Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Can I merge files after I loaded them into hive?


Copy link to this message
-
Can I merge files after I loaded them into hive?
Hi, all.

Can I merge files after I loaded them into hive?
This is my situation:

There is a log table partitioned by date, which is store the nginx access logs.
The raw log files are loaded into hive every hour.
By now, a single log file size is small, say 10 MB or even smaller.
So there are 24 small size files in one partition.
This is ineffective in my opinion, and will consume more hadoop heap size.
That's why I want to merge the small files.

Can hive merge those files automatically?
Or dose hive provide some tools to merge files?
Or I can just use hadoop dfs -cat to do that?

--

Regards,
Cheng Su
+
Bejoy KS 2012-11-15, 08:10
+
Bejoy KS 2012-11-15, 10:08
+
Роман Павленко 2012-11-15, 10:20
+
Cheng Su 2012-11-15, 10:41
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB