Can I merge files after I loaded them into hive?
This is my situation:
There is a log table partitioned by date, which is store the nginx access logs.
The raw log files are loaded into hive every hour.
By now, a single log file size is small, say 10 MB or even smaller.
So there are 24 small size files in one partition.
This is ineffective in my opinion, and will consume more hadoop heap size.
That's why I want to merge the small files.
Can hive merge those files automatically?
Or dose hive provide some tools to merge files?
Or I can just use hadoop dfs -cat to do that?
Bejoy KS 2012-11-15, 08:10
Bejoy KS 2012-11-15, 10:08
Роман Павленко 2012-11-15, 10:20
Cheng Su 2012-11-15, 10:41