Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Flume data to hive

Copy link to this message
Re: Flume data to hive
In hive, when you load data to a partitioned table, you need to add that
partition info to hive.

so you can just add those partitions and it should work fine.
On Wed, Jan 15, 2014 at 1:15 AM, Chen Wang <[EMAIL PROTECTED]>wrote:

> Hey guys,
> I am using flume to directly sink data into my hive table. However, there
> seems to be some schema inconsistency, and I am not sure how to
> troubleshoot it.
> I created a hive table 'targeting' in hive, it use sequence file, snappy
> compression, partitioned by 'epoch'. After the table is created, I could
> see a folder called 'targeting' under my folder:
> /hive/cwang49.db/targeting
> I then using flume to flow my log data into this folder directly, the
> flume configuration is:
> sinks.HDFS.type = hdfs
> sinks.HDFS.hdfs.path = maprfs:///hive/cwang49.db/targeting/epoch=%{epoch}
> sinks.HDFS.hdfs.fileType = SequenceFile
> sinks.HDFS.hdfs.codeC = snappy
> When I run flume node, I can see folder epoch=123445 created, and there
> are files under the folder as well. However, when I run hive query against
> the table, it returns empty.
> I think this might be caused by some schema discrepancy? Do I still need
> to load partition meta data into hive before i could see the partition?(I
> recall doing this for external table). How can I trouble shoot this?
> Thanks a bunch!
> Chen

Nitin Pawar