Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Flume data to hive


Copy link to this message
-
Re: Flume data to hive
Nitin Pawar 2014-01-14, 20:02
In hive, when you load data to a partitioned table, you need to add that
partition info to hive.

so you can just add those partitions and it should work fine.
On Wed, Jan 15, 2014 at 1:15 AM, Chen Wang <[EMAIL PROTECTED]>wrote:

> Hey guys,
> I am using flume to directly sink data into my hive table. However, there
> seems to be some schema inconsistency, and I am not sure how to
> troubleshoot it.
>
> I created a hive table 'targeting' in hive, it use sequence file, snappy
> compression, partitioned by 'epoch'. After the table is created, I could
> see a folder called 'targeting' under my folder:
> /hive/cwang49.db/targeting
>
> I then using flume to flow my log data into this folder directly, the
> flume configuration is:
> sinks.HDFS.type = hdfs
> sinks.HDFS.hdfs.path = maprfs:///hive/cwang49.db/targeting/epoch=%{epoch}
> sinks.HDFS.hdfs.fileType = SequenceFile
> sinks.HDFS.hdfs.codeC = snappy
>
> When I run flume node, I can see folder epoch=123445 created, and there
> are files under the folder as well. However, when I run hive query against
> the table, it returns empty.
>
> I think this might be caused by some schema discrepancy? Do I still need
> to load partition meta data into hive before i could see the partition?(I
> recall doing this for external table). How can I trouble shoot this?
>
> Thanks a bunch!
> Chen
>

--
Nitin Pawar