Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Processing data from HDFS


+
Abhijeet Pathak 2013-01-24, 07:20
+
Nitin Pawar 2013-01-24, 07:29
+
Alexander Alten-Lorenz 2013-01-24, 07:40
+
Abhijeet Pathak 2013-01-25, 05:20
Copy link to this message
-
Re: Processing data from HDFS
Then you could use Hive's HBase handler, http://mapredit.blogspot.de/2012/12/using-hives-hbase-handler.html

- Alex

On Jan 25, 2013, at 6:20 AM, Abhijeet Pathak <[EMAIL PROTECTED]> wrote:

> I've evaluated Pig, but it's not suitable for my purpose.
>
> Because, the CSV files that I have can have different column names, and column sequence for each file.
> Also, the key is not present there in CSV. We need to calculate row Key for each record also.
>
> Regards,
> Abhijeet Pathak
>
>
> ________________________________________
> From: Alexander Alten-Lorenz [[EMAIL PROTECTED]]
> Sent: 24 January 2013 1:10 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Processing data from HDFS
>
> Use PIG, a well written example you can find here:
> http://blog.whitepages.com/2011/10/27/hbase-storage-and-pig/
>
> Regards
>
> On Jan 24, 2013, at 8:29 AM, Nitin Pawar <[EMAIL PROTECTED]> wrote:
>
>> how are the files coming to hdfs?
>>
>> there is a direct hbase sink available for wrting data into hbase
>>
>> also from hdfs to hbase, you will need to write your own mapreduce job to
>> put data in hbase
>>
>>
>> On Thu, Jan 24, 2013 at 12:50 PM, Abhijeet Pathak <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi,
>>>
>>> I've a folder in HDFS where a bunch of files gets created periodically.
>>> I know that currently Flume does not support reading from HDFS folder.
>>>
>>> What is the best way to transfer this data from HDFS to Hbase (with or
>>> without using Flume)?
>>>
>>>
>>> Regards,
>>> Abhijeet Pathak
>>>
>>>
>>>
>>
>>
>> --
>> Nitin Pawar
>
> --
> Alexander Alten-Lorenz
> http://mapredit.blogspot.com
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>
>

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB