Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Transfering compressed (gzip) files


+
Sadananda Hegde 2012-10-22, 15:18
+
Harish Mandala 2012-10-22, 15:27
+
Sadananda Hegde 2012-10-22, 18:14
Copy link to this message
-
Re: Transfering compressed (gzip) files
Sadu,
   Flume is designed to transfer a continuous stream of events into hadoop.
It appears that in your use case each gzip file is a collection of events
that needs to be moved.  The closest thing that i can see flume supporting
your use case is through the spooling directory source
https://issues.apache.org/jira/browse/FLUME-1425
... which has not yet been released.
-roshan
On Mon, Oct 22, 2012 at 11:14 AM, Sadananda Hegde <[EMAIL PROTECTED]>wrote:

> Hi Harish,
>
> I am still exploring my options and that's part of my question too - which
> source should I be using.
>
> Currently I have set up my flume ng configuration to use exec source (exec
> source, file channel and hdfs sink); but can change to use a
> different source if it handles the compressed files.
>
> Thanks,
> Sadu
> On Mon, Oct 22, 2012 at 10:27 AM, Harish Mandala <[EMAIL PROTECTED]>wrote:
>
>> Hi,
>>
>> Which of the flume sources are you trying to use?
>>
>> Regards,
>> Harish
>>
>> On Mon, Oct 22, 2012 at 11:18 AM, Sadananda Hegde <[EMAIL PROTECTED]>wrote:
>>
>>> My application servers produce data files that are in compressed format
>>> (gzip). I am planning to use flume ng (1.2.0) to collect those files and
>>> transfer them to hadoop cluster (write to HDFS). Is it possible to read and
>>> transfer them without uncomressing first? My sink would be HDFS and there
>>> are options to compress before writing to HDFS. That would work fine if my
>>> source is uncompressed text file and need to store hdfs file in compressed
>>> format. But in my case, the source itself is compressed. What would be the
>>> best options to handle such cases?
>>>
>>> Thanks for your help.
>>>
>>> Sadu
>>>
>>
>>
>
+
Harish Mandala 2012-10-22, 20:17
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB