Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Question about gzip compression when using Flume Ng


Copy link to this message
-
Re: Question about gzip compression when using Flume Ng
Bhaskar,

Your suggestion worked like magic!! I don't believe my eyes!!

hadoop@jobtracker301:/home/hadoop/sagar/debug$ hget
/ngpipes-raw-logs/2013-01-15/0200/collector102.ngpipes.sac.ngmoco.com.1358216630511.gz
.

hadoop@jobtracker301:/home/hadoop/sagar/debug$ gunzip
collector102.ngpipes.sac.ngmoco.com.1358216630511.gz
hadoop@jobtracker301:/home/hadoop/sagar/debug$ ls -ltrh
total 34M
-rw-r--r-- 1 hadoop hadoop 34M 2013-01-15 02:29
collector102.ngpipes.sac.ngmoco.com.1358216630511

The file decompresses fine!!

This is what I did

   - Downloaded the latest Cloudera stuff here -
   https://ccp.cloudera.com/display/CDH4DOC/CDH4+Installation
   - It installed hadoop to /usr/lib and I pointed the HADOOP_HOME to
   /usr/lib/hadoop and restarted Flume!!
   - Thats it!! - time to party :)

Thank you so much guys for your prompt replies!!

Sagar
On Mon, Jan 14, 2013 at 5:25 PM, Bhaskar V. Karambelkar <[EMAIL PROTECTED]
> wrote:

> Sagar,
> You're better of downloading and unzipping CDH3u5 or CDH4 some where, and
> pointing the HADOOP_HOME env. variable to the base directory.
> That way you won't have to worry about which jar files are needed and
> which not.
> Flume will auto add all JARs from the Hadoop Installation that it needs.
>
> regards
> Bhaskar
>
>
> On Mon, Jan 14, 2013 at 7:43 PM, Sagar Mehta <[EMAIL PROTECTED]> wrote:
>
>> ok so I dropped in the new hadoop-core jar in /opt/flume/lib [I got some
>> errors about the guava dependencies so put in that jar too]
>>
>> smehta@collector102:/opt/flume/lib$ ls -ltrh | grep -e "hadoop-core" -e
>> "guava"
>> -rw-r--r-- 1 hadoop hadoop 1.5M 2012-11-14 21:49 guava-10.0.1.jar
>> -rw-r--r-- 1 hadoop hadoop 3.7M 2013-01-14 23:50
>> hadoop-core-0.20.2-cdh3u5.jar
>>
>> Now I don't event see the file being created in hdfs and the flume log is
>> happily talking about housekeeping for some file channel checkpoints,
>> updating pointers et al
>>
>> Below is tail of flume log
>>
>> *hadoop@collector102:/data/flume_log$ tail -10 flume.log*
>> 2013-01-15 00:42:10,814 [Log-BackgroundWorker-channel2] INFO
>>  org.apache.flume.channel.file.Log - Updated checkpoint for file:
>> /data/flume_data/channel2/data/log-36 position: 129415524 logWriteOrderID:
>> 1358209947324
>> 2013-01-15 00:42:10,814 [Log-BackgroundWorker-channel2] INFO
>>  org.apache.flume.channel.file.LogFile - Closing RandomReader
>> /data/flume_data/channel2/data/log-34
>> 2013-01-15 00:42:10,814 [Log-BackgroundWorker-channel1] INFO
>>  org.apache.flume.channel.file.Log - Updated checkpoint for file:
>> /data/flume_data/channel1/data/log-36 position: 129415524 logWriteOrderID:
>> 1358209947323
>> 2013-01-15 00:42:10,814 [Log-BackgroundWorker-channel1] INFO
>>  org.apache.flume.channel.file.LogFile - Closing RandomReader
>> /data/flume_data/channel1/data/log-34
>> 2013-01-15 00:42:10,819 [Log-BackgroundWorker-channel2] INFO
>>  org.apache.flume.channel.file.LogFileV3 - Updating log-34.meta
>> currentPosition = 18577138, logWriteOrderID = 1358209947324
>> 2013-01-15 00:42:10,819 [Log-BackgroundWorker-channel1] INFO
>>  org.apache.flume.channel.file.LogFileV3 - Updating log-34.meta
>> currentPosition = 18577138, logWriteOrderID = 1358209947323
>> 2013-01-15 00:42:10,820 [Log-BackgroundWorker-channel1] INFO
>>  org.apache.flume.channel.file.LogFile - Closing RandomReader
>> /data/flume_data/channel1/data/log-35
>> 2013-01-15 00:42:10,821 [Log-BackgroundWorker-channel2] INFO
>>  org.apache.flume.channel.file.LogFile - Closing RandomReader
>> /data/flume_data/channel2/data/log-35
>> 2013-01-15 00:42:10,826 [Log-BackgroundWorker-channel1] INFO
>>  org.apache.flume.channel.file.LogFileV3 - Updating log-35.meta
>> currentPosition = 217919486, logWriteOrderID = 1358209947323
>> 2013-01-15 00:42:10,826 [Log-BackgroundWorker-channel2] INFO
>>  org.apache.flume.channel.file.LogFileV3 - Updating log-35.meta
>> currentPosition = 217919486, logWriteOrderID = 1358209947324
>>
>> Sagar
>>
>>
>> On Mon, Jan 14, 2013 at 3:38 PM, Brock Noland <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB