Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Question about gzip compression when using Flume Ng


+
Sagar Mehta 2013-01-14, 19:18
Copy link to this message
-
Re: Question about gzip compression when using Flume Ng
Can you post your full config?

- Connor
On Mon, Jan 14, 2013 at 11:18 AM, Sagar Mehta <[EMAIL PROTECTED]> wrote:

> Hi Guys,
>
> I'm using Flume Ng and it works great for me. In essence I'm using an exec
> source for doing  tail -F on a logfile and using two HDFS sinks using a
> File channel. So far so great - Now I'm trying to use gzip compression
> using the following config as per the Flume-Ng User guide at
> http://flume.apache.org/FlumeUserGuide.html.
>
> #gzip compression related settings
> collector102.sinks.sink1.hdfs.codeC = gzip
> collector102.sinks.sink1.hdfs.fileType = CompressedStream
> collector102.sinks.sink1.hdfs.fileSuffix = .gz
>
> However this is what looks to be happening
>
> *Flume seems to write gzipped compressed output [I see the .gz files in
> the output buckets], however when I try to decompress it - I get an error
> about 'trailing garbage ignored' and the decompressed output is in fact
> smaller in size.*
>
> hadoop@jobtracker301:/home/hadoop/sagar/temp$ ls -ltr
> collector102.ngpipes.sac.ngmoco.com.1357936638713.gz
> -rw-r--r-- 1 hadoop hadoop *5381235* 2013-01-11 20:44
> *collector102.ngpipes.sac.ngmoco.com.1357936638713.gz*
>
> hadoop@jobtracker301:/home/hadoop/sagar/temp$ gunzip
> collector102.ngpipes.sac.ngmoco.com.1357936638713.gz
>
> *gzip: collector102.ngpipes.sac.ngmoco.com.1357936638713.gz:
> decompression OK, trailing garbage ignored*
> *
> *
> hadoop@jobtracker301:/home/hadoop/sagar/temp$ ls -l
>
> -rw-r--r-- 1 hadoop hadoop *58898* 2013-01-11 20:44 *
> collector102.ngpipes.sac.ngmoco.com.1357936638713*
> *
> *
> *Below are some helpful details.*
> *
> *
> *I'm using apache-flume-1.4.0-SNAPSHOT-bin*
> *
> *
> smehta@collector102:/opt$ ls -l flume
> lrwxrwxrwx 1 root root 31 2012-12-14 00:44 flume ->
> apache-flume-1.4.0-SNAPSHOT-bin
>
> *I also have the hadoop-core jar in my path*
>
> smehta@collector102:/opt/flume/lib$ ls -l hadoop-core-0.20.2-cdh3u2.jar
> -rw-r--r-- 1 hadoop hadoop 3534499 2012-12-01 01:53
> hadoop-core-0.20.2-cdh3u2.jar
> *
> *
> Everything is working well for me except the compression part. I'm not
> quite sure what I'm missing here. So while I debug this, any ideas/help is
> much appreciated.
>
> Thanks in advance,
> Sagar
>
+
Sagar Mehta 2013-01-14, 22:34
+
Connor Woodson 2013-01-14, 22:52
+
Sagar Mehta 2013-01-14, 23:12
+
Brock Noland 2013-01-14, 23:16
+
Sagar Mehta 2013-01-14, 23:24
+
Sagar Mehta 2013-01-14, 23:27
+
Brock Noland 2013-01-14, 23:38
+
Sagar Mehta 2013-01-15, 00:43
+
Brock Noland 2013-01-15, 00:54
+
Sagar Mehta 2013-01-15, 01:03
+
Connor Woodson 2013-01-15, 01:17
+
Sagar Mehta 2013-01-15, 01:52
+
Bhaskar V. Karambelkar 2013-01-15, 01:25
+
Connor Woodson 2013-01-15, 01:26
+
Sagar Mehta 2013-01-15, 02:36
+
Connor Woodson 2013-01-14, 23:17
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB