Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Compressing output using block compression


Copy link to this message
-
Re: Compressing output using block compression
Mohit Anchlia 2012-04-03, 18:39
Is bzip2 not advisable? I think it can split too and is supported out of
the box.

On Thu, Mar 29, 2012 at 8:08 PM, 帝归 <[EMAIL PROTECTED]> wrote:

> When I use LzoPigStorage, it will load all files under a directory. But I
> want compress every file under a directory and keep the file name
> unchanged, just with a .lzo extension name. How can I do this? Maybe I must
> write a mapreduce job?
>
> 2012/3/30 Jonathan Coveney <[EMAIL PROTECTED]>
>
> > check out:
> >
> >
> https://github.com/kevinweil/elephant-bird/tree/master/src/java/com/twitter/elephantbird/pig/store
> >
> > 2012/3/29 Mohit Anchlia <[EMAIL PROTECTED]>
> >
> > > Thanks! When I store output how can I tell pig to compress it in LZO
> > > format?
> > >
> > > On Thu, Mar 29, 2012 at 4:02 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]>
> > > wrote:
> > >
> > > > You might find the elephant-bird project helpful for reading and
> > > > creating LZO files, in raw hadoop or using Pig.
> > > > (disclaimer: I'm a committer on elephant-bird)
> > > >
> > > > D
> > > >
> > > > On Wed, Mar 28, 2012 at 9:49 AM, Prashant Kommireddi
> > > > <[EMAIL PROTECTED]> wrote:
> > > > > Pig support LZO for splittable compression.
> > > > >
> > > > > Thanks,
> > > > > Prashant
> > > > >
> > > > > On Mar 28, 2012, at 9:45 AM, Mohit Anchlia <[EMAIL PROTECTED]
> >
> > > > wrote:
> > > > >
> > > > >> We currently have 100s of GB of uncompressed data which we would
> > like
> > > to
> > > > >> zip using some compression that is block compression so that we
> can
> > > use
> > > > >> multiple input splits. Does pig support any such compression?
> > > >
> > >
> >
>
>
>
> --
> ‘(hello world)
>