|
|
-
LZO & Pig (Elephantbird?)
Evert Lammerts 2011-01-12, 20:10
Hello list, I've installed the LZO codecs ( https://github.com/kevinweil/hadoop-lzo) and now I'm looking into using LZO in Pig. Elephant Bird ( https://github.com/kevinweil/elephant-bird) seems to provide some nice prefab loaders, but it's requirements do not fit out Hadoop installation (we're on CDH3b2 with Pig 0.7, EB cannot be used with anything > 0.6). Also the need for Thrift 0.2 is unclear to me - Thrift is now at 0.5. Now I did find this project, http://code.google.com/p/hadoop-gpl-packing/, saying EB can handle even Pig 0.8. This confuses me - can I or can I not use Elephant Bird with Pig 0.7, or even upgrade to Pig 0.8? Since EB is probably not an option, does anybody have some pointers on how to use LZO'ed files with Pig? Thanks! Evert Lammerts
+
Evert Lammerts 2011-01-12, 20:10
-
RE: LZO & Pig (Elephantbird?)
Tyler Coffin 2011-01-12, 20:23
There's a fork of elephant-bird where pig-8 support is being worked on: https://github.com/dvryaboy/elephant-bird/tree/pig-08I haven't given it a shot yet. -----Original Message----- From: Evert Lammerts [mailto:[EMAIL PROTECTED]] Sent: January 12, 2011 15:10 To: '[EMAIL PROTECTED]' Subject: LZO & Pig (Elephantbird?) Hello list, I've installed the LZO codecs ( https://github.com/kevinweil/hadoop-lzo) and now I'm looking into using LZO in Pig. Elephant Bird ( https://github.com/kevinweil/elephant-bird) seems to provide some nice prefab loaders, but it's requirements do not fit out Hadoop installation (we're on CDH3b2 with Pig 0.7, EB cannot be used with anything > 0.6). Also the need for Thrift 0.2 is unclear to me - Thrift is now at 0.5. Now I did find this project, http://code.google.com/p/hadoop-gpl-packing/, saying EB can handle even Pig 0.8. This confuses me - can I or can I not use Elephant Bird with Pig 0.7, or even upgrade to Pig 0.8? Since EB is probably not an option, does anybody have some pointers on how to use LZO'ed files with Pig? Thanks! Evert Lammerts --------------------------------------------------------------------- This transmission (including any attachments) may contain confidential information, privileged material (including material protected by the solicitor-client or other applicable privileges), or constitute non-public information. Any use of this information by anyone other than the intended recipient is prohibited. If you have received this transmission in error, please immediately reply to the sender and delete this information from your system. Use, dissemination, distribution, or reproduction of this transmission by unintended recipients is not authorized and may be unlawful.
+
Tyler Coffin 2011-01-12, 20:23
-
Re: LZO & Pig (Elephantbird?)
Dmitriy Ryaboy 2011-01-12, 22:44
I am working on the pig 08 compatibility layer; it mostly works, fwiw. Converting to Thrift 0.5 would be fairly straightforward; unfortunately the signatures of Thrift messages changed so the code is not entirely backwards compatible. I don't think the changes for what we do with Pig are material. Are you trying to load protobufs or thrift files, or do you just want Lzo support? If you just want plain text lzo loading, the loaders in the pig-08 branch totally work. Let me know if you have any issues. D On Wed, Jan 12, 2011 at 12:23 PM, Tyler Coffin <[EMAIL PROTECTED]> wrote: > There's a fork of elephant-bird where pig-8 support is being worked on: > https://github.com/dvryaboy/elephant-bird/tree/pig-08> > I haven't given it a shot yet. > > -----Original Message----- > From: Evert Lammerts [mailto:[EMAIL PROTECTED]] > Sent: January 12, 2011 15:10 > To: '[EMAIL PROTECTED]' > Subject: LZO & Pig (Elephantbird?) > > Hello list, > > I've installed the LZO codecs ( https://github.com/kevinweil/hadoop-lzo)> and > now I'm looking into using LZO in Pig. Elephant Bird > ( https://github.com/kevinweil/elephant-bird) seems to provide some nice > prefab loaders, but it's requirements do not fit out Hadoop installation > (we're on CDH3b2 with Pig 0.7, EB cannot be used with anything > 0.6). Also > the need for Thrift 0.2 is unclear to me - Thrift is now at 0.5. > > Now I did find this project, http://code.google.com/p/hadoop-gpl-packing/, > saying EB can handle even Pig 0.8. This confuses me - can I or can I not > use > Elephant Bird with Pig 0.7, or even upgrade to Pig 0.8? > > Since EB is probably not an option, does anybody have some pointers on how > to use LZO'ed files with Pig? > > Thanks! > > Evert Lammerts > > --------------------------------------------------------------------- > This transmission (including any attachments) may contain confidential > information, privileged material (including material protected by the > solicitor-client or other applicable privileges), or constitute non-public > information. Any use of this information by anyone other than the intended > recipient is prohibited. If you have received this transmission in error, > please immediately reply to the sender and delete this information from your > system. Use, dissemination, distribution, or reproduction of this > transmission by unintended recipients is not authorized and may be unlawful. >
+
Dmitriy Ryaboy 2011-01-12, 22:44
-
Re: LZO & Pig (Elephantbird?)
Dmitriy Ryaboy 2011-01-12, 22:51
P.S. Thrift 0.2 and 0.5 are binary-compatible, so you can read messages generated with 0.5 using files compiled with thrift 0.2, and vice versa. We have some projects that use 0.5 and some that are still on 0.2, and all that means is that you install both versions of the compilers on your dev box and flip your aliases depending on which project you are building. On Wed, Jan 12, 2011 at 2:44 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote: > I am working on the pig 08 compatibility layer; it mostly works, fwiw. > Converting to Thrift 0.5 would be fairly straightforward; unfortunately the > signatures of Thrift messages changed so the code is not entirely backwards > compatible. I don't think the changes for what we do with Pig are material. > > Are you trying to load protobufs or thrift files, or do you just want Lzo > support? If you just want plain text lzo loading, the loaders in the pig-08 > branch totally work. > > Let me know if you have any issues. > > D > > > On Wed, Jan 12, 2011 at 12:23 PM, Tyler Coffin <[EMAIL PROTECTED]> wrote: > >> There's a fork of elephant-bird where pig-8 support is being worked on: >> https://github.com/dvryaboy/elephant-bird/tree/pig-08>> >> I haven't given it a shot yet. >> >> -----Original Message----- >> From: Evert Lammerts [mailto:[EMAIL PROTECTED]] >> Sent: January 12, 2011 15:10 >> To: '[EMAIL PROTECTED]' >> Subject: LZO & Pig (Elephantbird?) >> >> Hello list, >> >> I've installed the LZO codecs ( https://github.com/kevinweil/hadoop-lzo)>> and >> now I'm looking into using LZO in Pig. Elephant Bird >> ( https://github.com/kevinweil/elephant-bird) seems to provide some nice >> prefab loaders, but it's requirements do not fit out Hadoop installation >> (we're on CDH3b2 with Pig 0.7, EB cannot be used with anything > 0.6). >> Also >> the need for Thrift 0.2 is unclear to me - Thrift is now at 0.5. >> >> Now I did find this project, http://code.google.com/p/hadoop-gpl-packing/>> , >> saying EB can handle even Pig 0.8. This confuses me - can I or can I not >> use >> Elephant Bird with Pig 0.7, or even upgrade to Pig 0.8? >> >> Since EB is probably not an option, does anybody have some pointers on how >> to use LZO'ed files with Pig? >> >> Thanks! >> >> Evert Lammerts >> >> --------------------------------------------------------------------- >> This transmission (including any attachments) may contain confidential >> information, privileged material (including material protected by the >> solicitor-client or other applicable privileges), or constitute non-public >> information. Any use of this information by anyone other than the intended >> recipient is prohibited. If you have received this transmission in error, >> please immediately reply to the sender and delete this information from your >> system. Use, dissemination, distribution, or reproduction of this >> transmission by unintended recipients is not authorized and may be unlawful. >> > >
+
Dmitriy Ryaboy 2011-01-12, 22:51
-
RE: LZO & Pig (Elephantbird?)
Evert Lammerts 2011-01-13, 15:43
> Are you trying to load protobufs or thrift files, or do you just want > Lzo > support? Protobufs would be nice, but Elephant Bird is not ready yet for Pig 0.7 / 0.8, right? > If you just want plain text lzo loading, the loaders in the > pig-08 > branch totally work. Does Pig 0.8 work with Hadoop 0.20.1 as well? Thanks for the support! Evert > > Let me know if you have any issues. > > D > > On Wed, Jan 12, 2011 at 12:23 PM, Tyler Coffin <[EMAIL PROTECTED]> wrote: > > > There's a fork of elephant-bird where pig-8 support is being worked > on: > > https://github.com/dvryaboy/elephant-bird/tree/pig-08> > > > I haven't given it a shot yet. > > > > -----Original Message----- > > From: Evert Lammerts [mailto:[EMAIL PROTECTED]] > > Sent: January 12, 2011 15:10 > > To: '[EMAIL PROTECTED]' > > Subject: LZO & Pig (Elephantbird?) > > > > Hello list, > > > > I've installed the LZO codecs ( https://github.com/kevinweil/hadoop-> lzo) > > and > > now I'm looking into using LZO in Pig. Elephant Bird > > ( https://github.com/kevinweil/elephant-bird) seems to provide some > nice > > prefab loaders, but it's requirements do not fit out Hadoop > installation > > (we're on CDH3b2 with Pig 0.7, EB cannot be used with anything > > 0.6). Also > > the need for Thrift 0.2 is unclear to me - Thrift is now at 0.5. > > > > Now I did find this project, http://code.google.com/p/hadoop-gpl-> packing/, > > saying EB can handle even Pig 0.8. This confuses me - can I or can I > not > > use > > Elephant Bird with Pig 0.7, or even upgrade to Pig 0.8? > > > > Since EB is probably not an option, does anybody have some pointers > on how > > to use LZO'ed files with Pig? > > > > Thanks! > > > > Evert Lammerts > > > > --------------------------------------------------------------------- > > This transmission (including any attachments) may contain > confidential > > information, privileged material (including material protected by the > > solicitor-client or other applicable privileges), or constitute non- > public > > information. Any use of this information by anyone other than the > intended > > recipient is prohibited. If you have received this transmission in > error, > > please immediately reply to the sender and delete this information > from your > > system. Use, dissemination, distribution, or reproduction of this > > transmission by unintended recipients is not authorized and may be > unlawful. > >
+
Evert Lammerts 2011-01-13, 15:43
-
Re: LZO & Pig (Elephantbird?)
Dmitriy Ryaboy 2011-01-13, 19:26
Depends on your definition of ready. I haven't put it into production and there's a bit of clean-up left, but as far as I know there are no critical bugs; just some interface issues which are shared between the current master and the version for Pig 0.8. Pig 0.8 should work with Hadoop 0.20.1; I've been working with it under CDH2 (which is a patched 0.20.1). -Dmitriy On Thu, Jan 13, 2011 at 7:43 AM, Evert Lammerts <[EMAIL PROTECTED]>wrote: > > Are you trying to load protobufs or thrift files, or do you just want > > Lzo > > support? > > Protobufs would be nice, but Elephant Bird is not ready yet for Pig 0.7 / > 0.8, right? > > > If you just want plain text lzo loading, the loaders in the > > pig-08 > > branch totally work. > > Does Pig 0.8 work with Hadoop 0.20.1 as well? > > Thanks for the support! > Evert > > > > > Let me know if you have any issues. > > > > D > > > > On Wed, Jan 12, 2011 at 12:23 PM, Tyler Coffin <[EMAIL PROTECTED]> wrote: > > > > > There's a fork of elephant-bird where pig-8 support is being worked > > on: > > > https://github.com/dvryaboy/elephant-bird/tree/pig-08> > > > > > I haven't given it a shot yet. > > > > > > -----Original Message----- > > > From: Evert Lammerts [mailto:[EMAIL PROTECTED]] > > > Sent: January 12, 2011 15:10 > > > To: '[EMAIL PROTECTED]' > > > Subject: LZO & Pig (Elephantbird?) > > > > > > Hello list, > > > > > > I've installed the LZO codecs ( https://github.com/kevinweil/hadoop-> > lzo) > > > and > > > now I'm looking into using LZO in Pig. Elephant Bird > > > ( https://github.com/kevinweil/elephant-bird) seems to provide some > > nice > > > prefab loaders, but it's requirements do not fit out Hadoop > > installation > > > (we're on CDH3b2 with Pig 0.7, EB cannot be used with anything > > > 0.6). Also > > > the need for Thrift 0.2 is unclear to me - Thrift is now at 0.5. > > > > > > Now I did find this project, http://code.google.com/p/hadoop-gpl-> > packing/, > > > saying EB can handle even Pig 0.8. This confuses me - can I or can I > > not > > > use > > > Elephant Bird with Pig 0.7, or even upgrade to Pig 0.8? > > > > > > Since EB is probably not an option, does anybody have some pointers > > on how > > > to use LZO'ed files with Pig? > > > > > > Thanks! > > > > > > Evert Lammerts > > > > > > --------------------------------------------------------------------- > > > This transmission (including any attachments) may contain > > confidential > > > information, privileged material (including material protected by the > > > solicitor-client or other applicable privileges), or constitute non- > > public > > > information. Any use of this information by anyone other than the > > intended > > > recipient is prohibited. If you have received this transmission in > > error, > > > please immediately reply to the sender and delete this information > > from your > > > system. Use, dissemination, distribution, or reproduction of this > > > transmission by unintended recipients is not authorized and may be > > unlawful. > > > >
+
Dmitriy Ryaboy 2011-01-13, 19:26
-
Re: LZO & Pig (Elephantbird?)
Dmitriy Lyubimov 2011-01-20, 20:16
We just OSSd some load and store funcs for pig 0.7 cdh3b3 supporting reads/writes protobuf from/to sequence files and hbase that we actually use in our prod. There's no codegen and i guess they do not support lzo files directly (but i guess one might enable lzo inside sequence files if needed. ) It works rather nicely for us. I guess there might be a need for some minor adjustements since we use grunt integrated in our redundant clients rather then spin off grunt on its own. We haven't switched to 0.8 yet but i gather the api gap for loadfuncs is narrower between 0.7 and 0.8 than between 0.6 and 0.7 (we actually have some decommissioned funcs for 0.6 in that tree that we don't use anymore, too.) : https://github.com/dlyubimov/ecoadaptersOn Thu, Jan 13, 2011 at 7:43 AM, Evert Lammerts <[EMAIL PROTECTED]>wrote: > > Are you trying to load protobufs or thrift files, or do you just want > > Lzo > > support? > > Protobufs would be nice, but Elephant Bird is not ready yet for Pig 0.7 / > 0.8, right? > >
+
Dmitriy Lyubimov 2011-01-20, 20:16
-
Re: LZO & Pig (Elephantbird?)
Dmitriy Ryaboy 2011-01-20, 20:41
FWIW we don't do codegen anymore either, both in the 0.6 and the 0.8-compatible branches. Pointing to a description file is a good idea, we'll add that. D On Thu, Jan 20, 2011 at 12:16 PM, Dmitriy Lyubimov <[EMAIL PROTECTED]>wrote: > We just OSSd some load and store funcs for pig 0.7 cdh3b3 supporting > reads/writes protobuf from/to sequence files and hbase that we actually use > in our prod. There's no codegen and i guess they do not support lzo files > directly (but i guess one might enable lzo inside sequence files if needed. > ) It works rather nicely for us. I guess there might be a need for some > minor adjustements since we use grunt integrated in our redundant clients > rather then spin off grunt on its own. We haven't switched to 0.8 yet but i > gather the api gap for loadfuncs is narrower between 0.7 and 0.8 than > between 0.6 and 0.7 (we actually have some decommissioned funcs for 0.6 in > that tree that we don't use anymore, too.) : > https://github.com/dlyubimov/ecoadapters> > > > On Thu, Jan 13, 2011 at 7:43 AM, Evert Lammerts <[EMAIL PROTECTED] > >wrote: > > > > Are you trying to load protobufs or thrift files, or do you just want > > > Lzo > > > support? > > > > Protobufs would be nice, but Elephant Bird is not ready yet for Pig 0.7 / > > 0.8, right? > > > > >
+
Dmitriy Ryaboy 2011-01-20, 20:41
-
Re: LZO & Pig (Elephantbird?)
Dmitriy Lyubimov 2011-01-20, 20:44
Yes i figured you guys went far ahead since i last checked it. Our code was spinning for past 7 mos or so, i think before we had a chance to figure all details on EB On Thu, Jan 20, 2011 at 12:41 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote: > FWIW we don't do codegen anymore either, both in the 0.6 and the > 0.8-compatible branches. > Pointing to a description file is a good idea, we'll add that. > > D > > On Thu, Jan 20, 2011 at 12:16 PM, Dmitriy Lyubimov <[EMAIL PROTECTED] > >wrote: > > > We just OSSd some load and store funcs for pig 0.7 cdh3b3 supporting > > reads/writes protobuf from/to sequence files and hbase that we actually > use > > in our prod. There's no codegen and i guess they do not support lzo files > > directly (but i guess one might enable lzo inside sequence files if > needed. > > ) It works rather nicely for us. I guess there might be a need for some > > minor adjustements since we use grunt integrated in our redundant clients > > rather then spin off grunt on its own. We haven't switched to 0.8 yet but > i > > gather the api gap for loadfuncs is narrower between 0.7 and 0.8 than > > between 0.6 and 0.7 (we actually have some decommissioned funcs for 0.6 > in > > that tree that we don't use anymore, too.) : > > https://github.com/dlyubimov/ecoadapters> > > > > > > > On Thu, Jan 13, 2011 at 7:43 AM, Evert Lammerts <[EMAIL PROTECTED] > > >wrote: > > > > > > Are you trying to load protobufs or thrift files, or do you just want > > > > Lzo > > > > support? > > > > > > Protobufs would be nice, but Elephant Bird is not ready yet for Pig 0.7 > / > > > 0.8, right? > > > > > > > > >
+
Dmitriy Lyubimov 2011-01-20, 20:44
-
Re: LZO & Pig (Elephantbird?)
Evert Lammerts 2011-01-13, 20:18
Alright, thanks! I'm going to give that a try and will let you know how it goes. Cheers, Evert ----- Reply message ----- From: "Dmitriy Ryaboy" <[EMAIL PROTECTED]> Date: Thu, Jan 13, 2011 8:27 pm Subject: LZO & Pig (Elephantbird?) To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Depends on your definition of ready. I haven't put it into production and there's a bit of clean-up left, but as far as I know there are no critical bugs; just some interface issues which are shared between the current master and the version for Pig 0.8. Pig 0.8 should work with Hadoop 0.20.1; I've been working with it under CDH2 (which is a patched 0.20.1). -Dmitriy On Thu, Jan 13, 2011 at 7:43 AM, Evert Lammerts <[EMAIL PROTECTED]>wrote: > > Are you trying to load protobufs or thrift files, or do you just want > > Lzo > > support? > > Protobufs would be nice, but Elephant Bird is not ready yet for Pig 0.7 / > 0.8, right? > > > If you just want plain text lzo loading, the loaders in the > > pig-08 > > branch totally work. > > Does Pig 0.8 work with Hadoop 0.20.1 as well? > > Thanks for the support! > Evert > > > > > Let me know if you have any issues. > > > > D > > > > On Wed, Jan 12, 2011 at 12:23 PM, Tyler Coffin <[EMAIL PROTECTED]> wrote: > > > > > There's a fork of elephant-bird where pig-8 support is being worked > > on: > > > https://github.com/dvryaboy/elephant-bird/tree/pig-08> > > > > > I haven't given it a shot yet. > > > > > > -----Original Message----- > > > From: Evert Lammerts [mailto:[EMAIL PROTECTED]] > > > Sent: January 12, 2011 15:10 > > > To: '[EMAIL PROTECTED]' > > > Subject: LZO & Pig (Elephantbird?) > > > > > > Hello list, > > > > > > I've installed the LZO codecs ( https://github.com/kevinweil/hadoop-> > lzo) > > > and > > > now I'm looking into using LZO in Pig. Elephant Bird > > > ( https://github.com/kevinweil/elephant-bird) seems to provide some > > nice > > > prefab loaders, but it's requirements do not fit out Hadoop > > installation > > > (we're on CDH3b2 with Pig 0.7, EB cannot be used with anything > > > 0.6). Also > > > the need for Thrift 0.2 is unclear to me - Thrift is now at 0.5. > > > > > > Now I did find this project, http://code.google.com/p/hadoop-gpl-> > packing/, > > > saying EB can handle even Pig 0.8. This confuses me - can I or can I > > not > > > use > > > Elephant Bird with Pig 0.7, or even upgrade to Pig 0.8? > > > > > > Since EB is probably not an option, does anybody have some pointers > > on how > > > to use LZO'ed files with Pig? > > > > > > Thanks! > > > > > > Evert Lammerts > > > > > > --------------------------------------------------------------------- > > > This transmission (including any attachments) may contain > > confidential > > > information, privileged material (including material protected by the > > > solicitor-client or other applicable privileges), or constitute non- > > public > > > information. Any use of this information by anyone other than the > > intended > > > recipient is prohibited. If you have received this transmission in > > error, > > > please immediately reply to the sender and delete this information > > from your > > > system. Use, dissemination, distribution, or reproduction of this > > > transmission by unintended recipients is not authorized and may be > > unlawful. > > > >
+
Evert Lammerts 2011-01-13, 20:18
-
RE: LZO & Pig (Elephantbird?)
Gerrit van Vuuren 2011-01-17, 12:04
Hi, FYI: The project http://code.google.com/p/hadoop-gpl-packing/is an effort to provide all the lzo and elephant bird code into one rpm or deb for x86_64, amd64-64 and i386-32, I've created this for use at my current work place and we've decided to open source and share this. Normally to start with lzo you need to compile and package different projects together, this project makes the task easier, just download and run rpm -i or dpkg. The libraries get installed into /opt/hadoopgpl from there you reference it from your pig scripts. The newest release contains pig-0.8.0 elephant bird compiled for branches (1) https://github.com/dvryaboy/elephant-bird (2) https://github.com/hirohanin/elephant-bird/ (3) https://github.com/gerritjvv/elephant-birdSome confusion might arise while these are being developed and pulled back into the original branch, but the current state as I understand is: Branch 1 contains all from (2). Branch 3 contains a merge from 1 and 2 with some extra classes I'm using at my work place. We're using elephant-bird lzo and GPB with pig-0.8.0 and its working without problems so far. Also on the home page of this project I've tried to provide easy to follow instructions of how to link in and configure pig, hadoop for use with lzo and GPB. Cheers, Gerrit -----Original Message----- From: Evert Lammerts [mailto:[EMAIL PROTECTED]] Sent: Thursday, January 13, 2011 8:19 PM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: LZO & Pig (Elephantbird?) Alright, thanks! I'm going to give that a try and will let you know how it goes. Cheers, Evert ----- Reply message ----- From: "Dmitriy Ryaboy" <[EMAIL PROTECTED]> Date: Thu, Jan 13, 2011 8:27 pm Subject: LZO & Pig (Elephantbird?) To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Depends on your definition of ready. I haven't put it into production and there's a bit of clean-up left, but as far as I know there are no critical bugs; just some interface issues which are shared between the current master and the version for Pig 0.8. Pig 0.8 should work with Hadoop 0.20.1; I've been working with it under CDH2 (which is a patched 0.20.1). -Dmitriy On Thu, Jan 13, 2011 at 7:43 AM, Evert Lammerts <[EMAIL PROTECTED]>wrote: > > Are you trying to load protobufs or thrift files, or do you just want > > Lzo > > support? > > Protobufs would be nice, but Elephant Bird is not ready yet for Pig 0.7 / > 0.8, right? > > > If you just want plain text lzo loading, the loaders in the > > pig-08 > > branch totally work. > > Does Pig 0.8 work with Hadoop 0.20.1 as well? > > Thanks for the support! > Evert > > > > > Let me know if you have any issues. > > > > D > > > > On Wed, Jan 12, 2011 at 12:23 PM, Tyler Coffin <[EMAIL PROTECTED]> wrote: > > > > > There's a fork of elephant-bird where pig-8 support is being worked > > on: > > > https://github.com/dvryaboy/elephant-bird/tree/pig-08> > > > > > I haven't given it a shot yet. > > > > > > -----Original Message----- > > > From: Evert Lammerts [mailto:[EMAIL PROTECTED]] > > > Sent: January 12, 2011 15:10 > > > To: '[EMAIL PROTECTED]' > > > Subject: LZO & Pig (Elephantbird?) > > > > > > Hello list, > > > > > > I've installed the LZO codecs ( https://github.com/kevinweil/hadoop-> > lzo) > > > and > > > now I'm looking into using LZO in Pig. Elephant Bird > > > ( https://github.com/kevinweil/elephant-bird) seems to provide some > > nice > > > prefab loaders, but it's requirements do not fit out Hadoop > > installation > > > (we're on CDH3b2 with Pig 0.7, EB cannot be used with anything > > > 0.6). Also > > > the need for Thrift 0.2 is unclear to me - Thrift is now at 0.5. > > > > > > Now I did find this project, http://code.google.com/p/hadoop-gpl-> > packing/, > > > saying EB can handle even Pig 0.8. This confuses me - can I or can I > > not > > > use > > > Elephant Bird with Pig 0.7, or even upgrade to Pig 0.8? > > > > > > Since EB is probably not an option, does anybody have some pointers > > on how > > > to use LZO'ed files with Pig?
+
Gerrit van Vuuren 2011-01-17, 12:04
|
|