|
|
-
Example using Binary SerDe
Hans Uhlig 2012-01-22, 18:23
I am attempting to Use LazyBinarySerDe to read Sequence files output by a mapreduce job. Is there an example of how the data needs to be packed by the final reduce, and how the tables are set up so they can read the output?
-
Re: Example using Binary SerDe
Aniket Mokashi 2012-01-23, 01:41
Hi Hans,
Can you please elaborate on the use case more? Is your data already in Binary format readable to LazyBinarySerDe (if you mount a table with that serde with hive)? OR are you trying to write data using mapreduce (java) into a location that can be further read by a table that is declared to use LazyBinarySerDe?
Please elaborate more.
Thanks, Aniket
On Sun, Jan 22, 2012 at 10:23 AM, Hans Uhlig <[EMAIL PROTECTED]> wrote:
> I am attempting to Use LazyBinarySerDe to read Sequence files output by a > mapreduce job. Is there an example of how the data needs to be packed by > the final reduce, and how the tables are set up so they can read the > output? -- "...:::Aniket:::... Quetzalco@tl"
-
Re: Example using Binary SerDe
Hans Uhlig 2012-01-23, 03:28
Hi Aniket,
I am looking to run some data through a mapreduce and I want the output sequence files to be compatible with Block Compressed Partitioned LazyBinarySerDe so I can map external tables to it. The current job uses a pojo that extends writable to serialize to disk, this is easy to read back in for mapreduce but I am not sure how to read this with hive. Do I need to define it as a struct, just normal fields and row format is LazyBinarySerDe?
On Sun, Jan 22, 2012 at 5:41 PM, Aniket Mokashi <[EMAIL PROTECTED]> wrote:
> Hi Hans, > > Can you please elaborate on the use case more? Is your data already in > Binary format readable to LazyBinarySerDe (if you mount a table with that > serde with hive)? > OR > are you trying to write data using mapreduce (java) into a location that > can be further read by a table that is declared to use LazyBinarySerDe? > > Please elaborate more. > > Thanks, > Aniket > > On Sun, Jan 22, 2012 at 10:23 AM, Hans Uhlig <[EMAIL PROTECTED]> wrote: > >> I am attempting to Use LazyBinarySerDe to read Sequence files output by a >> mapreduce job. Is there an example of how the data needs to be packed by >> the final reduce, and how the tables are set up so they can read the >> output? > > > > > -- > "...:::Aniket:::... Quetzalco@tl" >
-
Re: Example using Binary SerDe
Aniket Mokashi 2012-01-23, 05:27
Does that mean you would like to read the pojo objects using hive? Is your pojo a custom writable? LazyBinarySerDe in my opinion is a SerDe that converts bytewritable to columns. Your recordreader would return a bytewritable and serde along with objectinspector would convert it to typed columns. So, directly converting these pojos into columns would not be straightforward.
In my opinion, writing a serde in this case also would be quite tough (but doable). You might need your own record writer (inputformat) and then a serde of your own to inspect the objects.
If you control the way you store your pojo, you may want to pass it through serde and create a bytewritable before storing it. That would make the problem much simpler.
Thanks, Aniket On Sun, Jan 22, 2012 at 7:28 PM, Hans Uhlig <[EMAIL PROTECTED]> wrote:
> Hi Aniket, > > I am looking to run some data through a mapreduce and I want the output > sequence files to be compatible with Block Compressed Partitioned > LazyBinarySerDe so I can map external tables to it. The current job uses a > pojo that extends writable to serialize to disk, this is easy to read back > in for mapreduce but I am not sure how to read this with hive. Do I need to > define it as a struct, just normal fields and row format is LazyBinarySerDe? > > On Sun, Jan 22, 2012 at 5:41 PM, Aniket Mokashi <[EMAIL PROTECTED]>wrote: > >> Hi Hans, >> >> Can you please elaborate on the use case more? Is your data already in >> Binary format readable to LazyBinarySerDe (if you mount a table with that >> serde with hive)? >> OR >> are you trying to write data using mapreduce (java) into a location that >> can be further read by a table that is declared to use LazyBinarySerDe? >> >> Please elaborate more. >> >> Thanks, >> Aniket >> >> On Sun, Jan 22, 2012 at 10:23 AM, Hans Uhlig <[EMAIL PROTECTED]> wrote: >> >>> I am attempting to Use LazyBinarySerDe to read Sequence files output by >>> a mapreduce job. Is there an example of how the data needs to be packed by >>> the final reduce, and how the tables are set up so they can read the >>> output? >> >> >> >> >> -- >> "...:::Aniket:::... Quetzalco@tl" >> > > -- "...:::Aniket:::... Quetzalco@tl"
|
|