Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Example using Binary SerDe


Copy link to this message
-
Re: Example using Binary SerDe
Does that mean you would like to read the pojo objects using hive? Is your
pojo a custom writable?
LazyBinarySerDe in my opinion is a SerDe that converts bytewritable to
columns. Your recordreader would return a bytewritable and serde along with
objectinspector would convert it to typed columns. So, directly converting
these pojos into columns would not be straightforward.

In my opinion, writing a serde in this case also would be quite tough (but
doable). You might need your own record writer (inputformat) and then a
serde of your own to inspect the objects.

If you control the way you store your pojo, you may want to pass it through
serde and create a bytewritable before storing it. That would make the
problem much simpler.

Thanks,
Aniket
On Sun, Jan 22, 2012 at 7:28 PM, Hans Uhlig <[EMAIL PROTECTED]> wrote:

> Hi Aniket,
>
> I am looking to run some data through a mapreduce and I want the output
> sequence files to be compatible with Block Compressed Partitioned
> LazyBinarySerDe so I can map external tables to it. The current job uses a
> pojo that extends writable to serialize to disk, this is easy to read back
> in for mapreduce but I am not sure how to read this with hive. Do I need to
> define it as a struct, just normal fields and row format is LazyBinarySerDe?
>
> On Sun, Jan 22, 2012 at 5:41 PM, Aniket Mokashi <[EMAIL PROTECTED]>wrote:
>
>> Hi Hans,
>>
>> Can you please elaborate on the use case more? Is your data already in
>> Binary format readable to LazyBinarySerDe (if you mount a table with that
>> serde with hive)?
>> OR
>> are you trying to write data using mapreduce (java) into a location that
>> can be further read by a table that is declared to use LazyBinarySerDe?
>>
>> Please elaborate more.
>>
>> Thanks,
>> Aniket
>>
>> On Sun, Jan 22, 2012 at 10:23 AM, Hans Uhlig <[EMAIL PROTECTED]> wrote:
>>
>>> I am attempting to Use LazyBinarySerDe to read Sequence files output by
>>> a mapreduce job. Is there an example of how the data needs to be packed by
>>> the final reduce, and how the tables are set up so they can read the
>>> output?
>>
>>
>>
>>
>> --
>> "...:::Aniket:::... Quetzalco@tl"
>>
>
>
--
"...:::Aniket:::... Quetzalco@tl"