Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> 2 questions about SerDe

Koert Kuipers 2012-02-21, 15:37
Copy link to this message
Re: 2 questions about SerDe
Have a look at the code for the LazySerDes. When you deserialize in the
SerDe, you don't actually have to deserialize all the columns. Deserialized
could return an object that is not actually deserialized and you can write
an ObjectInspector that deserializes a field from that structure but only
when it's needed (that's when the ObjectInspector is called).


On Tue, Feb 21, 2012 at 7:37 AM, Koert Kuipers <[EMAIL PROTECTED]> wrote:

> 1) Is there a way in initialize() of a SerDe to know if it is being used
> as a Serializer or a Deserializer. If not, can i define the Serializer and
> Deserializer separately instead of defining a SerDe (so i have two
> initialize methods)?
> 2) Is there a way to find out which columns are being used? say if someone
> does select a,b,c from test, and my SerDe gets initialized for usage in
> that query how can i know that only a,b,c are being needed? i would like to
> take advantage of this information so i dont deserialize unnecessary
> information, without having to resort to more complex lazy deserialization
> tactics.