Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Some questions on intermediate serialization in Pig


Copy link to this message
-
Re: Some questions on intermediate serialization in Pig
Maverick goes in RECORD_1, Goose goes in RECORD_2 and Goose's dipshit
ejection seat goes in RECORD_3. 1 has crooked teeth. 2 is a bloody
corpse. And 3... well 3 is to blame for it all.

Russell Jurney http://datasyndrome.com

On May 23, 2012, at 4:51 PM, Jonathan Coveney <[EMAIL PROTECTED]> wrote:

> I'm trying to understand how intermediate serialization in Pig works at a
> deeper level (understanding the whole code path, not just BinInterSedes in
> its own vaccuum). Right now I am looking at
> InterRecordReader/InterRecordWriter/InterStorage. Is that the right place
> to look for understanding how BinInterSedes is actually called?
>
> Further, I'm trying to better understanding the RECORD_1/RECORD_2/RECORD_3
> thing. My guess is that it's to make the file splittable? But I'm not
> really sure. I'd love any pointers about where to look for how
> BinInterSedes is used, and how intermediate storage happens.
>
> Thanks!
> Jon