Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - mapred.min.split.size


Copy link to this message
-
Re: mapred.min.split.size
Corbin Hoenes 2010-08-05, 20:21
So what does pig do when I have a 5 gig file?  Does it simply hardcode the split size to block size?   Is there no way to tell it to just operate on a larger split size?
On Jul 27, 2010, at 3:41 PM, Richard Ding wrote:

> For Pig loaders, each split can have at most one file, doesn't matter what split size is.
>
> You can concatenate the input files before loading them.
>
> Thanks,
> -Richard
> -----Original Message-----
> From: Corbin Hoenes [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, July 27, 2010 2:09 PM
> To: [EMAIL PROTECTED]
> Subject: mapred.min.split.size
>
> Is there a way to set the mapred.min.split.size property in pig? I set it but doesn't seem to have changed the mapper's HDFS_BYTES_READ counter.  My mappers are finishing ~10 secs.  I have ~20,000 of them.
>
>
>