Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Real-life experience of forcing smaller input splits?

David Morel 2013-01-25, 06:16
Copy link to this message
Re: Real-life experience of forcing smaller input splits?
Hi David,

What file format and compression type are you using ?


Le 25 janv. 2013 à 07:16, David Morel <[EMAIL PROTECTED]> a écrit :

> Hello,
> I have seen many posts on various sites and MLs, but didn't find a firm
> answer anywhere: is it possible yes or no to force a smaller split size
> than a block on the mappers, from the client side? I'm not after
> pointers to the docs (unless you're very very sure :-) but after
> real-life experience along the lines of 'yes, it works this way, I've
> done it like this...'
> All the parameters that I could find (especially specifying a max input
> split size) seem to have no effect, and the files that I have are so
> heavily compressed that they completely saturate the mappers' memory
> when processed.
> A solution I could imagine for this specific issue is reducing the block
> size, but for now I simply went with disabling in-file compression for
> those. And changing the block size on a per-file basis is something I'd
> like to avoid if at all possible.
> All the hive settings that we tried only got me as far as raising the
> number of mappers from 5 to 6 (yay!) where I would have needed at least
> ten times more.
> Thanks!
> D.Morel
Nitin Pawar 2013-01-25, 06:47
Edward Capriolo 2013-01-25, 07:46
Bertrand Dechoux 2013-01-25, 09:37
David Morel 2013-01-25, 09:53
David Morel 2013-01-25, 12:28
Dean Wampler 2013-01-25, 13:39
Edward Capriolo 2013-01-25, 07:44