-Re: AW: How to split a big file in HDFS by size
Marcos Ortiz 2011-06-20, 15:39
Evert Lammerts at Sara.nl did something seemed to your problem, spliting
a big 2.7 TB file to chunks of 10 GB.
This work was presented on the BioAssist Programmers' Day on January of
this year and its name was
"Large-Scale Data Storage and Processing for Scientist in The Netherlands"
P.D: I sent the message with a copy to him
El 6/20/2011 10:38 AM, Niels Basjes escribiï¿½:
> On Mon, Jun 20, 2011 at 16:13, Mapred Learn<[EMAIL PROTECTED]> wrote:
>> But this file is a gzipped text file. In this case, it will only go to 1 mapper than the case if it was
>> split into 60 1 GB files which will make map-red job finish earlier than one 60 GB file as it will
>> Hv 60 mappers running in parallel. Isn't it so ?
> Yes, that is very true.
Marcos Luï¿½s Ortï¿½z Valmaseda
Software Engineer (UCI)