Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # dev >> problem about use mapreduce get big file


Copy link to this message
-
problem about use mapreduce get big file
hi,maillist:
     i want to produce a big input file( or a collection of big files) use
MR job,my output file is very sample ,like following info

1,text
2,text
3,text
......

the question is the whole output ,it's first column is a count, but if i
has 16 map ,it output will be 16 file like
xxx_m1_xxx
......
xxx_m15_xxx
i do know  how to guarantee the first file output is (if each file has 2
record)

1,text
2,text

and second is
3,text
4,text

so i can combine them into
1.text
2,text
3,text
4,text
  what i think is if i can know current map position in map construct like
(i have a map array ,and i can get the index which current map task on
though map task context)

anyone can help?