Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Split the File using mapreduce


+
Ranjini Rathinam 2013-12-27, 13:26
+
Nitin Pawar 2013-12-27, 13:38
Copy link to this message
-
Re: Split the File using mapreduce
Yanbo Liang 2013-12-27, 14:53
Did you installed Hive on your Hadoop cluster?
If yes, use Hive SQL may be simple and efficiency.
Otherwise, you can write a MapReduce program with
org.apache.hadoop.mapred.lib.MultiOuputFormat, and the output from the
Reducer can be written to more than one file.
2013/12/27 Nitin Pawar <[EMAIL PROTECTED]>

> 1)if you have a csv file and do it often without writing a lot of code
> then create a hive table with "," delimiter and then select from table
> columns you want and write to the file
>
> 2) you are good at script, then look at pig scripting, and then write to
> files
>
> 3) you want to do it through mapreduce program of your own, take a look at
> multioutputformat and textinputformat
>
>
> On Fri, Dec 27, 2013 at 6:56 PM, Ranjini Rathinam <[EMAIL PROTECTED]>wrote:
>
>> Hi,
>>
>> I have a file with 16 fields such as
>> id,name,sa,dept,exp,address,company,phone,mobile,project,redk,........ so on
>>
>> My scenaraio is to split the first eight attributes in one file and
>> another eight attributes in another file using MapReduce program.
>>
>> so first eight attributes and its value in one file as
>> id,name,sa,dept,exp,address,company,phone
>>
>> and the rest of attributes and its value in another file. Using Mapreduce
>> Program.
>>
>> I am using Hadoop 0.20 version and java 1.6
>> Thanks in advance
>>
>> Regards,
>> Ranjini.R
>>
>>
>>
>>
>
>
>
> --
> Nitin Pawar
>