Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> How to Create an effective chained MapReduce program.


+
ilyal levin 2011-09-05, 15:49
+
Joey Echeverria 2011-09-05, 16:41
+
ilyal levin 2011-09-05, 19:21
+
Roger Chen 2011-09-05, 19:50
+
ilyal levin 2011-09-05, 22:33
+
ilyal levin 2011-09-05, 23:53
+
Joey Echeverria 2011-09-06, 00:16
+
Niels Basjes 2011-09-06, 05:57
+
ilyal levin 2011-09-06, 07:16
+
David Rosenstrauch 2011-09-06, 20:26
+
ilyal levin 2011-09-07, 22:10
+
David Rosenstrauch 2011-09-07, 22:17
Copy link to this message
-
Re: How to Create an effective chained MapReduce program.
You might find it more easy to understand this if you use one of the
low-level job-scripting languages like Oozie or Hamake. They put the whole
assemblage of stuff into one file.

On Wed, Sep 7, 2011 at 3:17 PM, David Rosenstrauch <[EMAIL PROTECTED]>wrote:

> * open a SequenceFile.Reader on the sequence file
> * in a loop, call next(key,val) on the reader to read the next key/val pair
> in the file (see: http://hadoop.apache.org/**common/docs/current/api/org/*
> *apache/hadoop/io/SequenceFile.**Reader.html#next(org.apache.**
> hadoop.io.Writable,%20org.**apache.hadoop.io.Writable)<http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.Reader.html#next%28org.apache.hadoop.io.Writable,%20org.apache.hadoop.io.Writable%29>)
> * write code to format the key & val into whatever appropriate format you
> want, and write them to the console
> * when next(key,val) returns false, exit the loop
>
> HTH,
>
> DR
>
>
> On 09/07/2011 06:10 PM, ilyal levin wrote:
>
>> Can you be more specific on how to perform this. In general is there a way
>> to convert the binary files i have to text files?
>>
>>
>>
>> On Tue, Sep 6, 2011 at 11:26 PM, David Rosenstrauch<[EMAIL PROTECTED]**
>> >wrote:
>>
>>  On 09/06/2011 01:57 AM, Niels Basjes wrote:
>>>
>>>  Hi,
>>>>
>>>> In the past i've had the same situation where I needed the data for
>>>> debugging. Back then I chose to create a second job with simply
>>>> SequenceFileInputFormat, IdentityMapper, IdentityReducer and finally
>>>> TextOutputFormat.
>>>>
>>>> In my situation that worked great for my purpose.
>>>>
>>>>
>>> I did similar at my last job, but rather than writing a 2nd map/reduce
>>> job
>>> for this, we just wrote a simple command line app that used the Hadoop
>>> Java
>>> API to dump the contents of the binary file as text (JSON) to the
>>> console.
>>>
>>> HTH,
>>>
>>> DR
>>>
>>
--
Lance Norskog
[EMAIL PROTECTED]
+
ilyal levin 2011-09-08, 09:24