Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> How to Create an effective chained MapReduce program.


+
ilyal levin 2011-09-05, 15:49
+
Joey Echeverria 2011-09-05, 16:41
+
ilyal levin 2011-09-05, 19:21
+
Roger Chen 2011-09-05, 19:50
+
ilyal levin 2011-09-05, 22:33
+
ilyal levin 2011-09-05, 23:53
+
Joey Echeverria 2011-09-06, 00:16
+
Niels Basjes 2011-09-06, 05:57
+
ilyal levin 2011-09-06, 07:16
+
David Rosenstrauch 2011-09-06, 20:26
+
ilyal levin 2011-09-07, 22:10
+
David Rosenstrauch 2011-09-07, 22:17
Copy link to this message
-
Re: How to Create an effective chained MapReduce program.
You might find it more easy to understand this if you use one of the
low-level job-scripting languages like Oozie or Hamake. They put the whole
assemblage of stuff into one file.

On Wed, Sep 7, 2011 at 3:17 PM, David Rosenstrauch <[EMAIL PROTECTED]>wrote:

> * open a SequenceFile.Reader on the sequence file
> * in a loop, call next(key,val) on the reader to read the next key/val pair
> in the file (see: http://hadoop.apache.org/**common/docs/current/api/org/*
> *apache/hadoop/io/SequenceFile.**Reader.html#next(org.apache.**
> hadoop.io.Writable,%20org.**apache.hadoop.io.Writable)<http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.Reader.html#next%28org.apache.hadoop.io.Writable,%20org.apache.hadoop.io.Writable%29>)
> * write code to format the key & val into whatever appropriate format you
> want, and write them to the console
> * when next(key,val) returns false, exit the loop
>
> HTH,
>
> DR
>
>
> On 09/07/2011 06:10 PM, ilyal levin wrote:
>
>> Can you be more specific on how to perform this. In general is there a way
>> to convert the binary files i have to text files?
>>
>>
>>
>> On Tue, Sep 6, 2011 at 11:26 PM, David Rosenstrauch<[EMAIL PROTECTED]**
>> >wrote:
>>
>>  On 09/06/2011 01:57 AM, Niels Basjes wrote:
>>>
>>>  Hi,
>>>>
>>>> In the past i've had the same situation where I needed the data for
>>>> debugging. Back then I chose to create a second job with simply
>>>> SequenceFileInputFormat, IdentityMapper, IdentityReducer and finally
>>>> TextOutputFormat.
>>>>
>>>> In my situation that worked great for my purpose.
>>>>
>>>>
>>> I did similar at my last job, but rather than writing a 2nd map/reduce
>>> job
>>> for this, we just wrote a simple command line app that used the Hadoop
>>> Java
>>> API to dump the contents of the binary file as text (JSON) to the
>>> console.
>>>
>>> HTH,
>>>
>>> DR
>>>
>>
--
Lance Norskog
[EMAIL PROTECTED]
+
ilyal levin 2011-09-08, 09:24
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB