Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Job config before read fields


Copy link to this message
-
Re: Job config before read fields
I think you have to override/extend the Comparator to achieve that,
something like what is done in Secondary Sort?

Regards,
Shahab
On Fri, Aug 30, 2013 at 9:01 PM, Adrian CAPDEFIER <[EMAIL PROTECTED]>wrote:

> Howdy,
>
> I apologise for the lack of code in this message, but the code is fairly
> convoluted and it would obscure my problem. That being said, I can put
> together some sample code if really needed.
>
> I am trying to pass some metadata between the map & reduce steps. This
> metadata is read and generated in the map step and stored in the job
> config. It also needs to be recreated on the reduce node before the key/
> value fields can be read in the readFields function.
>
> I had assumed that I would be able to override the Reducer.setup()
> function and that would be it, but apparently the readFields function is
> called before the Reducer.setup() function.
>
> My question is what is any (the best) place on the reduce node where I can
> access the job configuration/ context before the readFields function is
> called?
>
> This is the stack trace:
>
>         at
> org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.compare(MapTask.java:1111)
>         at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:70)
>         at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:59)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1399)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298)
>         at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:699)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB