Chris Nauroth 2012-10-15, 18:37
I think it would work, but I'm wondering if it would be easier for your
application to restructure the keys emitted from the mapper tasks so that
you can take advantage of the sorting inherently done during the shuffle.
For each reduce task, your reducer code will receive keys emitted from
mappers in sorted order. Therefore, if the keys emitted from your mapper
contain the item's priority, then the shuffle would provide the sort order
that you need. This might lead you down the path of writing a custom
WritableComparable to use as the map output key, but this is usually pretty
Also, keep in mind that if you run multiple reduce tasks, then each reducer
receives a subset of the keys emitted from the mapper. Depending on your
application logic, this may or may not be a problem.
On Mon, Oct 15, 2012 at 11:07 AM, Aseem Anand <[EMAIL PROTECTED]> wrote:
> Hi Chris,
> I had a few PriorityQueue's at the mappers which I wished to send to some
> reducers. After this each reducer(receiving PriorityQueues from each
> mapper) would perform some operations on these by removing the top and
> hence accessing the elements in sorted order(which is very essential to my
> application). Even I thought of pushing them in an ArrayWritable but was
> wondering if there would be an existing implementation of PriorityQueue.
> Would it be advisable to insert elements into ArrayWritable in sorted
> order and reconstruction of merged PriorityQueues at the other end now ?
> On Mon, Oct 15, 2012 at 11:07 PM, Chris Nauroth <[EMAIL PROTECTED]>wrote:
>> Hello Aseem,
>> I'm aware of nothing in Hadoop or related projects that provides a
>> PriorityQueueWritable. You could achieve this by taking some existing
>> priority queue class and subclassing it or wrapping it to implement the
>> Writable.write and Writable.readFields methods.
>> If you could give us some additional context around what you want to
>> solve, then we might be able to offer some other suggestions. For example,
>> depending on the problem, maybe you could sort values and wrap them in
>> ArrayWritable (which already exists), which would save you the trouble of
>> coding your own custom Writable.
>> Thank you,
>> On Mon, Oct 15, 2012 at 9:56 AM, Aseem Anand <[EMAIL PROTECTED]>wrote:
>>> Is anyone familiar with a PriorityQueueWritable to be used to pass data
>>> from mapper to reducers ?