Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: Map Shuffle Bytes


Copy link to this message
-
Re: Map Shuffle Bytes
This isn't called 'shuffle' (but rather a plain remote read) so your
original question was confusing, thanks for clarifying!

In that case, you could count the bytes coming in from the required
record reader - for example a TextRecordReader uses a Long key that
denotes current offset in file, which you could use as a simple,
progressing counter of bytes read thus far.

On Wed, Dec 26, 2012 at 5:16 PM, Eduard Skaley <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I mean TO the mappers. I'm using the CompositeInputFormat for my application
> to compute map-side joins.
> I want to join two datasets A and B one is stored on node 1 and the other
> one on node 2.
> For example if the join will be computed on node 2 then the inputsplit of
> the dataset which is stored on node 1 has to be transferred to node 2.
> I want to count the bytes which are shuffled (transferred) TO the mapper of
> node 2.
>
>> Hi,
>>
>> What do you mean by "shuffled bytes [to] the mappers"? If you mean
>> "from", it is "Reduce shuffle bytes" you look for; otherwise, you may
>> be looking for the per-map counter of "Map output bytes".
>>
>> Per-partition counters can be constructed on the user side if needed,
>> by pre-computing the partition before emit (using the same
>> partitioner) and counting up the bytes of your objects for its
>> counter.
>>
>> On Tue, Dec 25, 2012 at 6:03 PM, Eduard Skaley <[EMAIL PROTECTED]>
>> wrote:
>>>
>>> Hello guys,
>>>
>>> I need a counter for shuffled bytes to the mappers.
>>> Is there existing one or should I define one myself ?
>>> How can I implement such a counter?
>>>
>>> Thank you and happy Christmas time,
>>> Eduard
>>
>>
>>
>

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB