Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> how to figure out the range of a split that failed?


Copy link to this message
-
Re: how to figure out the range of a split that failed?
Thanks for the tip.
I actually already have tried your method. The command I wrote is like below

cerr << "reporter:counter:SkippingTaskCounters,MapProcessedRecords,1\n";

This actually produced some skipped records in skip folder. But the problem
is that the skipped records' text was all messed up. So I couldn't recycle
them. The broken text is at the end of this mail.
I don't know the reason. Maybe it's because I wrote some other information
on error stream(such as document ID. The command below is the one)

cerr << "Processing: " << docID << endl;

Anyway if I happen to have any progress, I will update through this post.

Broken Text:
SEQ#6;!org.apache.hadoop.io.LongWritable#25;org.apache.hadoop.io.Text#1;#1;*org.apache.hadoop.io.compress.DefaultCodec�9/#6;S#11;#2;#15;庇4z����#25;9/#6;S#11;#2;#15;庇4z��
x��#26;#17;#18;x�`#1;[#15;0텬�#6;,#1;w#14;x�c�c]#1;#7;r#2;8�$x�U]o#19;W#16;}i�H�Hi��Z�b+X�r�y접#16;�#4;"H���\#17;��탈�y��
6�%�b��EML#18;の]`[}Ø�}�則�霙�綴#23;�愁P�#18;<#18;#31;�q�賽���\��5~庾X�xp�c≠�=��<�嵬#14;p|3:#24;#28;#25;�B�A?-8�勁쫑�2��_�@���#18;F#A�6#16;
#15;G�}殘�蛛甦Q�$潾�,8#16;��#4;0鉛=�츈^曠�#29;�G�└&?日���;=~兪����묽��씔�nl�o]�판��M�d橫罹�|잣��C�#15;#4;噫�0#18;��#20;��p`���i#27;�U���z�4#20;t^d芚Mc}�qf��뎬�����_^#6;朽c���V��#3;}S�V콸�Q�%κ쐴*�O_�����s������:�kn��(����b��RX�+oh#1;S��8���梁#5;=㈍哥�0%DGw當�.
#7;�#20;3#1;V���y#6;�분�!A�0%�;���瀏��茁�'&?
�N#31;�jP��#8;��$��� %�m��밑!c�#8;�l#17;s����<``% �����廣#18;�>Y�i��#8;�璽#6;��Fb賃$x좃_�W�L
�\F#12;�g#7;i�Ix8#22;#25;���廟別<�毫�<u9�BI(5S��u#21;�#6;�後#3;�
#��#14;#15;~��弛欒��鯖�`h�����|�9�c�高#5;#16;�:課Y�Q��������#31;銳�占�쇽#31;��p
-�cK��#20;qY#4;%�OR�#18;��55�pJ��潁~�#14;�Qt�>HywK�\Dz觴���U��(����/����渾�Re'%�s#30;0k咫키L�#11;%{#17;K$��U+���1�B5�#12;7j�-�f~��~頃�K�`c\#26;�G�+t#23;-��dJc|s�#11;�b�#16;#6;vWA�2�荳�f�X�却"��M�����W�.��#23;��D=O~��#25;/���$
#20;s#14;��헷1M�e��#29;��긺 ^#3;�9��1�揷�]奸��;���0-�굇���
�#8;w욜=l泊�踰=c��u��瓣��S��#18;*�����奈y1룡#18;bM]��X�씬�7h#27;짼Zx�\琅윰'��n?-�s#31;�?�#17;�q�弘벧MV�*距�#28;已희�岸侖:N#2;�����刃#2;2�#20;
2010/7/6 Sharad Agarwal <[EMAIL PROTECTED]>

> to be precise you have to write on error stream ->
> for map:
> reporter:counter:SkippingTaskCounters,MapProcessedRecords,<count>
>
> for reduce:
> reporter:counter:SkippingTaskCounters,ReduceProcessedGroups,<count>
>
>
> edward choi wrote:
>
>> Thanks for the response. I went to the web page you told me and several
>> other pages that I found.
>> I am still not sure if I got it right.
>> If I am trying to increment COUNTER_MAP_PROCESS_RECORDS using Hadoop
>> Streaming, is the example below the way to do it? (assuming that I am using
>> c++)
>>
>> example:
>> cerr << "reporter:counter:counters,linecount,1" << endl;
>>
>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB