Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # user - is this a bug?


Copy link to this message
-
RE: is this a bug?
ey-chih chow 2011-03-10, 23:16

After I made the change mentioned in the previous message, The MR job was running.  However, this did not fix the problem I mentioned at the beginning of the topic.  I got the following for the reducer:
====================================================================================================attempt_20110310145147365_0002_r_000000_0/syslog:2011-03-10 14:52:31,226 INFO com.ngmoco.ngpipes.sourcing.NgActivityGatheringReducer: working on 0000000000000000000000000000000000000 whose rowKey is 0000000000000000000000000000000000000attempt_20110310145315542_0002_r_000000_0/syslog:2011-03-10 14:53:59,010 INFO com.ngmoco.ngpipes.sourcing.NgActivityGatheringReducer: working on 0000000000000000000000000000000000000 whose rowKey is 0000000000000000000000000000000000000attempt_20110310145315542_0002_r_000000_0/syslog:2011-03-10 14:53:59,016 INFO com.ngmoco.ngpipes.sourcing.NgActivityGatheringReducer: working on 0000000100000000000000000000000000001 whose rowKey is 0000000200000000000000000000000000002attempt_20110310145315542_0002_r_000000_0/syslog:2011-03-10 14:53:59,017 INFO com.ngmoco.ngpipes.sourcing.NgActivityGatheringReducer: working on 0000000200000000000000000000000000002 whose rowKey is 0000000300000000000000000000000000003attempt_20110310145315542_0002_r_000000_0/syslog:2011-03-10 14:53:59,021 INFO com.ngmoco.ngpipes.sourcing.NgActivityGatheringReducer: working on 0000000300000000000000000000000000003 whose rowKey is 0000000400000000000000000000000000004attempt_20110310145315542_0002_r_000000_0/syslog:2011-03-10 14:53:59,023 INFO com.ngmoco.ngpipes.sourcing.NgActivityGatheringReducer: working on 0000000400000000000000000000000000004 whose rowKey is 0000000500000000000000000000000000005attempt_20110310145315542_0002_r_000000_0/syslog:2011-03-10 14:53:59,024 INFO com.ngmoco.ngpipes.sourcing.NgActivityGatheringReducer: working on 0000000500000000000000000000000000005 whose rowKey is 0000000500000000000000000000000000005===================================================================================================If we add the following two lines to the reducer code:
====================================================================================================boolean workAround = getConf().getBoolean(NgActivityGatheringJob.NG_AVRO_BUG_WORKAROUND, true);Utf8 dupKey = (workAround) ? new Utf8(key.toString()) : key; // use dupKey instead of key passed to reducer===================================================================================================We got the following trace, which we consider as the right behavior:
====================================================================================================2011-03-10 15:04:33,431 INFO com.ngmoco.ngpipes.sourcing.NgActivityGatheringReducer: working on 0000000000000000000000000000000000000 whose rowKey is 0000000000000000000000000000000000000attempt_20110310150517897_0002_r_000000_0/syslog:2011-03-10 15:06:01,374 INFO com.ngmoco.ngpipes.sourcing.NgActivityGatheringReducer: working on 0000000000000000000000000000000000000 whose rowKey is 0000000000000000000000000000000000000attempt_20110310150517897_0002_r_000000_0/syslog:2011-03-10 15:06:01,381 INFO com.ngmoco.ngpipes.sourcing.NgActivityGatheringReducer: working on 0000000100000000000000000000000000001 whose rowKey is 0000000100000000000000000000000000001attempt_20110310150517897_0002_r_000000_0/syslog:2011-03-10 15:06:01,383 INFO com.ngmoco.ngpipes.sourcing.NgActivityGatheringReducer: working on 0000000200000000000000000000000000002 whose rowKey is 0000000200000000000000000000000000002attempt_20110310150517897_0002_r_000000_0/syslog:2011-03-10 15:06:01,389 INFO com.ngmoco.ngpipes.sourcing.NgActivityGatheringReducer: working on 0000000300000000000000000000000000003 whose rowKey is 0000000300000000000000000000000000003attempt_20110310150517897_0002_r_000000_0/syslog:2011-03-10 15:06:01,391 INFO com.ngmoco.ngpipes.sourcing.NgActivityGatheringReducer: working on 0000000400000000000000000000000000004 whose rowKey is 0000000400000000000000000000000000004attempt_20110310150517897_0002_r_000000_0/syslog:2011-03-10 15:06:01,393 INFO com.ngmoco.ngpipes.sourcing.NgActivityGatheringReducer: working on 0000000500000000000000000000000000005 whose rowKey is 0000000500000000000000000000000000005===================================================================================================
Ey-Chih Chow
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: RE: is this a bug?
Date: Thu, 10 Mar 2011 14:28:41 -0800
I changed the Games__ field of the DeviceRow to
union {null, array<DynamicColumn4Games>} Games__;
the system seemed no longer complaining.  Is this a right fix?  Thanks.
Ey-Chih Chow

From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: RE: is this a bug?
Date: Thu, 10 Mar 2011 11:33:13 -0800
Thanks.  I tried to migrate to 1.5.0 from 1.4.0.  I came up with some error messages that are never shown up for 1.4.0.  Could you tell me what we should change?  Our avdl record, DeviceRow, has a field defined as follows:

union {array<DynamicColumn4Games>, null} Games__;

The error messages are as follows:

11/03/10 11:31:02 INFO mapred.TaskInProgress: Error from attempt_20110310113041953_0001_m_000000_0: java.lang.NullPointerException: in com.ngmoco.hbase.DeviceRow in union null of union in field Games__ of com.ngmoco.hbase.DeviceRow
    at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
    at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57)
    at org.apache.avro.mapred.AvroSerialization$AvroWrapperSerializer.serialize(AvroSerialization.java:131)
    at org.apache.avro.mapred.AvroSerialization$AvroWrapperSerializer.serialize(AvroSerialization.java:114)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:900)
    at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
    at org.apache.avro.mapred.HadoopMapper$MapCollector.collect(HadoopMapper