Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # user - issue with writing an array of records


Copy link to this message
-
Re: issue with writing an array of records
Scott Carey 2013-01-08, 10:36

On 1/7/13 8:35 AM, "Alan Miller" <[EMAIL PROTECTED]> wrote:
>Hi, I have a schema with an array of records (I'm open to other
>suggestions too) field
>called ifnet to store misc attribute name/values for a host's network
>interfaces.
>e.g.
>
>
>{ "type": "record",
>  "namespace": "com.company.avro.data",
>  "name": "MyRecord",
>  "doc": "My Data Record.",
>  "fields": [
>    // (required) fields
>    {"name":          "time_stamp", "type": "long"
>      },
>    {"name":            "hostname", "type": "string"
>      },
>
>    // (optional) array of ifnet instances
>    {"name": "ifnet",
>     "type": ["null", {
>                "type": "array",
>                "items": { "type": "record", "name": "Ifnet",
>                           "namespace": "com.company.avro.data",
>                           "fields": [ {"name": "name",           "type":
>"string"},
>                                       {"name": "send_bps",       "type":
>"long"  },
>                                       {"name": "recv_bps",       "type":
>"long"  }
>                           ]
>                }
>              }
>     ]
>    }
>
> ]
>}

First thought:  Why the union of null and the array?  It may be easier to
simply  have an empty list when there are no Ifnet data.

>
>
>I can write the records, (time_stamp and hostname are correct) but
>my "array of records" field (ifnet) only contains the last element of my
>java List.
>
>Am I writing the field correctly?  I'm trying to write the ifnet field
>with a
>java.util.List<com.company.avro.data.Ifnet>
>
>Here's the related code lines that write the ifnet field. (Yes, I'm
>attempting to use reflection
>because Ifnet is only 1 of approx 11 other array of record fields I'm
>trying to implement.)
>
>   Class[] paramObj = new Class[1];
>   paramObj[0] = Ifnet.class;
>   Method method = cls.getMethod(methodName, List.class);
>   jsonObj = new Ifnet();
>   listOfObj = new ArrayList<Ifnet>();
>   .......  
>
>
>   // in a loop building the List<Ifnet>.......
>
>    LOG.info(String.format("   [%s] %s %s(%s) as %s", name,
>k,methNm,v,types[j].toString()));
>   .......  
>
>    LOG.info(String.format("   [%s] setting name to %s", name, name));
>
>   .......  
>
>   istOfObj.add(jsonObj);
>
>   .......
>
>  // then finally I call invoke with a List of Ifnet records
>
>  if (method != null) { method.invoke(obj, listOfObj); }
>  LOG.info(String.format("  invoking %s.%s",
>method.getClass().getSimpleName(), method.getName()));
>  LOG.info(String.format("  param: listObj<%s> with %d entries" ,
>jsonObj.getClass().getName(), listOfObj.size()));
>
>
>and the respective output
>20130107T172303  INFO c.c.a.d.MyDriver - Set                ifnet
>json via             setIfnet(Ifnet object)
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0c] setting name to e0c
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0c] send_bps setSendBps(0)
>as class java.lang.Long
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0c] setting name to e0c
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0c] recv_bps setRecvBps(0)
>as class java.lang.Long
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0d] setting name to e0d
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0d] send_bps setSendBps(0)
>as class java.lang.Long
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0d] setting name to e0d
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0d] recv_bps setRecvBps(0)
>as class java.lang.Long
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0a] setting name to e0a
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0a] send_bps
>setSendBps(170720) as class java.lang.Long
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0a] setting name to e0a
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0a] recv_bps
>setRecvBps(244480) as class java.lang.Long
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0b] setting name to e0b
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0b] send_bps setSendBps(0)
>as class java.lang.Long
>20130107T172303  INFO c.c.a.d.MyDriver -    [e0b] setting name to e0b

What API are you using?  Generic, Specific, or Reflect?  I am suspicious
of the reflection, does it work when you bind it to one specific type?  I
suspect you could use generics to avoid the reflection but I have seen
similar tasks get a little ugly.

This is something you should be able to step through in a debugger; when
you append the datum to the file, you can check the state of the object
being written and that it has all of the expected data, then step through
the code in the DatumWriter (if you use an IDE and maven, it should be
trivial to attach the Avro source to your debugger) and see what it does
with your list.