|
|
-
issue with writing an array of records
Alan Miller 2013-01-07, 16:35
Hi, I have a schema with an array of records (I'm open to other suggestions too) field called ifnet to store misc attribute name/values for a host's network interfaces. e.g.
{ "type": "record", "namespace": "com.company.avro.data", "name": "MyRecord", "doc": "My Data Record.", "fields": [ // (required) fields {"name": "time_stamp", "type": "long" }, {"name": "hostname", "type": "string" }, // (optional) array of ifnet instances {"name": "ifnet", "type": ["null", { "type": "array", "items": { "type": "record", "name": "Ifnet", "namespace": "com.company.avro.data", "fields": [ {"name": "name", "type": "string"}, {"name": "send_bps", "type": "long" }, {"name": "recv_bps", "type": "long" } ] } } ] } ] }
I can write the records, (time_stamp and hostname are correct) but my "array of records" field (ifnet) only contains the last element of my java List.
Am I writing the field correctly? I'm trying to write the ifnet field with a java.util.List<com.company.avro.data.Ifnet>
Here's the related code lines that write the ifnet field. (Yes, I'm attempting to use reflection because Ifnet is only 1 of approx 11 other array of record fields I'm trying to implement.)
Class[] paramObj = new Class[1]; paramObj[0] = Ifnet.class; Method method = cls.getMethod(methodName, List.class); jsonObj = new Ifnet(); listOfObj = new ArrayList<Ifnet>(); .......
// in a loop building the List<Ifnet>....... LOG.info(String.format(" [%s] %s %s(%s) as %s", name, k,methNm,v,types[j].toString())); ....... LOG.info(String.format(" [%s] setting name to %s", name, name)); ....... istOfObj.add(jsonObj); .......
// then finally I call invoke with a List of Ifnet records if (method != null) { method.invoke(obj, listOfObj); } LOG.info(String.format(" invoking %s.%s", method.getClass().getSimpleName(), method.getName())); LOG.info(String.format(" param: listObj<%s> with %d entries" , jsonObj.getClass().getName(), listOfObj.size()));
and the respective output 20130107T172303 INFO c.c.a.d.MyDriver - Set ifnet json via setIfnet(Ifnet object) 20130107T172303 INFO c.c.a.d.MyDriver - [e0c] setting name to e0c 20130107T172303 INFO c.c.a.d.MyDriver - [e0c] send_bps setSendBps(0) as class java.lang.Long 20130107T172303 INFO c.c.a.d.MyDriver - [e0c] setting name to e0c 20130107T172303 INFO c.c.a.d.MyDriver - [e0c] recv_bps setRecvBps(0) as class java.lang.Long 20130107T172303 INFO c.c.a.d.MyDriver - [e0d] setting name to e0d 20130107T172303 INFO c.c.a.d.MyDriver - [e0d] send_bps setSendBps(0) as class java.lang.Long 20130107T172303 INFO c.c.a.d.MyDriver - [e0d] setting name to e0d 20130107T172303 INFO c.c.a.d.MyDriver - [e0d] recv_bps setRecvBps(0) as class java.lang.Long 20130107T172303 INFO c.c.a.d.MyDriver - [e0a] setting name to e0a 20130107T172303 INFO c.c.a.d.MyDriver - [e0a] send_bps setSendBps(170720) as class java.lang.Long 20130107T172303 INFO c.c.a.d.MyDriver - [e0a] setting name to e0a 20130107T172303 INFO c.c.a.d.MyDriver - [e0a] recv_bps setRecvBps(244480) as class java.lang.Long 20130107T172303 INFO c.c.a.d.MyDriver - [e0b] setting name to e0b 20130107T172303 INFO c.c.a.d.MyDriver - [e0b] send_bps setSendBps(0) as class java.lang.Long 20130107T172303 INFO c.c.a.d.MyDriver - [e0b] setting name to e0b 20130107T172303 INFO c.c.a.d.MyDriver - [e0b] recv_bps setRecvBps(0) as class java.lang.Long 20130107T172303 INFO c.c.a.d.MyDriver - [e0P] setting name to e0P 20130107T172303 INFO c.c.a.d.MyDriver - [e0P] send_bps setSendBps(0) as class java.lang.Long 20130107T172303 INFO c.c.a.d.MyDriver - [e0P] setting name to e0P 20130107T172303 INFO c.c.a.d.MyDriver - [e0P] recv_bps setRecvBps(0) as class java.lang.Long 20130107T172303 INFO c.c.a.d.MyDriver - [losk] setting name to losk 20130107T172303 INFO c.c.a.d.MyDriver - [losk] send_bps setSendBps(0) as class java.lang.Long 20130107T172303 INFO c.c.a.d.MyDriver - [losk] setting name to losk 20130107T172303 INFO c.c.a.d.MyDriver - [losk] recv_bps setRecvBps(0) as class java.lang.Long 20130107T172303 INFO c.c.a.d.MyDriver - invoking Method.setIfnet 20130107T172303 INFO c.c.a.d.MyDriver - param: listObj<com.synopsys.iims.be.storage.Ifnet> with 6 entries 20130107T172303 INFO c.c.a.d.MyDriver - Set time_stamp integer via setTimeStamp to 1357513251
When I dump the records I see an array of 6 entries but the values all reflect the last last entry in my java.util.List. The print statements suggest I'm passing a list of 6 items with different values,
[amiller@localhost $] hadoop fs -copyToLocal /data/test_2013-01-07.avro [amiller@localhost $] avro cat --fields=time_stamp,ifnet test_2013-01-07.avro|less {"time_stamp": 1357513251, "ifnet": [{"send_bps": 0, "recv_bps": 0, "name": "losk"}, {"send_bps": 0, "recv_bps": 0, "name": "losk"}, {"send_bps": 0, "recv_bps": 0, "name": "losk"}, {"send_bps": 0, "recv_bps": 0, "name": "losk"}, {"send_bps": 0, "recv_bps": 0, "name": "losk"}, {"send_bps": 0, "recv_bps": 0, "name": "losk"}]} {"time_stamp": 1357513550, "ifnet": [{"send_bps": 0, "recv_bps": 0, "name": "losk"}, {"send_bps": 0, "recv_bps": 0, "name": "losk"}, {"send_bps": 0, "recv_bps": 0, "name": "losk"}, {"send_bps": 0, "recv_bps": 0, "name": "losk"}, {"send_bps": 0, "recv_bps": 0, "name": "losk"}, {"send_bps": 0, "recv_bps": 0, "name": "losk"}]} Any ideas. Alan
+
Alan Miller 2013-01-07, 16:35
-
Re: issue with writing an array of records
Scott Carey 2013-01-08, 10:36
On 1/7/13 8:35 AM, "Alan Miller" <[EMAIL PROTECTED]> wrote: >Hi, I have a schema with an array of records (I'm open to other >suggestions too) field >called ifnet to store misc attribute name/values for a host's network >interfaces. >e.g. > > >{ "type": "record", > "namespace": "com.company.avro.data", > "name": "MyRecord", > "doc": "My Data Record.", > "fields": [ > // (required) fields > {"name": "time_stamp", "type": "long" > }, > {"name": "hostname", "type": "string" > }, > > // (optional) array of ifnet instances > {"name": "ifnet", > "type": ["null", { > "type": "array", > "items": { "type": "record", "name": "Ifnet", > "namespace": "com.company.avro.data", > "fields": [ {"name": "name", "type": >"string"}, > {"name": "send_bps", "type": >"long" }, > {"name": "recv_bps", "type": >"long" } > ] > } > } > ] > } > > ] >}
First thought: Why the union of null and the array? It may be easier to simply have an empty list when there are no Ifnet data.
> > >I can write the records, (time_stamp and hostname are correct) but >my "array of records" field (ifnet) only contains the last element of my >java List. > >Am I writing the field correctly? I'm trying to write the ifnet field >with a >java.util.List<com.company.avro.data.Ifnet> > >Here's the related code lines that write the ifnet field. (Yes, I'm >attempting to use reflection >because Ifnet is only 1 of approx 11 other array of record fields I'm >trying to implement.) > > Class[] paramObj = new Class[1]; > paramObj[0] = Ifnet.class; > Method method = cls.getMethod(methodName, List.class); > jsonObj = new Ifnet(); > listOfObj = new ArrayList<Ifnet>(); > ....... > > > // in a loop building the List<Ifnet>....... > > LOG.info(String.format(" [%s] %s %s(%s) as %s", name, >k,methNm,v,types[j].toString())); > ....... > > LOG.info(String.format(" [%s] setting name to %s", name, name)); > > ....... > > istOfObj.add(jsonObj); > > ....... > > // then finally I call invoke with a List of Ifnet records > > if (method != null) { method.invoke(obj, listOfObj); } > LOG.info(String.format(" invoking %s.%s", >method.getClass().getSimpleName(), method.getName())); > LOG.info(String.format(" param: listObj<%s> with %d entries" , >jsonObj.getClass().getName(), listOfObj.size())); > > >and the respective output >20130107T172303 INFO c.c.a.d.MyDriver - Set ifnet >json via setIfnet(Ifnet object) >20130107T172303 INFO c.c.a.d.MyDriver - [e0c] setting name to e0c >20130107T172303 INFO c.c.a.d.MyDriver - [e0c] send_bps setSendBps(0) >as class java.lang.Long >20130107T172303 INFO c.c.a.d.MyDriver - [e0c] setting name to e0c >20130107T172303 INFO c.c.a.d.MyDriver - [e0c] recv_bps setRecvBps(0) >as class java.lang.Long >20130107T172303 INFO c.c.a.d.MyDriver - [e0d] setting name to e0d >20130107T172303 INFO c.c.a.d.MyDriver - [e0d] send_bps setSendBps(0) >as class java.lang.Long >20130107T172303 INFO c.c.a.d.MyDriver - [e0d] setting name to e0d >20130107T172303 INFO c.c.a.d.MyDriver - [e0d] recv_bps setRecvBps(0) >as class java.lang.Long >20130107T172303 INFO c.c.a.d.MyDriver - [e0a] setting name to e0a >20130107T172303 INFO c.c.a.d.MyDriver - [e0a] send_bps >setSendBps(170720) as class java.lang.Long >20130107T172303 INFO c.c.a.d.MyDriver - [e0a] setting name to e0a >20130107T172303 INFO c.c.a.d.MyDriver - [e0a] recv_bps >setRecvBps(244480) as class java.lang.Long >20130107T172303 INFO c.c.a.d.MyDriver - [e0b] setting name to e0b >20130107T172303 INFO c.c.a.d.MyDriver - [e0b] send_bps setSendBps(0) >as class java.lang.Long >20130107T172303 INFO c.c.a.d.MyDriver - [e0b] setting name to e0b
What API are you using? Generic, Specific, or Reflect? I am suspicious of the reflection, does it work when you bind it to one specific type? I suspect you could use generics to avoid the reflection but I have seen similar tasks get a little ugly.
This is something you should be able to step through in a debugger; when you append the datum to the file, you can check the state of the object being written and that it has all of the expected data, then step through the code in the DatumWriter (if you use an IDE and maven, it should be trivial to attach the Avro source to your debugger) and see what it does with your list.
+
Scott Carey 2013-01-08, 10:36
-
Re: issue with writing an array of records
Alan Miller 2013-01-08, 11:48
Thanks Scott, I figured it out. The problem was simply an improper loop design. I corrected the loop code and now I get an array of Ifnet records.
[amiller@loclhost $] avro cat --fields hostname,ifnet data_2013-01-08.avro {"hostname": "host08", "ifnet": [{"send_bps": 0, "recv_bps": 0, "name": "e0P"}, {"send_bps": 171256, "recv_bps": 197512, "name": "vif1"}, {"send_bps": 0, "recv_bps": 0, "name": "vif2"}, {"send_bps": 0, "recv_bps": 0, "name": "losk"}]} {"hostname": "host11", "ifnet": [{"send_bps": 9992353544, "recv_bps": 14049010776, "name": "e1a"}, {"send_bps": 0, "recv_bps": 0, "name": "e1b"}, {"send_bps": 2736, "recv_bps": 2960, "name": "e0P"}, {"send_bps": 9376, "recv_bps": 19312, "name": "c0a"}, {"send_bps": 0, "recv_bps": 0, "name": "vif1"}, {"send_bps": 9376, "recv_bps": 19312, "name": "c0b"}, {"send_bps": 0, "recv_bps": 0, "name": "e0M"}, {"send_bps": 0, "recv_bps": 0, "name": "losk"}]} I did not have to change my schema. For anyone interested, I verified I could write my array of records with the schema using this test program.
public class ArrayOfRecTest { public static void main(String[] args) throws IOException { String fileName = "/tmp/test.avro"; File avroFile = new File(fileName); // NOTE: MyRecord.java was generated from MyRecord.avsc and avro.compiler.specific.SchemaTask Schema schema = new MyRecord().getSchema(); DatumWriter<MyRecord> writer = new SpecificDatumWriter<MyRecord>(schema); DataFileWriter<MyRecord> dataFileWriter = new DataFileWriter<MyRecord>(writer); dataFileWriter.create(schema, avroFile);
MyRecord record = new MyRecord(); record.setHostname("localhost"); record.setTimeStamp(new Long(1357639723)); List<Ifnet> ifnetList = new ArrayList<Ifnet>(); Ifnet ifc1 = new Ifnet("eth0", 1234L, 5678L); ifnetList.add(ifc1); Ifnet ifc2 = new Ifnet("eth1", 100L, 200L); ifnetList.add(ifc2); Ifnet ifc3 = new Ifnet("eth2", 0L, 0L); ifnetList.add(ifc3); record.setIfnet(ifnetList); dataFileWriter.append(record); dataFileWriter.flush(); dataFileWriter.close(); } } Alan
On Tue, Jan 8, 2013 at 11:36 AM, Scott Carey <[EMAIL PROTECTED]> wrote:
> > On 1/7/13 8:35 AM, "Alan Miller" <[EMAIL PROTECTED]> wrote: > > > >Hi, I have a schema with an array of records (I'm open to other > >suggestions too) field > >called ifnet to store misc attribute name/values for a host's network > >interfaces. > >e.g. > > > > > >{ "type": "record", > > "namespace": "com.company.avro.data", > > "name": "MyRecord", > > "doc": "My Data Record.", > > "fields": [ > > // (required) fields > > {"name": "time_stamp", "type": "long" > > }, > > {"name": "hostname", "type": "string" > > }, > > > > // (optional) array of ifnet instances > > {"name": "ifnet", > > "type": ["null", { > > "type": "array", > > "items": { "type": "record", "name": "Ifnet", > > "namespace": "com.company.avro.data", > > "fields": [ {"name": "name", "type": > >"string"}, > > {"name": "send_bps", "type": > >"long" }, > > {"name": "recv_bps", "type": > >"long" } > > ] > > } > > } > > ] > > } > > > > ] > >} > > First thought: Why the union of null and the array? It may be easier to > simply have an empty list when there are no Ifnet data. > > > > > > > > >I can write the records, (time_stamp and hostname are correct) but > >my "array of records" field (ifnet) only contains the last element of my > >java List. > > > >Am I writing the field correctly? I'm trying to write the ifnet field > >with a > >java.util.List<com.company.avro.data.Ifnet> > > > >Here's the related code lines that write the ifnet field. (Yes, I'm > >attempting to use reflection > >because Ifnet is only 1 of approx 11 other array of record fields I'm > >trying to implement.) > > > > Class[] paramObj = new Class[1];
+
Alan Miller 2013-01-08, 11:48
|
|