Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Writing in Cpp and reading in Python


Copy link to this message
-
Writing in Cpp and reading in Python
Hi,

I am using following schema to write in C++ and reading in python.

{
"type": "record",
"name": "jok_obj",
"fields" : [
            {"name" : "val", "type": ["null", "boolean", "long", "int",
                                      "double", "float", "string",

                                      {"name" : "date", "type" : "record",
                                       "fields" : [
                                                   {"name" : "value",
"type" : "int"}
                                                  ]
                                      },

                                      {"name" : "datetime", "type" : "record",
                                       "fields" : [
                                                   {"name" : "date",
"type" : "int"},
                                                   {"name" : "tics",
"type" : "int"}
                                                  ]
                                      },

                                      {"name" : "timestamp", "type" : "record",
                                       "fields" : [
                                                   {"name" : "sec",
  "type" : "long"},
                                                   {"name" :
"microsec", "type" : "long"}
                                                  ]
                                      },

                                      {"type" : "map",   "values" : "jok_obj"},
                                      {"type" : "array", "items" : "jok_obj"}
                                      ]
            }
        ]
}

I encode C++ object to a memoryInputStream and read it using
StreamReader and convert it ultimately to std::string. Further I try
to decode that string in C++, it works fine, but fails in python.
====    std::string AvroObj::encode()
    {
        std::auto_ptr<avro::OutputStream> out = avro::memoryOutputStream();
        avro::EncoderPtr e = avro::binaryEncoder();
        e->init(*out);
        avro::encode(*e, obj);

        std::auto_ptr<avro::InputStream> in = avro::memoryInputStream(*out);
        avro::StreamReader* reader = new avro::StreamReader(*in);

        std::stringstream ss;
        while(reader->hasMore()) {
            ss << reader->read();
        }

        return ss.str();
    }

====
I am trying to encode {"val" : 0.0}, which in encoded form results to
"\x08". But when I send this to python it fails saying:

=============================================...
File "/u/nanda/jok/lib/python/*****/jok/rpc.py", line 451, in to_avro
    record = dr.read(decoder)
  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 445, in read
    return self.read_data(self.writers_schema, self.readers_schema, decoder)
  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 490, in read_data
    return self.read_record(writers_schema, readers_schema, decoder)
  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 690, in
read_record
    field_val = self.read_data(field.type, readers_field.type, decoder)
  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 488, in read_data
    return self.read_union(writers_schema, readers_schema, decoder)
  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 654, in read_union
    return self.read_data(selected_writers_schema, readers_schema, decoder)
  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 458, in read_data
    return self.read_data(writers_schema, s, decoder)
  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 476, in read_data
    return decoder.read_double()
  File "/u/nanda/avro-src-1.6.1/lang/py/src/avro/io.py", line 218, in
read_double
    ((ord(self.read(1)) & 0xffL) << 48) |
TypeError: ord() expected a character, but string of length 0 found
===============================================
While digging in more I found that python encodes {"val" : 0.0"} as
"\x08\x00\x00\x00\x00\x00\x00\x00\x00". Anything string shorter that
this gives above error.

Could you please suggest?

Thanks,
Gaurav Nanda
+
Miki Tebeka 2012-05-15, 18:07
+
Gaurav Nanda 2012-05-15, 18:28
+
Gaurav Nanda 2012-05-15, 18:34
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB