Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Fast write using avro-c


+
amit nanda 2013-10-31, 10:25
Copy link to this message
-
Re: Fast write using avro-c
Hi,

I'm not exactly sure what you are doing, but if I understood correctly your sample code, it looks like you are allocating a new avro_value_t  for each record that you write. That shouldn't be necessary. You can reuse the same avro_value_t for each record that you have allocated once with avro_generic_value_new. To my understanding you can use avro_value_reset(value) to null out all the fields from a record.

-Mika

On Oct 31, 2013, at 12:25 PM, amit nanda <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I am writing a avro file, with 3 millions records, with each record having around 40 values each.
>
> The issue i am facing is that the writing is very slow, taking around 80 seconds.
>
> I am using the avro value interface for this. Below are some of the API's that i am using.
>
>
> avro_value_get_by_name(&_row, name.c_str(), &_column, NULL);
> avro_type_t type = avro_value_get_type(&_column);
> if(type == AVRO_UNION)
> {
> avro_value_set_branch(&_column, 1, &_branch);
> avro_value_set_int(&_branch, value);
> }
> else
> avro_value_set_int(&_branch, value);
>
> once all the values for a record are set, a append the record to the writer.
>
> avro_file_writer_append_value(_writer, &_row);
> avro_value_decref(&_row);
> avro_generic_value_new(_writer_iface, &_row);
>
> Am i doing something wrong in this?
> Is there any way to increase the speed at which  i am writing the data?
>
> Thanks
> Amit
+
amit nanda 2013-10-31, 13:57
+
Bruce Mitchener 2013-10-31, 10:29
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB