Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # user - Fast write using avro-c


Copy link to this message
-
Re: Fast write using avro-c
Mika Ristimaki 2013-10-31, 12:06
Hi,

I'm not exactly sure what you are doing, but if I understood correctly your sample code, it looks like you are allocating a new avro_value_t  for each record that you write. That shouldn't be necessary. You can reuse the same avro_value_t for each record that you have allocated once with avro_generic_value_new. To my understanding you can use avro_value_reset(value) to null out all the fields from a record.

-Mika

On Oct 31, 2013, at 12:25 PM, amit nanda <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I am writing a avro file, with 3 millions records, with each record having around 40 values each.
>
> The issue i am facing is that the writing is very slow, taking around 80 seconds.
>
> I am using the avro value interface for this. Below are some of the API's that i am using.
>
>
> avro_value_get_by_name(&_row, name.c_str(), &_column, NULL);
> avro_type_t type = avro_value_get_type(&_column);
> if(type == AVRO_UNION)
> {
> avro_value_set_branch(&_column, 1, &_branch);
> avro_value_set_int(&_branch, value);
> }
> else
> avro_value_set_int(&_branch, value);
>
> once all the values for a record are set, a append the record to the writer.
>
> avro_file_writer_append_value(_writer, &_row);
> avro_value_decref(&_row);
> avro_generic_value_new(_writer_iface, &_row);
>
> Am i doing something wrong in this?
> Is there any way to increase the speed at which  i am writing the data?
>
> Thanks
> Amit