Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro, mail # user - Fast write using avro-c


Copy link to this message
-
Re: Fast write using avro-c
amit nanda 2013-10-31, 13:57
Hi,

I am doing the same thing, by mistake i copied an old sample. But the issue
that i am facing is that my CPU and time usage is high, i am new to avro
and using this for the first time. So wanted to check if i am doing
something wrong?

Other things i wanted to cross-check are
(i) Is there any way to set a full record at one time, instead for setting
each value of a record?
(ii) If my schema has 10 values, but for a record has values less that 10,
then the values not i that specific records need to set as NULL?

Thanks
Amit
On Thu, Oct 31, 2013 at 5:36 PM, Mika Ristimaki <[EMAIL PROTECTED]>wrote:

> Hi,
>
> I'm not exactly sure what you are doing, but if I understood correctly
> your sample code, it looks like you are allocating a new avro_value_t  for
> each record that you write. That shouldn't be necessary. You can reuse the
> same avro_value_t for each record that you have allocated once with
> avro_generic_value_new. To my understanding you can use
> avro_value_reset(value) to null out all the fields from a record.
>
> -Mika
>
> On Oct 31, 2013, at 12:25 PM, amit nanda <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> >
> > I am writing a avro file, with 3 millions records, with each record
> having around 40 values each.
> >
> > The issue i am facing is that the writing is very slow, taking around 80
> seconds.
> >
> > I am using the avro value interface for this. Below are some of the
> API's that i am using.
> >
> >
> > avro_value_get_by_name(&_row, name.c_str(), &_column, NULL);
> > avro_type_t type = avro_value_get_type(&_column);
> > if(type == AVRO_UNION)
> > {
> >       avro_value_set_branch(&_column, 1, &_branch);
> >       avro_value_set_int(&_branch, value);
> > }
> > else
> >       avro_value_set_int(&_branch, value);
> >
> > once all the values for a record are set, a append the record to the
> writer.
> >
> > avro_file_writer_append_value(_writer, &_row);
> > avro_value_decref(&_row);
> > avro_generic_value_new(_writer_iface, &_row);
> >
> > Am i doing something wrong in this?
> > Is there any way to increase the speed at which  i am writing the data?
> >
> > Thanks
> > Amit
>
>