Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Fast write using avro-c


Copy link to this message
-
Re: Fast write using avro-c
Hi,

I am doing the same thing, by mistake i copied an old sample. But the issue
that i am facing is that my CPU and time usage is high, i am new to avro
and using this for the first time. So wanted to check if i am doing
something wrong?

Other things i wanted to cross-check are
(i) Is there any way to set a full record at one time, instead for setting
each value of a record?
(ii) If my schema has 10 values, but for a record has values less that 10,
then the values not i that specific records need to set as NULL?

Thanks
Amit
On Thu, Oct 31, 2013 at 5:36 PM, Mika Ristimaki <[EMAIL PROTECTED]>wrote:

> Hi,
>
> I'm not exactly sure what you are doing, but if I understood correctly
> your sample code, it looks like you are allocating a new avro_value_t  for
> each record that you write. That shouldn't be necessary. You can reuse the
> same avro_value_t for each record that you have allocated once with
> avro_generic_value_new. To my understanding you can use
> avro_value_reset(value) to null out all the fields from a record.
>
> -Mika
>
> On Oct 31, 2013, at 12:25 PM, amit nanda <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> >
> > I am writing a avro file, with 3 millions records, with each record
> having around 40 values each.
> >
> > The issue i am facing is that the writing is very slow, taking around 80
> seconds.
> >
> > I am using the avro value interface for this. Below are some of the
> API's that i am using.
> >
> >
> > avro_value_get_by_name(&_row, name.c_str(), &_column, NULL);
> > avro_type_t type = avro_value_get_type(&_column);
> > if(type == AVRO_UNION)
> > {
> >       avro_value_set_branch(&_column, 1, &_branch);
> >       avro_value_set_int(&_branch, value);
> > }
> > else
> >       avro_value_set_int(&_branch, value);
> >
> > once all the values for a record are set, a append the record to the
> writer.
> >
> > avro_file_writer_append_value(_writer, &_row);
> > avro_value_decref(&_row);
> > avro_generic_value_new(_writer_iface, &_row);
> >
> > Am i doing something wrong in this?
> > Is there any way to increase the speed at which  i am writing the data?
> >
> > Thanks
> > Amit
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB