Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Hadoop-consumer & partition question


Copy link to this message
-
Re: Hadoop-consumer & partition question
I haven't used this script in a while, but if I remember correctly, you
should have a different offset file for each broker/partition combination...

In any case, the article you linked to is an outdated version of that
script (as mentioned in the block at the very beginning of the post, BTW).

A quick look at the script you linked to shows that it manages only a
single offset file, which would explain why you're consuming just one
partition. The latest version of the script manages all of the offset
files, so I think that should solve your problem.

You can find the latest version of the script here:
https://gist.github.com/1671887

--
Felix

On Mon, Aug 13, 2012 at 2:39 AM, Fredrik Emilsson <
[EMAIL PROTECTED]> wrote:

> Thanks for the information!
>
> Here is my offset file:
>
> SEQkafka.etl.KafkaETLKey"org.apache.hadoop.io.BytesWritableh*À¦ê$ßíãðÒF.tcp://localhost:9092
>    device-events-2 0       5133247
>
> Is the 0 in the file the partition? If so then I have to identify how to
> set the partition number. I guess 0 is the default value.
>
>   Regards,
>     Fredrk
>
> -----Original Message-----
> From: Neha Narkhede [mailto:[EMAIL PROTECTED]]
> Sent: 10 August 2012 17:34
> To: [EMAIL PROTECTED]
> Subject: Re: Hadoop-consumer & partition question
>
> The Hadoop consumer uses an offsets file to know which partitions to
> consume and from which offset. How does you offsets file look like ?
>
> Thanks,
> Neha
>
> On Fri, Aug 10, 2012 at 7:06 AM, Fredrik Emilsson <
> [EMAIL PROTECTED]> wrote:
> > Hello,
> >
> >
> >
> >   I have a topic that have two partitions. I have one broker. Adding
> > events is working well, there are events both in the 0 and 1 partition.
> > The problem arise when I try to import it into Hadoop. It seems that
> > only the events in partition 0 is imported. I am using a script (found
> > here:
> > http://felixgv.com/post/69/automating-incremental-imports-with-the-kaf
> > ka
> > -hadoop-consumer/) to import it.
> >
> >
> >
> >   Does anyone know what the problem could be?
> >
> >
> >
> >   Regards,
> >
> >     Fredrik
> >
> >
> >
> >
> >
> >
> > NOTICE - This message and any attached files may contain information
> that is confidential and/or subject of legal privilege intended only for
> use by the intended recipient. If you are not the intended recipient or the
> person responsible for delivering the message to the intended recipient, be
> advised that you have received this message in error and that any
> dissemination, copying or use of this message or attachment is strictly
> forbidden, as is the disclosure of the information therein.  If you have
> received this message in error please notify the sender immediately and
> delete the message.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB