Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Problem with using BinSedesTuple as Mapper key


+
Gayatri Rao 2012-04-23, 05:30
+
Harsh J 2012-04-23, 05:51
Copy link to this message
-
Re: Problem with using BinSedesTuple as Mapper key
Hi Gayatri,
Looks like you might want to use a low-level enhancement of the default
Hadoop API called Pangool (http://pangool.net) which uses tuples and
simplifies grouping by, sorting by and joining datasets in Hadoop.

On Mon, Apr 23, 2012 at 7:30 AM, Gayatri Rao <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I am using BinSedesTuple as a mapper key to emit a tuple of values. But
> somehow same keys do not go to the same reducer and I do not get
> aggregates.
> Is it not suggested to use it as a mapper key?
>
> For example in my mapper I emit
>
> Mapper:
> Output key : BinSedesTuple  value: int
>
>
> Example output:
> tuple.append(url);
> tuple.append(category);
>
> Reducer:
> Input key: BinSedesTuple value: int
> Output key: Text value: int
>
> Example output:
> url1 category1 3
> url1 category1 2
>
> In the reducer output I get output with multiple keys being the same. My
> expected output is
> url1 category 5
>
> Any ideas what might be wrong?
>
>
> Thanks,
> Gayatri
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB