Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Results from a Map/Reduce


Copy link to this message
-
RE: Results from a Map/Reduce
Jonathan Gray 2010-12-17, 20:00
There's not much in the way of examples for coprocessors besides the implementation of Security.  Check out HBASE-2000 and go from there.  If you're fairly new to HBase, then wait a couple months and there should be much better support around Coprocessors.

I'm unsure of a way to have a final result returned back to the main() method.  What exactly are you trying to do with this result?  Available to you to do what with it?  Could the MR job put the result back into HBase or could your reducer contain the logic you need to use with the final result?

> -----Original Message-----
> From: Peter Haidinyak [mailto:[EMAIL PROTECTED]]
> Sent: Friday, December 17, 2010 11:56 AM
> To: [EMAIL PROTECTED]
> Subject: RE: Results from a Map/Reduce
>
> Does that mean that when the job.waitForCompletion(true) returns that I
> have the results from the Reducer(s) available to me? I haven't seen much
> on coprocessors, can you point me to some examples of their use?
>
> Thanks
> -Pete
>
> -----Original Message-----
> From: Jonathan Gray [mailto:[EMAIL PROTECTED]]
> Sent: Friday, December 17, 2010 11:13 AM
> To: [EMAIL PROTECTED]
> Subject: RE: Results from a Map/Reduce
>
> Hey Peter,
>
> That System.exit line is nothing important, just the main thread waiting for
> the tasks to finish before closing.
>
> You're interested in having the MR job return a single result?  To do that, you
> would need to roll-up the processing done in each of your Map tasks into a
> single Reduce task.  With one reducer, you can have a single point to do the
> final aggregation of the result.
>
> I'm not sure exactly what kind of aggregation you are doing but funneling
> into a single reducer can range from no problem to don't even try it.  Sounds
> like you just want a final number or something so shouldn't be an issue.
>
> You might also consider doing your aggregations with coprocessors if you're
> into experimenting on HBase Trunk :)
>
> As for FirstKeyOnlyFilter:
>
> /**
>  * A filter that will only return the first KV from each row.
>  * <p>
>  * This filter can be used to more efficiently perform row count operations.
>  */
>
> That's what it does.  If you scan a table, regardless of what you ask for in the
> query, the filter will just return whatever the first KeyValue is on each row
> and will skip every other column/version/value of that row except the first.
>
> Like it says, it's generally useful for doing row counting but that's about it.
>
> JG
>
> > -----Original Message-----
> > From: Peter Haidinyak [mailto:[EMAIL PROTECTED]]
> > Sent: Friday, December 17, 2010 10:56 AM
> > To: [EMAIL PROTECTED]
> > Subject: Results from a Map/Reduce
> >
> > Hi, dumb question again.
> >   I have been using a Scan to return a result back to my client which
> > works fine except when I am returning a million rows just to aggregate the
> results.
> > The next logical step would be to do the aggregation in a Map/Reduce.
> > I've been looking at what samples I could find and they see to all do this...
> >
> >     System.exit(job.waitForCompletion(true) ? 0 : 1);
> >
> > My question, is there a way to return a result from the job in a
> > similar way of getting a ResultScanner back in iterating through the results?
> >
> > Also, is there a good definition of what a 'FirstKeyOnlyFilter' does?
> >
> > Thanks
> >
> > -Pete