Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Occasional regionserver crashes following socket errors writing to HDFS


Copy link to this message
-
Re: Occasional regionserver crashes following socket errors writing to HDFS
Hmmm.

That could be. I don't know what Doug wrote except that I knew he mentioned he updated the docs on it.

This is really kind of a basic issue.  It just makes sense.
As you already point out, you and Andrew already noticed this back in 2009 and 2010.
I just don't think you took it far enough. Essentially HBase can be used in place of a reducer. In terms of a M/R job, M/M using HBase is going to be more efficient. (Assuming that you are already running HBase.)   I really can't see any reason to use a reducer when using HBase.
Maybe I'm being stupid, but every example I've looked at, you can refactor it to not use a reducer.

I also think you may read a bit more in to my posts that I intend. ;-)

-Mike

On May 10, 2012, at 10:28 PM, Stack wrote:

> On Thu, May 10, 2012 at 6:28 PM, Michael Segel
> <[EMAIL PROTECTED]> wrote:
>> That section was written by Doug after he and I had the same debate man moons ago.
>
>
> I'm not sure that is correct.  If you git blame that section, you'll
> see that stack and andrew are the authors and that the edits were made
> in 2009 and 2010.
>
> There is this section in the book but it doesn't seem to have the
> benefit of your input:
> http://hbase.apache.org/book.html#mapreduce.example.summary.noreducer
>
>
>> While I can't say with absolute certainty that you shouldn't use a reducer, I can say is that every situation where I have seen a M/R where you are writing to HBase, you end up not wanting to use a reducer. If you want a clear and concise statement you can say that the rule of thumb is that you don't want to use a reducer and that cases where you would need to first use a reducer are the rare exception.
>>
>
> Please file an issue w/ a patch.  It'd be good to get your experience
> into the doc.
>
>> The reason I ask people to think about this topic is that unless you have a really good foundation in databases, not relying on a reducer is a bit counter intuitive. (Which is why I said that you really need to clear your mind and focus on this issue. )
>>
>
> Lets make it so that if you don't have a foundation in dbs, if you
> read the doc., you won't need such a background to get the best of
> hbase.
>
>> PS. If you care to read the thread, I didn't become condescending until a certain individual piped up about how refactoring the M/R was a 'distraction' to the issue at hand.
>> Not to mention his flip response w the Google paper?
>>
>
> There are a few problems w/ the above.
>
> + You presume I did not read the thread before responding
> + That the condescending tone started after Dave's intercessions (I
> was not referring to this thread only).
>
> Michael, fellas like you help move the hbase story along.   Generally,
> I see that you do a great job in this forum and in others.  In my
> previous note, I was just trying to give a pointer that what you might
> consider jest, others can read as condescending or sarcasm.
>
> St.Ack
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB