Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Occasional regionserver crashes following socket errors writing to HDFS


Copy link to this message
-
Re: Occasional regionserver crashes following socket errors writing to HDFS
Dave, do you really want to go there?

OP has a couple of issues and he was going down a rabbit hole.
(You can choose if that's a reference to 'the Matrix, Jefferson Starship, Alice in Wonderland... or all of the above)

So to put him on the correct path, I recommended the following, not in any order...

1) Increase his region size for this table only.
2) Look to decreasing the number of regions managed by a RS (which is why you increase region size)
3) Up the dfs.balance.bandwidthPerSec. (How often does HBase move regions and how exactly do they move regions ?)
4) Look at implementing MSLABS and GC tuning. This cuts down on the overhead.
5) Refactoring his job....

Oops.
Ok I didn't put that in the list.
But that was the last thing I wrote as a separate statement.
Clearly you didn't take my advice and think about the problem....

To prove a point.... you wrote:
'Many mapreduce algorithms require a reduce phase (e.g. sorting)'

Ok. So tell me why you would want to sort your input in to HBase and if that's really a good thing?
Oops!... :-)
On May 10, 2012, at 12:31 PM, Dave Revell wrote:
> This "you don't need a reducer" conversation is distracting from the real
> problem and is false.
>
> Many mapreduce algorithms require a reduce phase (e.g. sorting). The fact
> that the output is written to HBase or somewhere else is irrelevant.
>
> -Dave
>
> On Thu, May 10, 2012 at 6:26 AM, Michael Segel <[EMAIL PROTECTED]>wrote:
> [SNIP]

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB