Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Errors after major compaction

Copy link to this message
Re: Errors after major compaction
> Well, the master doesn't know that s05 has the region open -- thats
> why it gives it to s02 -- and then, there is no channel available to
> s05 to figure who has what

The way I see it, that's the root of the problem. It would probably
make sense if the RS could figure this out independently from the
master. I don't really see a way to do that other than storing the
region allocation in a central "reliable" location (read ZK), having
each RS register itself there when it opens a region and constantly
monitor the assignment of of the regions it holds, looking for other
RSs that registered the same region. In which case they can either try
to work out which one should be the owner of the region or they could
both close the region and let the master select a new RS. This is
obviously a rough idea that needs more polishing, like how to handle
old records of dead servers, but that's the only way I can think of
for guaranteeing there is no double assignment other than using
broadcasts and election algorithms.
I can work out the details if people think it's interesting. There's
also a discussion about it in HBASE-4060.