Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # dev - Meeting about hbase 0.95 release


Copy link to this message
-
Meeting about hbase 0.95 release
Jonathan Hsieh 2013-05-06, 18:43
These are my rough notes from a meatspace meeting was held on 5/3/15

tl;dr a large group of folks pushing on the 0.95 release went through the
myriad of jirae and hashed out which ones we should try to get into the
next 0.95 release, and bumped several issues from the 0.95.1 release.

There was also a substantial discussion about getting hadoop2 tests to pass
(and understanding why the currently fail) and also about the new table
namespacing feature.

Other topics: more people please review, BoF session for Hadoop Summit,
Integration testing suite.

----

(notes get sparser towards the end).

5/3/13
Location: Hortonworks HQ, Palo Alto, CA
Who:  Stack, JD, Jon H , Jimmy X, Himanshu, Dave W, Elliott C, Ted Y, Enis,
Deveraj, Sergey, Jeff Z, (Matteo, Nicholas L on the phone).

Went through blockers and critical lists (goal, resolve or nix before
release):

Blocker:
HBASE-5995 - logrol on dn pipeline restart (in progress)

Critical:
HBASE-8366 logs full trace stuff.
HBASE-5923 check and clean - nix
HBASE-8450 - hbase-defaults.xml updates - stack to do, everyone please look
at it
HBASE-7997 - on last class moves. - not critical
HBASE-6891 - hadoop2 - investigate speed regression is hadoop2 or hadoop1.
* make unit test pass right new.  regression investigation not blocking.
* HBASE-8337 Hadoop2  - SCR problem - can resolve.  have reasonable reason
HBASE-7006 - mttr - distributed log replay - give feature good name and
close it out.
HBASE-3787 - increment non-idepmottent - sergey - working on tests.
HBASE-8449 - recover lease "fun and games" - stack working on it.
HBASE-7932 - region locations in meta.
HBASE-7897 - cells and tags. change interface. cell interface must be
hardened in 0.96. - big for intel
HBASE-8483 - zk leak.

Selected Majors to bump to critical?:
HBASE-8143 - short circuit read.  OOM.  every reader 1MB buffer, 600 block
readers, then lots of memory used.
- move to critical - enis: maybe set to hfile block size

Would be good to fix:
HBASE-8385 -
HBASE-7910 -
HBASE-7391 -
HBASE-7709 - infite loop in m/m replication - would be good to fix.
HBASE-7564 - replication refactor. -- if done done, if not leave out.
HBASE-7958 - stats per column family per-region - remove from 0.95.1
HBASE-6294 - usability, nice to fix.
HBASE-8479 - compile issue - generics problem.
HBASE-4050 - metrics to metric2 umbrella blocked by HBASE-7074 - docs for
hbase-4050
HBASE-6580 htable pool broken - leaving in for now.
HBASE-7839 - dead machine in integration - nice to have
HBASE-7840 - nice to have

----
Enis:

HBASE-8015 namespaces  (discussion below)
- file system structure likely changes things, ideally before 0.6
HBASE-6721 RS grouping stuff- can do with coproc and load balancer plugins
(not blocking0.
HBASE-7999 system tables? (deprecated if namespaces added)

namespace descriptor -
- distribute in cluster.  similar to zk permisison watcher

'.' vs some other separator char for <namespace><sep><table>
- '.' is what db's use, but is valid hbase table name and breaks hbase
tables
- other chars would look weird but don't break existing hbase tables.

the proposed dir structure move has impact on upgrades
- do a shutdown restart upgrade?
- Discussed potential approach where empty name space files don't move (and
don't break rolling upgrade)
- discussed how to deal with matteo's hfile pool approach (one dir with
hfiles, meta just has pointers, not dependent on dir strcuture)

how does this interact with rolling upgrades?
how does this  interact with hdfs quotas?

stack:
- not in 0.95 unless progress is made.
- Let's try to get it committed to a branch ala snapshots

----

patch available - discussion
- please do more reviews.

----

integration tests
- alex 94
- elliot 95

run via maven against distributed cluster
- system test framework
- cherry-pick, run via command line tool.
- 0.94 chaos monkey - every 1 in 5 fails
- 0.96 - 1 in 10 fails.

machines don't come backup.

4 tests are good
- test too many regions
- want tests to scale to scale
- want to kill less -- not completely random.

bigtop does this
- put in maven.
recover lease thing
locality improvements
Hadoop summit
- day before - birds of a feather
compactions

// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// [EMAIL PROTECTED]
+
Ted Yu 2013-05-06, 22:21
+
Jonathan Hsieh 2013-05-06, 22:27