Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Hackathon notes 3/21/2011


Copy link to this message
-
Hackathon notes 3/21/2011
Dear HBase developers,

Last Monday, several HBase contributors met up at the StumbleUpon offices
for a bit of a hackathon. We spent the beginning of the day discussing a few
general topics, and then from about 11am through 7pm or so most of us
hunkered down to hacking on various projects. I was the secretary for the
morning, so here are the notes. Please excuse any typos or if I got your
name wrong - I was never cut out for stenography.

Thanks to those who came, and special thanks to the folks at StumbleUpon for
space, food, and beer!
Agenda:
 - Upcoming releases:
   - 0.90.2 - when to release? a few bugs
   - 0.91.x - - should we do one?
   - 0.92.0 - when and what?
 - Next user group meetup?
 - Upcoming features:
   - Rolling restart improvements?
   - Online config change
   - Security and build issues
   - Distributed splitting
 - Maybe produce some code today! (power through above, then work on
respective priorities)

---

People:

 - Stack @ StumbleUpon
 - Todd @ Cloudera
 - Elliot @ NGMoco - using 0.89 in prod, 0.90.1 about to be rolled out
 - Ted Yu from CarrierIQ
 - Liyin and Nicolas from Facebook, using 0.89 for messaging product
 - Benoit from SU - TSDB
 - Mingjie, Eugene, Gary from TrendMicro - using some internal build
which is like trunk (security + coprocessors frankenbuild)
 - JD from SU
 - Prakash Khemani from FB - his group is on 0.90 - increment heavy workload
   - has a patch for distributed splitting
   - if a server goes down, takes 10-15 minutes to catch up, so wants
to reduce that time window
 - Marc, independent consultant with MetaMarkets right now - 0.90.1
"pseudo prdoction" work
 - Ryan from StumbleUpon
-----

0.90.2:
 - next week? (week of 3/28?)
   - there are some bugs that need ot be fixed still
   - candidate end of this week, then some time for testing
 - Stack has volunteered to be release manager

0.91.x:
 - should we do it?
   - people seem to think yes
   - but we shouldn't put much effort into testing these pre-release
   - there are a lot of interesting things in trunk that people might
want to play with

0.92.x:
 - JD would like to have something more than alpha quality in time
for Hadoop Summit (3rd or 4th week of June)
 - What are pending items?
   - Coprocessors
   - Online schema changes? Makes Coprocessors more useful
   - HBASE-1502 - removing heartbeats
   - HBASE-2856 - ACID fixes
   - Distributed splitting
 - Time based or feature based? we want to try doing really time based
 - May 1st for first release candidate
Next meetup:
 - some time in April? in south bay?

Features:
 - Rolling restart: Stack working on it
 - Online schema edit? FB finds it a pain point but Nicolas not sure
where it ranks on their priority list
 - Online config changes?
 - Online schema change is probably more important than online config
change, since config change can be done with rolling restart
   - For co-processors, we need to attack some classloading issues
before online schema change can really reload coprocessor
implementations

Security and build:
 - Security code has been isolated as much as possible:
   - two separate layers:
     - RPC layer does secure RPC - pluggable RPC implementation and
subclassing for HBaseServer and Client classes
     - Loadable coprocessors for auth
 - But building is difficult - need to build against a secure Hadoop
in order to do this
   - conditional build step? maven module?
 - Stack and Gary will look into how to build and release this:
   - maybe Maven profiles? modules?
   - separate jar to be added to classpath with stuff that depends on
security

Distributed splitting:
 - HLogSplitter code is pretty different on FB's 0.90 branch
 - But most stuff plugs easily into trunk
 - Same interface:
   - call splitLog with server name
   - master uses SplitLogManager - puts log splitting tasks in ZK
   - each RS has SplitLogWorkers - watch for tasks, race to grab them in ZK
   - each RS splits logs one at a time
   - RS pings the master on the tasks as it splits them
   - master can preempt a task away from a worker
   - when master comes up it needs to grab orphanned tasks
 - some unit tests done, but hasn't been substantially tested on real
cluster
 - Current splitting does batching - multiple input logs go to one
output file per region
   - new splitting creates 3-4x as many files for recovered.edits
   - this is OK - we already handle this with seqids
 - If whole cluster goes down, something like MapReduce makes more sense
 - this feature is targeted towards single-RS failure
   - currently seeing downtime of 10 minutes when RS goes down
   - FB has various internal scripts/tools ("HyperShell") that let
them do the full-cluster-failure case efficiently, but they don't have
a clean way of open sourcing it
   - Maybe we can build something like this with hbase-regionservers.sh
What are we working on:
 - Todd - maybe making YCSB runnable as integration test
 - Stack - rolling restart? with Nicolas's help perhaps
 - Marc - add some new cases to hbck
 - Ryan - maybe porting RPC to Thrift?
   - wants to resolve the meta-in-ZK ticket as "wontfix"
 - Prakash - distributed splitting
 - JD - fix bugs he saw over the weekend
 - Gary - work on splitting out security build (maven pom file fun)
 - Eugene: ZK-938 - kerberos stuff for ZooKeeper (necessary for HBase
security)
   - or maybe just fix some open bugs in HBase
 - Mingjie: open bugs for secure HBase (Access Control related)
 - Benoit: busy working on StumbleUpon stuff - mostly just observing
 - Nicolas: multithreaded compactions - needs to be refactored and cleaned
up
   - they have very big storefiles (10GB+) so their compactions take 1hr+
   - or just talking to people about stuff - easier than IRC
 - Liyin - add ability to do ZK miniclusters with multiple ZKs
 - Ted - working on pending patches / testing
 - Elliot: HBASE-3541 - HBase rest multigets
Todd Lipcon
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB