-Speculative execution and TableOutputFormat
Bryan Keller 2011-09-10, 20:22
I believe there is a problem with Hadoop's speculative execution (which is on by default), and HBase's TableOutputFormat. If I understand correctly, speculative execution can launch the same task on multiple nodes, but only "commit" the one that finishes first. The other tasks that didn't complete are killed.
I encountered some strange behavior with speculative execution and TableOutputFormat. It looks like context.write() will submit the rows to HBase (when the write buffer is full). But there is no "rollback" if the task that submitted the rows did not finish first and is later killed. The rows remain submitted.
My particular job uses a partitioner so one node will process all records that match the partition. The reducer selects among the records and persists these to HBase. With speculative execution turned on, the reducer for the partition is actually run on 2 nodes, and both end up inserting into HBase, even though the second reducer is eventually killed. The results were not what I wanted.
Turning off speculative execution resolved my issue. I think this should be set off by default when using TableOutputFormat, unless there is a better way to handle this.