On 23/09/10 19:25, Renaud Delbru wrote:
> On 23/09/10 19:22, Stack wrote:
>>> Will the TotalOrderPartitioner found in the hadoop library not work for
>>> 0.20.x ?
>> You might have to do what Todd did in TRUNK where he brought over the
>> 'mapred' TotalOrderPartitioner to go against the new 'mapreduce' API
>> (The bulk load is done against the hadoop 'new' API 'mapreduce' as
>> opposed to 'mapred' package). You might even be able to just copy
>> what Todd has done in trunk over to your 0.20 install?
> Yes, it is what we did, and it seems to work.
The job has failed because the TotalOrderPartitioner requires a
partitions.lst file, which should contains the list of start keys for
each region. However, in our case, since we are building the table from
scratch, we don't know the start keys of each partition. Is there a way
to bypass this, or do we first need to run a scan on our data collection
to create this partition list ?