Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Reduce task failing on job with error java.lang.IllegalStateException: Keys appended out-of-order


Copy link to this message
-
Re: Reduce task failing on job with error java.lang.IllegalStateException: Keys appended out-of-order
Something that may be worth thinking more about is that whether you use
bulk or live ingest, col3 sorts lexicographically after col16.
You may not be able to output that for bulk ingest without further padding
or encoding, as Bill said, but you may also not be happy with getting your
data back in that order if you use live ingest. Either way you may need to
think about the encoding/sorting issues.
On Thu, Dec 6, 2012 at 9:15 AM, Chris Burrell <[EMAIL PROTECTED]> wrote:

> Is this a limitation of the bulk ingest approach? Does the MapReduce job
> need to give the data to the AccumuloOutputFileFormat in
> a lexicographically-sorted manner? If so, is this not a rather big
> limitation of this approach, as you need to ensure your data comes in from
> your various data sources in a form such that the accumulo keys are then
> sorted.
>
> This seems to suggest that although the bulk ingest would be very quick,
> you would lose most of the time trying to sort and adapt the source files
> themselves in the MR job.
>
> Chris
>
>
>
> On 6 December 2012 14:08, William Slacum <[EMAIL PROTECTED]>wrote:
>
>> Excuse me, 'col3' sorts lexicographically *after* 'col16'.
>>
>>
>> On Thu, Dec 6, 2012 at 9:07 AM, William Slacum <
>> [EMAIL PROTECTED]> wrote:
>>
>>> 'col3' sorts lexicographically before 'col16'. you'll either need to
>>> encode your numerics or zero pad them.
>>>
>>>
>>> On Thu, Dec 6, 2012 at 9:03 AM, Andrew Catterall <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>> Hi,
>>>>
>>>>
>>>> I am trying to run a bulk ingest to import data into Accumulo but it is
>>>> failing at the reduce task with the below error:
>>>>
>>>>
>>>>
>>>> java.lang.IllegalStateException: Keys appended out-of-order.  New key
>>>> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:col3 [
>>>> myVis] 9223372036854775807 false, previous key client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a
>>>> foo:col16 [myVis] 9223372036854775807 false
>>>>
>>>>         at
>>>> org.apache.accumulo.core.file.rfile.RFile$Writer.append(RFile.java:378)
>>>>
>>>>
>>>>
>>>> Could this be caused by the order at which the writes are being done?
>>>>
>>>>
>>>> *-- Background*
>>>>
>>>> *
>>>> *
>>>>
>>>> The input file is a tab separated file.  A sample row would look like:
>>>>
>>>> Data1    Data2    Data3    Data4    Data5    …             DataN
>>>>
>>>>
>>>>
>>>> The map parses the data, for each row, into a Map<String, String>.
>>>> This will contain the following:
>>>>
>>>> Col1       Data1
>>>>
>>>> Col2       Data2
>>>>
>>>> Col3       Data3
>>>>
>>>> …
>>>>
>>>> ColN      DataN
>>>>
>>>>
>>>> An outputKey is then generated for this row in the format *
>>>> client@timeStamp@randomUUID*
>>>>
>>>> Then for each entry in Map<String, String> a outputValue is generated
>>>> in the format *ColN|DataN*
>>>>
>>>> The outputKey and outputValue are written to Context.
>>>>
>>>>
>>>>
>>>> This completes successfully, however, the reduce task fails.
>>>>
>>>>
>>>> My ReduceClass is as follows:
>>>>
>>>>
>>>>
>>>>       *public* *static* *class* ReduceClass *extends* Reducer<Text,Text,Key,Value>
>>>> {
>>>>
>>>>          *public* *void* reduce(Text key, Iterable<Text> keyValues,
>>>> Context output) *throws* IOException, InterruptedException {
>>>>
>>>>
>>>>
>>>>                 // for each value belonging to the key
>>>>
>>>>                 *for* (Text keyValue : keyValues) {
>>>>
>>>>
>>>>
>>>>                        //split the keyValue into *Col* and Data
>>>>
>>>>                      String[] values = keyValue.toString().split("\\|"
>>>> );
>>>>
>>>>
>>>>
>>>>                      // Generate key
>>>>
>>>>                      Key outputKey = *new* Key(key, *new* Text("foo"),
>>>> *new* Text(values[0]), *new* Text("myVis"));
>>>>
>>>>
>>>>
>>>>                      // Generate value
>>>>
>>>>                      Value outputValue = *new* Value(values[1].getBytes(),
>>>> 0, values[1].length());
>>>>
>>>>
>>>>
>>>>                      // Write to context