Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo, mail # user - Reduce task failing on job with error java.lang.IllegalStateException: Keys appended out-of-order


+
Andrew Catterall 2012-12-06, 14:03
Copy link to this message
-
Re: Reduce task failing on job with error java.lang.IllegalStateException: Keys appended out-of-order
William Slacum 2012-12-06, 14:07
'col3' sorts lexicographically before 'col16'. you'll either need to encode
your numerics or zero pad them.

On Thu, Dec 6, 2012 at 9:03 AM, Andrew Catterall <
[EMAIL PROTECTED]> wrote:

> Hi,
>
>
> I am trying to run a bulk ingest to import data into Accumulo but it is
> failing at the reduce task with the below error:
>
>
>
> java.lang.IllegalStateException: Keys appended out-of-order.  New key
> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:col3 [myVis]
> 9223372036854775807 false, previous key client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a
> foo:col16 [myVis] 9223372036854775807 false
>
>         at
> org.apache.accumulo.core.file.rfile.RFile$Writer.append(RFile.java:378)
>
>
>
> Could this be caused by the order at which the writes are being done?
>
>
> *-- Background*
>
> *
> *
>
> The input file is a tab separated file.  A sample row would look like:
>
> Data1    Data2    Data3    Data4    Data5    …             DataN
>
>
>
> The map parses the data, for each row, into a Map<String, String>.  This
> will contain the following:
>
> Col1       Data1
>
> Col2       Data2
>
> Col3       Data3
>
> …
>
> ColN      DataN
>
>
> An outputKey is then generated for this row in the format *
> client@timeStamp@randomUUID*
>
> Then for each entry in Map<String, String> a outputValue is generated in
> the format *ColN|DataN*
>
> The outputKey and outputValue are written to Context.
>
>
>
> This completes successfully, however, the reduce task fails.
>
>
> My ReduceClass is as follows:
>
>
>
>       *public* *static* *class* ReduceClass *extends* Reducer<Text,Text,Key,Value>
> {
>
>          *public* *void* reduce(Text key, Iterable<Text> keyValues,
> Context output) *throws* IOException, InterruptedException {
>
>
>
>                 // for each value belonging to the key
>
>                 *for* (Text keyValue : keyValues) {
>
>
>
>                        //split the keyValue into *Col* and Data
>
>                      String[] values = keyValue.toString().split("\\|");
>
>
>
>                      // Generate key
>
>                      Key outputKey = *new* Key(key, *new* Text("foo"), *
> new* Text(values[0]), *new* Text("myVis"));
>
>
>
>                      // Generate value
>
>                      Value outputValue = *new* Value(values[1].getBytes(),
> 0, values[1].length());
>
>
>
>                      // Write to context
>
>                      output.write(outputKey, outputValue);
>
>                 }
>
>          }
>
>       }
>
>
>
>
> *-- Expected output*
>
>
>
> I am expecting the contents of the Accumulo table to be as follows:
>
>
>
> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col1 [myVis]
> Data1
>
> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col2 [myVis]
> Data2
>
> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col3 [myVis]
> Data3
>
> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col4 [myVis]
> Data4
>
> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col5 [myVis]
> Data5
>
> …
>
> client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:ColN [myVis]
> DataN
>
>
>
>
>
> Thanks,
>
> Andrew
>
+
William Slacum 2012-12-06, 14:08
+
Chris Burrell 2012-12-06, 14:15
+
Josh Elser 2012-12-06, 15:15
+
Chris Burrell 2012-12-06, 18:35
+
Josh Elser 2012-12-07, 03:33
+
Michael Flester 2012-12-07, 03:34