Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Reduce task failing on job with error java.lang.IllegalStateException: Keys appended out-of-order


Copy link to this message
-
Reduce task failing on job with error java.lang.IllegalStateException: Keys appended out-of-order
Hi,
I am trying to run a bulk ingest to import data into Accumulo but it is
failing at the reduce task with the below error:

java.lang.IllegalStateException: Keys appended out-of-order.  New key
client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:col3 [myVis]
9223372036854775807 false, previous key
client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a
foo:col16 [myVis] 9223372036854775807 false

        at
org.apache.accumulo.core.file.rfile.RFile$Writer.append(RFile.java:378)

Could this be caused by the order at which the writes are being done?
*-- Background*

*
*

The input file is a tab separated file.  A sample row would look like:

Data1    Data2    Data3    Data4    Data5    …             DataN

The map parses the data, for each row, into a Map<String, String>.  This
will contain the following:

Col1       Data1

Col2       Data2

Col3       Data3



ColN      DataN
An outputKey is then generated for this row in the format *client@timeStamp
@randomUUID*

Then for each entry in Map<String, String> a outputValue is generated in
the format *ColN|DataN*

The outputKey and outputValue are written to Context.

This completes successfully, however, the reduce task fails.
My ReduceClass is as follows:

      *public* *static* *class* ReduceClass *extends*
Reducer<Text,Text,Key,Value>
{

         *public* *void* reduce(Text key, Iterable<Text> keyValues, Context
output) *throws* IOException, InterruptedException {

                // for each value belonging to the key

                *for* (Text keyValue : keyValues) {

                       //split the keyValue into *Col* and Data

                     String[] values = keyValue.toString().split("\\|");

                     // Generate key

                     Key outputKey = *new* Key(key, *new* Text("foo"), *new*
 Text(values[0]), *new* Text("myVis"));

                     // Generate value

                     Value outputValue = *new* Value(values[1].getBytes(),
0, values[1].length());

                     // Write to context

                     output.write(outputKey, outputValue);

                }

         }

      }
*-- Expected output*

I am expecting the contents of the Accumulo table to be as follows:

client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col1 [myVis]
Data1

client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col2 [myVis]
Data2

client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col3 [myVis]
Data3

client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col4 [myVis]
Data4

client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col5 [myVis]
Data5



client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:ColN [myVis]
DataN

Thanks,

Andrew
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB