Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> OutOfMemoryError: Java heap space after data load


Copy link to this message
-
Re: OutOfMemoryError: Java heap space after data load
Hi Eric,
Yes, I loaded 4.5 million entries with the shell to verify that things were
working properly.  With three shells running, and 99% of the data going to
a single TabletServer (due to my quick-hack RowKey structure, which I'm
changing to better mimic what the real rowkey structure will be ), it
ingested 1,400 entries per second.  Again, it flushed every 10,000 records
(30,000 entries).

Here is the start of the file and 2 logical records (3 entries each) of the
junk data text file:

table junkmeta
insert 20130426172656.954191-3300-04 attr vehicle "3300"
insert 20130426172656.954191-3300-04 attr stream "04"
insert 20130426172656.954191-3300-04 data rawmsg
"NzBwWP5xETl7B6eX7GHA9Kb4nv5rt7gx7HZkqtrMRjfwWZmnAIO h"
insert 20130426172656.954348-2200-11 attr vehicle "2200"
insert 20130426172656.954348-2200-11 attr stream "11"
insert 20130426172656.954348-2200-11 data rawmsg
"8LyKX3CKKfHPz1ZtdLn9ZCM0troO7gGnrVaKBxHHY1ICmpc8ewNZ8uKMlPHZZs9WxO"

I ran it from this simple shell script:

#!/bin/bash

AUSER=junk
AUSERPWD=XXXX
LOADFILE=/root/load-junkmeta.txt
LOG=${LOADFILE%.txt}.log

echo $(date) Starting Accumulo shell to load $LOADFILE, with output piped
to $LOG ... | tee $LOG
/usr/lib/accumulo/bin/accumulo shell -u $AUSER -p $AUSERPWD < $LOADFILE >>
$LOG
echo $(date) Load complete. | tee $LOG

Crude, but effective enough to validate the cluster is functioning well so
the developers can poke at it with their real programs. ;-)
On Apr 29, 2013, at 2:32 PM, Eric Newton <[EMAIL PROTECTED]> wrote:
For a quick test I have a text file I generated to load 500,000 rows of
> sample data using the Accumulo shell.
>

So you used the shell to insert lots of data?  One cell at a time?

-Eric