Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Removing splits [SEC=UNCLASSIFIED]


Copy link to this message
-
Re: Removing splits [SEC=UNCLASSIFIED]
On Apr 8, 2013 8:56 PM, "David Medinets" <[EMAIL PROTECTED]> wrote:
>
> My understanding is that Delete Range will actually delete rows from the
Accumulo Table. Perhaps the Merge Tablets is a better option (
http://accumulo.apache.org/1.4/user_manual/Table_Configuration.html#Merging_tablets
)?

Good point. Delete range would be for when you want to age off the data. If
you want to keep the data in fewer tablets, look at merging.

Billie

> Splits are used by the AccumuloInputFormat class (during a map-reduce
job) to determine the number of mappers used by the job. If you have 1,000
splits, you'll get 1,000 mappers. Additionally, let's say that you have a
table with information from A to Z in one tablet (i.e., no splits). If you
split on M, then you'll have two tablets. As the table is split again and
again, more tablets are created. Eventually these tablets are spread over
the cluster. If you don't create the splits yourself, then Accumulo will
split a tablet automatically when its size passes some configurable
setting.
>
>
> On Mon, Apr 8, 2013 at 8:42 PM, Billie Rinaldi <[EMAIL PROTECTED]>
wrote:
>>
>> On Apr 8, 2013 7:36 PM, "Dickson, Matt MR" <[EMAIL PROTECTED]>
wrote:
>> >
>> > UNCLASSIFIED
>> >
>> > Hi guys,
>> >
>> > Just a simple question.  We ingest data in daily batches and create
splits on the data to distribute the loading, eg splits are 20130407-1,
20130407-2, ... 20130407-n
>> >
>> > Once this data is loaded the splits will not be required again.  Is
there a maximum number of splits a table can have?  How can splits be
removed once they are nolonger required, I can't see any command in the api?
>>
>> You can delete a range through the shell or the API:
>>
http://accumulo.apache.org/1.4/user_manual/Table_Configuration.html#Delete_Range
>>
>> Billie
>>
>> >
>> > Thanks in advance,
>> > Matt Dickson
>> >
>> > IMPORTANT: This email remains the property of the Department of
Defence and is subject to the jurisdiction of section 70 of the Crimes Act
1914. If you have received this email in error, you are requested to
contact the sender and delete the email.
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB