Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> delete rows from hbase


+
Oleg Ruchovets 2012-06-18, 22:08
+
Jean-Daniel Cryans 2012-06-18, 22:18
+
Oleg Ruchovets 2012-06-18, 23:13
+
Jean-Daniel Cryans 2012-06-18, 23:18
+
Amitanand Aiyer 2012-06-18, 23:36
+
shashwat shriparv 2012-06-19, 07:43
+
Mohammad Tariq 2012-06-19, 10:46
+
Kevin Odell 2012-06-19, 13:26
+
Oleg Ruchovets 2012-06-19, 16:17
+
Anoop Sam John 2012-06-20, 08:38
Copy link to this message
-
Re: delete rows from hbase
Hi,

The simple way to do this as a map/reduce is the following....

Use the HTable Input and scan the records you want to delete.
In side Mapper.Setup() create a connection to the HTable where you want to delete the records.
In side Mapper.Map() for each iteration you will get a row which matched your scan that you set up in your ToolRunner.  If the record matches the criteria that you want to delete, you just issue a delete command passing in that row key.

And voila! You are done.

No muss, no fuss, and no reducer.

Its that easy.

There is no output that you return to your client job except if you maybe want to keep count of the records that you deleted and that's an easy thing to do using dynamic counters.

HTH
-Mike

On Jun 20, 2012, at 3:38 AM, Anoop Sam John wrote:

> Hi
>      Do some one tried for the possibility of an Endpoint implementation using which the delete can be done directly with the scan at server side.
> In the below samples I can see
> Client -> Server - Scan for certain rows ( we want the rowkeys satisfying our criteria)
> Client <- Server - returns the Results
> Client -> Server - Delete calls
>
> Instead using the Endpoints we can make one call from Client to Server in which both the scan and the delete will happen...
>
> -Anoop-
> ________________________________________
> From: Oleg Ruchovets [[EMAIL PROTECTED]]
> Sent: Tuesday, June 19, 2012 9:47 PM
> To: [EMAIL PROTECTED]
> Subject: Re: delete rows from hbase
>
> Thank you all for the answers. I try to speed up my solution and user
> map/reduce over hbase
>
> Here is the code:
> I want to use Delete (map function to delete the row) and I pass the same
> tableName  at TableMapReduceUtil.initTableMapperJob
> and TableMapReduceUtil.initTableReducerJob.
>
> Question: is it possible to pass Delete as I did in map function?
>
>
>
>
> public class DeleteRowByCriteria {
>    final static Logger LOG > LoggerFactory.getLogger(DeleteRowByCriteria.class);
>    public static class MyMapper extends
> TableMapper<ImmutableBytesWritable, Delete> {
>
>        public String account;
>        public String lifeDate;
>
>        @Override
>        public void map(ImmutableBytesWritable row, Result value, Context
> context) throws IOException, InterruptedException {
>            context.write(row, new Delete(row.get()));
>        }
>    }
>    public static void main(String[] args) throws ClassNotFoundException,
> IOException, InterruptedException {
>
> String tableName = args[0];
> String filterCriteria = args[1];
>
>        Configuration config = HBaseConfiguration.create();
>        Job job = new Job(config, "DeleteRowByCriteria");
>        job.setJarByClass(DeleteRowByCriteria.class);
>
>        try {
>
>            Filter campaignIdFilter = new
> PrefixFilter(Bytes.toBytes(filterCriteria));
>            Scan scan = new Scan();
>            scan.setFilter(campaignIdFilter);
>            scan.setCaching(500);
>            scan.setCacheBlocks(false);
>
>
>            TableMapReduceUtil.initTableMapperJob(
>                    tableName,
>                    scan,
>                    MyMapper.class,
>                    null,
>                    null,
>                    job);
>
>
>            TableMapReduceUtil.initTableReducerJob(
>                    tableName,
>                    null,
>                    job);
>            job.setNumReduceTasks(0);
>
>            boolean b = job.waitForCompletion(true);
>            if (!b) {
>                throw new IOException("error with job!");
>            }
>
>        }catch (Exception e) {
>            LOG.error(e.getMessage(), e);
>        }
>    }
> }
>
>
>
> On Tue, Jun 19, 2012 at 9:26 AM, Kevin O'dell <[EMAIL PROTECTED]>wrote:
>
>> Oleg,
>>
>> Here is some code that we used for deleting all rows with user name
>> foo.  It should be fairly portable to your situation:
>>
>> import java.io.IOException;
>>
>> import org.apache.hadoop.conf.Configuration;
>> import org.apache.hadoop.hbase.HBaseConfiguration;
+
Oleg Ruchovets 2012-06-20, 11:56
+
Michael Segel 2012-06-20, 14:10