Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> MR missing lines


+
Jean-Marc Spaggiari 2012-12-16, 12:52
+
Kevin Odell 2012-12-16, 14:05
+
Asaf Mesika 2012-12-16, 16:28
+
Jean-Marc Spaggiari 2012-12-17, 00:20
+
Jean-Marc Spaggiari 2012-12-17, 12:15
+
Jean-Marc Spaggiari 2012-12-18, 13:37
+
Anoop Sam John 2012-12-19, 05:11
+
Jean-Marc Spaggiari 2012-12-20, 00:39
+
Anoop Sam John 2012-12-20, 04:24
Copy link to this message
-
Re: MR missing lines
You can use MR counters to count your overall Deletes, to see if they
match your table count. Also, does your job input record count match
the expected count of the table you intended to clear?

On Sun, Dec 16, 2012 at 6:22 PM, Jean-Marc Spaggiari
<[EMAIL PROTECTED]> wrote:
> Hi,
>
> I have a table where I'm running MR each time is exceding 100 000 rows.
>
> When the target is reached, all the feeding process are stopped.
>
> Yesterday it reached 123608 rows. So I stopped the feeding process,
> and ran the MR.
>
> For each line, the MR is creating a delete. The delete is placed on a
> list, and when the list reached 10 elements, it's sent to the table.
> In the clean method, the list is sent to the table if there is any
> element in it.
>
> So at the en of the MR, I should have an empty table.
>
> The table is splitted over 128 regions. And I have 8 region servers.
>
> What is disturbing me is that after the MR, I had 38 lines remaining
> on the table. the MR took 348 minutes to run. So I ran the MR again,
> which this time took 2 minutes, and now I have 1 row remaining in the
> table.
>
> I looked at the logs (for the 38 lines run) and there is nothing in
> it. There is some scanner timeout exception for the run of the 100K
> rows.
>
> I'm running HBase 0.94.3.
>
> I will hava another 100K rows today, so I will re-run the job. I will
> increase the timeout to make sure I got no exception, but even when I
> ran the 38 lines with no exception one was remaining...
>
> Any idea why and where I can seach? It's not really an issue for me
> since I can just re-run the job, but this might be an issue for some
> others.
>
> JM

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB