Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Inconsistent row count between mapreduce and shell count


Copy link to this message
-
Re: Inconsistent row count between mapreduce and shell count
Hmm... Can you show us the exact commands you executed?

And just to rule out the obvious:
1. There were no writes while you did the row count?
2. In the RowCount M/R case you specified neither a range nor any columns?
Do you always get the exact same numbers in both cases? Or do they vary?

Thanks.

-- Lars
----- Original Message -----
From: kiran chitturi <[EMAIL PROTECTED]>
To: user <[EMAIL PROTECTED]>
Cc:
Sent: Saturday, February 9, 2013 4:49 PM
Subject: Re: Inconsistent row count between mapreduce and shell count

Yes. I just counted the number of regions in '
http://machine1:60010/table.jsp?name=documents' and the count is 53 which
is equal to the number of complete tasks in hadoop.
Thanks,
Kiran.
On Sat, Feb 9, 2013 at 7:43 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> Apart from the 5 killed tasks, was the number of successful tasks equal to
> the number of regions in your table ?
>
> Thanks
>
> On Sat, Feb 9, 2013 at 4:14 PM, kiran chitturi <[EMAIL PROTECTED]
> >wrote:
>
> > Hi!
> >
> > I am using Hbase 0.94.1 version over a distributed cluster of 20 nodes.
> >
> > When i execute hbase count over a table in a shell, i got the count of
> > 2152416 rows.
> >
> > When i did the same thing using the rowcounter mapreduce, i got the value
> > as below
> >
> > org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counters
> > 13/02/10 00:05:06 INFO mapred.JobClient:     ROWS=1389991
> >
> > Same thing happened when i used pig to count or do operations. There is
> > inconsistency between both the results.
> >
> > During the mapreduce, i have noticed that there are 5 tasks that are
> > killed. When i tried to trace back to the tasktracker logs of the node it
> > shows similar to below log.
> >
> > 2013-02-09_23:58:58.40665 13/02/09 23:58:58 INFO mapred.TaskTracker: JVM
> > with ID: jvm_201302090035_0015_m_1905604998 given task:
> > attempt_201302090035_0015_m_000012_1
> > 2013-02-09_23:59:03.57016 13/02/09 23:59:03 INFO mapred.TaskTracker:
> > Received KillTaskAction for task: attempt_201302090035_0015_m_000012_1
> > 2013-02-09_23:59:03.57034 13/02/09 23:59:03 INFO mapred.TaskTracker:
> About
> > to purge task: attempt_201302090035_0015_m_000012_1
> > 2013-02-09_23:59:03.61003 13/02/09 23:59:03 INFO util.ProcessTree:
> Killing
> > process group9745 with signal TERM. Exit code 0
> >
> > I have also tried to run the tool 'hbck' but it shows no inconsistencies.
> >
> > Can you please suggest me why there is inconsistency and how can i
> correct
> > it ?
> >
> > Thanks,
> > --
> > Kiran Chitturi
> >
>

--
Kiran Chitturi