Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> NoSuchColumnFamilyException with rowcounter


+
Jean-Marc Spaggiari 2012-10-11, 17:26
+
Kevin Odell 2012-10-11, 17:30
+
Jean-Marc Spaggiari 2012-10-11, 17:43
+
Stack 2012-10-11, 20:09
+
Jean-Daniel Cryans 2012-10-11, 20:10
+
Jean-Marc Spaggiari 2012-10-11, 20:20
+
Jean-Daniel Cryans 2012-10-11, 20:27
+
Jean-Marc Spaggiari 2012-10-11, 20:36
+
Jean-Daniel Cryans 2012-10-11, 20:42
Copy link to this message
-
Re: NoSuchColumnFamilyException with rowcounter
2 tasks at the same time, for a total of 25 tasks at the end.

Maybe as you are saying, I'm not facing the good jobtracker? I'm
running the command line on the master server.

If I look at the map tasks, I can see that:
Input Split Locations /default-rack/node1

With differents values depending on the tasks, but on the same page I
can see machine=/default-rack/node3 (which is my master).

How/where should I run this? Should I point it to Zookeeper instance instead?

Thanks,

JM
2012/10/11 Jean-Daniel Cryans <[EMAIL PROTECTED]>:
> 2 tasks total or that are running at the same time? If latter, it just
> means that you are using the local job tracker instead of your job
> tracker because HBase couldn't find your MR config.
>
> J-D
>
> On Thu, Oct 11, 2012 at 1:36 PM, Jean-Marc Spaggiari
> <[EMAIL PROTECTED]> wrote:
>> Hi J-D,
>>
>> I have about 20M rows over 25 regions on 6 nodes. So that mean I
>> should see something like 6 tasks or even 25, right? And not just 2?
>> Keys are 128 byte long. Value is 1 byte.
>>
>> I tried also to update mapreduce.tasktracker.map.tasks.maximum but
>> this is "the number of map tasks that should be launched on each node,
>> not the number of nodes to be used for each map task.", so there was
>> no changes, as expected.
>>
>> JM
>>
>> 2012/10/11 Jean-Daniel Cryans <[EMAIL PROTECTED]>:
>>> On Thu, Oct 11, 2012 at 1:20 PM, Jean-Marc Spaggiari
>>> <[EMAIL PROTECTED]> wrote:
>>>> I'm now using thsi command line and it's working fine (except for the
>>>> number of tasks).
>>>> HADOOP_CLASSPATH=`/home/hbase/hbase-0.94.0/bin/hbase
>>>> classpath`:`/home/hadoop/hadoop-1.0.3/bin/hadoop classpath`
>>>> /home/hadoop/hadoop-1.0.3/bin/hadoop jar
>>>> /home/hbase/hbase-0.94.0/hbase-0.94.1.jar rowcounter
>>>> -Dhbase.client.scanner.caching=100 -Dmapred.map.tasks=6
>>>> -Dmapred.map.tasks.speculative.execution=false work_proposed
>>>>
>>>> I simply don't know if the -D parameters are taken into consideration
>>>> since I get the same results (numbers of tasks, time of exec, etc.)
>>>> with and without them.
>>>
>>> Using a higher caching value won't do much good if you don't have a
>>> lot of rows. Since you didn't include any data like that in your
>>> email, I won't guess how much 100 would help your case.
>>>
>>> The number of map tasks when mapping an HBase table will be the number
>>> of regions you have in that table. Unfortunately you can't change it
>>> unless you write your own input format for HBase.
>>>
>>> J-D
+
Jean-Daniel Cryans 2012-10-11, 20:59
+
Jean-Marc Spaggiari 2012-10-11, 21:06
+
Jean-Daniel Cryans 2012-10-11, 21:16
+
Jean-Marc Spaggiari 2012-10-11, 21:46
+
Jean-Daniel Cryans 2012-10-11, 21:50
+
Jean-Marc Spaggiari 2012-10-11, 22:18
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB