Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> M/R scan problem


Copy link to this message
-
Re: M/R scan problem
1. yes - I configure my job using this line:
TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan,
ScanMapper.class, Text.class, MapWritable.class, job)

which internally uses TableInputFormat.class

2. One split per region ? What do you mean ? How do I do that ?

3. hbase version 0.90.2

4. no exceptions. the logs are very clean.

On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> Do you use TableInputFormat ?
> To scan large number of rows, it would be better to produce one Split per
> region.
>
> What HBase version do you use ?
> Do you find any exception in master / region server logs around the moment
> of timeout ?
>
> Cheers
>
> On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter <[EMAIL PROTECTED]>
> wrote:
>
> > Hi all,
> > I'm running a scan using the M/R framework.
> > My table contains hundreds of millions of rows and I'm scanning using
> > start/stop key about 50 million rows.
> >
> > The problem is that some map tasks get stuck and the task manager kills
> > these maps after 600 seconds. When retrying the task everything works
> fine
> > (sometimes).
> >
> > To verify that the problem is in hbase (and not in the map code) I
> removed
> > all the code from my map function, so it looks like this:
> > public void map(ImmutableBytesWritable key, Result value, Context
> context)
> > throws IOException, InterruptedException {
> > }
> >
> > Also, when the map got stuck on a region, I tried to scan this region
> > (using
> > simple scan from a Java main) and it worked fine.
> >
> > Any ideas ?
> >
> > Thanks,
> > Lior
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB