Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - M/R scan problem


Copy link to this message
-
Re: M/R scan problem
Lior Schachter 2011-07-04, 14:37
1. yes - I configure my job using this line:
TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME, scan,
ScanMapper.class, Text.class, MapWritable.class, job)

which internally uses TableInputFormat.class

2. One split per region ? What do you mean ? How do I do that ?

3. hbase version 0.90.2

4. no exceptions. the logs are very clean.

On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> Do you use TableInputFormat ?
> To scan large number of rows, it would be better to produce one Split per
> region.
>
> What HBase version do you use ?
> Do you find any exception in master / region server logs around the moment
> of timeout ?
>
> Cheers
>
> On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter <[EMAIL PROTECTED]>
> wrote:
>
> > Hi all,
> > I'm running a scan using the M/R framework.
> > My table contains hundreds of millions of rows and I'm scanning using
> > start/stop key about 50 million rows.
> >
> > The problem is that some map tasks get stuck and the task manager kills
> > these maps after 600 seconds. When retrying the task everything works
> fine
> > (sometimes).
> >
> > To verify that the problem is in hbase (and not in the map code) I
> removed
> > all the code from my map function, so it looks like this:
> > public void map(ImmutableBytesWritable key, Result value, Context
> context)
> > throws IOException, InterruptedException {
> > }
> >
> > Also, when the map got stuck on a region, I tried to scan this region
> > (using
> > simple scan from a Java main) and it worked fine.
> >
> > Any ideas ?
> >
> > Thanks,
> > Lior
> >
>