Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Scanner problem after bulk load hfile


Copy link to this message
-
Re: Scanner problem after bulk load hfile
Amit Sela 2013-07-14, 08:45
If new regions are created during the bulk load (are you pre-splitting ?),
maybe try myTable.clearRegionCache() after the bulk load (or even after the
pre-splitting if you do pre-split).
This should clear the region cache. I needed to use this because I am
pre-splitting my tables for bulk load.
BTW I'm using HBase 0.94.2
Good luck!
On Fri, Jul 12, 2013 at 6:50 PM, Rohit Kelkar <[EMAIL PROTECTED]> wrote:

> I am having problems while scanning a table created using HFile.
> This is what I am doing -
> Once Hfile is created I use following code to bulk load
>
> LoadIncrementalHFiles loadTool = new LoadIncrementalHFiles(conf);
> HTable myTable = new HTable(conf, mytablename.getBytes());
> loadTool.doBulkLoad(new Path(outputHFileBaseDir + "/" + mytablename),
> mytableTable);
>
> Then scan the table using-
>
> HTable table = new HTable(conf, mytable);
> Scan scan = new Scan();
> scan.addColumn("cf".getBytes(), "q".getBytes());
> ResultScanner scanner = table.getScanner(scan);
> for (Result rr = scanner.next(); rr != null; rr = scanner.next()) {
> numRowsScanned += 1;
> }
>
> This code crashes with following error - http://pastebin.com/SeKAeAST
> If I remove the scan.addColumn from the code then the code works.
>
> Similarly on the hbase shell -
> - A simple count 'mytable' in hbase shell gives the correct count.
> - A scan 'mytable' gives correct results.
> - get 'mytable', 'myrow', 'cf:q' crashes
>
> The hadoop dfs -ls /hbase/mytable shows the .tableinfo, .tmp, the directory
> for region etc.
>
> Now if I do a major_compact 'mytable' and then execute my code with the
> scan.addColumn statement then it works. Also the get 'mytable', 'myrow',
> 'cf:q' works.
>
> My question is
> What is major_compact doing to enable the scanner that the
> LoadIncrementalFiles tool is not? I am sure I am missing a step after the
> LoadIncrementalFiles.
>
> - R
>