Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Scanner problem after bulk load hfile


+
Rohit Kelkar 2013-07-12, 15:50
+
Amit Sela 2013-07-14, 08:45
+
Rohit Kelkar 2013-07-15, 04:27
Copy link to this message
-
Re: Scanner problem after bulk load hfile
Amit Sela 2013-07-15, 14:14
Well, I know it's kind of voodoo but try it once before pre-split and once
after. Worked for me.
On Mon, Jul 15, 2013 at 7:27 AM, Rohit Kelkar <[EMAIL PROTECTED]> wrote:

> Thanks Amit, I am also using 0.94.2 . I am also pre-splitting and I tried
> the table.clearRegionCache() but still doesn't work.
>
> - R
>
>
> On Sun, Jul 14, 2013 at 3:45 AM, Amit Sela <[EMAIL PROTECTED]> wrote:
>
> > If new regions are created during the bulk load (are you pre-splitting
> ?),
> > maybe try myTable.clearRegionCache() after the bulk load (or even after
> the
> > pre-splitting if you do pre-split).
> > This should clear the region cache. I needed to use this because I am
> > pre-splitting my tables for bulk load.
> > BTW I'm using HBase 0.94.2
> > Good luck!
> >
> >
> > On Fri, Jul 12, 2013 at 6:50 PM, Rohit Kelkar <[EMAIL PROTECTED]>
> > wrote:
> >
> > > I am having problems while scanning a table created using HFile.
> > > This is what I am doing -
> > > Once Hfile is created I use following code to bulk load
> > >
> > > LoadIncrementalHFiles loadTool = new LoadIncrementalHFiles(conf);
> > > HTable myTable = new HTable(conf, mytablename.getBytes());
> > > loadTool.doBulkLoad(new Path(outputHFileBaseDir + "/" + mytablename),
> > > mytableTable);
> > >
> > > Then scan the table using-
> > >
> > > HTable table = new HTable(conf, mytable);
> > > Scan scan = new Scan();
> > > scan.addColumn("cf".getBytes(), "q".getBytes());
> > > ResultScanner scanner = table.getScanner(scan);
> > > for (Result rr = scanner.next(); rr != null; rr = scanner.next()) {
> > > numRowsScanned += 1;
> > > }
> > >
> > > This code crashes with following error - http://pastebin.com/SeKAeAST
> > > If I remove the scan.addColumn from the code then the code works.
> > >
> > > Similarly on the hbase shell -
> > > - A simple count 'mytable' in hbase shell gives the correct count.
> > > - A scan 'mytable' gives correct results.
> > > - get 'mytable', 'myrow', 'cf:q' crashes
> > >
> > > The hadoop dfs -ls /hbase/mytable shows the .tableinfo, .tmp, the
> > directory
> > > for region etc.
> > >
> > > Now if I do a major_compact 'mytable' and then execute my code with the
> > > scan.addColumn statement then it works. Also the get 'mytable',
> 'myrow',
> > > 'cf:q' works.
> > >
> > > My question is
> > > What is major_compact doing to enable the scanner that the
> > > LoadIncrementalFiles tool is not? I am sure I am missing a step after
> the
> > > LoadIncrementalFiles.
> > >
> > > - R
> > >
> >
>
+
Rohit Kelkar 2013-07-16, 20:15
+
Jimmy Xiang 2013-07-16, 20:28
+
Rohit Kelkar 2013-07-16, 21:33
+
Ted Yu 2013-07-16, 21:41
+
Jimmy Xiang 2013-07-16, 21:41
+
lars hofhansl 2013-07-16, 22:40
+
Ted Yu 2013-07-16, 23:15
+
Rohit Kelkar 2013-07-16, 23:20
+
Rohit Kelkar 2013-07-16, 23:39
+
lars hofhansl 2013-07-17, 00:21
+
Rohit Kelkar 2013-07-16, 23:11
+
Rohit Kelkar 2013-07-16, 21:53