Can you please provide your hbase-env, hbase-site.xml, and a describe of
the table in question? Also, what is your row key setup? When you are
doing the write do you see different region servers being written to or
just one? How many rows are in this 5GB of data?
On Sat, Oct 27, 2012 at 8:05 AM, nick maillard <
[EMAIL PROTECTED]> wrote:
> Hi everyone
> So I've set up a hadoop/hbase/hive 3 ubuntu machines cluster:
> master: Ubuntu 64bit, 8 core 3ghz, 16gb mem, gigaethernet connection
> 2slaves: the same
> I went around the different documentations,blogs and articles on hadoop
> and or
> Hbase understanding and tuning. map/reduce tasks 7, up heap param,
> xrecievers,compression,speculative exection off etc...
> I've installed ycbs to start stress testing as well on my own set of data.
> Looking around I saw a lot of experiences and tools to test but to put it
> put I don'tknow what I should expect.
> When I import through import TSV a 5 gb file it takes about an hour. (my
> are not incremental)
> When I stress test with one thread writing 10million entries it takes a
> over an hour.
> When I ask though hive something like 'select * from tableA where
> valueC=1' on a
> table of about 1,5 million elements it takes 4minutes to resolve.Arguably I
> should have a rowkey to really get a god time but this example is to test
> map/reduce against a dataset.
> So all in all what should I expect, is my dataset too small so it seems
> like a
> relatively long time. The writes seem really long and resolving through
> map/reduce seems long as well. Off course maybe the time would be the same
> for a
> much larger set which would make a lot more sense.
> Just for info I have checked with iostat and my disks are about 95% iddle.
> So If someone were kind enough to share what kind of performance I could
> with my cluster just to see If my set up is really not respondinf how it
> or If I'm using it the wrong way. Or if this is coherent
Customer Operations Engineer, Cloudera