Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> sum, avg, count, etc...


Copy link to this message
-
sum, avg, count, etc...
I am trying to do some simple statistics with my data but its taking longer
than expected.

Here is how my data is structured in hbase.

keys (symbol#epoch time stamp)
msft#1319562974#NASDAQ
t#1319562974#NYSE
yhoo#1319562974#NASDAQ
msft#1319562975#NASDAQ

The values look like this (for instance microsoft)
...
price=26.81
openclose...

there are about 300 values per each key.
So, for instance if I want to calculate avg price of msft I am setting up a
start and stop filter and its able to calculate it by tick. But its taking
about 7 seconds to go thru 500 keys. Is that normal? Is there a faster way
to calculate sum,avg,count in hbase? would I need to redo my schema?

tia

--
--- Get your facts first, then you can distort them as you please.--