Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> sum, avg, count, etc...


+
Rita 2011-10-26, 10:21
+
Doug Meil 2011-10-26, 13:18
+
Gary Helmling 2011-10-26, 18:49
+
Rita 2011-10-27, 00:27
Copy link to this message
-
Re: sum, avg, count, etc...
For the values,
...
price=26.81
openclose...
Does hbase do a full scan across all values or does it have a constant
lookup, O(1) ?

On Wed, Oct 26, 2011 at 8:27 PM, Rita <[EMAIL PROTECTED]> wrote:

> Thanks for all of your responses.
>
> The original file is a text file and when I try to search that using grep
> it takes minutes. So, taking 7 seconds aint too bad.
>
> thanks again for your time and advise
>
>
> On Wed, Oct 26, 2011 at 2:49 PM, Gary Helmling <[EMAIL PROTECTED]>wrote:
>
>> Also, make sure that you're either setting a stop row on the scan, or
>> if you're using a filter, try wrapping it in a WhileMatchFilter.  This
>> tells the scanner it can stop as soon as the filter starts rejecting
>> rows.  Otherwise you can wind up getting back just the data you
>> expect, but still scanning all the way to the end of the table, just
>> filtering out all the remaining rows.
>>
>> On Wed, Oct 26, 2011 at 6:18 AM, Doug Meil
>> <[EMAIL PROTECTED]> wrote:
>> > Hi there-
>> >
>> > First, make sure you aren't tripping on any of these issues..
>> >
>> > http://hbase.apache.org/book.html#perf.reading
>> >
>> >
>> >
>> >
>> >
>> > On 10/26/11 6:21 AM, "Rita" <[EMAIL PROTECTED]> wrote:
>> >
>> >>I am trying to do some simple statistics with my data but its taking
>> >>longer
>> >>than expected.
>> >>
>> >>
>> >>
>> >>Here is how my data is structured in hbase.
>> >>
>> >>keys (symbol#epoch time stamp)
>> >>msft#1319562974#NASDAQ
>> >>t#1319562974#NYSE
>> >>yhoo#1319562974#NASDAQ
>> >>msft#1319562975#NASDAQ
>> >>
>> >>The values look like this (for instance microsoft)
>> >>...
>> >>price=26.81
>> >>open>> >>close>> >>...
>> >>
>> >>there are about 300 values per each key.
>> >>
>> >>
>> >>So, for instance if I want to calculate avg price of msft I am setting
>> up
>> >>a
>> >>start and stop filter and its able to calculate it by tick. But its
>> taking
>> >>about 7 seconds to go thru 500 keys. Is that normal? Is there a faster
>> way
>> >>to calculate sum,avg,count in hbase? would I need to redo my schema?
>> >>
>> >>tia
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>--
>> >>--- Get your facts first, then you can distort them as you please.--
>> >
>> >
>>
>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>

--
--- Get your facts first, then you can distort them as you please.--
+
Doug Meil 2011-10-29, 15:26
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB