Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - sum, avg, count, etc...


+
Rita 2011-10-26, 10:21
+
Doug Meil 2011-10-26, 13:18
+
Gary Helmling 2011-10-26, 18:49
+
Rita 2011-10-27, 00:27
Copy link to this message
-
Re: sum, avg, count, etc...
Rita 2011-10-29, 12:38
For the values,
...
price=26.81
openclose...
Does hbase do a full scan across all values or does it have a constant
lookup, O(1) ?

On Wed, Oct 26, 2011 at 8:27 PM, Rita <[EMAIL PROTECTED]> wrote:

> Thanks for all of your responses.
>
> The original file is a text file and when I try to search that using grep
> it takes minutes. So, taking 7 seconds aint too bad.
>
> thanks again for your time and advise
>
>
> On Wed, Oct 26, 2011 at 2:49 PM, Gary Helmling <[EMAIL PROTECTED]>wrote:
>
>> Also, make sure that you're either setting a stop row on the scan, or
>> if you're using a filter, try wrapping it in a WhileMatchFilter.  This
>> tells the scanner it can stop as soon as the filter starts rejecting
>> rows.  Otherwise you can wind up getting back just the data you
>> expect, but still scanning all the way to the end of the table, just
>> filtering out all the remaining rows.
>>
>> On Wed, Oct 26, 2011 at 6:18 AM, Doug Meil
>> <[EMAIL PROTECTED]> wrote:
>> > Hi there-
>> >
>> > First, make sure you aren't tripping on any of these issues..
>> >
>> > http://hbase.apache.org/book.html#perf.reading
>> >
>> >
>> >
>> >
>> >
>> > On 10/26/11 6:21 AM, "Rita" <[EMAIL PROTECTED]> wrote:
>> >
>> >>I am trying to do some simple statistics with my data but its taking
>> >>longer
>> >>than expected.
>> >>
>> >>
>> >>
>> >>Here is how my data is structured in hbase.
>> >>
>> >>keys (symbol#epoch time stamp)
>> >>msft#1319562974#NASDAQ
>> >>t#1319562974#NYSE
>> >>yhoo#1319562974#NASDAQ
>> >>msft#1319562975#NASDAQ
>> >>
>> >>The values look like this (for instance microsoft)
>> >>...
>> >>price=26.81
>> >>open>> >>close>> >>...
>> >>
>> >>there are about 300 values per each key.
>> >>
>> >>
>> >>So, for instance if I want to calculate avg price of msft I am setting
>> up
>> >>a
>> >>start and stop filter and its able to calculate it by tick. But its
>> taking
>> >>about 7 seconds to go thru 500 keys. Is that normal? Is there a faster
>> way
>> >>to calculate sum,avg,count in hbase? would I need to redo my schema?
>> >>
>> >>tia
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>--
>> >>--- Get your facts first, then you can distort them as you please.--
>> >
>> >
>>
>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>

--
--- Get your facts first, then you can distort them as you please.--
+
Doug Meil 2011-10-29, 15:26