Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill >> mail # dev >> Steam lib


Copy link to this message
-
Re: Steam lib
Ted -

Any chance we can add your quantile estimator to stream-lib?

Matt

On Wed, Nov 13, 2013 at 5:38 AM, Ted Dunning <[EMAIL PROTECTED]> wrote:
> I also have a new quantile estimator that dominates all other
> implementations that I know of on speed and accuracy (10us per point added,
> 8K data size to get a few ppm accuracy for high or low quantiles and about
> 0.05% accuracy on middle quantiles like the median).
>
>
>
>
> On Wed, Nov 13, 2013 at 8:53 AM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:
>
>> Summingbird uses algebird. I think Stripe might also have a library, Avi
>> Bryant was toying with this for a while.
>>
>> Algebird has some nice features like not doing approximation at all for
>> small sets (just use the real values), etc. we also recently did a bunch of
>> work to make sure we can serialize all approximate structures so they can
>> be correctly reused by different computations, sent across the wire, etc.
>>
>> I don't recall doing speed comparisons and the like, it would be
>> interesting to see them if you guys are choosing what library to use.
>>
>> On Nov 13, 2013, at 12:33 AM, Ted Dunning <[EMAIL PROTECTED]> wrote:
>>
>> > stream-lib is used quite widely and is generally high quality.
>> >
>> > The other competitive library is Brick House from Klout.
>> >
>> >
>> http://engineering.klout.com/2013/01/introducing-brickhouse-major-open-source-release-from-klout/
>> >
>> >
>> >
>> >
>> > On Tue, Nov 12, 2013 at 7:28 PM, Timothy Chen <[EMAIL PROTECTED]> wrote:
>> >
>> >> Just saw this library today and thought it's something we can
>> potentially
>> >> leverage:
>> >>
>> >> https://github.com/addthis/stream-lib
>> >>
>> >> It has a number of algo for approximation streams and has code for
>> >> cardinality estimation (HyperLogLog) and others.
>> >>
>> >> Looks like Twitter's SummingBird uses this library too.
>> >>
>> >> Tim
>> >>
>>