Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Hbase for real-time data aggregation


Copy link to this message
-
Re: Hbase for real-time data aggregation
As far as my exp it not bad to go wid hbase. only proble is you will not
get redimade things. if your going wid it you can look for indexing option
available wid hbase. you cn try hsearch and lily project for indexing and
fast retrieval.

On Fri, Jan 6, 2012 at 11:25 PM, prasenjit mukherjee
<[EMAIL PROTECTED]>wrote:

> I need to design a near real-time system where documents ( with
> fields:id,keywords,timestamp ) are getting added to the system. The
> requirement is to get top-k keywords from the documents added to the
> system in last x minutes. The typical document addition rate is around
> 100 documents/sec, which may increase in the future ( hence technology
> should be horizontally scalable ).
>
> I am thinking of using hbase. For each document we can add a set of
> keys ( for all the keywords in that doc )  with timestamp_keywords.
> During query time we can run a map-reduce job over a keyrange ( from
> ts1_* to ts2* ) to compute the the keyword frequency for that range.
>
> Any other better technologies  for this use-case ? Like MomgoDB,
> Cassandra, Storm etc. The use case is primarily on aggregation.
>
> -prasen
>

--
Shashwat Shriparv
09900059620
09663531241

<iframe src="
http://rcm.amazon.com/e/cm?t=shriparv-20&o=1&p=48&l=ur1&category=kindlerotating&f=ifr"
width="728" height="90" scrolling="no" border="0" marginwidth="0"
style="border:none;" frameborder="0"></iframe>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB