Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase tasks


Copy link to this message
-
RE: HBase tasks
Hi
>But what to do, if I have an HBase in-memory table,
Why you say in memory table? All the data in memory? Can u explain a bit abt this?

Yes there is MR job to scan the HBase table data. (Full or part)

When you say you want to retrieve data fast, what is the ammount of data? How many regions? Any testing u have done with scan APIs?

Which version of HBase?

-Anoop-
________________________________________
From: Pavel Hančar [[EMAIL PROTECTED]]
Sent: Saturday, April 06, 2013 10:15 PM
To: [EMAIL PROTECTED]
Subject: HBase tasks

  Hello,
maybe I don't understand one basic thing. MapReduce jobs are there for long
jobs, that process some big data. But what to do, if I have an HBase
in-memory table, where I would like to process all (or selected) records
with minimal time response. Also MapReduce?
   If so, are there any features to speed up the processing? Is possible to
avoid some disk writes/reads?
  I try to compare some vectors extracted from pictures and sort the output
with a single empty reducer. Then I take the output by a web application.
Especially the last write of the output of the single reducer and then the
reading it by the web application seems strange to me. Is it possible to
get an iterator from the reducer instead of the output file?
  Thanks,
  Pavel Hančar
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB