This scenario i am thinking is to: 1) output to local file system(like Linux ) instead of hdfs 2) each regserver only output its only data to it's node's file system
To elaborate the 2) a bit. Basically, this will be like export Hbase data to local file system without going through network. And on each node, one file will be created.
Is there a way to achieve it? Actually the receiving side of 1) doesn't have to be a file system , it can be another process to process the data. But let's use file system to simplify the scenario for now.
This sounds an awful lot like a map-only MR job... With Hadoop Streaming, you should be able to achieve your goal of piping to an arbitrary process. On Tue, Aug 19, 2014 at 4:26 PM, Demai Ni <[EMAIL PROTECTED]> wrote:
A coprocessor is certainly possible. You haven't shared your motivation, only a specific implementation, so I cannot assist further. On Tue, Aug 19, 2014 at 6:28 PM, Demai Ni <[EMAIL PROTECTED]> wrote:
I am not sure exactly the use case yet, just doing some experiment. Current idea is to have a join with data from a mpp database, and have a program from mpp run on each node of Hbase, so instead of get a collection of all data, the join operation can occur at each regserver lever. Actually join may not be a good example here. The idea is to access data at regserver level but still be able to leverage Hbase filters.
Demai on the run
On Aug 19, 2014, at 7:39 PM, Nick Dimiduk <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation project and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext