Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> hadoop cluster for querying data on mongodb


Copy link to this message
-
Re: hadoop cluster for querying data on mongodb
Hi Ayon,

I tried to setup the hadoop-cluster using hadoop-0.20.2 and it seem's to be
ok, but when I tried to used another version of hadoop, such as
hadoop-0.20.3, when I start-all.sh, it gaves me an error like this :

uvm12dk: Unrecognized option: -jvm
uvm12dk: Could not create the Java virtual machine.

Would you be so kindly to help me with this problem?

Thanks.

Martinus

On Wed, Dec 21, 2011 at 1:12 PM, Ayon Sinha <[EMAIL PROTECTED]> wrote:

> Couple of things:
> 1. Hadoop's strength is in data locality. So having most of your Hadoop
> heavy lifting on local filesystem (HDFS where hadoop computation is shipped
> to the nodes with the data).
> 2. Assuming you are pulling data into Hadoop from Mongo to crunch and put
> the resulting data back into Mongo as only the 1st and the last step in
> your entire workflow, you are basically looking for a MongoInputFormat and
> MongoOutputFormat (I made up the class names). you are probably looking for
> https://jira.mongodb.org/browse/HADOOP/component/10736
>
> Your other options if using Pig or Hive is to write Loader UDF's, similar
> to PigStorage, HBaseStorage, etc.
>
> -Ayon
> See My Photos on Flickr <http://www.flickr.com/photos/ayonsinha/>
> Also check out my Blog for answers to commonly asked questions.<http://dailyadvisor.blogspot.com>
>
>   ------------------------------
> *From:* Martinus Martinus <[EMAIL PROTECTED]>
> *To:* [EMAIL PROTECTED]
> *Sent:* Tuesday, December 20, 2011 7:31 PM
> *Subject:* hadoop cluster for querying data on mongodb
>
> Hi,
>
> I have hadoop cluster running and have my data inside mongodb database. I
> already write a java code to query data on mongodb using mongodb-java
> driver. And right now, I want to use hadoop cluster to run my java code to
> get and put the data from and to mongo database. Did anyone has done this
> before? Can you explain to me how to do that?
>
> Thanks.
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB