Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - hadoop cluster for querying data on mongodb


+
Martinus Martinus 2011-12-21, 03:31
+
Ayon Sinha 2011-12-21, 05:12
Copy link to this message
-
Re: hadoop cluster for querying data on mongodb
Martinus Martinus 2011-12-23, 09:04
Hi Ayon,

I tried to setup the hadoop-cluster using hadoop-0.20.2 and it seem's to be
ok, but when I tried to used another version of hadoop, such as
hadoop-0.20.3, when I start-all.sh, it gaves me an error like this :

uvm12dk: Unrecognized option: -jvm
uvm12dk: Could not create the Java virtual machine.

Would you be so kindly to help me with this problem?

Thanks.

Martinus

On Wed, Dec 21, 2011 at 1:12 PM, Ayon Sinha <[EMAIL PROTECTED]> wrote:

> Couple of things:
> 1. Hadoop's strength is in data locality. So having most of your Hadoop
> heavy lifting on local filesystem (HDFS where hadoop computation is shipped
> to the nodes with the data).
> 2. Assuming you are pulling data into Hadoop from Mongo to crunch and put
> the resulting data back into Mongo as only the 1st and the last step in
> your entire workflow, you are basically looking for a MongoInputFormat and
> MongoOutputFormat (I made up the class names). you are probably looking for
> https://jira.mongodb.org/browse/HADOOP/component/10736
>
> Your other options if using Pig or Hive is to write Loader UDF's, similar
> to PigStorage, HBaseStorage, etc.
>
> -Ayon
> See My Photos on Flickr <http://www.flickr.com/photos/ayonsinha/>
> Also check out my Blog for answers to commonly asked questions.<http://dailyadvisor.blogspot.com>
>
>   ------------------------------
> *From:* Martinus Martinus <[EMAIL PROTECTED]>
> *To:* [EMAIL PROTECTED]
> *Sent:* Tuesday, December 20, 2011 7:31 PM
> *Subject:* hadoop cluster for querying data on mongodb
>
> Hi,
>
> I have hadoop cluster running and have my data inside mongodb database. I
> already write a java code to query data on mongodb using mongodb-java
> driver. And right now, I want to use hadoop cluster to run my java code to
> get and put the data from and to mongo database. Did anyone has done this
> before? Can you explain to me how to do that?
>
> Thanks.
>
>
>
+
Joey Echeverria 2011-12-25, 19:57
+
Martinus Martinus 2011-12-26, 02:31
+
Martinus Martinus 2011-12-26, 04:14