Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - hadoop cluster for querying data on mongodb


Copy link to this message
-
Re: hadoop cluster for querying data on mongodb
Martinus Martinus 2011-12-26, 02:31
Hi Joey,

Can you give more explanation about that? You mean I should make a new user
and new group or just ssh?

Thanks.

On Mon, Dec 26, 2011 at 3:57 AM, Joey Echeverria <[EMAIL PROTECTED]> wrote:

> Don't start your daemons as root. They should be started as a system
> account. Typically hdfs for the HDFS services and mapred for the
> MapReduce ones.
>
> -Joey
>
> On Fri, Dec 23, 2011 at 4:04 AM, Martinus Martinus
> <[EMAIL PROTECTED]> wrote:
> > Hi Ayon,
> >
> > I tried to setup the hadoop-cluster using hadoop-0.20.2 and it seem's to
> be
> > ok, but when I tried to used another version of hadoop, such as
> > hadoop-0.20.3, when I start-all.sh, it gaves me an error like this :
> >
> > uvm12dk: Unrecognized option: -jvm
> > uvm12dk: Could not create the Java virtual machine.
> >
> > Would you be so kindly to help me with this problem?
> >
> > Thanks.
> >
> > Martinus
> >
> >
> > On Wed, Dec 21, 2011 at 1:12 PM, Ayon Sinha <[EMAIL PROTECTED]> wrote:
> >>
> >> Couple of things:
> >> 1. Hadoop's strength is in data locality. So having most of your Hadoop
> >> heavy lifting on local filesystem (HDFS where hadoop computation is
> shipped
> >> to the nodes with the data).
> >> 2. Assuming you are pulling data into Hadoop from Mongo to crunch and
> put
> >> the resulting data back into Mongo as only the 1st and the last step in
> your
> >> entire workflow, you are basically looking for a MongoInputFormat and
> >> MongoOutputFormat (I made up the class names). you are probably looking
> for
> >> https://jira.mongodb.org/browse/HADOOP/component/10736
> >>
> >> Your other options if using Pig or Hive is to write Loader UDF's,
> similar
> >> to PigStorage, HBaseStorage, etc.
> >>
> >> -Ayon
> >> See My Photos on Flickr
> >> Also check out my Blog for answers to commonly asked questions.
> >>
> >> ________________________________
> >> From: Martinus Martinus <[EMAIL PROTECTED]>
> >> To: [EMAIL PROTECTED]
> >> Sent: Tuesday, December 20, 2011 7:31 PM
> >> Subject: hadoop cluster for querying data on mongodb
> >>
> >> Hi,
> >>
> >> I have hadoop cluster running and have my data inside mongodb database.
> I
> >> already write a java code to query data on mongodb using mongodb-java
> >> driver. And right now, I want to use hadoop cluster to run my java code
> to
> >> get and put the data from and to mongo database. Did anyone has done
> this
> >> before? Can you explain to me how to do that?
> >>
> >> Thanks.
> >>
> >>
> >
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>