Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> hadoop cluster for querying data on mongodb


Copy link to this message
-
Re: hadoop cluster for querying data on mongodb
Hi Joey,

Can you give more explanation about that? You mean I should make a new user
and new group or just ssh?

Thanks.

On Mon, Dec 26, 2011 at 3:57 AM, Joey Echeverria <[EMAIL PROTECTED]> wrote:

> Don't start your daemons as root. They should be started as a system
> account. Typically hdfs for the HDFS services and mapred for the
> MapReduce ones.
>
> -Joey
>
> On Fri, Dec 23, 2011 at 4:04 AM, Martinus Martinus
> <[EMAIL PROTECTED]> wrote:
> > Hi Ayon,
> >
> > I tried to setup the hadoop-cluster using hadoop-0.20.2 and it seem's to
> be
> > ok, but when I tried to used another version of hadoop, such as
> > hadoop-0.20.3, when I start-all.sh, it gaves me an error like this :
> >
> > uvm12dk: Unrecognized option: -jvm
> > uvm12dk: Could not create the Java virtual machine.
> >
> > Would you be so kindly to help me with this problem?
> >
> > Thanks.
> >
> > Martinus
> >
> >
> > On Wed, Dec 21, 2011 at 1:12 PM, Ayon Sinha <[EMAIL PROTECTED]> wrote:
> >>
> >> Couple of things:
> >> 1. Hadoop's strength is in data locality. So having most of your Hadoop
> >> heavy lifting on local filesystem (HDFS where hadoop computation is
> shipped
> >> to the nodes with the data).
> >> 2. Assuming you are pulling data into Hadoop from Mongo to crunch and
> put
> >> the resulting data back into Mongo as only the 1st and the last step in
> your
> >> entire workflow, you are basically looking for a MongoInputFormat and
> >> MongoOutputFormat (I made up the class names). you are probably looking
> for
> >> https://jira.mongodb.org/browse/HADOOP/component/10736
> >>
> >> Your other options if using Pig or Hive is to write Loader UDF's,
> similar
> >> to PigStorage, HBaseStorage, etc.
> >>
> >> -Ayon
> >> See My Photos on Flickr
> >> Also check out my Blog for answers to commonly asked questions.
> >>
> >> ________________________________
> >> From: Martinus Martinus <[EMAIL PROTECTED]>
> >> To: [EMAIL PROTECTED]
> >> Sent: Tuesday, December 20, 2011 7:31 PM
> >> Subject: hadoop cluster for querying data on mongodb
> >>
> >> Hi,
> >>
> >> I have hadoop cluster running and have my data inside mongodb database.
> I
> >> already write a java code to query data on mongodb using mongodb-java
> >> driver. And right now, I want to use hadoop cluster to run my java code
> to
> >> get and put the data from and to mongo database. Did anyone has done
> this
> >> before? Can you explain to me how to do that?
> >>
> >> Thanks.
> >>
> >>
> >
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB