Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Hive + mongoDB


Copy link to this message
-
Re: Hive + mongoDB
Sandip,

Did you try using hive-mongo (https://github.com/yc-huang/Hive-mongo).

Its pretty easy to use as well. If you want to start with analytics
directly.
On Thu, Sep 12, 2013 at 2:02 PM, Sandeep Nemuri <[EMAIL PROTECTED]>wrote:

> Thanks all
> i am trying to import data with this program
> but when i compied this code i got errors
>
> Here is the code
>
> import java.io.*;
> import org.apache.commons.logging.*;
> import org.apache.hadoop.conf.*;
> import org.apache.hadoop.fs.Path;
> import org.apache.hadoop.io.*;
> import org.apache.hadoop.mapreduce.lib.output.*;
> import org.apache.hadoop.mapreduce.*;
> import org.bson.*;
> import com.mongodb.hadoop.*;
> import com.mongodb.hadoop.util.*;
>
> public class ImportWeblogsFromMongo {
>
> private static final Log log = LogFactory.
> getLog(ImportWeblogsFromMongo.class);
>
> public static class ReadWeblogsFromMongo extends Mapper<Object,
> BSONObject, Text, Text>{
>
> public void map(Object key, BSONObject value, Context context) throws
> IOException, InterruptedException{
>
> System.out.println("Key: " + key);
> System.out.println("Value: " + value);
>
> String md5 = value.get("md5").toString();
> String url = value.get("url").toString();
> String date = value.get("date").toString();
> String time = value.get("time").toString();
> String ip = value.get("ip").toString();
> String output = "\t" + url + "\t" + date + "\t" + time + "\t" + ip;
>
> context.write( new Text(md5), new Text(output));
> }
> }
>
> public static void main(String[] args) throws Exception{
>
> final Configuration conf = new Configuration();
>
> MongoConfigUtil.setInputURI(conf,"mongodb://localhost:27017/mongo_hadoop.example");
>
> MongoConfigUtil.setCreateInputSplits(conf, false);
> System.out.println("Configuration: " + conf);
>
> final Job job = new Job(conf, "Mongo Import");
> Path out = new Path("/user/mongo_data");
> FileOutputFormat.setOutputPath(job, out);
> job.setJarByClass(ImportWeblogsFromMongo.class);
> job.setMapperClass(ReadWeblogsFromMongo.class);
> job.setOutputKeyClass(Text.class);
> job.setOutputValueClass(Text.class);
> job.setInputFormatClass(MongoInputFormat.class);
> job.setOutputFormatClass(TextOutputFormat.class);
> job.setNumReduceTasks(0);
> System.exit(job.waitForCompletion(true) ? 0 : 1 );
> }
> }
>
>
>
> On Wed, Sep 11, 2013 at 11:50 PM, Russell Jurney <[EMAIL PROTECTED]
> > wrote:
>
>> The docs are at https://github.com/mongodb/mongo-hadoop/tree/master/hive
>>
>> You need to build mongo-hadoop, and then use the documented syntax to
>> create BSON tables in Hive.
>>
>>
>> On Wed, Sep 11, 2013 at 11:11 AM, Jitendra Yadav <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi,
>>>
>>> 1. you may use Hadoop-mongodb connector, create a map reduce program
>>> to process your data from mongodb to hive.
>>>
>>> https://github.com/mongodb/mongo-hadoop
>>>
>>>
>>> 2. As an alternative you can also use pig mongodb combination to get
>>> the data from mongodb through pig, then after you can create a table
>>> in hive that will points to the pig output file on hdfs.
>>>
>>> https://github.com/mongodb/mongo-hadoop/blob/master/pig/README.md
>>>
>>> Regards
>>> Jitendra
>>> On 9/11/13, Jérôme Verdier <[EMAIL PROTECTED]> wrote:
>>> > Hi,
>>> >
>>> > You can use Talend to import data from mongodb to hive
>>> >
>>> > More informations here : http://www.talend.com/products/big-data
>>> >
>>> >
>>> > 2013/9/11 Sandeep Nemuri <[EMAIL PROTECTED]>
>>> >
>>> >> Hi every one ,
>>> >>                        I am trying to import data from mongodb to
>>> hive .
>>> >> i
>>> >> got some jar files to connect mongo and hive .
>>> >> now how to import the data from mongodb to hive ?
>>> >>
>>> >> Thanks in advance.
>>> >>
>>> >> --
>>> >> --Regards
>>> >>   Sandeep Nemuri
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > *Jérôme VERDIER*
>>> > 06.72.19.17.31
>>> > [EMAIL PROTECTED]
>>> >
>>>
>>
>>
>>
>> --
>> Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.
>> com
>>

Nitin Pawar