Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Hive + mongoDB


Copy link to this message
-
Re: Hive + mongoDB
can you just try setting  ("mongo.column.mapping" = "DocId, User, Username,
Name, ShippingAddress, Orders, OrderDate")

I have not yet tired nested documents but nested structures can change in a
single collection in mongo so just basic top level thing should do.

Once I get some free time, I will try to put an example
On Thu, Sep 19, 2013 at 2:39 PM, Sandeep Nemuri <[EMAIL PROTECTED]>wrote:

> Hi Nithin ,
>                I have used
>
> add jar /usr/lib/hive/lib/mongo-2.7.3.jar;
> add jar /usr/lib/hive/lib/hive-mongo-0.0.3.jar;
>
> create external table mongo_users2 (id int ,name string ,age int)
> stored by "org.yong3.hive.mongo.MongoStorageHandler"
> with serdeproperties ( "mongo.column.mapping" = "_id,name,age" )
> tblproperties ( "mongo.host" = "192.168.0.199", "mongo.port"="27017",
> "mongo.db" ="test", "mongo.user" = "test", "mongo.passwd" = "password",
> "mongo.collection" = "users" );
>
>
> It worked for me now i am able to extract data from mongodb
>
> I have a nested data like
>
> {
>    "DocId": "ABC",
>    "User": {
>      "Id": 1234,
>      "Username": "sam1234",
>      "Name": "Sam",
>      "ShippingAddress": {
>        "Address1": "123 Main St.",
>        "Address2": null,
>        "City": "Durham",
>        "State": "NC"
>      },
>      "Orders": [
>        {
>          "ItemId": 6789,
>          "OrderDate": "11/11/2012"
>        },
>        {
>          "ItemId": 4352,
>          "OrderDate": "12/12/2012"
>        }
>      ]
>    }
>  }
>
>     To extract this collection
> i have used
>
> CREATE EXTERNAL TABLE complex_json3 (
> DocId string,
> User struct<Id:int,
> Username:string,
> Name: string,
> ShippingAddress:struct<Address1:string,
>                                      Address2:string,
>                                      City:string,
>                                      State:string>,
> Orders:array<struct<ItemId:int,
> OrderDate:string>>>
> )
> ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
> stored by "org.yong3.hive.mongo.MongoStorageHandler"
> with serdeproperties ( "mongo.column.mapping" > "DocId,User.Id,User.Username,User.Name,User.ShippingAddress.Address1,User.ShippingAddress.Address2,User.ShippingAddress.City,User.ShippingAddress.State,User.Orders.ItemId,User.Orders.OrderDate"
> )
> tblproperties ( "mongo.host" = "192.168.0.199", "mongo.port"="27017",
> "mongo.db" ="mongo_hadoop", "mongo.user" = "mongo_hadoop", "mongo.passwd" > "password", "mongo.collection" = "complex" );
>
> i am not sure whether mongo.column.mapping syntax is correct or not.
> But i am not able to make it as it is nested data
>
>
>
>
> On Fri, Sep 13, 2013 at 9:34 PM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
>
>> Can you share your create table ddl for table name docs?
>>
>> Select statement does not need all those details. Those are part of
>> create table DDL only.
>>
>>
>> On Fri, Sep 13, 2013 at 4:24 PM, Sandeep Nemuri <[EMAIL PROTECTED]>wrote:
>>
>>> Hi nithin
>>>
>>> Thanks for your help
>>> I have used this query in hive to retrieve the data from mongodb
>>>
>>> add jar /usr/lib/hadoop/lib/mongo-2.8.0.jar;
>>> add jar /usr/lib/hive/lib/hive-mongo-0.0.3-jar-with-dependencies.jar;
>>>
>>> select * from docs
>>> input format "org.yong3.hive.mongo.MongoStorageHandler"
>>> with serdeproperties ( "mongo.column.mapping" >>> "_id,dayOfWeek,bc3Year,bc5Year,bc10Year,bc20Year,bc1Month,bc2Year,bc3Year,bc30Year,bc1Year,bc7Year,bc6Year"
>>> )
>>> tblproperties ( "mongo.host" = "127.0.0.1", "mongo.port" = "27017",
>>> "mongo.db" = "sample", "mongo.user" = "sample", "mongo.passwd" >>> "password", "mongo.collection" = "docs" );
>>>
>>>
>>> I got an Error
>>>
>>> FAILED: Parse Error: line 2:6 mismatched input 'format' expecting EOF
>>> near 'input'
>>>
>>>
>>>
>>> On Thu, Sep 12, 2013 at 6:23 PM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
>>>
>>>> try creating table with your existing mongo db and collection see the
>>>> data can be read by the user or not.
>>>> What you need to do is mongo collection column mapping exactly with

Nitin Pawar
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB