Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Hive + mongoDB


Copy link to this message
-
Re: Hive + mongoDB
Nitin Pawar 2013-09-19, 09:20
can you just try setting  ("mongo.column.mapping" = "DocId, User, Username,
Name, ShippingAddress, Orders, OrderDate")

I have not yet tired nested documents but nested structures can change in a
single collection in mongo so just basic top level thing should do.

Once I get some free time, I will try to put an example
On Thu, Sep 19, 2013 at 2:39 PM, Sandeep Nemuri <[EMAIL PROTECTED]>wrote:

> Hi Nithin ,
>                I have used
>
> add jar /usr/lib/hive/lib/mongo-2.7.3.jar;
> add jar /usr/lib/hive/lib/hive-mongo-0.0.3.jar;
>
> create external table mongo_users2 (id int ,name string ,age int)
> stored by "org.yong3.hive.mongo.MongoStorageHandler"
> with serdeproperties ( "mongo.column.mapping" = "_id,name,age" )
> tblproperties ( "mongo.host" = "192.168.0.199", "mongo.port"="27017",
> "mongo.db" ="test", "mongo.user" = "test", "mongo.passwd" = "password",
> "mongo.collection" = "users" );
>
>
> It worked for me now i am able to extract data from mongodb
>
> I have a nested data like
>
> {
>    "DocId": "ABC",
>    "User": {
>      "Id": 1234,
>      "Username": "sam1234",
>      "Name": "Sam",
>      "ShippingAddress": {
>        "Address1": "123 Main St.",
>        "Address2": null,
>        "City": "Durham",
>        "State": "NC"
>      },
>      "Orders": [
>        {
>          "ItemId": 6789,
>          "OrderDate": "11/11/2012"
>        },
>        {
>          "ItemId": 4352,
>          "OrderDate": "12/12/2012"
>        }
>      ]
>    }
>  }
>
>     To extract this collection
> i have used
>
> CREATE EXTERNAL TABLE complex_json3 (
> DocId string,
> User struct<Id:int,
> Username:string,
> Name: string,
> ShippingAddress:struct<Address1:string,
>                                      Address2:string,
>                                      City:string,
>                                      State:string>,
> Orders:array<struct<ItemId:int,
> OrderDate:string>>>
> )
> ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
> stored by "org.yong3.hive.mongo.MongoStorageHandler"
> with serdeproperties ( "mongo.column.mapping" > "DocId,User.Id,User.Username,User.Name,User.ShippingAddress.Address1,User.ShippingAddress.Address2,User.ShippingAddress.City,User.ShippingAddress.State,User.Orders.ItemId,User.Orders.OrderDate"
> )
> tblproperties ( "mongo.host" = "192.168.0.199", "mongo.port"="27017",
> "mongo.db" ="mongo_hadoop", "mongo.user" = "mongo_hadoop", "mongo.passwd" > "password", "mongo.collection" = "complex" );
>
> i am not sure whether mongo.column.mapping syntax is correct or not.
> But i am not able to make it as it is nested data
>
>
>
>
> On Fri, Sep 13, 2013 at 9:34 PM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
>
>> Can you share your create table ddl for table name docs?
>>
>> Select statement does not need all those details. Those are part of
>> create table DDL only.
>>
>>
>> On Fri, Sep 13, 2013 at 4:24 PM, Sandeep Nemuri <[EMAIL PROTECTED]>wrote:
>>
>>> Hi nithin
>>>
>>> Thanks for your help
>>> I have used this query in hive to retrieve the data from mongodb
>>>
>>> add jar /usr/lib/hadoop/lib/mongo-2.8.0.jar;
>>> add jar /usr/lib/hive/lib/hive-mongo-0.0.3-jar-with-dependencies.jar;
>>>
>>> select * from docs
>>> input format "org.yong3.hive.mongo.MongoStorageHandler"
>>> with serdeproperties ( "mongo.column.mapping" >>> "_id,dayOfWeek,bc3Year,bc5Year,bc10Year,bc20Year,bc1Month,bc2Year,bc3Year,bc30Year,bc1Year,bc7Year,bc6Year"
>>> )
>>> tblproperties ( "mongo.host" = "127.0.0.1", "mongo.port" = "27017",
>>> "mongo.db" = "sample", "mongo.user" = "sample", "mongo.passwd" >>> "password", "mongo.collection" = "docs" );
>>>
>>>
>>> I got an Error
>>>
>>> FAILED: Parse Error: line 2:6 mismatched input 'format' expecting EOF
>>> near 'input'
>>>
>>>
>>>
>>> On Thu, Sep 12, 2013 at 6:23 PM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
>>>
>>>> try creating table with your existing mongo db and collection see the
>>>> data can be read by the user or not.
>>>> What you need to do is mongo collection column mapping exactly with

Nitin Pawar