Harish Mandala 2013-01-23, 23:21
-RE: Exporting some fairly complex json to sql server
Venkatesan Ranganathan 2013-01-24, 00:40
May be you can do it in 2 steps, an MR job to split your files into 2 separate files and then load them with SQOOP? I don't think SQOOP today supports it. You can also load to a one table only at a time.
> Date: Wed, 23 Jan 2013 18:21:24 -0500
> Subject: Exporting some fairly complex json to sql server
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> I've been playing with Sqoop, and it seems to fit my use case (to export
> some log data from HDFS to Microsoft SQL Server).
> A look at the documentation shows that sqoop will export/import between
> tables of similar schema. However, my data export is more complicated.
> Allow me to describe it. I have JSON strings stored in Hadoop Sequence
> files, with each string indexed by timestamp. Each JSON string is similar
> to the following:
> "Unique_Key": "123",
> "Timestamp": "8948024",
> "Inner_Array": [
> "Name": "XYZ1",
> "Value": "abc1"
> "Name": "XYZ2",
> "Value": "abc2"
> "Unique_Key": "456",
> "Timestamp": "89489802",
> "Inner_Array": [
> "Name": "JDJ1",
> "Value": "sfj"
> Each string represents an array of objects, with the "Unique_Key" and
> "Timestamp" of each of these objects corresponding to a row in one SQL
> table (Let's call it Table A). Each object has inside it another
> "Inner_Array" - each element of this Inner_Array needs to go into another
> SQL table (Table B), and will be associated with the previous table using
> the Unique_Key as a foreign key.
> So, the schema of the two SQL tables will be:
> Table A:
> Unique_Key (Primary Key) | TimeStamp
> Table B:
> Unique_Key (Foreign Key) | Name | Value
> If I wanted to implement this functionality in Sqoop (placing nested JSON
> in multiple tables), it seems I would need to firstly implement a "JSON
> parser" in lib and add schema mapping specifications to the configuration.
> We would also need to provide an option for parser selection. Is there
> anything I am missing? Any comments? Is this functionality already being
> implemented by someone?
> Thanks for your patient reading,