Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Mapping MySQL schema to Avro


Copy link to this message
-
Re: Mapping MySQL schema to Avro
On a quick look pass, this looks sane, and nests many-to-one sets of data
within the parent record.

A few things to think about:
* A double in mySQL will be a double in Avro, not a long.
* Each field that can be null in the database should be a union of null
and the field type.  For example, if a schema was

//pseudo syntax
CREATE TABLE "Foo" (
id int primary_key,
product_id int not null,
message varchar[100])

-----
then the record would need three fields -- the first two are integers that
are not nullable, and the last one is a string that may be null:

{"type":"record",
 "name":"Foo",
 "fields":{
  "id":"int",
  "product_id":"int",
  "message":[null, "string"]
 }
}

-Scott

On 11/24/12 5:23 AM, "Bart Verwilst" <[EMAIL PROTECTED]> wrote:

>Hello!
>
>I'm currently writing an importer to import our MySQL data into hadoop
>( as Avro files ). Attached you can find the schema i'm converting to
>Avro, along with the corresponding Avro schema i would like to use for
>my imported data. I was wondering if you guys could go over the schema
>and determine if this is sane/optimal, and if not, how i should improve
>it.
>
>As a sidenote, i converted bigints to long, and had 1 occurrence of
>double, which i also converted to long in the avro, not sure if that's
>the correct type?
>
>Thanks in advance for your expert opinions! ;)
>
>Kind regards,
>
>Bart