Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # user >> How does one preprocess the data so that they can be exported using sqoop


Copy link to this message
-
How does one preprocess the data so that they can be exported using sqoop
Hi

I would be grateful for any tips on how to "prepare" the data so they can
be exported to a Postgesql Database using sqoop.

As an example:

Provided some files of events. (user events, product events,
productActivity events)

[file0001]
event:user propertes:{name:"john" ...}
event:product properties:{ref:123,color:"blue",...
event:productActivity properties:{user:"john", product:"ref", action:"buy"}
.....

How does one come up with the primary keys and the user_product join table
ready to be exported?

On other words.

function(Input:eventfile) => output:[productFile, userFile,
user_productFile with auto generated primary keys ]

what goes into function?

I hope this makes sense!

Thank you in advance for any help

-matt
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB