I must say I'm only starting, and this is on cdh3u3.
I'm struggling with the sqoop options ATM, trying to import mysql tables
to hive, containing tabs, newlines, and all sorts of things.
I cannot figure out a proper combination of options on the sqoop command
line to turn these either to '\t' (in plain text, as mysql on the
command line does) or something usable, nor can I have
--mysql-delimiters output something usable (quotes are left in the final
file in hive), or... I tried many things, tried to understand the docs,
and am miserably failing to achieve anything usable.
The best I can do for now is import with piping mysql -B (which does all
the transliteration I need) to a fifo, and have hadoop read from that
for -put, then use this file in hive.
Could anyone, either:
- provide a working sqoop command line that actually works fine in
production (preferrably with --direct, since the files I have to put
there are quite big)
- provide alternative solutions, since maybe I'm going a completely
Thanks a million!
Arvind Prabhakar 2012-05-12, 21:33