ranjith raghunath 2012-11-17, 13:41
Miki Tebeka 2012-11-17, 15:36
Thanks for response. When you say avro tools you mean avro-tools-.....jar
Let me also run the flow by all of you. Use sqoop to download data from an
rdbms to avro format. Use avro tools to extract schema file. Use avro serde
to generate/update hive table. So this would eliminate the need for
statically mapping the fields in hive.
Is this flow one that makes sense?
On Nov 17, 2012 9:36 AM, "Miki Tebeka" <[EMAIL PROTECTED]> wrote:
> You can use the "avro" utility that comes when you install the Python
> package (or fastavro if you need 3.X support). Then run "avro cat
> --print-schema /path/to/avro/file".
> On Sat, Nov 17, 2012 at 5:41 AM, ranjith raghunath <
> [EMAIL PROTECTED]> wrote:
>> I could really use some advice on this topic.
>> I am pulling files in avro format from an external source (outside of the
>> cluster). How can I generate the avro schema file? The end goal is to have
>> it exposed in Hive.
Miki Tebeka 2012-11-17, 19:40
ranjith raghunath 2012-11-17, 19:48