Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Mixed input formats in LOAD path


Copy link to this message
-
Re: Mixed input formats in LOAD path
Ruslan Al-Fakikh 2012-06-15, 12:37
Hi Johannes,

I guess you'd have to write a custom Loader for such a situation, but
why do you need to load everything in one pass? You can load different
types of files separately (having multiple LOAD statements) and make a
join or a union afterwards.

Ruslan

On Fri, Jun 15, 2012 at 4:13 PM, Johannes Schwenk
<[EMAIL PROTECTED]> wrote:
> Hi all,
>
> is it possible to have an input path (as parameter to a LOAD statement)
> that contains several files in *different formats* - say serialized Avro
> data and tab separated values and make pig read the data into one alias?
> I guess I have to write an UDF for this? How should I start, can you
> sketch out a rough plan on how to proceed?
>
>
> Greetings,
> Johannes Schwenk
>
> --
> Softwareentwickler (Reporting)
> ________________________________________________________
>
> ADITION technologies AG
> Schwarzwaldstraße 78b
> 79117 Freiburg
>
> http://www.adition.com
>
> T +49 / (0)761 / 88147 - 30
> F +49 / (0)761 / 88147 - 77
> SUPPORT +49  / (0)1805 - ADITION
>
> (Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min)
>
> Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076
> Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus Schlüter
> Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer
> UStIDNr.: DE 218 858 434
>

--
Best Regards,
Ruslan Al-Fakikh