Thanks Bill. Any ideas on how to hide the location of HDFS files from the
On Tue, Dec 11, 2012 at 9:42 PM, Bill Graham <[EMAIL PROTECTED]> wrote:
> I think the latter would be better. Since the LoadFunc would be decoupled
> from the data exporter you could schedule the exporting independent of the
> loading. We do something similar, without the $query part.
> On Tue, Dec 11, 2012 at 1:10 AM, Prashant Kommireddi <[EMAIL PROTECTED]
> > I was working on a LoadFunc and needed some ideas/second opinion on the
> > best way to do this:
> > 1. We use an API to download data from database as flat-files.
> > - A query is given with table name and fields required to extract
> > data
> > 2. Once 1. is done upload data to HDFS
> > 3. Upload the schema file to HDFS
> > 4. LoadFunc to read the schema file and parse data
> > A strict requirement is to hide the details of the location of these HDFS
> > files from the user issuing the pig query. For a user it could look as
> > simple as:
> > A = load 'scheme://SampleTable' using CustomLoader('$query');
> > User here only issues the load statement on table with a query and API
> > calls for importing from database could happen in the background.
> > What would be the best way to do this? Is it better to do the above as
> > of LoadFunc, or would it rather be beneficial to do it separate and
> > communicate the location from API import to LoadFunc?
> > Thanks,
> > Prashant
> *Note that I'm no longer using my Yahoo! email address. Please email me at
> [EMAIL PROTECTED] going forward.*