-RE: Read/Write from RDBMS using PIG/Hadoop
Olga Natkovich 2009-05-04, 22:12
That's definitely an interesting insight. Thanks for sharing!
> -----Original Message-----
> From: Ted Dunning [mailto:[EMAIL PROTECTED]]
> Sent: Monday, May 04, 2009 11:27 AM
> To: [EMAIL PROTECTED]
> Subject: Re: Read/Write from RDBMS using PIG/Hadoop
> I have done this with other map-reduce programs with some
> interesting results that were predictable in hindsight:
> a) having mappers open database connections is a great way to
> take down your database. Databases are not usually ready to
> handle the data volumes that map-reduce programs would like
> to take from them.
> b) having an output format that puts output from a map-reduce
> program directly into a database is not usually faster than
> producing output in flat files and using special data load
> commands to re-import into the database.
> The upshot is that exporting from the database in flat file
> format, processing using map-reduce and then re-importing
> flat files isn't all that bad an alternative. I was hoping
> for a sexier solution, but the boring answer worked pretty well.
> On Mon, May 4, 2009 at 3:13 AM, Nellai
> <[EMAIL PROTECTED]>wrote:
> > Is there a way we can use PIG to interact with RDBMS? Do we
> have any
> > API to handle such a scenario? Is there a way we can use
> hadoop's API
> > ( Hadoop
> > 0.19
> > DBInputFormat/DBOutputFormat) to interact with RDBMS using PIG?
> > Please let me know if someone has tried this.
> > --
> Ted Dunning, CTO