Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> DataCreator

Copy link to this message
Re: DataCreator
Sounds like Pig.  Or Cascading.  Or Hive.

Seriously, isn't this already available?

On Wed, Feb 16, 2011 at 7:06 AM, Guy Doulberg <[EMAIL PROTECTED]>wrote:

> Hey all,
> I want to consult with you hadoppers about a Map/Reduce application I want
> to build.
> I want to build a map/reduce job, that read files from HDFS, perform some
> sort of transformation on the file lines, and store them to several
> partition depending on the source of the file or its data.
> I want this application to be as configurable as possible, so I designed
> interfaces to Parse, Decorate and Partition(On HDFS) the Data.
> I want to be able to configure different data flows, with different
> parsers, decorators and partitioners, using a config file.
> Do you think, you would use such an application? Does it fit an open-source
> project?
> Now, I have some technical questions:
> I was thinking of using reflection, to load all the classes I would need
> according to the configuration during the setup process of the Mapper.
> Do you think it is a good idea?
> Is there a way to send the Mapper objects or interfaces from the Job
> declaration?
>  Thanks,