Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> DataCreator


Copy link to this message
-
Re: DataCreator
Sounds like Pig.  Or Cascading.  Or Hive.

Seriously, isn't this already available?

On Wed, Feb 16, 2011 at 7:06 AM, Guy Doulberg <[EMAIL PROTECTED]>wrote:

>
> Hey all,
> I want to consult with you hadoppers about a Map/Reduce application I want
> to build.
>
> I want to build a map/reduce job, that read files from HDFS, perform some
> sort of transformation on the file lines, and store them to several
> partition depending on the source of the file or its data.
>
> I want this application to be as configurable as possible, so I designed
> interfaces to Parse, Decorate and Partition(On HDFS) the Data.
>
> I want to be able to configure different data flows, with different
> parsers, decorators and partitioners, using a config file.
>
> Do you think, you would use such an application? Does it fit an open-source
> project?
>
> Now, I have some technical questions:
> I was thinking of using reflection, to load all the classes I would need
> according to the configuration during the setup process of the Mapper.
> Do you think it is a good idea?
>
> Is there a way to send the Mapper objects or interfaces from the Job
> declaration?
>
>
>
>  Thanks,
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB