Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Bigtop >> mail # user >> A first glance/reminder/hack at the BigPetStore pipeline


Copy link to this message
-
Re: A first glance/reminder/hack at the BigPetStore pipeline
On 10/08/2013 03:16 PM, Jay Vyas wrote:
> Hi folks.
>
> Ive been hacking around on the big pet store idea.  So far ive only got
> the template for the synthetic data set generator:
>
> https://raw.github.com/jayunit100/hadoop-example-jobs/master/src/main/java/org/bigtop/bigpetstore/PetStoreTransactionGeneratorJob.java
>
> This is the "first" phase implementation of a MapReduce job that will a
> generate synthetic data set of transactions in a petstore.
>
> It is meant to be configurable: So people can use it to generate as many
> transactions as they want.  I will also add more "products" to it.
>
> 2) The next step will be to flesh out the transaction data and then
> write up aggregations both in hive, pig, and mapreduce.  That will serve
> as the ETL blueprint.
>
> 3) Then the interesting part will come:  Feeding those ETL'd statistics
> into an available data store that is bigtop supported : i.e. SOLR
> indices and  HBASE keyvalues.
>
> At that point the sample application will be ready and the first
> iteration of bigtop.blueprints will be ready to share.
>
> If Any initial thoughts or anyone else wants to jump in, let me know.? :)
>
> Jay Vyas
> http://jayunit100.blogspot.com
Looks like a great start!
Can't wait to see the following parts.

Some notss:
* Missing license header
* Package name should probably be org.apache.bigtop.blueprint.bigpetstore
* It would be nice to split all these classes in different files
* It would be nice to group instance variables at the same location (ex:
int soFar is declared right in the middle between two methods)
* It would be nice to extract strings such as "Dud Job", "transactions"
or "transaction_files" into constants
* I have spotted some System.out.println

Thanks,
Bruno
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB