-A first glance/reminder/hack at the BigPetStore pipeline
Jay Vyas 2013-10-08, 22:16
Ive been hacking around on the big pet store idea. So far ive only got the
template for the synthetic data set generator:
This is the "first" phase implementation of a MapReduce job that will a
generate synthetic data set of transactions in a petstore.
It is meant to be configurable: So people can use it to generate as many
transactions as they want. I will also add more "products" to it.
2) The next step will be to flesh out the transaction data and then write
up aggregations both in hive, pig, and mapreduce. That will serve as the
3) Then the interesting part will come: Feeding those ETL'd statistics
into an available data store that is bigtop supported : i.e. SOLR indices
and HBASE keyvalues.
At that point the sample application will be ready and the first iteration
of bigtop.blueprints will be ready to share.
If Any initial thoughts or anyone else wants to jump in, let me know.? :)