-Re: A first glance/reminder/hack at the BigPetStore pipeline
Jay Vyas 2013-12-03, 16:42
Thanks bruno ! I missed this :) . Yes ill add these in !
On Fri, Oct 25, 2013 at 4:50 AM, Bruno Mahé <[EMAIL PROTECTED]> wrote:
> On 10/08/2013 03:16 PM, Jay Vyas wrote:
>> Hi folks.
>> Ive been hacking around on the big pet store idea. So far ive only got
>> the template for the synthetic data set generator:
>> This is the "first" phase implementation of a MapReduce job that will a
>> generate synthetic data set of transactions in a petstore.
>> It is meant to be configurable: So people can use it to generate as many
>> transactions as they want. I will also add more "products" to it.
>> 2) The next step will be to flesh out the transaction data and then
>> write up aggregations both in hive, pig, and mapreduce. That will serve
>> as the ETL blueprint.
>> 3) Then the interesting part will come: Feeding those ETL'd statistics
>> into an available data store that is bigtop supported : i.e. SOLR
>> indices and HBASE keyvalues.
>> At that point the sample application will be ready and the first
>> iteration of bigtop.blueprints will be ready to share.
>> If Any initial thoughts or anyone else wants to jump in, let me know.? :)
>> Jay Vyas
> Looks like a great start!
> Can't wait to see the following parts.
> Some notss:
> * Missing license header
> * Package name should probably be org.apache.bigtop.blueprint.bigpetstore
> * It would be nice to split all these classes in different files
> * It would be nice to group instance variables at the same location (ex:
> int soFar is declared right in the middle between two methods)
> * It would be nice to extract strings such as "Dud Job", "transactions" or
> "transaction_files" into constants
> * I have spotted some System.out.println