Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Bigtop >> mail # user >> A first glance/reminder/hack at the BigPetStore pipeline


Copy link to this message
-
Re: A first glance/reminder/hack at the BigPetStore pipeline
Thanks bruno ! I missed this :) .  Yes ill add these in !
On Fri, Oct 25, 2013 at 4:50 AM, Bruno Mahé <[EMAIL PROTECTED]> wrote:

> On 10/08/2013 03:16 PM, Jay Vyas wrote:
>
>> Hi folks.
>>
>> Ive been hacking around on the big pet store idea.  So far ive only got
>> the template for the synthetic data set generator:
>>
>> https://raw.github.com/jayunit100/hadoop-example-
>> jobs/master/src/main/java/org/bigtop/bigpetstore/
>> PetStoreTransactionGeneratorJob.java
>>
>> This is the "first" phase implementation of a MapReduce job that will a
>> generate synthetic data set of transactions in a petstore.
>>
>> It is meant to be configurable: So people can use it to generate as many
>> transactions as they want.  I will also add more "products" to it.
>>
>> 2) The next step will be to flesh out the transaction data and then
>> write up aggregations both in hive, pig, and mapreduce.  That will serve
>> as the ETL blueprint.
>>
>> 3) Then the interesting part will come:  Feeding those ETL'd statistics
>> into an available data store that is bigtop supported : i.e. SOLR
>> indices and  HBASE keyvalues.
>>
>> At that point the sample application will be ready and the first
>> iteration of bigtop.blueprints will be ready to share.
>>
>> If Any initial thoughts or anyone else wants to jump in, let me know.? :)
>>
>> Jay Vyas
>> http://jayunit100.blogspot.com
>>
>
>
> Looks like a great start!
> Can't wait to see the following parts.
>
> Some notss:
> * Missing license header
> * Package name should probably be org.apache.bigtop.blueprint.bigpetstore
> * It would be nice to split all these classes in different files
> * It would be nice to group instance variables at the same location (ex:
> int soFar is declared right in the middle between two methods)
> * It would be nice to extract strings such as "Dud Job", "transactions" or
> "transaction_files" into constants
> * I have spotted some System.out.println
>
> Thanks,
> Bruno
>
>
--
Jay Vyas
http://jayunit100.blogspot.com