Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Re: reference architecture


Copy link to this message
-
Re: reference architecture
You just made my year. Let me know how I can make it better (off list).

Russell Jurney twitter.com/rjurney
On Oct 29, 2012, at 2:17 PM, "Daniel Käfer" <[EMAIL PROTECTED]> wrote:

> Thank you, that book is exactly what i'm looking for.
>
> Regards
> Daniel Käfer
>
> Am Samstag, den 27.10.2012, 02:19 -0700 schrieb Russell Jurney:
>> Russell Jurney http://datasyndrome.com
>>
>> On Oct 25, 2012, at 12:24 PM, "Daniel Käfer" <[EMAIL PROTECTED]> wrote:
>>
>>> Hello all,
>>>
>>> I'm looking for a reference architecture for hadoop. The only result I
>>> found is Lambda architecture from Nathan Marz[0].
>>>
>>> With architecture I mean answers to question like:
>>> - How should I store the data? CSV, Thirft, ProtoBuf
>> You should use Avro.
>>> - How should I model the data? ER-Model, Starschema, something new?
>> You should use document format.
>>> - normalized or denormalized or both (master data normalized, then
>>> transformation to denormalized, like ETL)
>> Demoralized fully, into document format.
>>> - How should i combine database and HDFS-Files?
>> Don't. Put everything on HDFS.
>>>
>>> Are there any other documented architectures for hadoop?
>> I really did make an example in my book. It is just one example, but
>> you wanted answers to questions that always 'depend.' You can check it
>> out in slides: http://www.slideshare.net/mobile/hortonworks/agile-analytics-applications-on-hadoop
>>>
>>> Regards
>>> Daniel Käfer
>>>
>>>
>>> [0] http://www.manning.com/marz/ just a preprint yet, not completed
>>>
>
>