Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Architecture question on Injesting Data into Hadoop


Copy link to this message
-
Re: Architecture question on Injesting Data into Hadoop
Hi Apurva,

In would use some data ingestion tool like Apache Flume to make the task
easier without much human intervention. Create sources for your different
systems and rest will be taken care of by Fume. However, it is not a must
to use something like Flume. But it will definitely make your life easier
and will help you in developing a more sophisticated system, IMHO.

You need HBase when you need rea-time random read/access to your data.
Basically when you intend to have low latency access to small amounts of
data from within a large data set and you have a flexible schema.

And for the last part of your question, use Apache Hive. It provides us
warehousing capabilities on top of an existing Hadoop cluster with an
SQLish interface to query the stored data. Also, it will be of help while
using Impala.

HTH

Warm Regards,
Tariq
cloudfront.blogspot.com
On Tue, Mar 25, 2014 at 1:41 AM, Geoffry Roberts <[EMAIL PROTECTED]>wrote: