There are too many issues to discuss I guess. I would recommend
reading Hadoop The Definitive Guide by Tom White. There are some
chapters for the answers.
Also what did you mean my 'real time"? Hadoop is not designed for
giving real time results of queries. It is rather for offline data
analysis, because each query can take minutes or hours to finish.
AFAIK, HBase provides some real time functionality though.
For Hadoop automation, you can try Oozie. We are using opswise in our company
On Mon, Oct 1, 2012 at 5:36 PM, yogesh dhari <[EMAIL PROTECTED]> wrote:
> Hi all,
> I have understood the Hadoop and Hadoop Ecosystem(Pig as ETL, Hive as
> DataWare house, Sqoop as importing tool). I worked and learned on single
> node cluster with demo data.
> As Hadoop suits best on Unix platform. Please help me to understand the
> requirement form start to finish to use Hadoop in production.
> What would be the things to use Hadoop on real time project.
> like Hadoop automation on Unix, alert of failure process.
> Please put some light on using Hadoop on real time and what objectives are
> Thanks & Regards
> Yogesh Kumar