pgaurav 2012-09-05, 08:42
Welcome to Hadoop Community. :)
Hadoop is meant for processing large data volumes. Saying that, for your
custom requirements you should write your own mapper and reducer that
contains your business logic for processing the input data. Also you can
have a look at hive and pig, which are tools built on top of map reduce
that is highly used for data analysis. Hive supports SQL like queries. If
your requirements could be satisfied with Hive or Pig, it is highly
recommend to go with those.
On Wed, Sep 5, 2012 at 2:12 PM, pgaurav <[EMAIL PROTECTED]> wrote:
> Hi Guys,
> I’m 5 days old in hadoop world and trying to analyse this as a long term
> solution to our client.
> I could do some r&d on Amazon EC2 / EMR:
> Load the data, text / csv, to S3
> Write your mapper / reducer / Jobclient and upload the jar to s3
> Start a job flow
> I tried 2 sample code, word count and csv data process.
> My question is that to further analyse the data / reporting / search, what
> should be done? Do I need to implement in Mapper class itself? Do I need to
> dump the data to the database and then write some custom application? What
> is the standard way to analysing the data?
> View this message in context:
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
Nitin Pawar 2012-09-05, 08:53
Bertrand Dechoux 2012-09-05, 08:57