Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Need Info


Copy link to this message
-
Re: Need Info
sudha sadhasivam 2009-10-22, 03:43
Is it a modification for hadoop or an application?
 
If it is an application, write the basic algorithm. See which portions can be parallelised.
Then put the parallelisable code into the map portion.
For this try out sample examples on wordcount, grep, sort, etc in hadoop. Then you can understand the input(K,V) pairs and output collector format.
 
The rest of the portions is like in  java
G Sudha Sadasivam

--- On Thu, 10/22/09, shwitzu <[EMAIL PROTECTED]> wrote:
From: shwitzu <[EMAIL PROTECTED]>
Subject: Re: Need Info
To: [EMAIL PROTECTED]
Date: Thursday, October 22, 2009, 6:07 AM

Thanks for Responding.

I read about HDFS and understood how it works and I also installed hadoop in
my windows using cygwin and tried a sample driver code and made sure it
works.

But my concern is, given the problem statement how should I proceed

Could you please give me some clue/ pseudo code or a design.

Thanks in anticipation.

Doss_IPH wrote:
>
> First and for most, you need to understand about hadoop platform
> infrastructures.
> Currently, I am working in real time application using hadoop. I think
> that Hadoop will be fit to your requirements.
> Hadoop is mainly for three things,
> 1. Scalability no limit for storage
> 2. Peta bytes of data processing in distributed parallel mode.
> 3. Fault tolerance (Automatically Block Replication) recovering data from
> failure.
>
>
> shwitzu wrote:
>>
>> Hello Sir!
>>
>> I am new to hadoop. I have a project  based on webservices. I have my
>> information in 4 databases with different files in each one of them. Say,
>> images in one, video, documents etc. My task is to develop a web service
>> which accepts the keyword from the client and process the request and
>> send back the actual requested file back to the user. Now I have to use
>> Hadoop distributed file system in this project.
>>
>> I have the following questions:
>>
>> 1) How should I start with the design?
>> 2)  Should I upload all the files and create Map, Reduce and Driver code
>> and once I run my application will it automatically go the file system
>> and get back the results to me?
>> 3) How do i handle the binary data? I want to store binary format data
>> using MTOM in my databse.
>>
>> Please let me know how I should proceed. I dont know much about this
>> hadoop and am I searching for some help. It would be great if you could
>> assist me. Thanks again
>>
>>
>
>

--
View this message in context: http://www.nabble.com/Need-Info-tp25901902p25996385.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.