I think you have to look the sequence file as input format .
Basically, the way this works is, you will have a separate Java process that takes several image files, reads the ray bytes into memory, then stores the data into a key-value pair in a SequenceFile. Keep going and keep writing into HDFS. This may take a while, but you'll only have to do it once.
From: AMARNATH, Balachandar [mailto:[EMAIL PROTECTED]]
Sent: 06 March 2013 11:07
To: [EMAIL PROTECTED]
Subject: Map reduce technique
I am new to map reduce paradigm. I read in a tutorial that says that 'map' function splits the data and into key value pairs. This means, the map-reduce framework automatically splits the data into pieces or do we need to explicitly provide the method to split the data into pieces. If it does automatically, how it splits an image file (size etc)? I see, processing of an image file as a whole will give different results than processing them in chunks.
With thanks and regards
The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised.
If you are not the intended recipient, please notify Airbus immediately and delete this e-mail.
Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately.
All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free.