|
Praveen Sripati
2011-09-24, 13:42
GOEKE, MATTHEW
2011-09-24, 14:50
Praveen Sripati
2011-09-24, 16:56
GOEKE, MATTHEW
2011-09-26, 02:51
Ravi Teja
2011-09-26, 05:02
Ravi Teja
2011-09-26, 05:25
|
-
How to pull data in the Map/Reduce functions?Praveen Sripati 2011-09-24, 13:42
Hi,
Normally the Hadoop framework calls the map()/reduce() for each record in the input split. I read in the 'Hadoop : The Definitive Guide' that that data can be pulled using the new MR API. What is the new API for pulling the data in the map()/reduce() or is there a sample code? Thanks, Praveen
-
RE: How to pull data in the Map/Reduce functions?GOEKE, MATTHEW 2011-09-24, 14:50
Praveen,
Functionality wise you don't gain much from using the new API and most would actually recommend that you stay with the old API as it will not be "officially" deprecated until 0.22 / 0.23 (I can't remember which one). If you want to take a look at the classes dig into the packages for org.apache.hadoop.mapred.* (old) and org.apache.hadoop.mapreduce.* (new). Also, I thought that Definitive Guide second edition and Hadoop In Action covered the new api. Matt From: Praveen Sripati [mailto:[EMAIL PROTECTED]] Sent: Saturday, September 24, 2011 8:43 AM To: [EMAIL PROTECTED] Subject: How to pull data in the Map/Reduce functions? Hi, Normally the Hadoop framework calls the map()/reduce() for each record in the input split. I read in the 'Hadoop : The Definitive Guide' that that data can be pulled using the new MR API. What is the new API for pulling the data in the map()/reduce() or is there a sample code? Thanks, Praveen This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware". Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. The information contained in this email may be subject to the export control laws and regulations of the United States, potentially including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this information you are obligated to comply with all applicable U.S. export laws and regulations.
-
Re: How to pull data in the Map/Reduce functions?Praveen Sripati 2011-09-24, 16:56
Matt,
Neither of the books have much information about the new MR API. I was reading the 'Hadoop - The Definitive Guide' and came across a single pager on new API. I wanted to try the new MR, but could not find much information neither in the book nor on the internet. Thanks, Praveen On Sat, Sep 24, 2011 at 8:20 PM, GOEKE, MATTHEW (AG/1000) < [EMAIL PROTECTED]> wrote: > Praveen,**** > > ** ** > > Functionality wise you don’t gain much from using the new API and most > would actually recommend that you stay with the old API as it will not be > “officially” deprecated until 0.22 / 0.23 (I can’t remember which one). If > you want to take a look at the classes dig into the packages for > org.apache.hadoop.mapred.* (old) and org.apache.hadoop.mapreduce.* (new). > Also, I thought that Definitive Guide second edition and Hadoop In Action > covered the new api.**** > > ** ** > > Matt**** > > ** ** > > *From:* Praveen Sripati [mailto:[EMAIL PROTECTED]] > *Sent:* Saturday, September 24, 2011 8:43 AM > *To:* [EMAIL PROTECTED] > *Subject:* How to pull data in the Map/Reduce functions?**** > > ** ** > > Hi, > > Normally the Hadoop framework calls the map()/reduce() for each record in > the input split. I read in the 'Hadoop : The Definitive Guide' that that > data can be pulled using the new MR API. > > What is the new API for pulling the data in the map()/reduce() or is there > a sample code? > > Thanks, > Praveen**** > This e-mail message may contain privileged and/or confidential > information, and is intended to be received only by persons entitled > to receive such information. If you have received this e-mail in error, > please notify the sender immediately. Please delete it and > all attachments from any servers, hard drives or any other media. Other use > of this e-mail by you is strictly prohibited. > > All e-mails and attachments sent and received are subject to monitoring, > reading and archival by Monsanto, including its > subsidiaries. The recipient of this e-mail is solely responsible for > checking for the presence of "Viruses" or other "Malware". > Monsanto, along with its subsidiaries, accepts no liability for any damage > caused by any such code transmitted by or accompanying > this e-mail or any attachment. > > > The information contained in this email may be subject to the export > control laws and regulations of the United States, potentially > including but not limited to the Export Administration Regulations (EAR) > and sanctions regulations issued by the U.S. Department of > Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this > information you are obligated to comply with all > applicable U.S. export laws and regulations. >
-
RE: How to pull data in the Map/Reduce functions?GOEKE, MATTHEW 2011-09-26, 02:51
A quick google search for "mapreduce new api" came up with a link for http://wisdombase.net/wiki/index.php?title=Upgrading_to_the_new_2.0_mapreduce_API. After quickly reviewing it I would say this is a decent reference that you can follow. There are definitely some better ones out there but this is just to help you get started.
Matt From: Praveen Sripati [mailto:[EMAIL PROTECTED]] Sent: Saturday, September 24, 2011 11:57 AM To: [EMAIL PROTECTED] Subject: Re: How to pull data in the Map/Reduce functions? Matt, Neither of the books have much information about the new MR API. I was reading the 'Hadoop - The Definitive Guide' and came across a single pager on new API. I wanted to try the new MR, but could not find much information neither in the book nor on the internet. Thanks, Praveen On Sat, Sep 24, 2011 at 8:20 PM, GOEKE, MATTHEW (AG/1000) <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Praveen, Functionality wise you don't gain much from using the new API and most would actually recommend that you stay with the old API as it will not be "officially" deprecated until 0.22 / 0.23 (I can't remember which one). If you want to take a look at the classes dig into the packages for org.apache.hadoop.mapred.* (old) and org.apache.hadoop.mapreduce.* (new). Also, I thought that Definitive Guide second edition and Hadoop In Action covered the new api. Matt From: Praveen Sripati [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>] Sent: Saturday, September 24, 2011 8:43 AM To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]> Subject: How to pull data in the Map/Reduce functions? Hi, Normally the Hadoop framework calls the map()/reduce() for each record in the input split. I read in the 'Hadoop : The Definitive Guide' that that data can be pulled using the new MR API. What is the new API for pulling the data in the map()/reduce() or is there a sample code? Thanks, Praveen This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware". Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. The information contained in this email may be subject to the export control laws and regulations of the United States, potentially including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this information you are obligated to comply with all applicable U.S. export laws and regulations.
-
RE: How to pull data in the Map/Reduce functions?Ravi Teja 2011-09-26, 05:02
Hi Praveen,
>that data can be pulled using the new MR API The next key value pair can be retrieved from the context object which is passed to the map, by calling nextKeyValue() on it. So you will be able to pull the next data from it in the new API. Regards, Ravi Teja _____ From: Praveen Sripati [mailto:[EMAIL PROTECTED]] Sent: Saturday, September 24, 2011 7:13 PM To: [EMAIL PROTECTED] Subject: How to pull data in the Map/Reduce functions? Hi, Normally the Hadoop framework calls the map()/reduce() for each record in the input split. I read in the 'Hadoop : The Definitive Guide' that that data can be pulled using the new MR API. What is the new API for pulling the data in the map()/reduce() or is there a sample code? Thanks, Praveen
-
Re: How to pull data in the Map/Reduce functions?Ravi Teja 2011-09-26, 05:25
Hi Praveen,
>that data can be pulled using the new MR API The next key value pair can be retrieved from the context object which is passed to the map, by calling nextKeyValue() on it. So you will be able to pull the next data from it in the new API. Regards, Ravi Teja _____ From: Praveen Sripati [mailto:[EMAIL PROTECTED]] Sent: Saturday, September 24, 2011 7:13 PM To: [EMAIL PROTECTED] Subject: How to pull data in the Map/Reduce functions? Hi, Normally the Hadoop framework calls the map()/reduce() for each record in the input split. I read in the 'Hadoop : The Definitive Guide' that that data can be pulled using the new MR API. What is the new API for pulling the data in the map()/reduce() or is there a sample code? Thanks, Praveen |