|
Hemanth Yamijala
2012-12-30, 08:24
nagarjuna kanamarlapudi
2012-12-30, 08:31
Michael Segel
2012-12-30, 16:27
Jonathan Bishop
2012-12-30, 19:00
nagarjuna kanamarlapudi
2012-12-30, 19:02
Jonathan Bishop
2012-12-30, 19:08
Niels Basjes
2012-12-30, 19:38
Bertrand Dechoux
2012-12-31, 08:17
Edward Capriolo
2012-12-31, 16:03
|
-
Re: What is the preferred way to pass a small number of configuration parameters to a mapper or reducerHemanth Yamijala 2012-12-30, 08:24
If it is a small number, A seems the best way to me.
On Friday, December 28, 2012, Kshiva Kps wrote: > > Which one is current .. > > > What is the preferred way to pass a small number of configuration > parameters to a mapper or reducer? > > > > > > *A. *As key-value pairs in the jobconf object. > > * * > > *B. *As a custom input key-value pair passed to each mapper or reducer. > > * * > > *C. *Using a plain text file via the Distributedcache, which each mapper > or reducer reads. > > * * > > *D. *Through a static variable in the MapReduce driver class (i.e., the > class that submits the MapReduce job). > > > > *Answer: B* > > >
-
Re: What is the preferred way to pass a small number of configuration parameters to a mapper or reducernagarjuna kanamarlapudi 2012-12-30, 08:31
A is the best way if it is considerably small ... had it been something
like all the parameters together make up a GB of data ... Then distributed cache could have been a better option. Custom input for key value pairs is out of scope as it might be difficult to satisfy all your requirements. On Sunday, December 30, 2012, Hemanth Yamijala wrote: > If it is a small number, A seems the best way to me. > > On Friday, December 28, 2012, Kshiva Kps wrote: > > > Which one is current .. > > > What is the preferred way to pass a small number of configuration > parameters to a mapper or reducer? > > > > > > *A. *As key-value pairs in the jobconf object. > > * * > > *B. *As a custom input key-value pair passed to each mapper or reducer. > > * * > > *C. *Using a plain text file via the Distributedcache, which each mapper > or reducer reads. > > ** > > -- Sent from iPhone
-
Re: What is the preferred way to pass a small number of configuration parameters to a mapper or reducerMichael Segel 2012-12-30, 16:27
Ed,
There are some who are of the opinion that these certifications are worthless. I tend to disagree, however, I don't think that they are the best way to demonstrate one's abilities. IMHO they should provide a baseline. We have seen these types of questions on the list and in the forums. They appear to be taken from a certain vendor's prior certification tests and accumulated over time. The sad thing is that when we respond to newbie questions we need to ask ourselves if the question is real or if they are asking the question because its a certification question. I'd also be careful in expressing your opinion... I wonder how long before a certain someone expresses their displeasure in your comment. ;-) Just saying! :-) On Dec 28, 2012, at 7:20 PM, Edward Capriolo <[EMAIL PROTECTED]> wrote: > Yes. another big data, data scientist, no ops, devops, cloud computing specialist is born. Thank goodness we have multiple choice tests to identify the best coders and administrators. > > On Friday, December 28, 2012, Michel Segel <[EMAIL PROTECTED]> wrote: > > Sounds like someone is cheating on a test... > > > > Sent from a remote device. Please excuse any typos... > > Mike Segel > > On Dec 28, 2012, at 3:10 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > > > > Answer B sounds pathologically bad to me. > > A or C are the only viable options. > > Neither B nor D work. B fails because it would be extremely hard to get the right records to the right components and because it pollutes data input with configuration data. D fails because statics don't work in parallel programs. > > > > On Fri, Dec 28, 2012 at 12:17 AM, Kshiva Kps <[EMAIL PROTECTED]> wrote: > > > > Which one is current .. > > > > What is the preferred way to pass a small number of configuration parameters to a mapper or reducer? > > > > > > > > > > > > A. As key-value pairs in the jobconf object. > > > > > > > > B. As a custom input key-value pair passed to each mapper or reducer. > > > >
-
Re: What is the preferred way to pass a small number of configuration parameters to a mapper or reducerJonathan Bishop 2012-12-30, 19:00
E. Store them in hbase...
On Sun, Dec 30, 2012 at 12:24 AM, Hemanth Yamijala < [EMAIL PROTECTED]> wrote: > If it is a small number, A seems the best way to me. > > On Friday, December 28, 2012, Kshiva Kps wrote: > >> >> Which one is current .. >> >> >> What is the preferred way to pass a small number of configuration >> parameters to a mapper or reducer? >> >> >> >> >> >> *A. *As key-value pairs in the jobconf object. >> >> * * >> >> *B. *As a custom input key-value pair passed to each mapper or reducer. >> >> * * >> >> *C. *Using a plain text file via the Distributedcache, which each >> mapper or reducer reads. >> >> * * >> >> *D. *Through a static variable in the MapReduce driver class (i.e., the >> class that submits the MapReduce job). >> >> >> >> *Answer: B* >> >> >> >
-
Re: What is the preferred way to pass a small number of configuration parameters to a mapper or reducernagarjuna kanamarlapudi 2012-12-30, 19:02
Only if u have few mappers and reducers
On Monday, December 31, 2012, Jonathan Bishop wrote: > E. Store them in hbase... > > > On Sun, Dec 30, 2012 at 12:24 AM, Hemanth Yamijala < > [EMAIL PROTECTED]> wrote: > > If it is a small number, A seems the best way to me. > > On Friday, December 28, 2012, Kshiva Kps wrote: > > > Which one is current .. > > > What is the preferred way to pass a small number of configuration > parameters to a mapper or reducer? > > > > > > *A. *As key-value pairs in the jobconf object. > > * * > > *B. *As a custom input key-value pair passed to each mapper or reducer. > > * * > > *C. * > > -- Sent from iPhone
-
Re: What is the preferred way to pass a small number of configuration parameters to a mapper or reducerJonathan Bishop 2012-12-30, 19:08
Nagarjuna,
Can you explain in more detail - what is the cost of using hbase as a configuration storage for MR jobs, say if there are many of them. Jon On Sun, Dec 30, 2012 at 11:02 AM, nagarjuna kanamarlapudi < [EMAIL PROTECTED]> wrote: > Only if u have few mappers and reducers > > > On Monday, December 31, 2012, Jonathan Bishop wrote: > >> E. Store them in hbase... >> >> >> On Sun, Dec 30, 2012 at 12:24 AM, Hemanth Yamijala < >> [EMAIL PROTECTED]> wrote: >> >> If it is a small number, A seems the best way to me. >> >> On Friday, December 28, 2012, Kshiva Kps wrote: >> >> >> Which one is current .. >> >> >> What is the preferred way to pass a small number of configuration >> parameters to a mapper or reducer? >> >> >> >> >> >> *A. *As key-value pairs in the jobconf object. >> >> * * >> >> *B. *As a custom input key-value pair passed to each mapper or reducer. >> >> * * >> >> *C. * >> >> > > -- > Sent from iPhone >
-
Re: What is the preferred way to pass a small number of configuration parameters to a mapper or reducerNiels Basjes 2012-12-30, 19:38
F. put a mongodb replica set on all hadoop workernodes and let the tasks
query the mongodb at localhost. (this is what I did recently with a multi GiB dataset) -- Met vriendelijke groet, Niels Basjes (Verstuurd vanaf mobiel ) Op 30 dec. 2012 20:01 schreef "Jonathan Bishop" <[EMAIL PROTECTED]> het volgende: > E. Store them in hbase... > > > On Sun, Dec 30, 2012 at 12:24 AM, Hemanth Yamijala < > [EMAIL PROTECTED]> wrote: > >> If it is a small number, A seems the best way to me. >> >> On Friday, December 28, 2012, Kshiva Kps wrote: >> >>> >>> Which one is current .. >>> >>> >>> What is the preferred way to pass a small number of configuration >>> parameters to a mapper or reducer? >>> >>> >>> >>> >>> >>> *A. *As key-value pairs in the jobconf object. >>> >>> * * >>> >>> *B. *As a custom input key-value pair passed to each mapper or >>> reducer. >>> >>> * * >>> >>> *C. *Using a plain text file via the Distributedcache, which each >>> mapper or reducer reads. >>> >>> * * >>> >>> *D. *Through a static variable in the MapReduce driver class (i.e., >>> the class that submits the MapReduce job). >>> >>> >>> >>> *Answer: B* >>> >>> >>> >> >
-
Re: What is the preferred way to pass a small number of configuration parameters to a mapper or reducerBertrand Dechoux 2012-12-31, 08:17
*G*. Use cascading so that way you don't have to actually provide the
parameters yourself because there is a transparent serialization of what will become the mapper and the reducer. (but it is really a hidden kind-of * A*). http://www.cascading.org/ About certifications, of course, cheating is not allowed. And if you are indeed cheating, you are open the 'retributions' you agreed on. But at the same time, you can find online resources which are plain wrong. I think the only good answers are *1)* use the API and figure it yourself *2)* Do not trust everybody (even well intentioned people can be wrong and the same can be said about public opinion) *3)* read a good reference (like http://hadoopbook.com/) The mailing list could have a rule stating the such post are not allowed. It really looks like a copy-and-paste from somewhere. Any author should provide more context if there really is a point which is not understood. Bertrand On Sun, Dec 30, 2012 at 8:38 PM, Niels Basjes <[EMAIL PROTECTED]> wrote: > F. put a mongodb replica set on all hadoop workernodes and let the tasks > query the mongodb at localhost. > > (this is what I did recently with a multi GiB dataset) > > -- > Met vriendelijke groet, > Niels Basjes > (Verstuurd vanaf mobiel ) > Op 30 dec. 2012 20:01 schreef "Jonathan Bishop" <[EMAIL PROTECTED]> > het volgende: > > E. Store them in hbase... >> >> >> On Sun, Dec 30, 2012 at 12:24 AM, Hemanth Yamijala < >> [EMAIL PROTECTED]> wrote: >> >>> If it is a small number, A seems the best way to me. >>> >>> On Friday, December 28, 2012, Kshiva Kps wrote: >>> >>>> >>>> Which one is current .. >>>> >>>> >>>> What is the preferred way to pass a small number of configuration >>>> parameters to a mapper or reducer? >>>> >>>> >>>> >>>> >>>> >>>> *A. *As key-value pairs in the jobconf object. >>>> >>>> * * >>>> >>>> *B. *As a custom input key-value pair passed to each mapper or >>>> reducer. >>>> >>>> * * >>>> >>>> *C. *Using a plain text file via the Distributedcache, which each >>>> mapper or reducer reads. >>>> >>>> * * >>>> >>>> *D. *Through a static variable in the MapReduce driver class (i.e., >>>> the class that submits the MapReduce job). >>>> >>>> >>>> >>>> *Answer: B* >>>> >>>> >>>> >>> >>
-
Re: What is the preferred way to pass a small number of configuration parameters to a mapper or reducerEdward Capriolo 2012-12-31, 16:03
Z. Implement passing simple small objects in the most complicated manner
possible try JPOX, ontop of hbase, configured by smartfrog and puppet with environments, on heroku. On Mon, Dec 31, 2012 at 3:17 AM, Bertrand Dechoux <[EMAIL PROTECTED]>wrote: > *G*. Use cascading so that way you don't have to actually provide the > parameters yourself because there is a transparent serialization of what > will become the mapper and the reducer. (but it is really a hidden kind-of > *A*). > > http://www.cascading.org/ > > About certifications, of course, cheating is not allowed. And if you are > indeed cheating, you are open the 'retributions' you agreed on. > But at the same time, you can find online resources which are plain wrong. > I think the only good answers are > *1)* use the API and figure it yourself > *2)* Do not trust everybody (even well intentioned people can be wrong > and the same can be said about public opinion) > *3)* read a good reference (like http://hadoopbook.com/) > > The mailing list could have a rule stating the such post are not allowed. > It really looks like a copy-and-paste from somewhere. Any author should > provide more context if there really is a point which is not understood. > > Bertrand > > > On Sun, Dec 30, 2012 at 8:38 PM, Niels Basjes <[EMAIL PROTECTED]> wrote: > >> F. put a mongodb replica set on all hadoop workernodes and let the tasks >> query the mongodb at localhost. >> >> (this is what I did recently with a multi GiB dataset) >> >> -- >> Met vriendelijke groet, >> Niels Basjes >> (Verstuurd vanaf mobiel ) >> Op 30 dec. 2012 20:01 schreef "Jonathan Bishop" <[EMAIL PROTECTED]> >> het volgende: >> >> E. Store them in hbase... >>> >>> >>> On Sun, Dec 30, 2012 at 12:24 AM, Hemanth Yamijala < >>> [EMAIL PROTECTED]> wrote: >>> >>>> If it is a small number, A seems the best way to me. >>>> >>>> On Friday, December 28, 2012, Kshiva Kps wrote: >>>> >>>>> >>>>> Which one is current .. >>>>> >>>>> >>>>> What is the preferred way to pass a small number of configuration >>>>> parameters to a mapper or reducer? >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> *A. *As key-value pairs in the jobconf object. >>>>> >>>>> * * >>>>> >>>>> *B. *As a custom input key-value pair passed to each mapper or >>>>> reducer. >>>>> >>>>> * * >>>>> >>>>> *C. *Using a plain text file via the Distributedcache, which each >>>>> mapper or reducer reads. >>>>> >>>>> * * >>>>> >>>>> *D. *Through a static variable in the MapReduce driver class (i.e., >>>>> the class that submits the MapReduce job). >>>>> >>>>> >>>>> >>>>> *Answer: B* >>>>> >>>>> >>>>> >>>> >>> > |