|
Ted Pedersen
2011-02-28, 00:34
Ted Dunning
2011-02-28, 00:55
Lance Norskog
2011-02-28, 02:20
Simon
2011-02-28, 03:23
Ted Pedersen
2011-02-28, 03:31
Ted Dunning
2011-02-28, 04:38
Evert Lammerts
2011-02-28, 09:04
Tom Deutsch
2011-02-28, 15:04
Ted Pedersen
2011-03-02, 18:58
|
-
Hadoop Case Studies?Ted Pedersen 2011-02-28, 00:34
Greetings all,
I'm teaching an undergraduate Computer Science class that is using Hadoop quite heavily, and would like to include some case studies at various points during this semester. We are using Tom White's "Hadoop The Definitive Guide" as a text, and that includes a very nice chapter of case studies which might even provide enough material for my purposes. But, I wanted to check and see if there were other case studies out there that might provide motivating and interesting examples of how Hadoop is currently being used. The idea is to find material that goes beyond simply saying "X uses Hadoop" to explaining in more detail how and why X are using Hadoop. Any hints would be very gratefully received. Cordially, Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse
-
Re: Hadoop Case Studies?Ted Dunning 2011-02-28, 00:55
Ted,
Greetings back at you. It has been a while. Check out Jimmy Lin and Chris Dyer's book about text processing with hadoop: http://www.umiacs.umd.edu/~jimmylin/book.html On Sun, Feb 27, 2011 at 4:34 PM, Ted Pedersen <[EMAIL PROTECTED]> wrote: > Greetings all, > > I'm teaching an undergraduate Computer Science class that is using > Hadoop quite heavily, and would like to include some case studies at > various points during this semester. > > We are using Tom White's "Hadoop The Definitive Guide" as a text, and > that includes a very nice chapter of case studies which might even > provide enough material for my purposes. > > But, I wanted to check and see if there were other case studies out > there that might provide motivating and interesting examples of how > Hadoop is currently being used. The idea is to find material that goes > beyond simply saying "X uses Hadoop" to explaining in more detail how > and why X are using Hadoop. > > Any hints would be very gratefully received. > > Cordially, > Ted > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse >
-
Re: Hadoop Case Studies?Lance Norskog 2011-02-28, 02:20
This is an exercise that will appeal to undergrads: pull the Craiglist
personals ads from several cities, and do text classification. Given a training set of all the cities, attempt to classify test ads by city. (If Peter Harrington is out there, I stole this from you.) Lance On Sun, Feb 27, 2011 at 4:55 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > Ted, > > Greetings back at you. It has been a while. > > Check out Jimmy Lin and Chris Dyer's book about text processing with > hadoop: > > http://www.umiacs.umd.edu/~jimmylin/book.html > > > On Sun, Feb 27, 2011 at 4:34 PM, Ted Pedersen <[EMAIL PROTECTED]> wrote: > >> Greetings all, >> >> I'm teaching an undergraduate Computer Science class that is using >> Hadoop quite heavily, and would like to include some case studies at >> various points during this semester. >> >> We are using Tom White's "Hadoop The Definitive Guide" as a text, and >> that includes a very nice chapter of case studies which might even >> provide enough material for my purposes. >> >> But, I wanted to check and see if there were other case studies out >> there that might provide motivating and interesting examples of how >> Hadoop is currently being used. The idea is to find material that goes >> beyond simply saying "X uses Hadoop" to explaining in more detail how >> and why X are using Hadoop. >> >> Any hints would be very gratefully received. >> >> Cordially, >> Ted >> >> -- >> Ted Pedersen >> http://www.d.umn.edu/~tpederse >> > -- Lance Norskog [EMAIL PROTECTED]
-
Re: Hadoop Case Studies?Simon 2011-02-28, 03:23
I think you can also simulate PageRank Algorithm with hadoop.
Simon - On Sun, Feb 27, 2011 at 9:20 PM, Lance Norskog <[EMAIL PROTECTED]> wrote: > This is an exercise that will appeal to undergrads: pull the Craiglist > personals ads from several cities, and do text classification. Given a > training set of all the cities, attempt to classify test ads by city. > (If Peter Harrington is out there, I stole this from you.) > > Lance > > On Sun, Feb 27, 2011 at 4:55 PM, Ted Dunning <[EMAIL PROTECTED]> > wrote: > > Ted, > > > > Greetings back at you. It has been a while. > > > > Check out Jimmy Lin and Chris Dyer's book about text processing with > > hadoop: > > > > http://www.umiacs.umd.edu/~jimmylin/book.html > > > > > > On Sun, Feb 27, 2011 at 4:34 PM, Ted Pedersen <[EMAIL PROTECTED]> > wrote: > > > >> Greetings all, > >> > >> I'm teaching an undergraduate Computer Science class that is using > >> Hadoop quite heavily, and would like to include some case studies at > >> various points during this semester. > >> > >> We are using Tom White's "Hadoop The Definitive Guide" as a text, and > >> that includes a very nice chapter of case studies which might even > >> provide enough material for my purposes. > >> > >> But, I wanted to check and see if there were other case studies out > >> there that might provide motivating and interesting examples of how > >> Hadoop is currently being used. The idea is to find material that goes > >> beyond simply saying "X uses Hadoop" to explaining in more detail how > >> and why X are using Hadoop. > >> > >> Any hints would be very gratefully received. > >> > >> Cordially, > >> Ted > >> > >> -- > >> Ted Pedersen > >> http://www.d.umn.edu/~tpederse > >> > > > > > > -- > Lance Norskog > [EMAIL PROTECTED] > -- Regards, Simon
-
Re: Hadoop Case Studies?Ted Pedersen 2011-02-28, 03:31
Thanks for all these great ideas. These are really very helpful.
What I'm also hoping to find are articles or papers that describe what particular companies or organizations have done with Hadoop. How does Facebook use Hadoop for example (that's one of the case studies in the White book), or how does last.fm use Hadoop (another of the case studies in the White book). One interesting resource is the list of "powered by Hadoop" projects available here: http://wiki.apache.org/hadoop/PoweredBy Some of these entries provide links to more detailed discussions of what an organization is doing, as in the following from Twitter http://www.slideshare.net/kevinweil/hadoop-pig-and-twitter-nosql-east-2009 So any additional descriptions of what specific organizations are doing with Hadoop (to the extent they are willing to share) would be really helpful (these sorts of "real world" cases tend to be particularly motivating). Cordially, Ted On Sun, Feb 27, 2011 at 9:23 PM, Simon <[EMAIL PROTECTED]> wrote: > I think you can also simulate PageRank Algorithm with hadoop. > > Simon - > > On Sun, Feb 27, 2011 at 9:20 PM, Lance Norskog <[EMAIL PROTECTED]> wrote: > >> This is an exercise that will appeal to undergrads: pull the Craiglist >> personals ads from several cities, and do text classification. Given a >> training set of all the cities, attempt to classify test ads by city. >> (If Peter Harrington is out there, I stole this from you.) >> >> Lance >> >> On Sun, Feb 27, 2011 at 4:55 PM, Ted Dunning <[EMAIL PROTECTED]> >> wrote: >> > Ted, >> > >> > Greetings back at you. It has been a while. >> > >> > Check out Jimmy Lin and Chris Dyer's book about text processing with >> > hadoop: >> > >> > http://www.umiacs.umd.edu/~jimmylin/book.html >> > >> > >> > On Sun, Feb 27, 2011 at 4:34 PM, Ted Pedersen <[EMAIL PROTECTED]> >> wrote: >> > >> >> Greetings all, >> >> >> >> I'm teaching an undergraduate Computer Science class that is using >> >> Hadoop quite heavily, and would like to include some case studies at >> >> various points during this semester. >> >> >> >> We are using Tom White's "Hadoop The Definitive Guide" as a text, and >> >> that includes a very nice chapter of case studies which might even >> >> provide enough material for my purposes. >> >> >> >> But, I wanted to check and see if there were other case studies out >> >> there that might provide motivating and interesting examples of how >> >> Hadoop is currently being used. The idea is to find material that goes >> >> beyond simply saying "X uses Hadoop" to explaining in more detail how >> >> and why X are using Hadoop. >> >> >> >> Any hints would be very gratefully received. >> >> >> >> Cordially, >> >> Ted >> >> >> >> -- >> >> Ted Pedersen >> >> http://www.d.umn.edu/~tpederse >> >> >> > >> >> >> >> -- >> Lance Norskog >> [EMAIL PROTECTED] >> > > > > -- > Regards, > Simon > -- Ted Pedersen http://www.d.umn.edu/~tpederse
-
Re: Hadoop Case Studies?Ted Dunning 2011-02-28, 04:38
At any large company that makes heavy use of Hadoop, you aren't going to
find any concise description of all the ways that hadoop is used. That said, here is a concise description of some of the ways that hadoop is (was) used at Yahoo: http://www.slideshare.net/ydn/hadoop-yahoo-internet-scale-data-processing On Sun, Feb 27, 2011 at 7:31 PM, Ted Pedersen <[EMAIL PROTECTED]> wrote: > Thanks for all these great ideas. These are really very helpful. > > What I'm also hoping to find are articles or papers that describe what > particular companies or organizations have done with Hadoop. How does > Facebook use Hadoop for example (that's one of the case studies in the > White book), or how does last.fm use Hadoop (another of the case > studies in the White book). > > One interesting resource is the list of "powered by Hadoop" projects > available here: > > http://wiki.apache.org/hadoop/PoweredBy > > Some of these entries provide links to more detailed discussions of > what an organization is doing, as in the following from Twitter > http://www.slideshare.net/kevinweil/hadoop-pig-and-twitter-nosql-east-2009 > > So any additional descriptions of what specific organizations are > doing with Hadoop (to the extent they are willing to share) would be > really helpful (these sorts of "real world" cases tend to be > particularly motivating). > > Cordially, > Ted > > On Sun, Feb 27, 2011 at 9:23 PM, Simon <[EMAIL PROTECTED]> wrote: > > I think you can also simulate PageRank Algorithm with hadoop. > > > > Simon - > > > > On Sun, Feb 27, 2011 at 9:20 PM, Lance Norskog <[EMAIL PROTECTED]> > wrote: > > > >> This is an exercise that will appeal to undergrads: pull the Craiglist > >> personals ads from several cities, and do text classification. Given a > >> training set of all the cities, attempt to classify test ads by city. > >> (If Peter Harrington is out there, I stole this from you.) > >> > >> Lance > >> > >> On Sun, Feb 27, 2011 at 4:55 PM, Ted Dunning <[EMAIL PROTECTED]> > >> wrote: > >> > Ted, > >> > > >> > Greetings back at you. It has been a while. > >> > > >> > Check out Jimmy Lin and Chris Dyer's book about text processing with > >> > hadoop: > >> > > >> > http://www.umiacs.umd.edu/~jimmylin/book.html > >> > > >> > > >> > On Sun, Feb 27, 2011 at 4:34 PM, Ted Pedersen <[EMAIL PROTECTED]> > >> wrote: > >> > > >> >> Greetings all, > >> >> > >> >> I'm teaching an undergraduate Computer Science class that is using > >> >> Hadoop quite heavily, and would like to include some case studies at > >> >> various points during this semester. > >> >> > >> >> We are using Tom White's "Hadoop The Definitive Guide" as a text, and > >> >> that includes a very nice chapter of case studies which might even > >> >> provide enough material for my purposes. > >> >> > >> >> But, I wanted to check and see if there were other case studies out > >> >> there that might provide motivating and interesting examples of how > >> >> Hadoop is currently being used. The idea is to find material that > goes > >> >> beyond simply saying "X uses Hadoop" to explaining in more detail how > >> >> and why X are using Hadoop. > >> >> > >> >> Any hints would be very gratefully received. > >> >> > >> >> Cordially, > >> >> Ted > >> >> > >> >> -- > >> >> Ted Pedersen > >> >> http://www.d.umn.edu/~tpederse > >> >> > >> > > >> > >> > >> > >> -- > >> Lance Norskog > >> [EMAIL PROTECTED] > >> > > > > > > > > -- > > Regards, > > Simon > > > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse >
-
RE: Hadoop Case Studies?Evert Lammerts 2011-02-28, 09:04
Hi Ted,
For what it's worth, here's a short article listing some of the cases that we (SARA, Dutch center for HPC) are supporting on our cluster at the moment: http://blog.bottledbits.com/2011/01/sara-hadoop-pilot-project-use-cases-on-hadoop-in-the-dutch-center-for-hpc/ Cheers, Evert Lammerts Consultant eScience & Cloud Services SARA Computing & Network Services Operations, Support & Development Phone: +31 20 888 4101 Email: [EMAIL PROTECTED] http://www.sara.nl > -----Original Message----- > From: Ted Dunning [mailto:[EMAIL PROTECTED]] > Sent: maandag 28 februari 2011 5:39 > To: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: Re: Hadoop Case Studies? > > At any large company that makes heavy use of Hadoop, you aren't going > to > find any concise description of all the ways that hadoop is used. > > That said, here is a concise description of some of the ways that > hadoop is > (was) used at Yahoo: > > http://www.slideshare.net/ydn/hadoop-yahoo-internet-scale-data- > processing > > On Sun, Feb 27, 2011 at 7:31 PM, Ted Pedersen <[EMAIL PROTECTED]> > wrote: > > > Thanks for all these great ideas. These are really very helpful. > > > > What I'm also hoping to find are articles or papers that describe > what > > particular companies or organizations have done with Hadoop. How does > > Facebook use Hadoop for example (that's one of the case studies in > the > > White book), or how does last.fm use Hadoop (another of the case > > studies in the White book). > > > > One interesting resource is the list of "powered by Hadoop" projects > > available here: > > > > http://wiki.apache.org/hadoop/PoweredBy > > > > Some of these entries provide links to more detailed discussions of > > what an organization is doing, as in the following from Twitter > > http://www.slideshare.net/kevinweil/hadoop-pig-and-twitter-nosql- > east-2009 > > > > So any additional descriptions of what specific organizations are > > doing with Hadoop (to the extent they are willing to share) would be > > really helpful (these sorts of "real world" cases tend to be > > particularly motivating). > > > > Cordially, > > Ted > > > > On Sun, Feb 27, 2011 at 9:23 PM, Simon <[EMAIL PROTECTED]> wrote: > > > I think you can also simulate PageRank Algorithm with hadoop. > > > > > > Simon - > > > > > > On Sun, Feb 27, 2011 at 9:20 PM, Lance Norskog <[EMAIL PROTECTED]> > > wrote: > > > > > >> This is an exercise that will appeal to undergrads: pull the > Craiglist > > >> personals ads from several cities, and do text classification. > Given a > > >> training set of all the cities, attempt to classify test ads by > city. > > >> (If Peter Harrington is out there, I stole this from you.) > > >> > > >> Lance > > >> > > >> On Sun, Feb 27, 2011 at 4:55 PM, Ted Dunning > <[EMAIL PROTECTED]> > > >> wrote: > > >> > Ted, > > >> > > > >> > Greetings back at you. It has been a while. > > >> > > > >> > Check out Jimmy Lin and Chris Dyer's book about text processing > with > > >> > hadoop: > > >> > > > >> > http://www.umiacs.umd.edu/~jimmylin/book.html > > >> > > > >> > > > >> > On Sun, Feb 27, 2011 at 4:34 PM, Ted Pedersen > <[EMAIL PROTECTED]> > > >> wrote: > > >> > > > >> >> Greetings all, > > >> >> > > >> >> I'm teaching an undergraduate Computer Science class that is > using > > >> >> Hadoop quite heavily, and would like to include some case > studies at > > >> >> various points during this semester. > > >> >> > > >> >> We are using Tom White's "Hadoop The Definitive Guide" as a > text, and > > >> >> that includes a very nice chapter of case studies which might > even > > >> >> provide enough material for my purposes. > > >> >> > > >> >> But, I wanted to check and see if there were other case studies > out > > >> >> there that might provide motivating and interesting examples of > how > > >> >> Hadoop is currently being used. The idea is to find material > that > > goes > > >> >> beyond simply saying "X uses Hadoop" to explaining in more > detail how > > >> >> and why X are using Hadoop.
-
RE: Hadoop Case Studies?Tom Deutsch 2011-02-28, 15:04
Ted - ping me off line and I'll help. Most of what we're doing is
classified or client confidential, but there are some I can share. ------------------------------------------------ Tom Deutsch Program Director CTO Office: Information Management Hadoop Product Manager / Customer Exec BigInsights IBM 3565 Harbor Blvd Costa Mesa, CA 92626-1420 [EMAIL PROTECTED] Evert Lammerts <[EMAIL PROTECTED]> 02/28/2011 01:04 AM Please respond to [EMAIL PROTECTED] To "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> cc Subject RE: Hadoop Case Studies? Hi Ted, For what it's worth, here's a short article listing some of the cases that we (SARA, Dutch center for HPC) are supporting on our cluster at the moment: http://blog.bottledbits.com/2011/01/sara-hadoop-pilot-project-use-cases-on-hadoop-in-the-dutch-center-for-hpc/ Cheers, Evert Lammerts Consultant eScience & Cloud Services SARA Computing & Network Services Operations, Support & Development Phone: +31 20 888 4101 Email: [EMAIL PROTECTED] http://www.sara.nl > -----Original Message----- > From: Ted Dunning [mailto:[EMAIL PROTECTED]] > Sent: maandag 28 februari 2011 5:39 > To: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: Re: Hadoop Case Studies? > > At any large company that makes heavy use of Hadoop, you aren't going > to > find any concise description of all the ways that hadoop is used. > > That said, here is a concise description of some of the ways that > hadoop is > (was) used at Yahoo: > > http://www.slideshare.net/ydn/hadoop-yahoo-internet-scale-data- > processing > > On Sun, Feb 27, 2011 at 7:31 PM, Ted Pedersen <[EMAIL PROTECTED]> > wrote: > > > Thanks for all these great ideas. These are really very helpful. > > > > What I'm also hoping to find are articles or papers that describe > what > > particular companies or organizations have done with Hadoop. How does > > Facebook use Hadoop for example (that's one of the case studies in > the > > White book), or how does last.fm use Hadoop (another of the case > > studies in the White book). > > > > One interesting resource is the list of "powered by Hadoop" projects > > available here: > > > > http://wiki.apache.org/hadoop/PoweredBy > > > > Some of these entries provide links to more detailed discussions of > > what an organization is doing, as in the following from Twitter > > http://www.slideshare.net/kevinweil/hadoop-pig-and-twitter-nosql- > east-2009 > > > > So any additional descriptions of what specific organizations are > > doing with Hadoop (to the extent they are willing to share) would be > > really helpful (these sorts of "real world" cases tend to be > > particularly motivating). > > > > Cordially, > > Ted > > > > On Sun, Feb 27, 2011 at 9:23 PM, Simon <[EMAIL PROTECTED]> wrote: > > > I think you can also simulate PageRank Algorithm with hadoop. > > > > > > Simon - > > > > > > On Sun, Feb 27, 2011 at 9:20 PM, Lance Norskog <[EMAIL PROTECTED]> > > wrote: > > > > > >> This is an exercise that will appeal to undergrads: pull the > Craiglist > > >> personals ads from several cities, and do text classification. > Given a > > >> training set of all the cities, attempt to classify test ads by > city. > > >> (If Peter Harrington is out there, I stole this from you.) > > >> > > >> Lance > > >> > > >> On Sun, Feb 27, 2011 at 4:55 PM, Ted Dunning > <[EMAIL PROTECTED]> > > >> wrote: > > >> > Ted, > > >> > > > >> > Greetings back at you. It has been a while. > > >> > > > >> > Check out Jimmy Lin and Chris Dyer's book about text processing > with > > >> > hadoop: > > >> > > > >> > http://www.umiacs.umd.edu/~jimmylin/book.html > > >> > > > >> > > > >> > On Sun, Feb 27, 2011 at 4:34 PM, Ted Pedersen > <[EMAIL PROTECTED]> > > >> wrote: > > >> > > > >> >> Greetings all, > > >> >> > > >> >> I'm teaching an undergraduate Computer Science class that is > using > > >> >> Hadoop quite heavily, and would like to include some case
-
Re: Hadoop Case Studies?Ted Pedersen 2011-03-02, 18:58
Greetings all,
Since posting my original request I ran across the following, which is a nice example of what I'd call a case study. Gives a few details at least and is kind of an interesting or creative use of Hadoop... http://engineering.foursquare.com/2011/02/28/how-we-found-the-rudest-cities-in-the-world-analytics-foursquare/ Enjoy, Ted On Sun, Feb 27, 2011 at 9:31 PM, Ted Pedersen <[EMAIL PROTECTED]> wrote: > Thanks for all these great ideas. These are really very helpful. > > What I'm also hoping to find are articles or papers that describe what > particular companies or organizations have done with Hadoop. How does > Facebook use Hadoop for example (that's one of the case studies in the > White book), or how does last.fm use Hadoop (another of the case > studies in the White book). > > One interesting resource is the list of "powered by Hadoop" projects > available here: > > http://wiki.apache.org/hadoop/PoweredBy > > Some of these entries provide links to more detailed discussions of > what an organization is doing, as in the following from Twitter > http://www.slideshare.net/kevinweil/hadoop-pig-and-twitter-nosql-east-2009 > > So any additional descriptions of what specific organizations are > doing with Hadoop (to the extent they are willing to share) would be > really helpful (these sorts of "real world" cases tend to be > particularly motivating). > > Cordially, > Ted > > On Sun, Feb 27, 2011 at 9:23 PM, Simon <[EMAIL PROTECTED]> wrote: >> I think you can also simulate PageRank Algorithm with hadoop. >> >> Simon - >> >> On Sun, Feb 27, 2011 at 9:20 PM, Lance Norskog <[EMAIL PROTECTED]> wrote: >> >>> This is an exercise that will appeal to undergrads: pull the Craiglist >>> personals ads from several cities, and do text classification. Given a >>> training set of all the cities, attempt to classify test ads by city. >>> (If Peter Harrington is out there, I stole this from you.) >>> >>> Lance >>> >>> On Sun, Feb 27, 2011 at 4:55 PM, Ted Dunning <[EMAIL PROTECTED]> >>> wrote: >>> > Ted, >>> > >>> > Greetings back at you. It has been a while. >>> > >>> > Check out Jimmy Lin and Chris Dyer's book about text processing with >>> > hadoop: >>> > >>> > http://www.umiacs.umd.edu/~jimmylin/book.html >>> > >>> > >>> > On Sun, Feb 27, 2011 at 4:34 PM, Ted Pedersen <[EMAIL PROTECTED]> >>> wrote: >>> > >>> >> Greetings all, >>> >> >>> >> I'm teaching an undergraduate Computer Science class that is using >>> >> Hadoop quite heavily, and would like to include some case studies at >>> >> various points during this semester. >>> >> >>> >> We are using Tom White's "Hadoop The Definitive Guide" as a text, and >>> >> that includes a very nice chapter of case studies which might even >>> >> provide enough material for my purposes. >>> >> >>> >> But, I wanted to check and see if there were other case studies out >>> >> there that might provide motivating and interesting examples of how >>> >> Hadoop is currently being used. The idea is to find material that goes >>> >> beyond simply saying "X uses Hadoop" to explaining in more detail how >>> >> and why X are using Hadoop. >>> >> >>> >> Any hints would be very gratefully received. >>> >> >>> >> Cordially, >>> >> Ted >>> >> >>> >> -- >>> >> Ted Pedersen >>> >> http://www.d.umn.edu/~tpederse >>> >> >>> > >>> >>> >>> >>> -- >>> Lance Norskog >>> [EMAIL PROTECTED] >>> >> >> >> >> -- >> Regards, >> Simon >> > > > > -- > Ted Pedersen > http://www.d.umn.edu/~tpederse > -- Ted Pedersen http://www.d.umn.edu/~tpederse |