|
|
-
I need some raw big data
Yin Steve 2012-12-07, 15:01
Hello, I'm Steve who need some raw big data for studying mapreduce programming. Where can i find them? especially those about weblog, traffic info etc. My English is not so well, if you can give me a URL which directly help me download the big file, That'll be great. Waiting for your reply......
+
Yin Steve 2012-12-07, 15:01
-
Re: I need some raw big data
Harsh J 2012-12-07, 15:48
You can find some real world data samples at InfoChimps' data marketplace: http://www.infochimps.com/marketplaceOn Fri, Dec 7, 2012 at 8:31 PM, Yin Steve <[EMAIL PROTECTED]> wrote: > Hello, I'm Steve who need some raw big data for studying mapreduce > programming. Where can i find them? especially those about weblog, traffic > info etc. My English is not so well, if you can give me a URL which directly > help me download the big file, That'll be great. > Waiting for your reply...... -- Harsh J
+
Harsh J 2012-12-07, 15:48
-
Re: I need some raw big data
Phillip Rhodes 2012-12-07, 15:57
On Fri, Dec 7, 2012 at 10:48 AM, Harsh J <[EMAIL PROTECTED]> wrote: > > On Fri, Dec 7, 2012 at 8:31 PM, Yin Steve <[EMAIL PROTECTED]> wrote: >> Hello, I'm Steve who need some raw big data for studying mapreduce >> programming. Where can i find them? especially those about weblog, traffic >> info etc. My English is not so well, if you can give me a URL which directly >> help me download the big file, That'll be great. >> Waiting for your reply...... Try some of the links off of this Quora thread: http://www.quora.com/Data/Where-can-I-find-large-datasets-for-modeling-confidence-during-the-financial-crisis-which-is-open-to-the-publicYou might also try googling "Enron corpus". Or check out CommonCrawl.org. Phil
+
Phillip Rhodes 2012-12-07, 15:57
-
Re: I need some raw big data
Chris Nauroth 2012-12-07, 21:55
Another suggestion is Google Books Ngrams: http://storage.googleapis.com/books/ngrams/books/datasetsv2.htmlOn Fri, Dec 7, 2012 at 7:57 AM, Phillip Rhodes <[EMAIL PROTECTED]>wrote: > On Fri, Dec 7, 2012 at 10:48 AM, Harsh J <[EMAIL PROTECTED]> wrote: > > > > On Fri, Dec 7, 2012 at 8:31 PM, Yin Steve <[EMAIL PROTECTED]> wrote: > >> Hello, I'm Steve who need some raw big data for studying mapreduce > >> programming. Where can i find them? especially those about weblog, > traffic > >> info etc. My English is not so well, if you can give me a URL which > directly > >> help me download the big file, That'll be great. > >> Waiting for your reply...... > > Try some of the links off of this Quora thread: > > > http://www.quora.com/Data/Where-can-I-find-large-datasets-for-modeling-confidence-during-the-financial-crisis-which-is-open-to-the-public> > You might also try googling "Enron corpus". Or check out CommonCrawl.org. > > > Phil >
+
Chris Nauroth 2012-12-07, 21:55
-
Re: I need some raw big data
Mohammad Tariq 2012-12-07, 22:35
Hello Yin, You may find this interesting : https://github.com/unitedstatesRegards, Mohammad Tariq On Sat, Dec 8, 2012 at 3:25 AM, Chris Nauroth <[EMAIL PROTECTED]>wrote: > Another suggestion is Google Books Ngrams: > > http://storage.googleapis.com/books/ngrams/books/datasetsv2.html> > > On Fri, Dec 7, 2012 at 7:57 AM, Phillip Rhodes <[EMAIL PROTECTED]>wrote: > >> On Fri, Dec 7, 2012 at 10:48 AM, Harsh J <[EMAIL PROTECTED]> wrote: >> > >> > On Fri, Dec 7, 2012 at 8:31 PM, Yin Steve <[EMAIL PROTECTED]> wrote: >> >> Hello, I'm Steve who need some raw big data for studying mapreduce >> >> programming. Where can i find them? especially those about weblog, >> traffic >> >> info etc. My English is not so well, if you can give me a URL which >> directly >> >> help me download the big file, That'll be great. >> >> Waiting for your reply...... >> >> Try some of the links off of this Quora thread: >> >> >> http://www.quora.com/Data/Where-can-I-find-large-datasets-for-modeling-confidence-during-the-financial-crisis-which-is-open-to-the-public>> >> You might also try googling "Enron corpus". Or check out >> CommonCrawl.org. >> >> >> Phil >> > >
+
Mohammad Tariq 2012-12-07, 22:35
-
Re: I need some raw big data
Sujit Dhamale 2012-12-08, 05:08
Hi, you can use National Climatic Data Center (NCDC) data which is good candidate for Hadoop Below are steps to download Data. 1. Create one Folder in your Local drive i created as "*/home/sujit/Desktop/Data/*" 2. Create below script and run for i in {1901..2012} do cd */home/sujit/Desktop/Data/* wget -r --no-parent --reject "index.html*" http://ftp3.ncdc.noaa.gov/pub/data/noaa/$i/ done Kind Regards Sujit Dhamale (+91 9970086652) On Sat, Dec 8, 2012 at 4:05 AM, Mohammad Tariq <[EMAIL PROTECTED]> wrote: > Hello Yin, > > You may find this interesting : > https://github.com/unitedstates> > Regards, > Mohammad Tariq > > > > On Sat, Dec 8, 2012 at 3:25 AM, Chris Nauroth <[EMAIL PROTECTED]>wrote: > >> Another suggestion is Google Books Ngrams: >> >> http://storage.googleapis.com/books/ngrams/books/datasetsv2.html>> >> >> On Fri, Dec 7, 2012 at 7:57 AM, Phillip Rhodes <[EMAIL PROTECTED] >> > wrote: >> >>> On Fri, Dec 7, 2012 at 10:48 AM, Harsh J <[EMAIL PROTECTED]> wrote: >>> > >>> > On Fri, Dec 7, 2012 at 8:31 PM, Yin Steve <[EMAIL PROTECTED]> >>> wrote: >>> >> Hello, I'm Steve who need some raw big data for studying mapreduce >>> >> programming. Where can i find them? especially those about weblog, >>> traffic >>> >> info etc. My English is not so well, if you can give me a URL which >>> directly >>> >> help me download the big file, That'll be great. >>> >> Waiting for your reply...... >>> >>> Try some of the links off of this Quora thread: >>> >>> >>> http://www.quora.com/Data/Where-can-I-find-large-datasets-for-modeling-confidence-during-the-financial-crisis-which-is-open-to-the-public>>> >>> You might also try googling "Enron corpus". Or check out >>> CommonCrawl.org. >>> >>> >>> Phil >>> >> >> >
+
Sujit Dhamale 2012-12-08, 05:08
-
Re: I need some raw big data
Bruce Durling 2012-12-07, 15:55
You can also have a play with some open data from the UK COINS http://data.gov.uk/dataset/coinsor have a look around the NHS Information Centre http://www.ic.nhs.uk/cheers, Bruce On Fri, Dec 7, 2012 at 3:48 PM, Harsh J <[EMAIL PROTECTED]> wrote: > You can find some real world data samples at InfoChimps' data > marketplace: http://www.infochimps.com/marketplace> > On Fri, Dec 7, 2012 at 8:31 PM, Yin Steve <[EMAIL PROTECTED]> wrote: > > Hello, I'm Steve who need some raw big data for studying mapreduce > > programming. Where can i find them? especially those about weblog, > traffic > > info etc. My English is not so well, if you can give me a URL which > directly > > help me download the big file, That'll be great. > > Waiting for your reply...... > > > > -- > Harsh J > -- @otfrom | CTO & co-founder @MastodonC | mastodonc.com
+
Bruce Durling 2012-12-07, 15:55
|
|