|
|
-
Re: I need some raw big dataSujit Dhamale 2012-12-08, 05:08
Hi,
you can use National Climatic Data Center (NCDC) data which is good candidate for Hadoop Below are steps to download Data. 1. Create one Folder in your Local drive i created as "*/home/sujit/Desktop/Data/*" 2. Create below script and run for i in {1901..2012} do cd */home/sujit/Desktop/Data/* wget -r --no-parent --reject "index.html*" http://ftp3.ncdc .noaa.gov/pub/data/noaa/$i/ done Kind Regards Sujit Dhamale (+91 9970086652) On Sat, Dec 8, 2012 at 4:05 AM, Mohammad Tariq <[EMAIL PROTECTED]> wrote: > Hello Yin, > > You may find this interesting : > https://github.com/unitedstates > > Regards, > Mohammad Tariq > > > > On Sat, Dec 8, 2012 at 3:25 AM, Chris Nauroth <[EMAIL PROTECTED]>wrote: > >> Another suggestion is Google Books Ngrams: >> >> http://storage.googleapis.com/books/ngrams/books/datasetsv2.html >> >> >> On Fri, Dec 7, 2012 at 7:57 AM, Phillip Rhodes <[EMAIL PROTECTED] >> > wrote: >> >>> On Fri, Dec 7, 2012 at 10:48 AM, Harsh J <[EMAIL PROTECTED]> wrote: >>> > >>> > On Fri, Dec 7, 2012 at 8:31 PM, Yin Steve <[EMAIL PROTECTED]> >>> wrote: >>> >> Hello, I'm Steve who need some raw big data for studying mapreduce >>> >> programming. Where can i find them? especially those about weblog, >>> traffic >>> >> info etc. My English is not so well, if you can give me a URL which >>> directly >>> >> help me download the big file, That'll be great. >>> >> Waiting for your reply...... >>> >>> Try some of the links off of this Quora thread: >>> >>> >>> http://www.quora.com/Data/Where-can-I-find-large-datasets-for-modeling-confidence-during-the-financial-crisis-which-is-open-to-the-public >>> >>> You might also try googling "Enron corpus". Or check out >>> CommonCrawl.org. >>> >>> >>> Phil >>> >> >> > |