Another suggestion is Google Books Ngrams:
On Fri, Dec 7, 2012 at 7:57 AM, Phillip Rhodes <[EMAIL PROTECTED]>wrote:
> On Fri, Dec 7, 2012 at 10:48 AM, Harsh J <[EMAIL PROTECTED]> wrote:
> > On Fri, Dec 7, 2012 at 8:31 PM, Yin Steve <[EMAIL PROTECTED]> wrote:
> >> Hello, I'm Steve who need some raw big data for studying mapreduce
> >> programming. Where can i find them? especially those about weblog,
> >> info etc. My English is not so well, if you can give me a URL which
> >> help me download the big file, That'll be great.
> >> Waiting for your reply......
> Try some of the links off of this Quora thread:
> You might also try googling "Enron corpus". Or check out CommonCrawl.org.