We are somewhat new to Hadoop and are looking to run some experiments with HDFS, Pig, and HBase.
With that in mind, I have a few questions:
What is the easiest (preferably free) Hadoop distro to get started with? Cloudera?
What host OS distro/release is recommended?
What is the easiest environment to get started with? Amazon EC2? Is there anyone offering virtual/hosted prebuilt Hadoop instances?
Where would we find some "big data" files that people have used for testing purposes?
Feel free to RTFM me to the right place ;-)