Your best bet would be to take a look at synthetic load generator.
10^8 files would be a problem for most cases because you'd need to have a
really beefy NN for that (~48GB of JVM heap and all that). The biggest I've
heard about hold something at the order of 1.15*10^8 objects (files & dirs)
and is serving a largest Hadoop cluster in the world for Yahoo! production
setup. You might want to check YDN for more details about this case, I guess.
Hope it helps,
On Mon, May 30, 2011 at 10:44AM, ccxixicc wrote:
> Hi all
> I'm doing a test and need create lots of files ( 100 million ) in
> HDFS-L-NOT I use a shell script to do this , it's very very slow, how to
> create a lot files in HDFS quickly?