Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - why did I achieve such poor performance of HDFS


Copy link to this message
-
Re: why did I achieve such poor performance of HDFS
Konstantin Boudnik 2009-08-03, 17:02
Hi Hao.

Thanks for the observation. While I'll leave a chance to comment on the
particular situation to someone knowing more about HDFS than me, I would like to
ask you a couple of questions:
   - do you have that particular test in a completely separable form? I.e. is it
automated and can it be reused easily by some one else?
   - could you share this test with the rest of the community through a JIRA or
else?

Thanks,
   Konstantin (aka Cos)

On 8/3/09 12:59 AM, Hao Gong wrote:
> Hi all,
>
> I have used HDFS as distributed storage system for experiment. But in my
> test process, I find that the performance of HDFS is very poor.
>
> I make two scenarios. 1) Middle size file test: I PUT 200,000 middle
> size files (20KB~20MB randomly) into HDFS, and trigger 10 client to GET
> random 5000 files simultaneously. But the average GET throughput of
> client is very poor (approximately less than 14000 KBps). 2) Large size
> file test. I PUT 20,000 large size files (250MB~750MB randomly) into
> HDFS, and trigger 10 client to GET random 100 files simultaneously. But
> the average GET throughput of client is also very poor (approximately
> less than 12500 KBps).
>
> So I�m puzzle about these experiments, why did such a poor performance
> of HDFS, the available throughput of Client is far less than the limit
> of network bandwidth. Is that has any parameter I need to change for
> high performance in HDFS (I chose default parameter value)?
>
> My enviroment is list as follows
>
> 1) 30 common PC as HDFS slaves (core2 E7200, 4G ram, 1.5T hdd)
>
> 2) 10 common PC as HDFS clients (core2 E7200, 4G ram, 1.5T hdd)
>
> 3) A common PC as HDFS master (core2 E7200, 4G ram, 1.5T hdd)
>
> 4) 1000M switcher and link as star network architecture
>
> 5) The hadoop version is 0.20.0, JRE version is 1.6.0_11
>
> Is there has anybody to research the performance of HDFS, please contact
> me. Thank you very much.
>
> Best regards,
>
> Hao Gong
>
> Huawei Technologies Co., Ltd
> ***********************************************
> This e-mail and its attachments contain confidential information from
> HUAWEI, which is intended only for the person or entity whose address is
> listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure,
> reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please
> notify the sender by phone or email immediately and delete it!
> ***********************************************
>

--
With best regards,
Konstantin Boudnik (aka Cos)

         Yahoo! Grid Computing
         +1 (408) 349-4049

2CAC 8312 4870 D885 8616  6115 220F 6980 1F27 E622
Attention! Streams of consciousness are disallowed