Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # dev - profiling hdfs write path


Copy link to this message
-
Re: profiling hdfs write path
Todd Lipcon 2012-11-26, 01:35
Hi Radim,

Currently it's CPU-intensive for several reasons:
1) It doesn't yet use the native CRC code
2) It makes several unnecessary copies and byte buffer allocations, both in
the client and in the DataNode

There are open JIRAs for these, and I have a preliminary patch which helped
a lot, but it hasn't been high priority. On most clusters, writing becomes
network bound before being CPU-bound. On the other hand, as 10gbe is
becoming fairly common, this will probably be more important soon. Hoping
to find time to get back to finishing the patches in the next few months.

-Todd

On Sun, Nov 25, 2012 at 1:41 PM, Radim Kolar <[EMAIL PROTECTED]> wrote:

> anybody tried to profile why HDFS write path is so much CPU intensive?
>

--
Todd Lipcon
Software Engineer, Cloudera