How's your cluster behave today ?? hope they run well and strong.
In the past or some bad days i saw 'Too many fetch-failure'; it was
fixed by adjusting dfs.datanode.max.xcievers to 6k.
I am still curious, how do we monitor the consumption of this value in
each datanode. I am looking it like other resource in system that
should be monitored and see the trend; probably can do a heatmap on
this metric. So we can possibly pinpoint the potential problem too.
Is this metric exposed in someway ?