From time to time I see some reduces failing with this:
Error: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
I don't see any issues in HDFS during this period (for example, for specific node on which this happened, I checked the logs, and only thing that was happening at that specific point was that pipeline was recovering).
So not quite sure how there's no more good datanodes in cluster of 15 nodes with replication factor three?