How to Safely Expand Cluster by Adding OSD Hosts : Ambedded Technology

Adding OSD to cluster by using the UVS manager is easy. However, if you are going to add some osd servers to a production cluster, you shall take care the working flow to avoid generating OSD PG re-map storm that may impacted to your client services.

Step to add OSD servers to the cluster safely:

Create a staging root in the CRUSH map for deploying the new OSD and servers to the staging root.
Create CRUSH buckets hosts and chassis in the staging root
1. CRUSH host bucket name is the server hostname
2. CRUSH chassis bucket name is the host-$hostname. For example, if the server host name is "uvs-01", set the CRUSH chassis name to be "host-uvs-01"
Set the osd nobackfill and norebalance flag for safety.
Create the new OSDs.
Set configure osd_max_backfills to 1
Move hosts to production root
unset osd nobackfill & norebalance

- `ceph osd set nobackfill` to disable initial backfilling

- `ceph config set osd osd_mclock_override_recovery_settings true` to override the mclock sheduler backfill settings

- Let the orchestrator add one host each time. I would wait between each host until all the peering and stuff is done and only the backfilling is left over.

From my experience adding a whole host is not a problem, unless you are hit by the pglog_dup bug (was fixed in pacific IIRC)

- `ceph tell 'osd.*' injectargs '--osd-max-backfills 1'` to limit the backfilling as much as possible

- `ceph osd unset nobackfill` to start the actuall backfill process

- `ceph config set osd osd_mclock_override_recovery_settings false` after backfilling is done.

I would restart all OSDs after that to make sure the OSDs got the correct backfilling values

:) Make sure your mons have enough oomph to handle the workload.

How to Safely Expand Cluster by Adding OSD Hosts Print

Related Articles