https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/37IHL7P2HAE5CXMPK4UJXEUEYVHSWJ5A/


Just in case anybody is interested: Using dm-cache works and boosts performance -- at least for my use case. 


The "challenge" was to get 100 (identical) Linux-VMs started on a three node hyperconverged cluster. The hardware is nothing special, each node has a Supermicro server board with a single CPU with 24 cores and 4 x 4 TB hard disks. And there's that extra 1 TB NVMe... 


I know that the general recommendation is to use the NVMe for WAL and metadata, but this didn't seem appropriate for my use case and I'm still not quite sure about failure scenarios with this configuration. So instead I made each drive a logical volume (managed by an OSD) and added 85 GiB NVMe to each LV as read-only cache. 


Each VM uses as system disk an RBD based on a snapshot from the master image. The idea was that with this configuration, all VMs should share most (actually almost all) of the data on their system disk and this data should be available from the cache. 


Well, it works. When booting the 100 VMs, almost all read operations are satisfied from the cache. So I get close to NVMe speed but have payed for conventional hard drives only (well, SSDs aren't that much more expensive nowadays, but the hardware is 4 years old). 


So, nothing sophisticated, but as I couldn't find anything about this kind of setup, it might be of interest nevertheless. 



 - Michael