How to enable Ceph RBD mirroring
Lots of talk but not a lot of info online. This worked for me
I'm enabling replication t pool level here. Not only images that have the 'journaling' feature enabled will be mirrored irrespective of the state of your pool. So to clarify, you need to enable replication on the pool AND enable the journaling feature on the image itself.
1 - Make sure the pool exists on the local AND remote clusters.
In this case our local cluster is ceph(Or default) and our remove cluster is adleast
On the ADLEast cluster run ceph osd pool create ADLWEST-vms
2 - Enable mirroring on the pool on both clusters
rbd mirror pool enable ADLWEST-vms pool
rbd mirror pool enable ADLWEST-vms pool --cluster adleast
3 - Add peers to the pool
rbd --cluster adleast mirror pool peer add ADLWEST-vms client.admin@ceph
rbd mirror pool peer add ADLWEST-vms client.admin@adleast
4 - Enable replication on the desired images in the pool
rbd feature enable ADLWEST-vms/VM-Cacti.raw journaling --journal-pool ADLWEST-journal
Note the journal-pool argument, this allows you to send all the journal data for that VM to a different pool, this might help you reduce the performance impact of journaling\mirroring on your cluster. Your journal will need to be as fast if not faster thatn the actual pool the image resides in else it will become a bottleneck. Also a ==really important gotcha==, if you are using KVM(Or anything with cephx authentication i guess) the user account you are using to access the cluster(Cinder for example!?) MUST have access to this pool, otherwise you IO access will just hang inexplicably! Trust me, i learnt this one the hard way!
Useful script
List the info on all images in a pool
1rbd ls -p $1 |
2 while IFS= read -r line
3 do
4 rbd mirror image status $1/$line
5 done
Should yield a result like
bash checkMirrorStatus.sh ADLWEST-vms
1ADLWest-RGW-LB02.raw:
2 global_id: 4196f19b-3ddb-4dce-a15d-0a281898298d
3 state: up+stopped
4 description: remote image is non-primary or local image is primary
5 last_update: 2017-06-25 19:25:47
6ADLWest-RGW02.raw:
7 global_id: 859a6377-9872-4f0f-9c5f-4cb69bcf101d
8 state: up+stopped
9 description: remote image is non-primary or local image is primary
10 last_update: 2017-06-25 19:25:47
11ADLWest-Tunnel1.raw:
12 global_id: 0e36b8bd-cf07-42e7-8875-ea4e63f9dcfa
13 state: up+stopped
14 description: remote image is non-primary or local image is primary
15 last_update: 2017-06-25 19:25:47
16VM-ADLWest-PRTG.raw:
17 global_id: 473c6d0a-4e6b-492b-a143-b240e4b6194d
18 state: up+stopped
19 description: remote image is non-primary or local image is primary
20 last_update: 2017-06-25 19:25:47
21VM-Cacti.raw:
22 global_id: 1aac9fbd-0eb4-47bd-a7f7-c7e0adebb5a9
23 state: up+stopped
24 description: remote image is non-primary or local image is primary
25 last_update: 2017-06-25 19:25:47
26VM-OS-Net02.raw:
27 global_id: ee6532e1-c11f-4728-a327-559e91eee39e
28 state: up+stopped
29 description: remote image is non-primary or local image is primary
30 last_update: 2017-06-25 19:25:47
31VM-SMTP01.raw:
32 global_id: 4fb8a975-54e4-486a-a119-ae741c4163af
33 state: up+stopped
34 description: remote image is non-primary or local image is primary
35 last_update: 2017-06-25 19:25:47
Useful command
Show the status of your image replication
rbd mirror image status ADLWEST-vms/VM-Cacti.raw
1VM-Cacti.raw:
2 global_id: 1aac9fbd-0eb4-47bd-a7f7-c7e0adebb5a9
3 state: up+stopped
4 description: remote image is non-primary or local image is primary
5 last_update: 2017-06-25 18:58:16
6**rbd mirror image status ADLWEST-vms/VM-Cacti.raw --cluster=adleast**
7VM-Cacti.raw:
8 global_id: 1aac9fbd-0eb4-47bd-a7f7-c7e0adebb5a9
9 state: up+syncing
10 description: bootstrapping, IMAGE_COPY/COPY_OBJECT 21%
11 last_update: 2017-06-25 18:58:50
Then when the replication is done you'll see something like this
rbd mirror image status ADLWEST-vms/VM-Cacti.raw
1VM-Cacti.raw:
2global_id: 1aac9fbd-0eb4-47bd-a7f7-c7e0adebb5a9
3state: up+stopped
4description: remote image is non-primary or local image is primary
5last_update: 2017-06-25 19:23:18
rbd mirror image status ADLWEST-vms/VM-Cacti.raw --cluster=adleast
1VM-Cacti.raw:
2 global_id: 1aac9fbd-0eb4-47bd-a7f7-c7e0adebb5a9
3 state: up+replaying
4 description: replaying, master_position=[object_number=21, tag_tid=0, entry_tid=57097], mirror_position=[object_number=6, tag_tid=0, entry_tid=10886], entries_behind_master=46211
5 last_update: 2017-06-25 19:22:57
I believe that 'e_ntries_behind_master'_ is something along the lines of how far behind the replication of the master vs the slave is. So if you have a write heavy VM it might fall quite far behind the master. But an idle VM should show zero