For the purposes of this post I'll assume we are working with two MariaDB servers that have been configured to perform master/master replication and one of them has failed. In this case Server01 is healthy while Server02 has stop replicating.
We need to firstly ensure that no queries are hitting Server02 / the failed server - this will typically be a case of stopping services / blocking network access to services that hit it. e.g. stopping httpd.
We'll also want to ensure replication is stopped on the failed server (Server02):
SERVER02> stop slave;
Now on Server01 / the working server issue:
SERVER01> stop slave;
SERVER01> flush tables with read lock; (This will temporarily stop it updating)
SERVER01> show master status;
We'll make a note of the above command - it should read something like:
File: mysql-bin.123456
Position 123
Binlog_Do_DB: <replicated_database>
Then on Server01 / the working server take a backup of the database:
SERVER01> mysqldump -u<username> -p --lock-tables --databases <database-name[s]> > export.sql
and on Server02 / the failed server - restore the backup:
SERVER02> mysql -u root -p < export.sql
Now on Server01 / the working server issue the following command to start processing changes again:
SERVER01> UNLOCK TABLES;
Then on Server02 / the failed server issue the following to repoint the logs (use the information above we recorded from Server01):
SERVER02> CHANGE MASTER TO master_log_file='mysql-bin.xxxxxx', master_log_pos=yy;
SERVER02> START SLAVE;
To verify we can issue:
SERVER02> show slave status \G
Now we need to do the reverse by ensuring Server01 / working server replicates from Server02 / failed server. On Server02 issue:
SERVER02> show master status \G
Record the output again.
Now on Server01 / the working server set the logs:
SERVER01> CHANGE MASTER TO master_log_file='mysql-bin.xxxxxx', master_log_pos=yy;
SERVER01> START SLAVE;
and then to verify replication issue:
SERVER01> SHOW SLAVE STATUS \G
Finally reverse anything you performed at the start to block comms with Server02 / the bad server e.g. start services, update firewall etc.