Mediawiki/OAI mirror/pragmatic synchronisation

From Bjoern Hassler's website
Jump to: navigation, search

This page explains how to do pragmatic two-way mirroring of two mediawikis.

This isn't a production-ready recipe, but it's a highly experimental proof of concept. While I've documented the steps, I don't really recommend you do this on anything but an experimental mediawiki install. USE AT YOUR OWN RISK!

1 Installation[edit]

The idea is to install the OAI extension on both wikis, the remote-server, and the local-client. It works like this:

  1. Install OAI extension on remote-server. See Mediawiki/OAI mirror/OAIRepository for instructions.
  2. Install a local mediawiki, and install OAI extension on this local mediawiki ("local-client")
  3. Also install:
    • wikisync perl code: This is perl. It can be anywhere on your system, but must be able to access the checkpoint code, and the oai updater code (in the oai extension). You need a bunch of perl modules for it.
    • checkpoint code: This is just php, and is easy. It goes into the oai extension directory.
  4. Finally run OAI updater on local-client to pull all data into the local-client

2 Synchronisation[edit]

When you run the wikisync perl code, this happens:

  • The code checks the OAI extension ON THE CLIENT, to see whether there were updates since the last 'checkpoint' (which is recorded in the OAI tables of database)
  • It also check for updates on the server.
  • The updates are compared:
    • If the two sets of modified pages are disjunct, we simply send client-modified pages to the server, and then run the updater. This is fairly straight forward.
    • If there is overlap, diff3 is used to merge changes.
      • If changes merge without conflicts, they are submitted
      • If there are conflicts, a version with conflict markers is submitted, and immediately thereafter a version is submitted where the conflicts are 'quashed'. (This isn't implemented yet, but ideally the users who contributed to such a page locally are notified, e.g. we add a note to their user talk page, letting them know that they need to manually re-instate their conflicting edits.)

3 Issues[edit]

An issue that isn't treated well is page moves: changes that may need to be merged are now in different histories. So we need to detect page moves, and make sure that local changes are merged into the new page, before the move is applied locally.

Ideally this would be packages into an extension, that runs synchronisation from an edit hook, depending on whether connectivity is available

This is a crude and horrendous process, and requires all sorts of tweaking. But for a seasoned php programmer, it shouldn't be too hard to tie all of this down properly.