Managing Checkpoints in a Clustered Virtuoso Server

What?

Managment of checkpoints across nodes in a Virtuoso clustered server environment.

Why?

The default checkpoint mode of a Virtuoso cluster server instance might not align with the expectations of specific usage scenarios.

How?

The checkpoint on cluster by default is the usual checkpoint setting in the virtuoso INI file, therefore each cluster node can do automatic checkpoints for itself.

By default checkpointing of a Virtuoso clustered server is performed manually by each node of the cluster, as specified in the Virtuoso configuration file of each node, and thus may not always be in sync.

The master node of the cluster can be configured to configured with a scheduled event to run the cl_exec('checkpoint') function to force all nodes of the cluster to perform a checkpoint simultaneously, there by ensuring the data across all nodes is kept in sync. If this behaviour is not desirable, then the operator can disable auto checkpoint, and make a scheduled task on one of the cluster nodes to execute:


SQL> cl_exec('checkpoint');

Done. -- 13901 msec.
SQL>

The automatic checkpoints on each individual nodes of the cluster set in there Virtuoso configuration files can then be disabled as they are no longer required.

Note, it is recommended a checkpoint be run after the loading of RDF or other datasets into a Virtuoso cluster, or after any large job has been run against a Virtuoso cluster, to ensure all transactions are fully committed, and prevent any data loss when the cluster is restarted.

Related