Difference between revisions of "Recital Replication Getting Started"

From Recital Documentation Wiki
Jump to: navigation, search
(Starting replication services)
m (GettingStartedWithRecitalReplication moved to RecitalReplicationGettingStarted)
(No difference)

Revision as of 00:34, 2 December 2009

Recital Replication Summary

There are two types of replication in Recital;

  • Master to Slave(s); where the updates are performed on one master server and published for slaves to subscribe.
  • Peer to Peer; where the updates are performed on many servers and the updates published for each server to subscribe.

Master to Slave Replication (MSR)

This type of replication enables data from one Recital database server (called the master) to be published for replication for one or more Recital database servers (slaves) to subscribe. Replication is asynchronous - your replication slaves do not need to be connected permanently to receive updates published on the master. They just need to connect to the publisher when updating the subscription. Which means that updates can occur over long-distance connections and even temporary solutions such as a dial-up service. Depending on the configuration, you can replicate all databases, selected databases and even selected tables within a database.

Target uses for MSR in Recital include:

  • Scale-out solutions - spreading the load among multiple slaves to improve performance. In this environment, all writes and updates must take place on the master server. Reads, however, may take place on one or more slaves. This model can improve the performance of writes (since the master is dedicated to updates), while dramatically increasing read speed across an increasing number of slaves.
  • Data security - because data is replicated to the slave, and the slave can pause the replication process, it is possible to run backup services on the slave without corrupting the corresponding master data.
  • Analytic - live data can be created on the master, while the analysis of the information can take place on the slave without affecting the performance of the master.
  • Long-distance data distribution - if a branch office would like to work with a copy of your main data, you can use replication to create a local copy of the data for their use without requiring permanent access to the master.

Peer to Peer Replication (PPR)

This type of replication enables data from one or more Recital database servers to be published for replication for one or more additional Recital database servers to subscribe. Replication is asynchronous - your replication subscription service does not need to be connected permanently to receive updates from the publication service. Which means that updates can occur over long-distance connections and even temporary solutions such as a dial-up service. Depending on the configuration, you can replicate all databases, selected databases and even selected tables within a database.

Target uses for PPR in Recital include:

  • Scale-out solutions - spreading the load among multiple servers to improve performance.

Enabling Replication

In order for transactions to be published in the queue table, replication must be set on first. This can be done by adding the SET REPLICATION ON command into a local or systemwide configuration file. The config.db stored in the conf directory which is in the root recital installation directory is used for systemwide or you can update the config.db in the local directory. You may also add the command to any 4GL program file.


When replication is turned on all tables already open and all tables opened afterwards are flagged for replication. You may turn replication off to disable replication, but if you want to filter out tables from replication it's better to use the allow and deny access control lists. Index files associated with the tables are also updated. Multiple index tag files are handled automatically. Single index files are sorted by name and then added onto the table name when stored with the transaction.


To turn replication on you must have installed the Replication service on the system already. If this hasnt been done then a You must configure the replication service first with 'dbservice RRS error will be returned.


The first time a table is flagged for replication all rows must be updated with a sequence number, if the table contains a large number of row this process may take some time. Once it's been the table is flagged so this process will not be done again. All new rows inserted after this point will automatically contain their own unique sequence number. For the life of a table these numbers will never be duplicated.


Access Control Lists

If you wish to limit which tables are replicated you may do this with access control lists. Access control lists are defined in two files stored in the conf directory which is in the root recital installation directory. Comment lines can be added to these file by starting the line with # symbol. The access control specification is "directory path" or [database name!]tablename, where an optional database name may be specified. Wild cards may also be specified with ''*'' for all text matches and ''%'' for single character matches. Name expansion from environment variables can also be used.

Example controls
Example Description
southwind!* This would include all tables in the southwind database.
recital* This would include all tables in all databases or directories that start with recital.
/tmp/* This would include all tables in the directory "/tmp/"
${DB_TMPDIR}/* This would include all tables in the directory expanded from the environment variable DB_TMPDIR


replication.allow

This file describes the names of the tables which will be replicated if REPLICATION is SET ON. If this list is not empty then only tables matching will be replicated.


replication.deny

This file describes the names of the tables which will NOT be replicated if REPLICATION is SET ON. If this list is not empty then any tables matching will be not be replicated.

Starting Replication Services

There are two replication services that can be used in the Recital, a subscriber and a publisher.


Subscriber

The subscriber service is used by any system that needs to update it's databases from published data. In a Master Slave configuration each slave would start a subscriber service and to connect to the published data on the Master server. In a Peer to Peer configuration each peer would start a subscriber service and connect to the published data on the publication server.


Publisher

The publication service is only used in a Peer to Peer configuration and only the master publisher would run this service. This service will retrieve the data from the replication queue tables on each peer. This service checks for conflicts before publishing the data for subscriber services.


Recital Database Server

In order for either of these services to work the Recital Database Server must be installed and running on the system publishing the data. In a master slave configuration this would be the master server. In a Peer to Peer configuration a Recital Database Server must be running each peer.


ODBC Data Source

In order for the replication services to connect to the Recital Database Server you must define a ODBC data source in the odbc.ini file. The Data Source name format is Recital Replication Service on plus the node name. The Driver name is Recital and the Database format is ODBC:RECITAL: The following options can also be added;

Starting Master Slave Configuration

A Master Slave configuration in Recital works with the published data stored in the queue table of the replication database. The Recital engine on the master system (the source of the database changes) writes all updates, inserts, deletes, recalls, packs and zaps into the queue table of the replication database. These transactions are stored in a XML format in the queue table.


Slaves are configured to subscribe to the master and to execute the published events on the slave's local databases. The Master is dumb in this scenario. Once replication has been enabled, all statements are published in the replication queue. If required, you can configure the Master to only publish events that apply to particular databases or tables


Each subscribed slave will get a copy of the changes published since the last time it connected. Slaves keep a record of the position within the published queue that they have read and processed. This means that multiple slaves can be connected to the master and executing different parts of the same published data. Because the slaves control this process, individual slaves can be connected and disconnected from the server without affecting the master's operation. Also, because each slave remembers its position within the published queue, it is possible for slaves to be disconnected, reconnect and then 'catch up' by continuing from the recorded position.


Both the master and each slave must be configured with a unique id. In addition, the slave must be configured with information about the master host name, RTQ database name and position within that file.

Starting Master Publication service

On the system specified as the Master you must start the Publication Service first with the command;

When the publication service starts it will attempt to connect to all the subscribed peer servers. If they are not on line, then it will set the error status on in the peer table for the server it can't connect to. It will try to establish a connection to unconnected servers each time it wakes to process transactions. So when the server comes back online it will connect and process all waiting transaction since it last connected.

Staring Subscriber Service

The subscriber service will sleep the specified number of seconds between each set of transaction. Then it will connect to the master publisher and retrieve all published data since it last connected. For each transaction that it processes it will attempt to open the table in the required mode. For ZAP and PACK transactions it must open the table exclusively, for all others it will open it shared. If it can't find the table specified via the path or can't open it in the required mode an error will be returned. The Subscriber Service will perform all required locks on tables so existing Recital users can coexist.

Starting replication services

The replication services are administrated with the recitalreplication command. See recitalreplication for full details.