Difference between revisions of "Recital Replication Configuration"

From Recital Documentation Wiki
Jump to: navigation, search
 
Line 1: Line 1:
 +
{{YLM remove peer to peer}}
 
= Recital Engine Configuration =
 
= Recital Engine Configuration =
 
== Queue Table ==
 
== Queue Table ==

Revision as of 17:04, 8 March 2010

Template:YLM remove peer to peer

Contents

Recital Engine Configuration

Queue Table

In Recital replication works by storing transactions for publication in the queue table which is stored in the replication database. The replication queue table meta data is defined below.

Queue Tables
Column Name Type Width Description
PUBLISHER Character 20 Publisher name
PUBLICATION Character 60 Publication number
TRANID Integer 8 Transaction ID
DATA Object 8 Transaction XML file
CMDTYPE Integer 4 Command type number
INSERTDATE DateTime 8 Insert date
CMDSTATE Character 1 Current transaction state
TRANKEY Character 73 Transaction key
TRANTYPE Character 1 Transaction type


Publisher column

This is the node name of the system that the transaction was processed on.


Publication column

This is the publication ID of the system that the transaction was processed on. This ID is specified during installation and is unique between all the systems that the replication services are installed on. It can be found in the recital.conf file and is defined by the DB_PUBLICATIONID variable. This ID should not be changed once the replication service has been started on the system as it is used to determine which system the transaction was processed on.


TranID column

The transaction ID is unique sequence number of each transaction in the queue table.


Data column

Each transaction is contained in a XML file which is stored in this column. The XML format of the transaction is explained in the XML format section.


CMDType column

This is the numeric value of the command for this transaction. A full list of supported commands and their numeric values can be found in the commands section.


InsertDate column

This contains the date and time that the transaction was performed.


CMDState column

The current state of the transaction. When a transaction is first inserted into the queue table the state is set to W. This indicates that this transaction is in a wait state. When new transactions are performed on the same row from the same table this flag is checked to see if the transaction can be optimized. When this transaction has been processed by a Slave or Peer service the state is set to P. Once a transaction's state has been set to Processed no more optimization can be performed on this row. See the section on optimization for more information.


TranKey column

A unique key to identify transactions performed on a row in a table. This key is made up of the table name and any open single index files plus the sequence of the row that this transaction was performed on.


TranType column

This is used to specify the type of transaction in the queue table. This value is always set to Q for the replication service.

Queue table indexes

The queue table contains the following indexes

Queue Tables Indexes
Expression Tag Name
CMDSTATE+TRANKEY+SYNCNUM CMDSTATE
SYNCNUM SYNCNUM

XML format

Each transaction is stored in a XML file in the data column which is a Large binary object type. The XML file is defined by the following elements and attributes.

_rep_tn attribute

The RECITAL_DATAEXCHANGE element has an attribute called _rep_tn which will contain the table name for the transaction. The target name is defined when the table being replicated is opened. The format of the target attribute is dependent of how the table is opened. Tables with the same name on the same system can be replicated because the table name is prefixed to either the database name or directory path. The following table explains more about the target attribute format

Target Attribute
Format Description
databasename ! tablename This is the format if the table is in a database. Tables being replicated this way must be located in the same database on each system.
directorypath tablename This is the format if the table is in a directory. Tables being replicated this way must be located in the same directory in each system. Directory links can be used to match the directory paths.
tablename,indexname,... If single index files are open on the table, then their name(s) are sorted in ascending order before being affixed in a comma separated list to the end of the table name.

RECORD content element

This element contains the record contents.


_rep_tt attribute

Each record element must contain a transaction type attribute. Transactions attributes can be one of the following types;

Transaction Types
Transaction Type
INSERT
PACK
DELETE
RECALL
ZAP
UPDATE
ALTER


_rep_wc attribute

If the _rep_tn attribute is an UPDATE then it must also contain a _rep_wc attribute. The format of this attribute is SYNCNUM=syncnum. The where condition is used when the slave replication service is performing the update on the target table.

_rep_id text element

Each record element must contain a text element called _rep_id. The text contained in this element is the unique publication ID from the system that the transaction originated from. This value is used in the peer to peer configuration to determine if the transaction being processed by the subscription service originated from the same system. If this is the case the transaction will not be processed.


This ID is specified during installation and is unique between all the systems that the replication services are installed on. It can be found in the recital.conf and is defined by the DB_PUBLICATIONID variable. This ID should not be changed once the replication service has been started on the system as it is used to determine which system the transaction was processed on.


_rep_sn text element

Each record must contain a text element called _rep_sn. This value can be used to identify the row on which to perform the transaction.


Field name text elements

If the transaction type in the record element is an INSERT or UPDATE than an element for each column that is being changed will be included. Only columns that have been changed will be added, see the optimization section for more information. Each field name text element will contain a pair of data separated by a vertical bar |. A before image of the data is stored with the transaction for peer to peer configuration, if a conflict occurs and a change must be rolled back. For example a change to a column named balance from 350 to 125 would be formated like this;

<BALANCE>125|350</BALANCE>

All text stored in field name elements is converted to hex so that if will not conflict with any XML directives.


The following is an example XML file for the SQL command;

INSERT INTO accounts (last_name, first_name) VALUES (''Jefferson'', ''Frank'');
<_rep_tn>=''finance!accounts''</_rep_tn>
<_rep_tt>=''INSERT''</_rep_tt>
    <_rep_id>01</_rep_id>
    <_rep_sn>01000000000000A2</_rep_sn>
    <LAST_NAME>|02A9E02</LAST_NAME>
    <FIRST_NAME>|AB075F1</FIRST_NAME>

SYNCNUM

When replication is enable in the Recital Engine a unique sequence number for each row in every table being replicated must be added. The first time a table is flagged for replication all rows must be updated with a sequence number, if the table contains a large number of row this process may take some time. You can also issue the recitalconvert replicate command at the OS prompt to convert all the tables in the current directory. Once it's been converted, the table is flagged so this process will not be done again. All new rows inserted after this point will automatically contain their own unique sequence number. For the life of a table these numbers will never be duplicated.


The unique publication ID is also included with the sequence so that this number is not only unique for this table, but is unique among all system replicating this table in a peer to peer configuration. In a master slave configuration this is not as important as only one system ever be inserting rows.

Optimization

There are several specific optimization procedures enabled for the replication process.


Record buffer

When replication is enabled on a table, a before image record of the row is kept. After each update on the row, this image is compared with the new data and if they are both the same the transaction is not added to the queue. This prevents multiple updates on the same row with the same data being added to the replication queue.


XML attributes

When replication is enabled on a table, a before image record of the row is kept. If the transaction type in the record element is an INSERT or UPDATE than an element for the columns are included in the XML file. Only columns that are different between the before image record and the new data are added to the transaction.


This has several benefits. The XML file that contains the replication transaction will only contain data being changed, so generally it will be smaller. This will help with storage size of the queue tables and network traffic from the slave subscription service. In a peer to peer configuration it will also help limit the chance of conflicts as the replication is being done at a column level.


Multiple transactions

The queue table contains a column used to store the current transaction state. When the first transaction is added for a row on a table, this state is set to wait. It will remain this way until a slave or peer service processes the transaction and sets the state of the transaction to Processed. When new UPDATE transactions are added to the queue, existing transactions for that row which are in a wait state are searched for. If a transaction is found for that row in a wait state and the transaction is an INSERT or UPDATE then the new transaction is merged in. If the column is not found in the current XML file it will be added. If the column is already in the XML then the data is updated with the new data.

If the transaction that is found is an INSERT then it will remain as an INSERT transaction and the data from the UPDATE transaction will be merge into it. In this case the transactions queue will have no record of an UPDATE being done.


Statements

Transactions that are performed by the Recital Database Engine are stored in the queue table for publication. The following table is a list of statements and commands that currently supported for replication.

Supported Statements
Statement Type Number
APPEND 5042
DELETE 5022
EDIT 5082
INSERT 5042
MERGE 5042
PACK 33
ZAP 82
REPLACE 5082
RECALL 36
READ 5082
BROWSE 5082
SQL INSERT 4042
SQL UPDATE 5082
SQL DELETE 5022

Enabling Replication

IMPORTANT You must add the directory path or database name that contains the tables you want to replicate into the replication.allow file. Only tables that match in this file and are not matched in the replication.deny file will be replicated.

In order for transactions to be published in the queue table, replication must be set on first. This can be done by adding the SET REPLICATION ON command into a local or systemwide configuration file. The config.db stored in the conf directory, which is in the root recital installation directory, is used for systemwide settings, or you can update the config.db in the local directory. You may also add the command to any 4GL program file.


When replication is turned on all tables already open and all tables opened afterwards are flagged for replication. You may turn replication off to disable replication, but if you want to filter out tables from replication it's better to use the allow and deny access control lists. Index files associated with the tables are also updated. Multiple index tag files are handled automatically. Single index files are sorted by name and then added onto the table name when stored with the transaction.


To turn replication on you must have installed the Replication service on the system already. If this hasn't been done then a You must configure the replication service first with 'recitalreplication rrs' error will be returned.


The first time a table is flagged for replication all rows must be updated with a sequence number, if the table contains a large number of row this process may take some time. Once it's been the table is flagged so this process will not be done again. All new rows inserted after this point will automatically contain their own unique sequence number. For the life of a table these numbers will never be duplicated.

Access Control Lists

You must add the directory path or database name that contains the tables you want to replicate into the replication.allow file. Only tables that match in this file and are not matched in the replication.deny file will be replicated.

If you wish to limit which tables are replicated you may do this with access control lists. Access control lists are defined in two files stored in the conf directory which is in the root recital installation directory. Comment lines can be added to these file by starting the line with # symbol. The access control specification is "directory path" or [database name!]tablename, where an optional database name may be specified. Wild cards may also be specified with ''*'' for all text matches and ''%'' for single character matches. Name expansion from environment variables can also be used.

Example controls
Example Description
southwind!* This would include all tables in the southwind database.
/tmp/* This would include all tables in the directory "/tmp/"
${DB_TMPDIR}/* This would include all tables in the directory expanded from the environment variable DB_TMPDIR


replication.allow

This file describes the names of the tables which will be replicated if REPLICATION is SET ON. If this list is not empty then only tables matching will be replicated.

If this file is empty then no tables will be replicated even if SET REPLICATION is ON.

replication.deny

This file describes the names of the tables which will NOT be replicated if REPLICATION is SET ON. If this list is not empty then any tables matching will be not be replicated.

Replication Services Configuration

There are two replication services that can be used in the Recital, a subscriber and a publisher.


Subscriber

The subscriber service is used by any system that needs to update it's databases from published data. In a Master Slave configuration each slave would start a subscriber service and to connect to the published data on the Master server. In a Peer to Peer configuration each peer would start a subscriber service and connect to the published data on the publication server.


Publisher

The publication service is only used in a Peer to Peer configuration and only the master publisher would run this service. This service will retrieve the data from the replication queue tables on each peer. This service checks for conflicts before publishing the data for subscriber services.


Recital Database Server

In order for either of these services to work the Recital Database Server must be installed and running on the system publishing the data. In a master slave configuration this would be the master server. In a Peer to Peer configuration a Recital Database Server must be running each peer.


ODBC Data Source

In order for the replication services to connect to the Recital Database Server you must define a ODBC data source in the odbc.ini file. The Data Source name format is Recital Replication Service on plus the node name. The Driver name is Recital and the Database format is ODBC:RECITAL: The following options can also be added;


ServerName

This is the publishers server name that has the Recital Database Server running.


UserName

This is the username to used to login on the publisher server. This is optional here as it can be specified in the argument command line when starting the service.


Password

This is the password for the username used to login on the publisher server. This is optional here as it can be specified in the argument command line when starting the service.


Database

This is the default data database that will opened when the service has connected to the Recital Database Server. This should be set to replication

Exclusive

This specifies how the replication database is opened and should be set to false so it's opened shared.


odbc.ini example

[Recital Replication Service on jazz]
Driver = Recital
DATABASE=ODBC:RECITAL:SERVERNAME=jazz.recital.local;USERNAME=replication;PASSWORD=recital;DATABASE=replication;EXCLUSIVE=FALSE

Initial data setup

Before starting the replication services, the best way to initially load the data from the master server onto the peers or slaves is to use the Recital recitaldump and recitalrestore commands:


# recitaldump -d /directory/path

recitalreplication

The replication services are administrated with the recitalreplication command. See recitalreplication for full details.

Master Slave Configuration

A Master Slave configuration in Recital works with the published data stored in the queue table of the replication database. The Recital engine on the master system (the source of the database changes) writes all updates, inserts, deletes, recalls, packs and zaps into the queue table of the replication database. These transactions are stored in a XML format in the queue table.


Slaves are configured to subscribe to the master and to execute the published events on the slave's local databases. The Master is dumb in this scenario. Once replication has been enabled, all statements are published in the replication queue. If required, you can configure the Master to only publish events that apply to particular databases or tables


Each subscribed slave will get a copy of the changes published since the last time it connected. Slaves keep a record of the position within the published queue that they have read and processed. This means that multiple slaves can be connected to the master and executing different parts of the same published data. Because the slaves control this process, individual slaves can be connected and disconnected from the server without affecting the master's operation. Also, because each slave remembers its position within the published queue, it is possible for slaves to be disconnected, reconnect and then 'catch up' by continuing from the recorded position.


Both the master and each slave must be configured with a unique id. In addition, the slave must be configured with information about the master host name, RTQ database name and position within that file.


Starting MSR

You should start the Master Slave Replication processes in the following order:

Starting Master

On the system specified as the Master you must start the Recital Database Server first with the command;

# recitalreplication start

Starting Slaves

On the systems specified as Slaves you must start the Subscriber Service with the command:

# recitalreplication start

Checking status

Once you have started the subscriber services on the Slaves you can check their current status with the command:

# recitalreplication status

An example out put for the Slave Subscriber Service would be:

Subscriber service 'port058' is running on a 60 second delay.
Connected to publisher 'jazz' in a Master Slave configuration.
Current transaction queue size is 0
Last transaction processed was 0 at
Remaining transactions in queue None

Processing Transactions

The subscriber service will sleep the specified number of seconds between each set of transaction. Then it will connect to the master publisher and retrieve all published data since it last connected. For each transaction that it processes it will attempt to open the table in the required mode. For ZAP and PACK transactions it must open the table exclusively, for all others it will open it shared. If it can't find the table specified via the path or can't open it in the required mode an error will be returned.


When the subscriber has to perform changes to existing rows in tables it will use the sync number specified in the transaction to locate the row in the specified table. As all sync number as ascending in order it uses a algorithm to locate the row by truncating the search into sections.

Peer to Peer Configuration

A Peer to Peer configuration is similar in some ways to a MSR in that you still need to define a master publication server. This server will have the master published data which will be updated by the publication service from each peer.


Master Publisher Queue Table

The replication works by storing transactions for publication in the pub_queue table which is stored in the replication database. The replication pub_queue table meta data is defined below and is the same as the local queue table.

Queue table Meta Data
Column Name Type Width Description
PUBLISHER Character 20 Publisher name
PUBLICATION Character 60 Publication number
TRANID Integer 8 Transaction ID
DATA Object 8 Transaction XML file
CMDTYPE Integer 4 Command type number
INSERTDATE DateTime 8 Insert date
CMDSTATE Character 1 Current transaction state
TRANKEY Character 73 Transaction key
TRANTYPE Character 1 Transaction type


Each peer server will keep its own local queue table in its own replication database for updates and changes performed on the server only. In this way each peer is also the same as a Master in a MSR configuration. For more information on the local queue table see the section Recital Engine Configuration.


The publication service that runs on the master publisher will subscribe to every peer server. It will add all the data from the queue table into the master queue table called pub_queue.

Peer Table

Configuration and status information about each peer is stored in the peer table. The peer table meta data is defined below.

Peer table Meta Data
Column Name Type Width Description
PEERNAME Character 20 Peer name
PUBLICATION Character 60 Publication number
PEERSYNCNUM Character 16 Last peer sync number
PEERLASTUPDATE DateTime 8 Last peer sync update date and time
PUBSYNCNUM Character 16 Last publisher sync number
PUBLASTUPDATE DateTime 8 Last publisher sync update
SUBSYNCNUM Character 16 Last subscriber sync number
SUBLASTUPDATE DateTime 8 Last subscriber sync update
STATUS Character 1 Current publisher status


PeerName column

This is the peer name that information is stored for in this row. A row for each peer defined in the replication peers list is inserted into this table. The rows are added when the table is first created and when the publication server is started.


Publication column

This is the publication ID of the peer name specified in this row . This ID is specified during installation of the peer server and is unique between all the systems that the replication services are installed on. It can be found in the recital.conf and is defined by the DB_PUBLICATIONID variable on the system that this peer is installed on. This ID should not be changed once the replication service has been started on the system as it is used to determine which system the transaction was processed on. The first time the subscription service is started on this peer, it will update this row with it's ID. During this process it will check for duplicate IDs and give return an error if one is found.


PeerSyncNum column

This is the last sync number that the publication service processed on this peer server queue table. When the publication service connects to the peer it uses this sync number to know where to start processing from in the queue table. If this value is empty then the publication service will process all transaction in the queue table on the peer server.


PeerLastUpdate column

Last time the publication service updated the peer sync number.


PubSyncNum column

This is the last sync number that the publication service processed for this peer in the pub_queue table. This number is used during conflict resolution to make sure that any conflicts found are in the range of unprocessed records.


PubLastUpdate column

This is the Last time the the publication sync number was updated.


SubSyncNum column

This is the last sync number that the subscription service on this peer processed in the pub_queue table. When the peer's subscriber service connects to the publication service it uses this sync number to know where to start processing from in the pub_queue table. If this value is empty then the subscriber service will process all transaction published in the pub_queue table.

SubLastUpdate column

Last time the subscriber service on the peer updated the subscriber sync number.


Status column

This specifies the current status of the publication service with the peer server. The following values will be used.

Status Column
Value Description
D Disconnected
C Connected
E Error on connection

Peer table indexes

The peer table contains the following indexes

Peer Tables Indexes
Expression Tag Name
PEERNAME PEERNAME

Starting PPR

On each Peer system you must start the Recital Database Server first with the command;

# recitalreplication start

The you should start the Peer to Peer Replication processes in the following order;


Starting Master Publication service

On the system specified as the Master you must start the Publication Service first with the command;

# recitalreplication start

When the publication service starts it will attempt to connect to all the subscribed peer servers. If they are not on line, then it will set the error status on in the peer table for the server it can't connect to. It will try to establish a connection to unconnected servers each time it wakes to process transactions. So when the server comes back online it will connect and process all waiting transaction since it last connected.

Starting Subscriber Service

On each Peer system you must start the Subscription Service with the command;

# recitalreplication start 

The subscriber service will sleep the specified number of seconds between each set of transaction. Then it will connect to the master publisher and retrieve all published data since it last connected. For each transaction that it processes it will attempt to open the table in the required mode. For ZAP and PACK transactions it must open the table exclusively, for all others it will open it shared. If it can't find the table specified via the path or can't open it in the required mode an error will be returned. The Subscriber Service will perform all required locks on tables so existing Recital users can coexist.

Checking status

Once you have started the subscriber services on the peers you can check their current status with the command:

# recitalreplication status

An example output for the Peer Subscriber Service would be;

Recital Replication Service on pubserv
Subscriber service jazz (02) is running on a 10 second delay.
Version 10.0.0 Compiled on xxx
Connected to publication pubserv in a Peer to Peer environment.
Last SYNCNUM processed was 0100000000000002 at %s

You can check the status of the publication server by

# recitalreplication status

An example output for the Master Publisher Service would be;

Publication service pubserv is running on a 10 second delay.
Version 10.0.0 Compiled on xxx

Peer Name: jazz Publication ID: 02 Status: Connected SYNCNUM last update
Publication 0100000000003856 2009-02-11 15:59:42
Subscription 0100000000003856 2009-01-30 16:21:03
Peer 0200000000001B9D 2009-02-11 15:59:42

Peer Name: port052 Publication ID: Unknown Status: Connected
SYNCNUM last update
Publication 0100000000003856 2009-02-11 15:59:42
Subscription 
Peer 0100000000001CB9 2009-02-11 15:59:42

Subscription Index

When the subscriber has to perform changes to existing rows in tables it will use the sync number specified in the transaction to locate the row in the specified table. It uses a single index file called subscription.ndx to store these sync number for each table being replicated. If the index does not exist it will create it in the replication database. If the table being processed isn't in the index then it will add it with all it sync numbers. Transactions that insert new rows into tables will have there sync numbers added as they are inserted.

Conflicts

While the Publication Service is adding in the transactions it checks for conflicts between the master pub_queue table and the peers local queue table. If there are no conflicts then transactions are added to the master publisher queue table.


Conflict are defined by a transaction that updates data on the same table in the same row and the same column. Because the Recital Replication Service only replicates changes at the column level it reduces that chances of conflict occurring.


The publication service use a round robin when processing peers. As the possibility of a conflict can be in in any of the peers the first transaction published will have priority.


If a conflict is found then the transaction is added the conflict_queue table and not the pub_queue table. The sync number of the published transaction is added to publication ID of conflicted transaction before being added to the conflict_queue table so it can be located later when the the transaction is rolled back on the peer with the conflict.


If this is the first conflict on this transaction, then the current sync number is added onto the publication ID, the publication ID of the peer with the conflict is also added to the ID for the current published transaction.


All future transactions performed on this row by the peer with the conflict will automatically be added to the conflict_queue table until the conflict is cleared. It will make not difference what column is updated as the row will rolled back then updated with published transaction.


When the Subscription Service running on the peer server processes the published transaction which it was in conflict with, it will first roll the transaction back its state before the conflicted transaction was done. It will then process the published transaction from the publisher.


Once the Subscriber Service clears the conflict on the peer, it will insert a marker row into it's local queue table to notify the publication server that the conflict has been resolved. All transaction after this marker row in the local peer queue table for that row will no longer be blocked for processing.


When the Publication Server get the marker record notifying that the conflict has been cleared, it will remove the Peer Publication ID record from the transaction marked as conflicted. Then any future transaction for that row from that table will be processed normally.


Conflict Queue Table

Transactions flag as conflicted are stored in the conflict_queue table which is stored in the replication database. The replication conflict_queue table meta data is defined below and is the same as the local queue table.

Queue table Meta Data
Column Name Type Width Description
PUBLISHER Character 20 Publisher name
PUBLICATION Character 60 Publication number
TRANID Integer 8 Transaction ID
DATA Object 8 Transaction XML file
CMDTYPE Integer 4 Command type number
INSERTDATE DateTime 8 Insert date
CMDSTATE Character 1 Current transaction state
TRANKEY Character 73 Transaction key
TRANTYPE Character 1 Transaction type

Maintenance

Clearing the data in the queue that has been replicated.