SQL Server Transactional Replication Internals
SQL Server Transactional Replication is one of the most common Replication techniques used to share, copy, or distribute data to multiple destinations. In this article, we will discuss Replication, various Replication Types, and pay special attention to the Transactional Replication work.
What is SQL Transactional Replication?
Replication is the SQL Server technology for Copying or Distributing data from one database to another while maintaining Data consistency.
Replication can be used to transfer data from one database to another
- on the same instance or another instance in the same server;
- or across Servers in a single location or multiple locations.
First, we should go through the Replication Architecture and understand the Replication terminologies.
SQL Server Replication Architecture & Terminologies
- The Publisher is the Source Database instance that publishes the data changes that can be distributed to another database. Data from a Single Publisher can be sent to a Single Subscriber or multiple Subscribers.
- Subscriber is the Destination Database Instance where we distribute the data changes captured from the Publisher Database. Subscriber can be either a Publisher Instance or another instance in Publisher Server/ another Server in the same location/distant location. Before the data changes are distributed to the Subscriber Database instance, these data are stored in the Distributor.
- The Distributor is a Database that stores change logs captured from Publisher databases. When a Server is enabled as a Distributor, it will create a system Database named the distribution database.
By the location of distribution databases, they can be classified as either local or remote distributors.
โ Local Distributor is a distribution database residing on the Publisher database instance.
โ Remote Distributor is a distribution database residing either in the Subscriber database instance or in any other SQL Server instance apart from the Publisher database instance.
The deciding factor is where to place the Distribution database on the Publisher instance (another instance). it depends on the Server resources available to handle the Data Distribution load.
According to the way of how the data will be sent from the Distribution database to the Subscriber instance, it can be classified into either Push or Pull Subscriptions.
โ Push Subscription means that the Distribution database takes responsibility to push the data to the Subscriber database instance.
โ Pull Subscription means that the Subscriber database instance takes responsibility to pull the available data from the Distribution database and apply it to the Subscriber database.
- Articles are the fundamental unit of Replication. It indicates any data changes on this database object or article that will be replicated from Publisher to Subscriber. The Article can be a Table, View, Indexed View, Stored Procedure, or the User Defined Function.
- Publications are a collection of one or more Articles from the database in Publisher.
- Subscription defines what Publication will be received. Also, it determines from which Publication and on what schedule the data is replicated. Subscription can be either Push or Pull (from Publication to Subscription).
- Replication Agents are standalone programs responsible for tracking changes and distributing data across the Publisher to Distributor and Subscriber. All Replication Agents execute as Jobs under SQL Server Agent. Thus, it can be administered via SSMS under SQL Server Agent Jobs or Replication Monitor. The following types of Replication Agents are available:
- Snapshot Agent โ Used by almost all types of Replication. The Snapshot Agent runs from Server holding the distribution database. It prepares the Schema and initial data of all Articles included in a Publication on a Publisher. Also, it creates the Snapshot files in a snapshot folder and records Synchronization details in the Distribution database.
- Log Reader Agent โ Used by Transactional Replication. The goal is to read the data changes of articles enabled for Replication from the Publisher Database Transaction Logs and stored to the Distribution Database. the Log Reader Agent runs from the Distributor Server.
- Distribution Agent โ Used by Transactional and Snapshot Replication. It applies the initial Snapshot files and incremental or available pending transactions from the Distribution database to the Subscriber database. The Distribution Agent runs from the Distributor Server for Push Subscriptions and the Subscriber Server for Pull Subscriptions.
- Merge Agent โ Used by Merge Replication only. It applies the initial snapshot files and reconciliation of differential or incremental changes across the Publisher or Subscriber. The Merge Agent runs on the Distributor Server for Push Replication and from the Subscriber Server for Pull Subscriptions.
- Queue Reader Agent โ Queue Reader Agent is used by Transactional Replication with queued update option. It moves changes from Subscriber to Publisher. The Queue Reader Agent runs from the Distributor Server.
- Replication Maintenance Jobs โ As explained earlier, all Replication Agents are standalone programs set up while configuring Replication. They run as jobs under SQL Server Agent Jobs. Few significant jobs to be noted are Distribution Clean Up: Distribution, Agent History Clean Up: Distribution, and Expired Subscription Clean Up.
Types of Replication in SQL Server
Now when we know the terminology, letโs dive into the types of Replication.
- Transactional Replication. As the name suggests, every transaction or data change within the transactional scope on Publisher will be sent to Subscriber in near real-time with minor delays depending upon the network bandwidth and server resources. Transactional Replication uses the Log Reader Agent to read the data changes from the Transactional Logs of the Publisher database. It also uses the Distribution Agent to apply changes to Subscriber. Occasionally it may use Snapshot Agent to take initial Snapshot data of all Replicated articles. Transactional Replication Publication may fall under the below categories:
- Standard Transactional Replication โ Subscriber is a read-only database from the Transactional Replication perspective. Any changes performed by anyone on the Subscriber database will not be tracked and updated to the Publisher Database. The Standard Transactional Replication is often referred to as Transactional Replication.
- Transactional Replication with Updatable Subscriptions is an Enhancement of Standard Transactional Replication that tracks the data changes taking place on Subscribes. Whenever data changes are initiated on an Updateable Subscription, they will be first propagated to Publisher and then to other Subscribers.
- Peer-to-Peer Replication is an Enhancement of the Standard Transactional Replication. It propagates transactionally consistent changes in near real-time across multiple server instances.
- Bidirectional Replication is an Enhancement of the Standard Transactional Replication that allows two servers (limit to only 2 Servers and hence named Bidirectional) to exchange the data changes across each other with any server acting as a Publisher (to send changes to another Server) or as a Subscriber (to receive changes from another server).
- Merge Replication โ Supports capturing data changes that take place across both Publisher and Subscriber and distributes it to the other Server. The Merge Replication requires the ROWGUID column on the Table Articles involved in Merge Replication. It uses Triggers to capture the data changes across Publisher and Subscriber. Also, it delivers the changes across Servers when both Publisher and Subscriber are connected to the network. Merge Replication uses the Merge Agent to replicate the data changes across Publisher and Subscriber.
- Snapshot Replication โ As the name indicates, Snapshot Replication doesnโt rely on either Transactional Logs or Triggers to capture the changes. It takes a snapshot of articles involved in Publication and applies it to the Subscriber with the records available at the time of the snapshot. The Snapshot Replication uses the Snapshot Agent to take a snapshot of the Publisher and uses the Distribution Agent to apply these records to Subscriber.
SQL Server Transactional Replication
Transactional Replication is typically preferred in scenarios where the OLTP Publisher database has heavy Data INSERT/UPDATE and/or DELETE activities.
Since the Publisher server instance has huge DISK IO happening, generating Reports may cause severe blockings. It also can impact the Server performance. Hence, another Server with near real-time data is good for offloading the Reporting requirements.
One of the fundamental requirements for Transactional Replication is that the tables replicated should have a Primary Key available.
We can summarize how Transactional Replication works. Have a look at the below Transactional Replication Architecture diagram taken from the official Microsoft documentation.
The Publication is created on the Publisher database that comprises the list of Articles for replicating to the Subscriber database.
Transactional Replication will typically be initialized from Publisher to Distributor via the Snapshot Agent or Full Backups. The Snapshot Agent is supported via Replication Configuration Wizard. The Full Backup is supported via TSQL Statements to initialize the Transactional Replication.
The Log Reader Agent scans the Transactional Log of the Publisher database for tracked Articles. Then it copies the data changes from the Transactional Log to the Distribution database.
The Distribution database can be either in Publisher or Subscriber; it can also be or another independent SQL Server instance.
Note also the following things:
- Log Reader Agent runs continuously from the Distributor Server to scan for new commands marked for Replication. However, if you donโt wish to run continuously and want to run on a schedule instead, we can change the Log Reader Agent SQL Job that will be created out.
- Log Reader Agent picks up all records marked for Replication from the Transactional Log in batches and sends them to the Distribution database.
- Log Reader Agent picks up only Committed transactions from the Transactional Log of Publisher Database. So, any long-running queries on the Publisher database can directly impact Replication as it is waiting for the active transaction to complete.
The Distribution Agent picks up all undistributed new commands from the Distribution database and applies them to the Subscription database via either Push or Pull Mechanism. As mentioned earlier, if Push Subscription Distributor takes ownership to apply the changes from Distribution database to Subscriber whereas in Pull Subscription Subscriber database takes ownership to fetch the changes from Distribution database to Subscriber.
Once the records are distributed successfully from the Distribution to Subscriber database, they will be marked as Distributed and marked for deletion from the Distribution database. One of the Key Replication Maintenance jobs named Distribution Clean Up: Distribution job runs once every 10 minutes to delete the distributed records from the Distribution database to maintain the Distribution database size under control.
With the detailed explanation of concepts of Replication and Transactional Replication, we can get our hands on it by configuring Replication for AdventureWorks database and verifying Replication for every component discussed theoretically.
Configuring Transactional Replication Step by Step (via SSMS GUI)
Transactional Replication configuration involves 3 major steps:
- Configuring Distribution database
- Publication Creation
- Subscription Creation
Before trying to configure Replication, make sure that Replication Components are installed as part of SQL Server installation or use SQL Server media to install Replication Components, as they are necessary for the task.
In SSMS, connect to the Publisher Database Instance and right-click on Replication:
Distribution is not configured right now. Hence, we have the Configure Distribution option. We can either Configure the Distribution database using the Configure Distribution wizard or via the Publication Creation wizard.
To configure the Distribution database and Publication, follow the below steps:
Expand Replication and right-click on New Publication.
Conclusion
Thanks for reading this long power-packed article! We have gone through a variety of topics, such as:
- Replication Architecture and Terminologies
- SQL Server Replication Types
- SQL Server Transactional Replication in Detail
- SQL Server Transactional Replication Configuration (Default approach)
- SQL Server Transactional Replication Verification
- SQL Server Transactional Replication in action