Understanding Transactional Replication in SQL Server

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

Transactional replication is a popular method for keeping multiple SQL Server instances synchronized by capturing changes from one server (publisher) and applying them to others (subscribers). We will explore its key aspects along with code samples and best practices.

USE master;
GO
CREATE DATABASE MyTestDB;
GO
USE MyTestDB;
GO
CREATE TABLE Person(ID INT PRIMARY KEY IDENTITY, Name NVARCHAR(100));
INSERT INTO dbo.Person VALUES ('John Doe'),('Jane Smith');
GO
-- Create publication
EXEC sp_addpublication @publication = 'MyPublication',
   @database = 'MyTestDB';

-- Add table to publication
EXEC sp_addarticle @publication = 'MyPublication',
    @article = 'Person',
    @source_owner = 'dbo',
    @source_object = 'Person',
    @type = 'logbased',
    @description = '';
GO

Explaination of Code:

It Creates a database named "MyTestDB."
Within "MyTestDB," creates a table named "Person" with two columns: "ID" and "Name".
Inserts two rows into the "Person" table.
Creates a publication named "MyPublication" for "MyTestDB."
Adds the "Person" table to the publication, specifying log-based replication.
Basically It sets up a database, a table, and establishes transactional replication for the table.

Subscription Setup:


-- On subscriber side
CREATE DATABASE MySubscribedDB;
GO
USE MySubscribedDB;
GO
-- Define push subscription
EXEC sp_addsubscription @publication = 'MyPublication',
      @subscriptiondb = 'MySubscribedDB',
      @destination_db = '',
      @sync_type = 'automatic',
      @subscription_type = N'Push',
      @update_mode = N'read only',
      @subscriber_security_mode = 1;
GO
-- Start Agent
EXEC sp_startpublication_snapshot @publication = 'MyPublication';
GO

Explaination of Code :

Creates a database named "MySubscribedDB".
Defines a push subscription for "MyPublication" to sync with "MySubscribedDB."
Specifies automatic synchronization, push subscription type, read-only mode, and starts the replication agent.
In essence, it prepares the subscriber database for receiving and updating data through transactional replication.

How Transactional Replication Works:

Transactional replication uses a publish-and-subscribe model where changes made to the published database are captured and then propagated to the subscribers. The process can be broken down into three main steps:

Snapshot Generation: A snapshot of the entire published database or specific articles (tables, views, etc.) is taken at regular intervals. This snapshot includes schema and data and serves as the initial state for all new subscribers.
Change Data Capture: As transactions occur on the published database, they are recorded by the distribution agent and stored in the distribution database. These change records include information about which rows were inserted, updated, or deleted.
Distribution: The distribution agent applies the changes to the subscription databases according to their schedule and settings. Changes are applied sequentially, ensuring consistency between the publisher and subscribers.

Comparison with similar concepts:

Feature	Transactional Replication	Change Tracking	Change Data Capture (CDC)	AlwaysOn Availability Groups
Purpose	Replicate data changes in real-time	Lightweight mechanism for tracking changes	Detailed audit trail at the column level	High availability, data protection through synchronization
Granularity	Individual transactions	Changes at the row level	Changes at the column level	Operates at the database level
Latency	Near real-time with potential latency	Changes tracked as they occur, polled periodically	Changes captured as they occur, polled periodically	Near real-time synchronization between replicas
Use Case	Reporting, load balancing, data distribution	Identify modified rows within a timeframe	Auditing, historical changes, synchronizing data warehouses	High availability, disaster recovery, automatic failover

In summary, these features serve different purposes within SQL Server:

Transactional Replication: For real-time replication of data.
Change Tracking: For lightweight tracking of changes at the row level.
Change Data Capture (CDC): For detailed auditing of changes at the column level.
AlwaysOn Availability Groups: For high availability and data protection through database-level synchronization.

Limitations of Transactional Replication

While Transactional replication is a powerful feature in SQL Server that allows the replication of changes made to a database in near real-time. However, like any technology, it has its limitations. Some common limitations of transactional replication are:

Latency:
While transactional replication provides near real-time data replication, there can still be some latency between changes made on the publisher and their replication to subscribers. The latency depends on factors such as network speed, the volume of changes, and the configuration of the replication.
Network Overhead:
Replicating every transaction to subscribers can create significant network overhead, especially in high-volume transaction environments. This can affect the overall performance of the network.
Consistency:
Ensuring consistency across all nodes can be challenging. If there are conflicts or errors during replication, it may lead to inconsistencies between the publisher and subscribers.
Initial Snapshot:
Transactional replication requires an initial snapshot of the entire data set to be transferred to the subscribers before real-time replication begins. This snapshot process can be time-consuming and resource-intensive, especially for large databases.
Schema Changes:
Schema changes, such as adding or modifying columns, may require additional steps in transactional replication. Managing schema changes can add complexity to the replication process.
Limited to SQL Server:
Transactional replication is specific to SQL Server and may not be compatible with other database management systems.

Applications

Transactional replication is commonly used in various scenarios such as:

Geographically dispersed systems
Reporting servers
Load balancing
Backup and disaster recovery

Question

Which type(s) of replication does SQL Server support?

Transactional, Merge, Snapshot, Peer-to-Peer

Only Transactional Replication

Master-Slave, Multi-Master, Broadcast

None of the above

SQL Server supports four main types of replication – Transactional, Merge, Snapshot, and Peer-to-Peer Replication. They cater to diverse use cases, latency tolerances, and data sizes, offering flexibility in choosing the appropriate replication technique based on individual requirements.

Question

Regarding modifying replicated schema, which statement is correct?

Altering schema doesn't impact replication as long as corresponding triggers remain unchanged.

Simple column additions don't affect existing replication but require manual adjustments for deletions or renaming.

You cannot make any schema changes once replication is enabled.

Any schema modification breaks replication until explicitly addressed via scripts

While working with SQL Server replication, making certain schema modifications, like adding new columns, won't disrupt ongoing replication processes directly. Nevertheless, some manual interventions might still be needed, particularly when dealing with column removals or name alterations.

Key Takeaways Transactional replication

Transactional replication keeps multiple SQL Server instances synchronized.
It involves a publisher, distributor, and subscriber setup
Real-world usage covers geographical dispersion, load balancing, backup, and disaster recovery.Map.

Understanding Transactional Replication in SQL Server

Database Management System (DBMS) Software Engineering

Table Of Contents:

Question

Which type(s) of replication does SQL Server support?

Question

Regarding modifying replicated schema, which statement is correct?

Key Takeaways Transactional replication

Aligning Flexbox

Different ways to add element to List in Python