Replication Technology
Replication is a set of SQL agent job which will be created while configuring replication. This Replication SQL Agent job will copy and send data and database objects (Tables i.e Articles, SP, Views & UDF etc) from one database to another database. There will be three roles in replication i.e. Source DB server is publisher and destination/Target DB server is subscriber. In between there will be distributor DB server. All three will communicate with each other. We can have all 3 in the same servers or 2 severs or each one will be in dedicated server to configure replication. It is based on client requirement and DBA needs we choose as per requirement.
SQL agent job or replication agent will be created in the background while configuring replication.
Additionally, Publication is publishing the objects from publisher to subscriber. Subscriber is subscribing from that publisher. distributor is middle man getting a data from publisher and send to subscriber.
Publisher is primary server – main server, Subscriber is secondary server – this would be used for reporting purpose and distributor is middle man can be in the same publisher server or dedicated server.
Publication is database objects will be hosted or created in publisher server. A single database can be published as different publication name to multiple subscribers. Check how many agent jobs are created.
Subscription is database objects will be hosted or created in each subscriber server.
Distribution is database will be created in distributor server while configuring replication. This database will be available under system database not as normal user database.
At first we need to configure distributor, publisher and finally subscriber.
Basic designs with how the SQL agent job created and how many jobs?
Design:1 We can have only one Publication with multiple subscribers, under that publication.
Here you can only have one snapshot agent job that will push the data to all your subscription. Not good for large database. In case of any issues, the snapshot will be pushed to all the subscribers.
Design:2 We can have one publication with one subscriber underneath. Here you can have snapshot agent job for each publication.
Snapshot agent is for each publication, Distributor agent is for each subscription and log reader agent is for each Published database. All the agents are called from SQL agent job.
How do you configure or initialize data from publisher to subscriber?
There are two methods for the initial data movement. 1. Via snapshot agent job (It is normal replication GUI configuration) or 2. Via Backup and restore method by using T-SQL, this will be used for very large database.
All the above is based on requirement, but important to understand how replication and agent jobs created based on your design.
Article is table.
Replication is objects level or column level copy/send of data from publisher to subscriber. Mostly used for reporting purpose. Whatever changes we made in publisher database immediately applied to subscriber database only for the replicated objects / columns not for all the objects OR columns.
Table must have primary key for transactional replication.
When to use replication
- To Take out subset of data from publisher to subscriber i.e. selected columns for reporting purpose, to avoid blocking of publisher
- To Remove PII (Bank no, Mobile no etc) that is replicate data to subscriber instead giving access to production we can give access to subscriber without PII data.
- To have more index on subscriber database which make query to run faster
- To have a control on security by restricting access to the user
- You can have replication database in simple recovery mode and can be used in different SQL version & edition. No need of enterprise edition.
Practical description of Transitional replication with agent program details:
Agent is nothing but it is SQL agent job.
In any type of replication Snapshot agent first needs to be run and it will write initial data (i.e. all the tables DDL and DML statements/query of selected objects for replication) into a snapshot share folder location using BCP bulk insert method. This will read and write data from table into snapshot location and block the users in the publisher database. Careful to generate snapshot or run snapshot agent for larger database.
It also records information about synchronization of data in the distribution database. The Snapshot Agent runs at the distributor server side. It is per publication.
The Log Reader agent moves upcoming transactions marked for replication from the Publisher database LDF file to the distribution database. Each database published will have its own Log Reader Agent (It is per database) that runs on the distributor server side and connects to the Publisher to get data.
Ex: If you have 3 published databases then you will have 3 log reader agents. If you have only one published database with many subscribers, still you will have only one log reader agent.
The Distribution agent will mainly for two tasks and some other task as well 1. It applies the initial snapshot to the Subscriber by reading the snapshot folder and applies it on subscriber database 2. It moves all other upcoming transactions held in the distribution database to Subscribers database by reading both snapshot folder and distribution database. It will read snapshot file and write into subscriber database. Be careful it will block the users in subscriber database. The Distribution Agent runs at either the Distributor server for push subscriptions or at the Subscriber server side for pull subscriptions.
The Distribution Agent is used on both snapshot replication and transactional replication types. It is per subscriber.
Only one line answer of agents
Snapshot agent – Will move initial DDL & DML statements from publisher to snapshot folder. When it runs it block the publisher table.
Log reader agent – Will move all the upcoming transaction from publisher DB to distributor DB.
Distribution agent – Will move both initial data from share path to subscriber and also upcoming data from distributor to subscriber. When it runs it block the subscriber table.
There will be replication agent / jobs created for each database replication configured based on design and type. Replication Agent is nothing it is SQL agent job which runs based on schedule we configured in the replication initial configuration. (We can change the schedule later)
Changing or resetting the password for agent login account password will stop the replication as well. Better use alternate logins.
How many replication agent or SQL jobs created?
It is based on the replication design we choose.
In general, snapshot agent is for each publication, Distributor agent is for each subscription and log reader agent is for each Published database. All the agents are called from SQL agent job.
The agent jobs or replication agent will be created based on replication topology design.
Ex: 3 separate server for Publisher, Subscriber and distributor as dedicated to each.
Ex: Publisher on server1, Subscriber on server2 and distributor on server3. Where the replication agent job will be created?
Snapshot and Log reader job is created in distributor server server3.
distributor agent job created in distributor server for push subscription.
So, all three agents will be in distributor server itself server3.
Ex: 2 server design as both publisher and distributor on same server – server 1 and subscriber server as different server – server 2. One publisher to many subscriber.
Ex: Publisher and distributor agent job on server1, Subscriber agent job on server1 for push subscription.
All three agent jobs will be created in Publisher server1 itself, since distributor is on same server.
Distributor model design:
distributor database can be configured on publisher itself or on subscriber side or on dedicated separate server.
Two methods of subscription for sending data – Push & Pull subscription.
It is best to have push method and easy to administrate.
Example:
One database called Muthu is published to different servers into 4 subscriber database.
It has 4 Publication with 4 subscription. Each publication will have only one subscription.
There will be 4 snapshot agent. Since snapshot agent is per publication.
There will be 4 distributor agent. Since distributor agent is per subscription.
There will be only one log reader agent. Since it is for per database.
Notes
For any new publication we have to use same login to connect to pub and sub otherwise it will error out. (OR) we can to choose impersonate option in the replication setting.
Replication Monitor:
Right click -> Replication -> Lunch replication monitor. There will be tabs. Following can be looked in each tabs.
Click left side click connect correct publisher server name.
Publication
Here you can see all publication status, you cannot do anything here
Subscription watch list
Here you can see all subscription.
Here you can start and stop the subscription that means it will call the distribution agent job. You can do the same SQL agent job. So you can do the same from SQL agent as well.
Important setting in right click of each subscription:
In view details there are three tabs. Data moving history will be available here.
Publisher to distributor. distributor to Subscriber and Undistributed commands. This is important to view when we initialize from snapshot agent.
You can reinitialize subscription from here as well, understand this will generate snapshot. (We should not run reinitialize subscription for large database) This only done in of business hours.
Agent
Here you can see all agents. There will be drop down to see agents.
You can stop and start agent from here, again this will start and stop agent jobs only.
Types of Replication:
Snapshot replication – In this replication Snapshot Agent will create snapshot of all articles from publisher and send or replicates to subscriber database.
When do you use it?
When to use SQL server Snapshot Replication
SQL Server Snapshot replication is useful for the databases which are not critical and/or data of the do not change frequently..
- Data do not change frequently and we can run snapshot agent job daily, weekly or monthly etc
- The database is not critical
- The database is for reporting purpose
Transactional replication – Log Reader Agent will create snapshot of all tables from publisher and send or replicates to subscriber other database.
When do you use it?
When to use Transactional Replication:
- Transactional replication used for the critical databases which require less downtime and immediate changes to reflect to subscriber
- It is useful when data need on an incremental basis
- It is useful for the database where a large amount of data changes frequently
Three agents used in Transactional replication. The Snapshot agent, The Log reader agent, and Distribution agent.
- Snapshot agent takes a snapshot of the published articles and put it into the snapshot folder
- The Log reader agent reads the transaction logs of the published database and transfers the committed transaction to the distribution database
- Distribution agent copied the snapshot data from the snapshot folder and the transaction log from the distribution database and applied to the subscriber server database
Merge replication – Merge Agent will create snapshot of all tables from publisher and send or replicates to subscriber other database.
Merge replication is used to replicate data from a publisher to a subscriber database and vice versa.
When do you use it?
Merge replication, like transactional replication, typically starts with a snapshot of the publication database objects and data. Subsequent data changes and schema modifications made at the Publisher and Subscribers are tracked with triggers. The Subscriber synchronizes with the Publisher when connected to the network and exchanges all rows that have changed between the Publisher and Subscriber since the last time synchronization occurred.
Merge replication is typically used in server-to-client environments. Merge replication is appropriate in any of the following situations:
- Multiple Subscribers might update the same data at various times and propagate those changes to the Publisher and to other Subscribers.
- Subscribers need to receive data, make changes offline, and later synchronize changes with the Publisher and other Subscribers.
- Each Subscriber requires a different partition of data.
- Conflicts might occur and, when they do, you need the ability to detect and resolve them.
- The application requires net data change rather than access to intermediate data states. For example, if a row changes five times at a Subscriber before it synchronizes with a Publisher, the row will change only once at the Publisher to reflect the net data change (that is, the fifth value).