INFO: Getting Started with Microsoft SQL Server Replication
ID: Q89937
|
The information in this article applies to:
-
Microsoft SQL Server versions 6.0, 6.5
SUMMARY
This article provides a reference for some of the most common issues
you can encounter when installing and using SQL Server replication. This
article is not a replacement for the SQL Server documentation. All of the
concepts, terms, and topics below are documented in the SQL Server
"Administrator's Companion," Part 6, Replication (both the printed version
and the Books Online). This section of the manual is the most complete
reference for replication issues, questions, and troubleshooting. Anyone
who is planning to install and use SQL Server replication should thoroughly
read all chapters in this section of the manual.
Contents:
- A table does not appear in the list of possible articles when creating a publication in SQL Enterprise Manager (SEM).
- LogReader and Distribution subsystem tasks do not show any entries inthe Task History of SQL Enterprise Manager after setting up replication.
- Initial installation of replication is really just a 3-step process.
- Connectivity problems with replication tasks.
- Tables do not get synchronized immediately after subscribing.
- What are all the tasks in replication, and how many can I expect to
have on a distribution server?
- What is the difference between a "filter stored procedure" and a
"custom stored procedure"?
- Does SQL Server support multiple publication servers of the same table?
- Monitoring proper replication of changes to subscription servers.
- Can TEXT/IMAGE columns be replicated?
- Windows NT Server names can conflict with replication.
- What are the "job batch size" and "commit batch size"?
MORE INFORMATION
The following represents a list of some of the more common issues and
topics that have resulted from supporting customers who are installing and
using SQL Server Replication:
- A table does not appear in the list of possible articles when creating a
publication in SQL Enterprise Manager.
Any table that will be published must have a PRIMARY KEY CONSTRAINT defined
using the CREATE TABLE or ALTER TABLE Transact-SQL statement. A constraint
may also be created using SQL Enterprise Manager.
Note that just creating a unique index will not allow the table to appear
in the list of possible articles. Users who have upgraded their database
from prior versions of SQL Server usually do not have PRIMARY KEY
CONSTRAINTS, but rather have used unique indexes because constraints were
not supported until SQL Server version 6.0.
- LogReader and Distribution tasks do not show any entries in the Task
History of SQL Enterprise Manager after setting up replication.
LogReader and Distribution subsystem tasks are created with a frequency by
default as "Auto Start." (When setting up subscription servers, you can
change the default for the Distribution task to a different frequency.)
This means the tasks will start when they are first created as tasks or any
time SQLExecutive is started. They run continuously in a polling "loop,"
pausing for each loop n milliseconds as defined by the "polling interval"
set with the "-i" parameter. The tasks are set up by default as one second
for LogReader and two seconds for Distribution. Under this type of task
scheduling, these two tasks will not display any task history provided
there are no errors. Any error or retry attempt will cause an entry to be
displayed in the Task History. If SQLExecutive is shutdown, these tasks
will have an entry in the Task History with a message of "This task was
shutdown." This message is not an error and just indicates the task ended
as part of the shutdown to SQLExecutive.
If a LogReader or Distribution task is scheduled to run "On Demand" or
"Recurring," then they will process jobs that are available until no more
jobs are available, display information about its processing to the Task
History, and then terminate. For example, a LogReader task will display the
following message "Processed <n> replicated transactions consisting of <m>
commands from '<published db>.'"
One method that can be used to examine a more detailed view of the
processing of a LogReader, SYNC, or Distribution task is to add the -o
parameter (syntax is -o<filename>. DO NOT use a space between -o and
<filename> or the file will not be created) to each of these tasks. This parameter
will force the task to dump details of replication processing.
The parameter can be added by editing any of the tasks using SQL Enterprise
Manager. Remember that if the task is set to "Auto Start," the parameter
change will not take effect until SQLExecutive is restarted. This parameter
should only be used for troubleshooting situations as the amount of output
written to the file can be substantial and could cause a degradation in
performance. The -o parameter is valid for any LogReader, Distribution, and
SYNC task. These tasks open the file to "append" entries if the file exists
and share it so a user could view the file while the task is still
executing. Remember that this parameter is only for troubleshooting
purposes and only typically used when working with a support professional. The task
will not display any errors if the file cannot be created for any
reason (for example, Syntax incorrect, path does not exist, ...).
See the SQL Server "Administrator's Companion" in the section titled
"Replication Task Scheduling" for more information about adding or editing
parameters for SQL Executive replication tasks.
- Initial installation of replication is really just a three step process.
After SQL Server is installed, installing replication using SQL Enterprise
Manager requires three steps: installing the distribution server,
configuring publication servers, and configuring subscription servers.
To ensure proper setup of replication, these steps should be
done in the following order:
- Install the distribution server by selecting Replication
Configuration/Install Publishing from the Server menu within SQL
Enterprise Manager.
- Configure a publication server by selecting the valid list of
subscription servers and databases to be published. This can be done
by selecting Replication Configuration/Publishing from the Server
menu, while having the focus on the publication server in SQL
Enterprise Manager. The Replication topology window also provides an
easy method to perform this operation. Valid subscription servers are
selected under the list of servers titled "Enable publishing to these
servers." A subscription server must be selected here first before
you can configure the subscription server.
- Configure a subscription server by selecting the valid list of
publication servers and databases to be subscribed. This can be done
by selecting Replication Configuration/Subscribing from the Server
menu while having the focus on the subscription server in SQL
Enterprise Manager. As before, the Replication Topology window can
also be used to perform this operation. Valid publication servers are
found in the list of servers titled "Enable subscribing from these
servers." A publication server must be selected here before a
subscription server can subscribe to publications from a publication
server. If you try to perform this step before step b., you will see
the following error in SQL Enterprise Manager:
Error 18483: Unable to connect to site '<publication server>'
because 'sa' is not defined as a remote login at the site.
Note the above assumes the distribution server is local. See the SQL Server
"Administrator's Companion" for more information on setting up a remote
distribution server. More information on using the Replication Topology
Window including drag-and-drop features can be found in the SQL Server
"Administrator's Companion."
- Connectivity problems with replication tasks.
Distribution task connectivity
SQL Server replication relies on integrated security connections. A
distribution task will "force" a trusted connection using ODBC to the
subscription server. Because of this design, the distribution server must
use the named pipes or multiprotocol network library as the default network
library (the Client Configuration Utility is used to set the default
network library).
It is also important to understand the security context of the SQL
Executive service. All replication tasks connect under the security context
of the SQL Executive service. Because a trusted connection is used to
connect to the subscription server, the account used by SQL Executive to
log on as a service must have permissions on the subscription server. A
trusted connection by a distribution task therefore requires that the
distribution and subscription servers belong to the same domain or trusted
domains. This is the reason that the SQL Executive service is normally
configured to log on as a domain administrator.
One simple method to test the connectivity of replication is to log on to
the distribution server using the same account SQL Executive logs on with.
Then use ISQL.EXE with the -E option to force a trusted connection to the
subscription server. While this test uses DB-Library and the Distribution
task uses ODBC, it will still provide a reliable method to test a trusted
connection to the subscription server. Setting the subscription server to
integrated security and using the ODBCPING.EXE utility can accomplish the
same test. If a Distribution task fails to connect to the subscription or
distribution server, it will display a standard ODBC error for connection
failed, such as "08001 [Microsoft][ODBC SQL Server Driver] Unable to
connect to DataSource," or "08001 [Microsoft][ODBC SQL Server
Driver][DBNMPNTW] Specified SQL Server not found."
For more information on troubleshooting issues with the ODBC SQL Server
Driver, see the following Microsoft Knowledge Base article:
Q140895 INF: Diagnostic Tips for the Microsoft SQL Server ODBC Driver
LogReader and SYNC Task Connectivity
Both a LogReader and SYNC task connect to the publication and distribution
servers using DB-Library, forcing a "trusted" connection. Because the
default network library needs to be named pipes or multiprotocol for the
subscription server, these tasks will use these network libraries to
connect to their servers. Similar techniques to test Distribution task
connectivity can be applied to a LogReader or SYNC task, except that they
are DB-Library applications. If the LogReader or SYNC task fails to connect
to either server, it will display the following error in the Task History
of SQL Enterprise Manager:
Unable to connect to '<server>.'
If any of these tasks fail to connect to a server, they are automatically
configured to "retry" the connection based on the configuration of the task
from SQL Enterprise Manager. By default, each task is configured to retry
once per minute for a maximum of 1440 times (once each minute for 24
hours). This can be changed by clicking Options when editing the task.
- Tables do not get synchronized immediately after subscribing.
To properly understand why a table may not be automatically synchronized
after subscribing, you need to understand what each type of task does for
replication.
A LogReader task is responsible for reading changes from the published
database transaction log and inserting jobs into the distribution database.
A LogReader task is created with a name combining the publication server
and published database name (for example, mypub_pubs).
A Distribution task is responsible for reading jobs from the distribution
database and executing commands from these jobs against a subscription
server. A Distribution task is created with a name combining the publisher
server/published database and subscription server/subscribed database (for
example, mypub_pubs_mysub_pubs).
A SYNC task is responsible for creating a data file (BCP format) and
inserting "sync" jobs into the distribution database. The SYNC task
performs this operation for any new subscription or any scheduled table
refresh publication. It is the distribution task that is responsible for
actually synchronizing the table and its data to a subscription server.
However, the distribution task cannot perform this task until the SYNC task
creates a job in the distribution database for the synchronization.
Remember that a SYNC task is created by SQL Enterprise Manager as part of
creating a publication. If you build publications manually, you must create
a SYNC task using the sp_addtask procedure. SQL Enterprise Manager
constructs a SYNC task name by combining the publication server, published
database, publication name, and a unique number. An example task name is
mypub_pubs_authors_10.
By default, the SYNC task is scheduled to run every five minutes. Because
the subscription may have taken place immediately after the publication was
created, the SYNC task may not have executed yet. If you have created a
publication and subscribed to it, check the status of the SYNC task in SQL
Enterprise Manager. If the SYNC task has successfully completed and added
the proper sync jobs, it will show an entry in the Task History like:
Sync event for publication: <publication name>. Created bcp file
<filename.tmp> for article <article name> sync job.
If you see an entry in the Task History like:
No new subscriptions were found for articles in publication:
<publication name>.
this means that no new subscriptions for this publication were found. The
SYNC task runs even if no one has subscribed. For this reason, most users
will probably want to change the default schedule of the SYNC task to an
interval longer than five minutes. This depends on how often new
subscriptions occur or if you decide to create "scheduled table refresh"
publications.
- What are all the tasks in replication and how many can I expect to have
on a distribution server?
A LogReader task exists for each published database for each publication
server. This task is created when the database is marked as "published"
(sp_dboption or SQL Enterprise Manager can be used to mark a database as
"published").
A SYNC task exists for every publication created. This task is created by
default when a publication is created using SQL Enterprise Manager. The
task must be manually added if the publication is created outside SQL
Enterprise Manager. Note that one SYNC task is used no matter how many
articles exist in a given publication.
A "Cleanup" task is created for every publication server/subscription
server combination. This task is created when a subscription server is
enabled on the publication server. This task is created by the
sp_addsubscriber procedure.
A Distribution task is created for every published db/subscribed db
combination. This task is created when the first publication in a published
database is subscribed to by a given subscription server. This task is
created by the sp_addsubscription or sp_subscribe procedure.
Remember that all tasks run on the distribution server, even if the
distribution server and published database(s) are on different computers.
Looking at the example below will help explain how many distribution tasks
can be created:
Assume that a distribution server is configured to handle five publication
servers, each with one published database. Now, three subscription servers
are configured to subscribe to at least one publication for each published
database for each publication server. All publications are placed in the
same database on each subscription server.
Based on this description, 15 distribution tasks will be required on the
distribution server to handle this environment. If a given publication was
subscribed to more than one time into a different database on the
subscription server, this will increase the number of tasks required by the
number of databases. In the example above, if each subscription server
pulled each publication into five different databases, the number of tasks
increases to 75.
The following table summarizes the expected number of tasks:
LogReader Tasks 1 per published database
SYNC Tasks 1 per publication
Distribution Tasks 1 per published
database/subscribed database combination
Cleanup Tasks (sp_replcleanup) 1 per publication
server/subscription server combination
- What is the difference between a "filter stored procedure" and a "custom
stored procedure"?
A "filter stored procedure" is created to support "horizontal
partitioning." This concept allows for changes to be "filtered" before they
are stored in the distribution database. A filter stored procedure can be
generated by SQL Enterprise Manager or created manually by using the CREATE
PROCEDURE FOR REPLICATION Transact-SQL statement. A filter stored procedure
is executed in the context of the LogReader task and is applied against the
changes read from the published database transaction log before jobs are
placed in the distribution database.
A "custom stored procedure" is a "hook" provided for advanced replication
situations. This is a procedure created by an application developer to
replace a standard INSERT, UPDATE, or DELETE command that will be executed
against the subscription server. A procedure of this type should be used if
you need to perform any special processing to be run on the subscription
server instead of the standard INSERT, UPDATE, or DELETE statement. The
protocol for creating a procedure of this type is defined in the SQL Server
"Administrator Companion" under the title "Adding Stored Procedures for
Insert, Update, and Delete."
The custom procedure is specified when defining an article and must be
created on the subscription server. When a job is placed in the
distribution database for an article with a custom stored procedure, the
procedure name will be placed as the command to run for the job in the
distribution database (in the MSjob_commands table). The distribution task
will then execute the procedure on the subscription server.
Here is an example INSERT custom stored procedure for the
employee table in the pubs database:
create procedure myinsert @emp_id char(9), @fname
varchar(20), @minit char(1),
@lname varchar(30), @job_id smallint, @job_lvl tinyint,
@pub_id char(4),
@hire_date datetime
as
insert into employee values (@emp_id, @fname, @minit,
@lname, @job_id, @job_lvl,
@pub_id, @hire_date)
The "Administrator's Companion" in the section mentioned above discusses
how error handling should be included into custom stored procedures using
RAISERROR.
- Does SQL Server support multiple publication servers of the same table?
One of the most common questions is how many servers can modify and publish
the same table. While SQL Server replication is a strategic component of
Microsoft's distributed enterprise computing platform, it is not designed
to be a fully distributed database management system.
SQL Server replication is not designed to handle "collisions" between two
publishing servers. If one server updates a row that another server is
trying to delete and the changes are replicated to each other, a
"collision" can result. One possible result of the collision is that the
distribution task of the server trying to replicate the update command will
fail because the row it is trying to update no longer exists on the
subscription server. Even worse, infinite loops of changes could result if
publications are not defined properly.
Replication can support multiple publishers of the same table provided that
each server "owns" a set of rows within the table. Each server would
provide a method (such as a stored procedure) to prevent any modifications
or inserts of rows outside the set they "own." This can easily be
accomplished by designing the table to include a column that designates
ownership and then use a horizontally partitioned article to publish that
table.
- Monitoring proper replication of changes to subscription servers.
There is no one "best" method to monitor proper replication of changes to a
subscription server. Devising your own method to fit your environment is
the recommended approach. Provided that the LogReader, SYNC, and
Distribution tasks are not encountering any errors, all changes should be
replicated properly. If you suspect a particular transaction was not
executed on a subscription server, it might be helpful to examine rows in
the MSjob_commands table of the distribution database. This table contains
all SQL statements that will be executed against subscription servers. Note
that the syntax of these commands may appear to be unusual under certain
circumstances because all commands are created to adhere to the ODBC SQL
Grammar specifications.
Another table that is useful to monitor replication is the MSlast_job_info
table that is created on the subscription server database. This table will
contain the last successful job that was applied to a subscription server
for a given publication server. It can be compared with the MSjobs,
MSjob_commands, and MSsubscriber_jobs to determine what jobs have and have
not been applied to the subscription server.
- Can TEXT/IMAGE columns be replicated?
SQL Server version 6.0
TEXT/IMAGE columns can participate in replication, but only for "scheduled
table refresh" publications. This means that a synchronization of the table
can be done to include the TEXT/IMAGE data, but not logged changes of that
column. If a published table is "transaction based" and contains a
TEXT/IMAGE column, any value for this column, after initial
synchronization, will be distributed as (null).
SQL Server version 6.5
Support for TEXT/IMAGE transaction-based publications is now a new feature
of SQL 6.5. Note that only applications that use the "WITH LOG" option when
writing out text/image data will be able to take advantage of this support.
- Windows NT Server names can conflict with replication.
If the characters in the name of your Windows NT server conflict with the
valid list of characters that SQL server supports for an identifier, you
will run into problems when setting up replication. The SQL Server Setup
program issues a warning when installing the product that the Windows NT
server name will cause problems when setting up replication.
The most common situation for this problem is when the Windows NT computer
name (that is, the name of the Windows NT computer as defined in the
Networks application in Control Panel) contains a hyphen "-" or "space"
character.
- What are the "job batch size" and "commit batch size"?
Job batch size is a parameter (-b) that can be configured for both a
LogReader and Distribution task to control the maximum number of jobs
processed by both tasks. For a LogReader task, this is the maximum number
of jobs (that is, transactions) that will be read out of the transaction
log and placed into the distribution database. This parameter is actually
called "transbatchsize." The default is 100 and is passed as a parameter to
sp_replcmds to extract, at most, this number of committed transactions from
the published database transaction log. The commit batch size (-c) is by
default the same as the job batch size for the LogReader unless explicitly
specified. It defines the maximum number of jobs that are applied to the
distribution database before a COMMIT TRAN is executed. Decreasing this
value from its default may help reduce deadlock contention between
LogReader and Distribution tasks, but could also affect performance of a
LogReader task.
The job batch size for the distribution task defines the maximum number of
jobs that will be read from the distribution database to execute on a
subscription server. The commit batch size defines the maximum number of
jobs that will be executed on a subscription server before a COMMIT TRAN is
executed on the subscription server.
An example might help clarify these terms. If the LogReader was configured
with a job batch size of 10, it would read at most 10 jobs (that is,
transactions) at a time from the published database transaction log and add
these to the distribution database. If the commit batch size is not
explicitly specified, all 10 jobs would be added to the distribution
database in one logical transaction (BEGIN/COMMIT block). However, if the
commit batch size were reduced to five, then five jobs at a time would be
committed in one logical transaction against the distribution database.
This same concept applies to the Distribution task against the subscription
server. Note that if the subscription server "republishes" the changes from
the Distribution server, what was originally 10 transactions from the
publication server may now appear to be one transaction, because the
Distribution task committed the 10 jobs as one logical transaction. Large
commit batch sizes from the Distribution task can result in very large
transactions to process for subscription servers that "republish."
Be careful in reducing these parameters to extremely low numbers if the
volume of replicated transactions are high. This may help in reducing
deadlock situations, but could decrease performance. Only proper testing
for your environment will indicate what the optimal setting should be. For
most users, leaving the defaults provide a good base to work from.
Additional query words:
rep admin sqlfaqtop
Keywords : kbusage SSrvAdmin SSrvRep
Version : winnt:6.0,6.5
Platform : winnt
Issue type : kbinfo
Last Reviewed: June 30, 1999