Working with Linked
Servers by
Bill Wunder
Linked servers allow you to query
virtually any OLE DB data source from T-SQL running on a SQL
Server 2000 creating a tremendous backdrop for distributed data
architectures. If the OLE DB provider provides the necessary
interfaces you can also insert, update, and delete data at that
OLE DB data source. Reciprocally, SQL Server is also an OLE DB
data source that can be accessed through quite a variety of OLE
DB Providers including linked servers from other SQL Servers.
There really are quite a number
of interesting possible OLE DB data sources. The better known
include Excel, Access, Oracle, DB2, and ODBC. The less well
known include LDAP, Active Directory, Exchange, Outlook MySQL,
HTML and VSAM. That is certainly not an exhaustive list. In
reality give a good C/C++ programmer enough time and they can
turn just about anything that can be connected to a computer
into an OLE DB data source. There's a nice list of commercial
non-Microsoft OLE DB providers at
sqlsummit.com
and you may also want to consider this well discussed listing of
most current
Microsoft OLE DB
providers in the MSDN
Library and this discussion of
OLE DB Providers Tested with SQL Server
in Books Online where you will find examples of how to configure
a connection some common and interesting linked server OLE DB
providers. Microsoft provides the SQLOLEDB provider for
connection to a SQL Server 2000 as the data source.
Before linked servers, earlier
releases of SQL Server supported distributed queries to other
SQL Servers using "Remote Servers" via RPC. This required
configuration at both the data serving and the data requesting
SQL Server and had all the performance, scalability, and
security issues of the underlying RPC protocol. Remote Servers
are still available in SQL Server 2000 but provided "for
backward compatibility only". Looks like they are still there in
Yukon beta 1 but if you are using them now, it's a good idea to
begin migrating remote servers to linked servers.
Hopefully, in SQL Server 2005 we
will be able to use the native .Net providers when creating
linked servers and enjoy the many advantages of that provider.
Dino Esposito has a very nice online article in Visual
Studio magazine: "Create
More Efficient Database Code"
on the .Net provider and how it differs from the SQLOLEDB
provider. Since linked servers are not yet implemented in the
Yukon beta 1 release, I guess we'll just have to wait and see if
T-SQL linked servers will gain the new .Net powers and
efficiency.
So what is an OLE DB provider
anyway? According to the "OLE
DB Programmers Reference"
of Microsoft's MSDN library:
OLE DB architecture is built
upon the precept of an application accessing diverse data
stores through a small application built specifically for
that purpose. The application that uses OLE DB functionality
is called the consumer, while the one that accesses
the data by exposing OLE DB interfaces is called the
provider.
This leads us to an important
point when we talk about using linked servers in T-SQL queries.
Using a linked server makes the SQL Server where the linked
server call is executed the consumer of the provided data. That
means that we are asking a SQL Server to be a middle man for
data. The network loading and security issues of this can be
huge! I have seen many cases where a developer will use a linked
server to get data from a remote source rather than making a
connection to that source directly from his application. This is
just laziness - and not the good
fourth lazy form
kind of lazy either! - it is simply wrong. SQL Server is...,
well..., a server not a client application or a mid tier
transport utility. It should be obvious to anyone that moving
data from an OLE DB data source to a SQL Server to your
application is less efficient and creates a more complex
security model than moving data from an OLE DB data source to
your application.
OK, so what are Linked Servers
good for then? If you need to join data from two or more
different data sources linked servers are your friend. Even with
this usage linked servers can carry some performance and
security costs. Basically what you want to do is use the
principle of least
privilege and then make
sure the record sets transferred on a linked server connection
are small or make sure that the rows are moved across the linked
server connection as a "fire hose" or fast forward only cursor.
That's right! If you watch a linked server call that uses the
SQLOLEDB provider to get a multi-row data set from another SQL
Server in SQL Profiler trace on that remote server you will see
that the data is transferred using a
Server API cursor.
Just like cursors in any other context, they can slow you down.
There is more than one way to work with a query to produce a
"fire hose" from the linked server data source. One consistently
useful method is to insert the rows into a temp table and then
process the necessary join as a local query using the temp table
rather than joining across the network. Bottom line here is that
good testing will result in good results.
You can
configure linked
servers using Enterprise
Manager but - as with almost anything else you do in SQL Server
- I recommend using the system stored procedures to configure
linked servers. The system stored procedures are:
sp_addlinkedserver
- You'll find yet another listing of the OLE DB
provider names listed on the sp_addlinkedserver help
page in Books Online. Just remember that the
preferred provider to link to SQL Servers is
SQLOLEDB. Another interesting tip is that you do not
need to specify a catalog when defining a SQLOLEDB
linked server. The advantage is that the linked
server will not "break" if a database is dropped or
it's name is changed. That can be a very obscure
problem. The limitation is that this does require
the database name be provided in all linked server
queries. And that should be a best practice anyway.
sp_addlinkedsrvlogin
- There are a few
security
considerations
when creating linked servers that you must address
when creating logins. Using linked servers on
trusted connections without the benefit of Active
Directory Delegation can be a real pain. It's
definitely a security nightmare, but you can add [NT
AUTHORITYANONYMOUS LOGIN] as a trusted login at the
data source and grant permissions to this ambiguous
login to work around the delegation issue in an
environment without delegation. If a SQL login has
the same name but different passwords you'll have to
map that user as to the same user name with the
other password in the linked server login
configuration. That also means when you change the
password you have to manually update the password in
the linked server configuration. You can also map
users to other users in the linked server. Be
careful because that could get confusing in a hurry.
sp_droplinkedsrvlogin
- Keep in mind that when using system stored
procedures you must drop the logins to a linked
server before you can drop the linked server.
Enterprise Manager takes care of all of that for you
behind the scenes.
sp_dropserver
- Since linked servers are store in
master.dbo.sysservers along with remote servers
and even the local instance, use the same procedure
to remove any of them.
sp_linkedservers
or
sp_helpserver -
sp_helpserver will show you linked servers as well
as remote servers.
sp_serveroption
- This is an important one that can slip through the
cracks. Use this procedure to establish "data
access" - the ability to directly select, insert,
update, and delete - across the linked server, "rpc"
to call stored procedures on the link, and even
connection time out and query time out settings that
unfortunately default to unlimited in SQL Server
2000.
|
I use a
stored procedure that calls the
system stored procedures to assure that the linked servers
between the SQL Servers in my shop are all the same. I have also
experienced occasional corruption of a linked server so having a
stored procedure available that will quickly reproduce a linked
server has proven invaluable in crisis mode. I also take
advantage of the ability to alias linked servers. So for example
if I have an application stored procedure that must reference
another server I can create a linked server with the same name
in development, acceptance and production so that the
application stored procedure does not have to be edited when
promoted from development or acceptance.
Finally, you may be wondering how
are linked servers be used in T-SQL code. You can use linked
sever in fully qualified or four part named references (linked
server.database.owner.object) or you can use
OPENQUERY to execute
pass through queries. Using OPENQUERY can help you write those
fire hose queries I mentioned earlier without even needing to
think about it. Unfortunately, many developers find the
OPENQUERY syntax confusing and avoid using it.
To sum it up, linked servers can
be very useful, but are better used with small distributed data
sets that large data collections. My preference is that linked
server calls in applications are looking for a single row or
value on the remote data source. Of couse, as everyone likes to
remind me, I don't always get what I want.
Bill |