INFO: ADO/ASP Scalability FAQ
ID: q176056
|
The information in this article applies to:
-
Microsoft Visual InterDev, version 1.0
-
Microsoft Active Server Pages, versions 1.0, 1.0b
-
ActiveX Data Objects (ADO), version 1.0
SUMMARY
This article addresses two common ActiveX Data Objects (ADO) questions
regarding the proper approach to handling connections in an Active Server
Pages (ASP) page:
- Why is it a good idea to open and close ADO connections within a page
instead of hanging on to them in session variables?
- What about users with specific connection requirements, can they use
connection pooling in this case?
Here are the key concepts touched upon to answer the above questions:
- Description of the Web environment.
- Web site scaling and efficient use of server resources.
- Description of connection pooling.
MORE INFORMATION
Question 1
Why are we opening/closing ADO connections within a page instead of hanging
on to them in, let's say, session variables? That's what we do in all our
"traditional" client/server applications (holding connections I mean) and
that seems to be okay.
Answer 1
The question comes down to the fact that you are in a very different
environment when using ASP pages than you were in the traditional
client/server applications. And when you look at it, many large
client/server applications did do connection pooling anyway, especially if
they were built in modular ways. The problems with pegging down a
connection per client are similar between the traditional client/server
application and the Web server environment, but they are exacerbated by the
nature of the Web server and so this becomes more of an issue in a Web
environment.
In the Web server, everything is multi-threaded, and there is no concept of
a "connection" between the Web browsing client and the server. The session
ID is a cookie passed back and forth from browser to server (which if the
browser does not support cookies or has that specifically turned off in
their options, session state does not work anyway so there's one good
reason to avoid it for database connections!). The only way a session
becomes invalid is if the user just doesn't return to the server within the
timeout period, or if you make some explicit call from the script to nuke
the session. The result is there's no good way to know when the client does
not need that connection anymore.
In a Web server, the entire idea is to scale like crazy. A public Web
server could easily have 100,000 clients maintaining active sessions
(probability increases as session timeout goes up) and while an intranet
server may not have such a load, you would not want to do something in your
architecture that would prohibit that possibility.
So first off, you can't establish a connection per client if you plan on
scaling past a few thousand concurrent active sessions. You can take steps
to try and avoid that type of scaling, but why? You quickly realize that
you are a server process, and in scalability, the name of the game is
resource conservation. You've got a ratio of 100,000 : 1 (or may be
1,000,000 : 1) and you have to make it work. How? Resource pooling. This is
why Microsoft Transaction Server and Internet Information Server have put
so much energy into resource pooling. They want to scale to support a huge
transaction load.
So what happens if you do not pool? You have idle connections wasting
server and network resources. You also have some interesting threading
issues that can occur if multiple concurrent threads end up whacking on the
same connection (though the session ID might save you here, but it's
conceivable that a browser could submit two concurrent requests using the
same session ID and you could get into a world of hurt with transactions or
with SQL Server's inability to process more than one command at a time on a
given connection).
If you "open" the connection (really getting it from a pool), use it and
"close" it again (really returning it to the pool), the connection can
safely be handed to another thread processing a different command. If the
server load gets light, the connection pool is trimmed back automatically
and others using that server get better performance. If the server load
gets heavy, the pool can grow as needed. However, you'll probably notice
that you can support lots of clients with just a few connections. I would
bet that you can get a really high ratio of clients (sessions) to DB
connections if you keep your DB work focused and concentrated.
As I said above, anyone who has written a traditional client/server
application that was intended to be modular (you plug in new components or
services over time) quickly realized that connection pooling was a must.
Since services have to be autonomous, they have to create and use
connection. You would not want each client desktop creating 5 or 10
connections to the database, so you make a shared ConnectionPool component.
And that is just what the ODBC 3.0 Driver Manager is. And it's a really
good one at that!
So I hope this gives you some insight on why it's really necessary. If you
don't want to scale beyond a few thousand concurrent sessions, have lots of
money to spend on your database server, and don't think you'll ever get
simultaneous requests with the same session ID, then don't worry about it.
Just throw your connection in the session object. However, I would really
strongly suggest you just use connection pooling and get in the habit of
doing discreet work since this is the model you use when doing Microsoft
Transaction Server objects, and it's the model that will scale well in this
distributed world of ours.
Question 2
How does connection pooling work for users with specific connection
requirements? Pooling connections among anonymous users who are operating
against the same data objects using similar connection semantics is nice,
but it isn't as helpful when each user must be authenticated and should
only have access to a limited number of data objects based on their
identity (a messaging session comes to mind).
Can connection pooling be used in the latter case?
Answer 2
It does take authentication into consideration, and it won't hand out a
connection that has a different identity than the one that is being
requested (if you submit a connection string with "UID=me;PWD=foo", it
won't return a pooled connection that has a UID and PWD of something
different). This also works when using integrated security in that you
won't get a connection that was opened under a different security context
than you are currently running under.
You do raise a very good question though. If you want to use an
authenticated connection (integrated security) but want to keep the
scalability high, how do you do it?
Since there is no way to change the security context of an existing open
connection, connection pooling would only buy you the ability to reuse a
connection that was opened on the first site hit by that user. The pool
would keep it alive for a reasonable amount of time and if the user kept
interacting with the site, the connection would remain live. If they user
went away, it would eventually fall out of the pool and be closed. This
still offers some goodness since you code as if you're doing drop/reconnect
and the pool optimizes the case where the user comes back quickly, but
overcomes the problem of not knowing when the user really leaves by aging
the connection and killing it if the user doesn't return in the specified
timeout period. You could argue that using the Session object would be a
way of doing this too, but note that here you could have the connection
pool timeout set differently than the session timeout, giving the user the
ability to go to lunch and still have the session state there, but not
tying up a connection the whole time.
It might be a better approach to classify users into roles like "admin,"
"user," and "browser." Each role has a SQL Server login (UID and PWD), and
you map the current user to that role based on their name or whatever. They
are still authenticated by the Web server, but you don't use integrated
security to the DB and just have three or four different login identities.
This enables you to still have really good scaling (since the majority of
people will fall into one or two roles), and you can also enforce certain
security in the application logic when needed (such as, only allow me to
see my stock option history). This of course is a bit flawed in that the
role and extra security logic is application enforced rather than being in
the system, but sometimes you have to do things that the system should do,
but doesn't do well at the moment. It would be ideal if the system could
pick a connection out of the pool and change the security context for it,
but I don't think that's possible today.
REFERENCES
For the latest Knowledge Base articles and other support information on
Visual InterDev and Active Server Pages, see the following page on the
Microsoft Technical Support site:
http://support.microsoft.com/support/vinterdev/
Additional query words:
Keywords : kbnokeyword kbADO kbVisID kbGrpASP
Version : WINDOWS:1.0
Platform : WINDOWS
Issue type : kbinfo
Last Reviewed: May 11, 1999