PRB: COM Causes Delay on Multi-homed Computers

ID: Q185012


The information in this article applies to:


SYMPTOMS

Using COM in multi-homed scenarios might cause a delay in the following situations:

In either case, if the first IP address in the IP bindings of the server machine is not reachable from the client machine, then there will be a delay, under the conditions described below.

Since the client OXID Resolver (OR) process and the client process itself caches OXIDs and the binding handles to reach them, the delay does not always occur. Instead, if the OXID cannot be found in the client process's OXID cache or the OR's cache, then the client's OR must resolve the OXID. Doing this may entail making remote calls on each IP addresses in the OXID's bindings.

In addition, in Windows NT 4.0 SP3, the IP addresses in the returned bindings might be mangled, due to a bug in RPC over UDP. For additional information about this bug and its resolution, please see the following article in the Microsoft Knowledge Base:

Q183930 FIX: IP Is Mangled When Using UDP on Multi-homed Computers
NOTE: This bug can cause delays or errors in activation as well as interface pointer unmarshaling, independent of the reasons discussed in this article.


CAUSE

The delay in the two cases listed in the SYMPTOMS section is caused during client's OXID resolution. OXID resolution involves making calls to the server machine using either IOXidResolver:ResolveOxid2 or IOXidResolver:ServerAlive. For further information about OXID resolution, refer to the DCOM Wire Protocol at:

http://msdn.microsoft.com/isapi/msdnlib.idc?theURL=/library/specs/distributedcomponentobjectmodelprotocoldcom10.htm


Activation calls are optimized to carry out OXID resolution during the call itself. However, if there are multiple bindings for the server's OXID, the client's OXID Resolver (OR) process tests the reachability of these bindings by making calls on them (ServerAlive). During interface pointer unmarshaling, the client's OR needs to call the server's OR to resolve the server OXID (ResolveOxid2) followed potentially by reachability tests (ServerAlive).

In Windows NT 4.0, the OR makes the remote calls sequentially. If the first IP address is not reachable, then the call has to timeout before the OR moves to the next IP address. The timeout period will vary depending on the transport used. For UDP, which is the default DCOM transport for Windows NT, the timeout is 32 seconds.


RESOLUTION

Case 1

The reason for the delay during activation is that the OXID information related to the server process that is returned to the client contains multiple bindings (that is, IP addresses) and the first binding is not reachable from the client. The solution is two fold:

  1. Make sure that server does not return multiple bindings, but *only one*


binding that corresponds to the reachable IP address. To do this, you can instruct RPC in the server process to bind to a specific Network Interface Card (NIC). By doing this, the server's OXID bindings returned in the activation response packet will contain only the IP address corresponding to the specified NIC. Thus, calls from the client can be assured of reachability.

To do this, create the following key in the registry (if it does not already exist). Use Regedt32.exe to create the key.

WARNING: Using Registry Editor incorrectly can cause serious, system-wide problems that may require you to reinstall Windows NT to correct them. Microsoft cannot guarantee that any problems resulting from the use of Registry Editor can be solved. Use this tool at your own risk.

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Rpc\Linkage

Now create a REG_MULTI_SZ named value under this key with the name Bind. Specify the NIC name that is reachable from the client as the value of the Bind named value. You can obtain the NIC name by running a tool such as ipconfig.

You need to restart the server machine for this change to take effect. It is also recommended that you restart the client machine to discard any cached binding handles.

NOTE: For this technique to work, the DCOM protocol must be TCP/IP.

  1. Make sure that the server returns bindings in the "right" order. That


is, the first binding is the reachable one from the client.

The bindings for a COM server process and for the OXID Resolver process are formed by calling gethostbyname (in Windows NT 4.0). The order in which this API returns IP addresses is usually not controllable in Windows NT 4.0. However, the following two Knowledge Base articles explain how this order can be either changed or made the same as the adapter binding order for the TCP/IP protocol. For additional information, please see the following article in the Microsoft Knowledge Base:

Q171320 How to Change the IP Address List Order Returned
Q164023 Fix for Gethostbyname() IP Address Order on Local Multihomed Mac
You need to restart the server computer for this change to take effect. It is also recommended that you restart the client computer to discard any cached binding handles.

Follow the instructions in the previously mentioned Knowledge Base articles and the end result should be that when you call gethostbyname()on the multi-homed computer, the API returns the "reachable" IP address first. Write a test program that calls the API and verify your results.

NOTE: These fixes usually apply when you have physically multi-homed computers, not when the computer acquires a second IP address through a RAS or PPTP connection.

Case 2

The case is harder to resolve. When an interface pointer is marshaled from a machine, the marshaled object reference packet contains the bindings of the OR process so that clients may contact the OR for OXID resolution. The OR is also the endpoint mapper process for RPC servers running on the machine and since any RPC server can potentially bind to all NICs, the OR must bind to all NICs. Hence, the marshaled object reference packet will always contain all the IP addresses of the machine.

One way to resolve this is to take advantage of the binding handle caching performed by the OR. Thus, you can have a dummy client process hold a reference to a dummy object on the server machine. As long as a client holds a reference to an object on the server machine, the server OR's reachable binding handle is cached in the client machine's OR. Note that any client process on the client machine can hold a reference to any object on the server machine (multi-homed), this need not necessarily be the actual client process and the actual server process themselves. Also note that there is an upper limit to the number of entries in the cache. All subsequent OXID resolutions from other client processes to other objects on the multi-homed machine will happen at normal speed because of this caching.

Additional query words:


Keywords          : kbnetwork kbAPI kbNTOS400 kbRPC kbSDKPlatform kbWinOS95 kbGrpNet 
Version           : WINDOWS:95; winnt:4.0
Platform          : WINDOWS winnt 
Issue type        : kbprb 

Last Reviewed: June 22, 1999