PRB: SQL Server ODBC Driver Converts Language Events to Unicode

ID: Q234748


The information in this article applies to:


SYMPTOMS

When using the MDAC 2.1 or later version of the SQL Server ODBC driver (version 3.70.0623 or later) or the OLEDB provider (version 7.01.0623 or later), under some circumstances you may experience translation of character data from the client code page to the server code page, even when Autotranslation is disabled for the connection.


CAUSE

Autotranslation is not the only mechanism that can result in code page conversion. The SQL Server 7.0 ODBC driver and OLEDB provider introduce a new behavior when connecting to MSDE 1.0, SQL Server 7.0, or later versions of either. All SQL statements sent as a language event are converted to Unicode on the client before being sent to the server. The end effect of this is similar to an Autotranslation of all data flowing from the client to the server through a language event, regardless of the current Autotranslation setting for the connection. This will not introduce any difficulties except when trying to store non-translated character data from a code page other than SQL Server's code page.


WORKAROUND

Do not store code page X data in a code page Y SQL Server (for example, code page 950 data in a code page 1252 server). While this was possible in some circumstances with previous versions of SQL Server, it has always been unsupported. To a 1252 SQL Server, anything but a 1252 character is not valid character data. Non-Unicode character data from a different code page will not be sorted correctly, and in the case of dual-byte (DBCS) data SQL Server will not recognize character boundaries correctly. The best choice for the SQL Server's code page is the code page of the clients that will be accessing the server.

The server and client may have different code pages, but you must ensure that Autotranslation is enabled on the client so that you get proper translation of data to and from the server’s code page in all cases.

If your server must store data from multiple code pages, the supported solution is to store the data in Unicode columns (NCHAR/NVARCHAR/NTEXT).

If your situation requires that you store code page X data in a code page Y SQL Server, there are only two ways to do this reliably:


MORE INFORMATION

Autotranslation (that is, the “Perform Translation for character data" checkboxes in newer ODBC applications) converts character data from the client code page to the server code page before sending the data to the server, using Unicode as a translation medium. However, the 3.7 SQL Server ODBC driver also converts all SQL statements sent as a language event to Unicode before placing them on the wire, which has an effect that is similar to Autotranslation but is not governed by the Autotranslation setting. In contrast, character data flowing from the server back to the clients respect the Autotranslation flag; if Autotranslation is turned off the data arrives at the client application with the same character codes as the data had on the server. Similarly, translation of data for client-to-server RPC events can be disabled by turning off Autotranslation. A simple script that demonstrates how this behavior affects language events follows. This example was run from Query Analyzer on a code page 1252 client connecting to a code page 437 server:


    -- Turn Autotranslation off here.
    USE tempdb
    GO
    CREATE TABLE t1 (c1 int, c2 char(1))
    GO
    
    -- Enter a yen character, using the keystroke ALT-0165.
    INSERT INTO t1 VALUES (1, '¥') 
    SELECT c1, c2, ASCII (c2) FROM t1 

        c1          c2               
        ----------- ---- ----------- 
        1               157
        
        (1 row(s) affected)
 

Note the following about the preceding example:



    -- Turn Autotranslation back on before running the following batch.
    INSERT INTO t1 VALUES (2, '¥')
    SELECT c1, c2, ASCII (c2) FROM t1 

        c1          c2               
        ----------- ---- ----------- 
        1           ¥    157
        2           ¥    157
        
        (2 row(s) affected)
 
In this case, turning Autotranslation back on had no effect on the translation from the client to the server (that is, the same correct translation from character code 165 to character code 157 happened), but it did have an effect on the data retrieved from the server. Note that when the SELECT statement is run this time (with Autotranslation on), the yen symbols display correctly in the code page 1252 application because they have been translated from character code 157 back to character code 165 by the Autotranslation mechanism.

You will see this behavior (conversion of language events to Unicode on the client) when using any SQL Server ODBC driver version 3.70 or later and connecting to SQL Server 7.0 or later. It will not occur when using older ODBC drivers, or when using the 3.7 driver to connect to SQL Server 6.5 or earlier. In addition, if you are storing your data in Unicode columns (NCHAR/NVARCHAR/NTEXT) the conversion will not be an issue.

Additional query words: cp cs dbcs OemToAnsi AutoAnsiToOem nls charset character set SQLExecDirect SQLExecute


Keywords          : kbSQLServ700 
Version           : WINDOWS:1.0,3.7; winnt:7.0
Platform          : WINDOWS winnt 
Issue type        : kbprb 

Last Reviewed: August 2, 1999