Q&A: SQL Server Driver 2.50 is DBCS-Enabled

ID: Q136269


The information in this article applies to:


SUMMARY

The purpose of this article is to answer general questions and provide more background information regarding how the SQL Server Driver is DBCS-enabled. The article is divided into the following sections:


MORE INFORMATION

What is DBCS?

Double-byte Character Set (DBCS) is a character encoding mechanism to accommodate ideographic characters used in Far Eastern languages. Unlike Single-byte Character Sets (SBCS), which can only represent at most 256 characters in one byte, characters in DBCS can be addressed using a 16-bit notation, using two bytes, or double-byte. With 16-bit notation, you can represent 65,536 (216) characters.

DBCS code pages contain both single and double-byte characters. The DBCS single-byte characters conform to the 8-bit national standards for each country and correspond closely to the ASCII character set.

In a double-byte character set, certain ranges of code-points are designated as leading bytes. A leading byte, together with the following byte, represents a single character. This second byte is called the trailing byte or trail byte. Each DBCS has a different set of lead-byte ranges and trail-byte ranges. Unlike leading bytes, trail-bytes in some DBCS can overlap with 7-bit ASCII character set.

For example, the Shift JIS (Japan Industry Standard) character set has a trail-byte range of 0x40H-0xFEH. That means a byte holding the value of 0x7DH can represent the second half of a Kanji character, not necessary a close brace character(}).

What does "DBCS-enabled" imply?

If a program is claimed to be DBCS-enabled, that means when it is running on a DBCS platform, the following conditions are true:
  1. It can distinguish a trail-byte from an ASCII character. For example, it can find out if 0x7DH is the trail-byte of a Kanji character or a close brace when it runs on Japanese versions of Windows or Windows NT.


  2. It should differentiate character-based semantics from byte-based semantics. For example, a function such as "CharCount" should return the number of characters in the string instead of the number of bytes in a DBCS string; a function such as "CharNext" should move to the next character rather than the next byte in a DBCS string.


Questions and Answers

The following answers are based on connections to the English version of Microsoft SQL Server version 6.0.
  1. CAN I PUT A DBCS STRING INTO CHAR OR VARCHAR COLUMNS? CAN I RETRIEVE A DBCS STRING FROM THE SQL SERVER AND DISPLAY IT?

    Yes. When you connect a driver to SQL Server version 6.0 (English version), since there are no DBCS code pages provided with version 6.0 the server will treat any DBCS string as characters in one of three code pages, ISO 8859-1, CP850, or CP437, which can be selected during the installation of the SQL Server. No data will be lost during the insertion or retrieval of the data.

    In order to display DBCS strings, however, your client application should run on a DBCS platform, such as the Japanese version of Windows. As soon as you fetch a DBCS string from the SQL Server, the Japanese version of Windows can display these characters for you.


  2. CAN I USE A DBCS CHARACTER OR STRING IN A LIKE CLAUSE ?

    Yes. Since the driver is DBCS-enabled, it can parse trail-bytes correctly. For example, it will not interpret trailing-byte characters such as the percent sign (%) and underscore character (_) as wildcards, and it will ignore trailing-byte characters such as the single quotation mark (') and close brace character(}).



    ODBC provides two wildcards in a LIKE clause: the percent sign matches zero or more of any character, and the underscore character matches any one character. When you connect to the English version of SQL Server version 6.0, the underscore character actually matches one byte.


  3. CAN I USE DBCS CHARACTERS TO NAME MY TABLES, COLUMNS AND OTHER OBJECTS?

    Yes, because SQL Server 6.0 treats any DBCS characters as characters in one of its SBCS code pages. Remember to use double quotation marks to enclose your DBCS identifier, in order to avoid syntax error messages from the SQL Server.


  4. HOW DO YOU DEFINE SORT ORDERS FOR DBCS IN SQL SERVER?

    Currently, the English version of SQL Server 6.0 has some predefined sort orders based on Single-Byte Character Sets. There is no easy way to plug-in a customized DBCS-based sort order in the current SQL Server. As previously mentioned, the server treats any DBCS characters as characters in the code page it is currently using.


  5. I AM TOLD THAT DBCS ISSUES WILL BE ADDRESSED IN THE ODBC 3.0 TIME FRAME. SINCE THE SQL SERVER DRIVER 2.50 HAS ALREADY BEEN DBCS-ENABLED, WHAT WILL BE NEW IN ODBC 3.0?

    ODBC 3.0 will address DBCS issues from the specification's perspective. For example, in Kyle Geiger's book, "Inside ODBC," Chapter 9, section "ODBC 3.0", page 453, you can see two fields in a descriptor record: LENGTH and OCTET_LENGTH. Here, LENGTH specifies the number of characters in the column and OCTET_LENGTH gives the length of the column in bytes.


Additional query words: sql6 prog QA odbcfaq


Keywords          : kbprg SSrvProg 
Version           : 2.5.121 6.0
Platform          : WINDOWS 
Issue type        : 

Last Reviewed: April 20, 1999