Search HTML Filter Ignores UTF-8 Character EncodingID: Q188340
|
Search does not index text on HTML pages that have been UTF-8 encoded.
The HTML filter that ships with Site Server 3.0 is not capable of handling UTF-8 character encoding.
To resolve this problem, apply the latest Site Server 3.0 service pack.
Microsoft has confirmed this to be a problem in Site Server version 3.0. This problem has been corrected in the latest U.S. service pack for Microsoft Site Server version 3.0. For information about obtaining the service pack, query on the following word in the Microsoft Knowledge Base (without the spaces):
S E R V P A C K
The HTML filter has been updated to support UTF-8 encoding. Also, the
language and codepage tables have been updated.
UTF-8 is not automatically detected. Only documents explicitly tagged with:
<meta http-equiv=content-type content="text/html; charset=utf8">are interpreted as UTF-8.
Additional query words: sp prodsrch prodsitesrv3
Keywords : prodsitesrv3 prodsrch SS3Sp1Public Ss3Sp1fix
Version : WINNT:3.0
Platform : winnt
Issue type : kbbug
Last Reviewed: July 15, 1999