Documents with Korean and Chinese Text Are Not Indexed Correctly

ID: Q215499


The information in this article applies to:


SYMPTOMS

Korean and Chinese wordbreakers do not work. Documents with Korean and Chinese text are not indexed correctly.


CAUSE

This problem occurs because the Korean and Chinese wordbreakers are not installed.


RESOLUTION

To resolve this problem, install the latest service pack for Site Server 3.0.


STATUS

Microsoft has confirmed this to be a problem in Microsoft Site Server version 3.0. This problem has been corrected in the latest U.S. service pack for Microsoft Site Server version 3.0. For information on obtaining the service pack, query on the following word in the Microsoft Knowledge Base (without the spaces):

S E R V P A C K


MORE INFORMATION

In addition to applying the service pack, the following modification is needed. The following changes require manual modification so that the previously-modified version of the Schema.txt file is not overwritten. Modify the Schema.txt file in the \Microsoft Site Server\Data\Search\Config directory by inserting the following text right before the end tag </site_server_search_schema>:


<stoplist
    language="Chinese_Simplified"
    file="noise.chs"
    primarylanguage=4
    sublanguage=2>

<stoplist
    language="Chinese_Traditional"
     file="noise.cht"
    primarylanguage=4
    sublanguage=1>

<stoplist
    language="Korean_Default"
    file="noise.kor"
    primarylanguage=18
    sublanguage=1>
 
This will enable the correct noise word lists for KOR, CHT, and CHS, which makes indexing more efficient and queries more meaningful.

For existing Search catalogs, modify the Schema.txt file found in the \Microsoft Site Server\Data\Search\Projects\Catalog Name\Build directory. After you make the above modification, a full build of the catalog must be performed for this change to take effect.

Additional query words:


Keywords          : SS3SP2Fix 
Version           : winnt:3.0
Platform          : winnt 
Issue type        : kbbug 

Last Reviewed: March 29, 1999