发明名称 Meta normalization for text
摘要 A system and method for normalizing encoded text data such as Unicode which is extensible without use of character definition tables through the use of metadata tagging. First, metadata characters, which have no effect on the interpretation of the raw text data, are used to express higher order protocols of encoded two text strings. Next, meta normal form conversion is performed on one or both of two strings to be compared, if both strings are not already in the same meta normal form. Finally, content equivalence determination is performed in which the characters in each string are compared to each other. If a string contains a metadata character, that character is ignored for purposes of equivalence comparison. The remaining characters represent the pure content of the string, e.g. characters without any particular glyph representation.
申请公布号 US6883007(B2) 申请公布日期 2005.04.19
申请号 US20010931302 申请日期 2001.08.16
申请人 INTERNATIONAL BUSINESS MACHINES 发明人 ATKIN STEVEN EDWARD
分类号 G06F17/22;(IPC1-7):G06F17/30;G06F17/00 主分类号 G06F17/22
代理机构 代理人
主权项
地址