发明名称 |
Method and apparatus for mapping multiword expressions to identifiers using finite-state networks |
摘要 |
Multiword expressions are mapped to identifiers using finite-state networks. Each of a plurality of multiword expressions is encoded into a regular expression. Each regular expression encodes a base form common to a plurality of derivative forms defined by ones of the multiword expressions. Each of the plurality of regular expressions is compiled with factorization into a set of finite-state networks. A union of the finite-state networks in the set of finite-state networks is performed to define a multiword finite-state network and a set of subnets. The multiword finite-state network and the set of subnets are traversed to identify a path corresponding to one of the plurality of multiword expressions, wherein only transitions originating from the multiword finite-state network are accounted for to ascertain a path number identifying a base form of the one of the plurality of multiword expressions.
|
申请公布号 |
US2004128122(A1) |
申请公布日期 |
2004.07.01 |
申请号 |
US20020248058 |
申请日期 |
2002.12.13 |
申请人 |
XEROX CORPORATION |
发明人 |
PRIVAULT CAROLINE;POIRIER HERVE |
分类号 |
G06F17/27;(IPC1-7):G06F17/28 |
主分类号 |
G06F17/27 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|