发明名称 Method for document retrieval and for word sense disambiguation using neural networks
摘要 A method for storing and searching documents also useful in disambiguating word senses and a method for generating a dictionary of context vectors. The dictionary of context vectors provides a context vector for each word stem in the dictionary. A context vector is a fixed length list of component values corresponding to a list of word-based features, the component values being an approximate measure of the conceptual relationship between the word stem and the word-based feature. Documents are stored by combining the context vectors of the words remaining in the document after uninteresting words are removed. The summary vector obtained by adding all of the context vectors of the remaining words is normalized. The normalized summary vector is stored for each document. The data base of normalized summary vectors is searched using a query vector and identifying the document whose vector is closest to that query vector. The normalized summary vectors of each document can be stored using cluster trees according to a centroid consistent algorithm to accelerate the searching process. Said searching process also gives an efficient way of finding nearest neighbor vectors in high-dimensional spaces.
申请公布号 US5317507(A) 申请公布日期 1994.05.31
申请号 US19900610430 申请日期 1990.11.07
申请人 GALLANT, STEPHEN I. 发明人 GALLANT, STEPHEN I.
分类号 G06F17/27;G06F17/30;(IPC1-7):G06F15/38 主分类号 G06F17/27
代理机构 代理人
主权项
地址