发明名称 Method, apparatus and system for linking documents
摘要 A method, apparatus and system for linking documents, the method comprising the steps of: providing a plurality of clusters in an enterprise intranet, each cluster consists of one or more documents; building a cluster page for each cluster to present the documents in the cluster; and building links between the cluster pages, between the documents, and between the cluster page and the document, based on analysis of the contents of the clusters and the documents. The present invention is useful for building the links between separate documents and may apply a link analyzing algorithm to the search for these documents to implement better search performance within the enterprise intranet.
申请公布号 US8938451(B2) 申请公布日期 2015.01.20
申请号 US200812133766 申请日期 2008.06.05
申请人 International Business Machines Corporation 发明人 Zhang Li;Yang Li Ping;Liu Shixia
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Ryan, Mason & Lewis, LLP 代理人 Ryan, Mason & Lewis, LLP
主权项 1. A method for linking documents, comprising a computer performing the steps of: obtaining a set of documents, wherein documents in the set of documents are not interlinked with other documents via one or more hyperlinks; forming a plurality of clusters from the set of documents, each cluster comprising one or more documents; building a cluster page for each cluster to represent the documents in the cluster; and building links based on analysis of the contents of the clusters and the documents; wherein, when a similarity of a topic of a first document of the set of documents to a second document of the set of documents is greater than a threshold, a link is built from the first document to the second document, the similarity being a cosine function of an angle between a topic vector of the first document and a document vector of the second document; wherein said step of building links comprises one or more of the following steps: building the link between the cluster pages;building the link from the cluster page to the document;building a link from the document to the cluster page; andbuilding the link between the documents; wherein, when the number of documents commonly owned by a first cluster and a second cluster is greater than or equal to a threshold, build the links between the cluster page of the first cluster and the cluster page of the second cluster; and wherein, when the proportion of the number of commonly owned documents in the first cluster is greater than that in the second cluster, generate a link from the cluster page of the second cluster to the cluster page of the first cluster, otherwise generate a link from the cluster page of the first cluster to the cluster page of the second cluster.
地址 Armonk NY US