发明名称 Method for mining path traversal patterns in a web environment by converting an original log sequence into a set of traversal sub-sequences
摘要 An efficient computer implemented method of mining path traversal patterns in a communications network. The method of the present invention comprises two steps. A method, called MF (standing for maximal forward references), is first used to convert an original sequence of log data into a set of traversal subsequences. Each traversal subsequence represents a maximal forward reference from the starting point of a user access. This step of converting the original log sequence into a set of maximal forward references will filter out the effect of backward references which are mainly made for ease of traveling, and enable us to concentrate on mining meaningful user access sequences. Accordingly, when backward references occur, a forward reference path terminates. This resulting forward reference path is termed a maximal forward reference. After a maximal forward reference is obtained, we back track to the starting point of the forward reference and begin a new forward reference path. In addition, the occurrence of a null source node also indicates the termination of an ongoing forward reference path and the beginning of a new one. Second, methods are developed to determine the frequent traversal patterns, termed large reference sequences, from the maximal forward references obtained above, where a large reference sequence is a reference sequence that appeared a sufficient number of times in the database to exceed a predetermined threshold.
申请公布号 US5668988(A) 申请公布日期 1997.09.16
申请号 US19950525891 申请日期 1995.09.08
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 CHEN, MING-SYAN;YU, PHILIP SHI-LUNG
分类号 G06F17/30;(IPC1-7):G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址