发明名称 Method and apparatus for learning, recognizing and generalizing sequences
摘要 A method of generalizing a dataset having a plurality of sequences defined over a lexicon of tokens is provided. The method comprises: searching over the dataset for similarity sets, where each similarity set comprises a plurality of segments of size L having L-S common tokens and S uncommon tokens; and defining a plurality of equivalence classes corresponding to uncommon tokens of at least one similarity set. The method may further comprise a step in which a plurality of significant patterns are extracted, where each significant pattern corresponds to a most significant partial overlap between one sequence of the dataset and other sequences of the dataset. In one embodiment, a generalized dataset represented by a graph or a forest is constructed, and can be realized as a context-free grammar. The graph or forest can be used for generating sequences and/or testing grammatical structures.
申请公布号 US2007055662(A1) 申请公布日期 2007.03.08
申请号 US20040566480 申请日期 2004.08.01
申请人 EDELMAN SHIMON;HORN DAVID;RUPPIN EYTAN;SOLAN TSACH 发明人 EDELMAN SHIMON;HORN DAVID;RUPPIN EYTAN;SOLAN TSACH
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址