发明名称 FAST AND SECURE RETRIEVAL OF DNA SEQUENCES
摘要 Sequence models are retrieved from a sequences index. The sequence models model DNA or RNA sequences stored in a database, and each comprises a finite memory tree source model and parameters for the finite memory tree source model. One or more DNA or RNA sequences stored in the database are identified as being most similar to a query DNA or RNA sequence based on fitting of the retrieved sequence models to the query DNA or RNA sequence. The sequence models may be context tree weighting (CTW) models {Sx, θSx} where Sx denotes the context tree model for the DNA or RNA sequence x stored in the database, and θSx denotes parameters of the context tree model Sx. The fitting may include, for each CTW model {Sx, θSx}, computing the codeword length for the query DNA or RNA sequence y using the CTW model {Sx, θSx.
申请公布号 US2016070859(A1) 申请公布日期 2016.03.10
申请号 US201414786207 申请日期 2014.04.30
申请人 KONINKLIJKE PHILIPS N.V. 发明人 IGNATENKO Tanya
分类号 G06F19/28;G06F17/30 主分类号 G06F19/28
代理机构 代理人
主权项 1. A non-transitory storage medium storing instructions executable by an electronic data processing device to perform a method including: generating a sequences index comprising sequence models for deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) sequences stored in a database, the generating including computing the sequence model for each DNA or RNA sequence stored in the database as a finite memory tree source model and parameters for the finite memory tree source model; wherein the sequence models are computed using context tree weighting (CTW); and identifying one or more DNA or RNA sequences stored in the database as being most similar to a query DNA or RNA sequence based on applying the sequence models to the query DNA or RNA sequence and on determining how well each sequence model fits the query DNA or RNA sequence.
地址 Eindhoven NL