发明名称 DATA ANALYSIS DEVICE AND METHOD THEREFOR
摘要 The computational cost for a mapping process performed in analyzing genome/exome/transcriptome is reduced by sorting all cyclic permutations or suffixes of all read sequences to allow a search to be performed on the basis of any base sequence as a key. Also performed is computing and storing, for each base position in a genome sequence, the minimum length for uniqueness (MLU) at which a partial sequence starting from the base position becomes unique in the genome. In analysis of variations, a target region is scanned to inspect the number of matching read sequences that contain a partial sequence with a length of MLU and thus estimate the position of a variation, and then, the read sequences are collected at a position where the possibility of a variation having occurred is high to perform comparison analysis of the sequences.
申请公布号 US2015363549(A1) 申请公布日期 2015.12.17
申请号 US201314762897 申请日期 2013.11.20
申请人 HITACHI HIGH-TECHNOLOGIES CORPORATION 发明人 KIMURA Kouichi
分类号 G06F19/22;G06F19/28;G06F17/30 主分类号 G06F19/22
代理机构 代理人
主权项 1. A data analysis device comprising a processing unit and a storage unit, wherein the storage unit is configured to store a genome sequence database and a read sequence database, the genome sequence database having a collection of genome sequence data, and the read sequence database having a collection of read sequence data, and the processing unit is configured to select a key sequence as a base sequence to be used for a search on the basis of a sequence of a specified genome region to be analyzed,determine a depth of the key sequence in the read sequence database, andextract read sequence data containing the key sequence from the read sequence database, and compare the extracted read sequence data with the sequence of the genome region to analyze the data.
地址 Tokyo JP