发明名称 System and method for identifying facts and legal discussion in court case law documents
摘要 A computer-implemented method of gathering large quantities of training data from case law documents (especially suitable for use as input to a learning algorithm that is used in a subsequent process of recognizing and distinguishing fact passages and discussion passages in additional case law documents) has steps of: partitioning text in the documents by headings in the documents, comparing the headings in the documents to fact headings in a fact heading list and to discussion headings in a discussion heading list, filtering from the documents the headings and text that is associated with the headings, and storing (on persistent storage in a manner adapted for input into the learning algorithm) fact training data and discussion training data that are based on the filtered headings and the associated text. Another method (of extracting features that are independent of specific machine learning algorithms needed to accurately classify case law text passages as fact passages or as discussion passages) has steps of: determining a relative position of the text passages in an opinion segment in the case law text, parsing the text passages into text chunks, comparing the text chunks to predetermined feature entities for possible matched feature entities, and associating the relative position and matched feature entities with the text passages for use by one of the learning algorithms. Corresponding apparatus and computer-readable memories are also provided.
申请公布号 US6772149(B1) 申请公布日期 2004.08.03
申请号 US19990401725 申请日期 1999.09.23
申请人 LEXIS-NEXIS GROUP 发明人 MORELOCK JOHN T.;WILTSHIRE, JR. JAMES S.;AHMED SALAHUDDIN;HUMPHREY TIMOTHY LEE;LU XIN ALLAN
分类号 G06F17/30;(IPC1-7):G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址