发明名称 |
COMPUTER-BASED SYSTEM AND METHOD FOR FINDING RULES OF LAW INTEXT |
摘要 |
A system and method for binary classification of text units such as sentence s, paragraphs and documents as either a rule of law (ROL) or not a rule of law (~ROL) (206). During a training phase (202) of the system and method of the present invention, an initialized knowledge base and labeled or pre-classifi ed sentences are used to build a trained knowledge base. The trained knowledge base contains an equation (404), a threshold (405), and a plurality of statistical values called Z values (502). When inputting text documents for classification, a Z value is generated for each term or token in the input text. The Z values are input to the equation which calculates a score for ea ch sentence. Each calculated score is compared to the threshold to classify eac h sentence as either ROL or ~ROL.
|
申请公布号 |
CA2410881(C) |
申请公布日期 |
2007.01.09 |
申请号 |
CA20012410881 |
申请日期 |
2001.05.31 |
申请人 |
LEXIS-NEXIS |
发明人 |
LU, X. ALLAN;HUMPHREY, TIMOTHY L.;AHMED, SALAHUDDIN;COLLIAS, SPIRO G.;MORELOCK, JOHN T.;WILTSHIRE, JAMES S., JR. |
分类号 |
G06F17/27;G06F17/30;G06N5/00 |
主分类号 |
G06F17/27 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|