发明名称 |
METHOD AND DEVICE FOR STRUCTURING DOCUMENT CONTENTS |
摘要 |
A method for structuring document contents includes: generating a first instantiating rule corresponding to a first document based upon a first schema file with a style, which is a preset style, and a first XML file with a rule, which is a first structuring rule, in the first document; obtaining a first list of tags corresponding to structured first contents in the first document based upon a first tag structure tree of the first contents; obtaining M texts matching the first instantiating rule from discrete contents corresponding to the first list of tags, wherein the discrete contents are unstructured contents excluded from the structured first contents; determining N tags which can match the structured first contents among M tags corresponding to the M texts; and structuring N texts corresponding to the N tags based upon the N tags to obtain a second tag structure tree. |
申请公布号 |
US2014181640(A1) |
申请公布日期 |
2014.06.26 |
申请号 |
US201314096790 |
申请日期 |
2013.12.04 |
申请人 |
BEIJING FOUNDER ELECTRONICS CO., LTD. ;PEKING UNIVERSITY FOUNDER GROUP CO., LTD. |
发明人 |
SUN Mingming |
分类号 |
G06F17/22 |
主分类号 |
G06F17/22 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method for structuring document contents, comprising:
generating a first instantiating rule corresponding to a first document based upon a first schema file with a style, which is a preset style, and a first XML file with a rule, which is a first structuring rule, in the first document; obtaining a first list of tags corresponding to structured first contents in the first document based upon a first tag structure tree of the first contents; obtaining M texts matching the first instantiating rule from discrete contents corresponding to the first list of tags, wherein the discrete contents are unstructured contents excluded from the structured first contents, and M is a positive integer equal to or larger than 1; determining N tags which can match the structured first contents among M tags corresponding to the M texts; and structuring N texts corresponding to the N tags based upon the N tags to obtain a second tag structure tree. |
地址 |
Beijing CN |