发明名称 |
CITATION RECORD EXTRACTION SYSTEM AND METHOD, AND PROGRAM PRODUCT |
摘要 |
A citation record extraction system is provided. An HTML rendering engine receives a publication list web page, parses the publication list web page to obtain layout information of the web page. A web page sequence builder generates a web page characteristic sequence for the web page according to the layout information. A web page repeated pattern analyzer analyzes repeated pattern presented in the web page characteristic sequence, screens out non-citation record therefrom, and obtains a citation record of the publication list web page.
|
申请公布号 |
US2011029528(A1) |
申请公布日期 |
2011.02.03 |
申请号 |
US20100834757 |
申请日期 |
2010.07.12 |
申请人 |
NATIONAL TAIWAN UNIVERSITY OF SCIENCE & TECHNOLOGY |
发明人 |
LEE HAHN-MING;HO JAN-MING;CHEN SHUI-SHI;YANG KAI-HSIANG;WANG RUEI-YUAN;YEH JEROME |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|