发明名称 Collecting learning materials for informal learning
摘要 A method of collecting learning materials for informal learning may include detecting an addition of an item to a curation. The method may include extracting one or more links in a page referenced by the item. The method may include downloading pages corresponding to the one or more links. The method may include filtering the downloaded pages to generate candidate index pages. The method may also include identifying an appropriate index page from the candidate index pages. The method may further include locating a primary information block in the appropriate index page. The method may also include generating an automated extraction rule configured to direct a system to the primary information block of the appropriate index page.
申请公布号 US9589061(B2) 申请公布日期 2017.03.07
申请号 US201414245764 申请日期 2014.04.04
申请人 FUJITSU LIMITED 发明人 Wang Jun;Uchino Kanji
分类号 G06F17/30;H04L29/08 主分类号 G06F17/30
代理机构 代理人 Brennan Maschoff
主权项 1. A method of collecting learning materials for informal learning, the method comprising: detecting an addition of an item to a curation; extracting one or more links from a page referenced by the item; downloading pages corresponding to the one or more links; filtering the downloaded pages to generate candidate index pages by excluding a subset of the downloaded pages not having links that point back to the page referenced by the item; detecting information blocks containing links pointing back to the page referenced by the item in one or more of the candidate index pages; locating a primary information block in one or more of the candidate index pages based on a page structure analysis; performing a uniform resource locator (URL) structures analysis of the one or more of the candidate index pages that include the links pointing back to the page referenced by the item in the primary information block; based at least partially on the URL structures analysis, identifying an appropriate index page from the candidate index pages; locating a primary information block in the appropriate index page, the primary information block including a portion of the appropriate index page where a majority of substantive information is contained; and generating an automated extraction rule configured to direct a system to the primary information block of the appropriate index page.
地址 Kawasaki JP