发明名称 A machine learning system for extracting structured records from web pages and other text sources
摘要 <p>A method for extracting a structured record (190) from a document (100) is described where the the structured record includes information related to a predetermined subject matter (120), with this information being organized into categories within the structured record. The method comprises the steps of identifying a span of text (130) in the document (100) according to criteria associated with the predetermined subject matter and processing (150) the span of text to extract at least one text element associated with at least one of the categories of the structured record (190) from the document (100).</p>
申请公布号 EP1669896(A2) 申请公布日期 2006.06.14
申请号 EP20050111255 申请日期 2005.11.24
申请人 PANSCIENT PTY LTD. 发明人 BAXTER, JONATHAN;SEYMORE, KIRSTIE
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址