发明名称 Structure extraction from unstructured documents
摘要 A document similarity detector may be used to determine a family of documents based on a similarity analysis between content of a seed document and content of the family of documents, the content of the seed document associated with at least one database object having at least one field. A content extraction system may be used to determine a ranking of a plurality of terms from within at least one document of the family of documents, based on a relative frequency with which each of the plurality of terms appears within the family of documents, and configured to extract at least one term from the plurality of terms as being associated with a value of the at least one field, based on the ranking.
申请公布号 US7562088(B2) 申请公布日期 2009.07.14
申请号 US20060645861 申请日期 2006.12.27
申请人 SAP AG 发明人 DAGA RAKSHIT;PANDEY GAURAV
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址