发明名称 SYSTEM AND METHOD FOR CONTENT EXTRACTION FROM UNSTRUCTURED SOURCES
摘要 A system and method for extracting content from unstructured sources is disclosed. The method includes analyzing web pages of a website, storing text and image data for each web page of the website, extracting a plurality of entities from the web page data, scoring each entity of the plurality of entities to provide an overall score for each entity, and defining a product based on the plurality of entities and the overall score for each entity.
申请公布号 WO2011032121(A3) 申请公布日期 2011.07.07
申请号 WO2010US48707 申请日期 2010.09.14
申请人 ETSY, INC.;DAVIS, JASON 发明人 DAVIS, JASON
分类号 G06F17/21;G06F15/16;G06F17/40 主分类号 G06F17/21
代理机构 代理人
主权项
地址