Identifying and characterizing an analogy in a document,申请号US201514672897-传众专利搜索

发明名称	Identifying and characterizing an analogy in a document
摘要	Disclosed is a method and system for identifying and characterizing an analogy in a document. In one implementation, the method comprises identifying a candidate document. The candidate document comprises an analogy for a target concept, a region of interest and a linguistic marker included in the region of interest. Further, the method comprises classifying the candidate document as an analogy document or a non-analogy document based upon a size of a region of interest and a count of linguistic marker. Furthermore, the method comprises identifying a source concept from the analogy document. Subsequently, the method comprises characterizing the source concept with corresponding metadata. The metadata comprises a familiarity of the source concept, a length of the source concept, and a readability of the source concept.
申请公布号	US9588965(B2)	申请公布日期	2017.03.07
申请号	US201514672897	申请日期	2015.03.30
申请人	TATA CONSULTANCY SERVICES LIMITED	发明人	Pedanekar Niranjan;Kumar Varun;Bhat Savita Suhas
分类号	G06F17/27;G06F17/28	主分类号	G06F17/27
代理机构	Thompson Hine LLP	代理人	Thompson Hine LLP
主权项	1. A method for identifying and characterizing an analogy in a document, the method comprising: identifying a candidate document, wherein the candidate document comprises an analogy for a target concept, a region of interest, and a linguistic marker included in the region of interest; classifying the candidate document as an analogy document or a non-analogy document based upon a size of the region of interest and a count of the linguistic marker; identifying a source concept from the analogy document, wherein the source concept comprises the analogy; and characterizing the source concept with corresponding metadata, wherein the metadata comprises a familiarity of the source concept, a length of the source concept, and a readability of the source concept, and wherein the familiarity of the source concept is calculated using an extracting Distributional related words using Co-occurrences (DISCO) tool, the length of the source concept is calculated using the size of the region of interest, and the readability of the source concept is calculated using a Flesch-Kincaid readability score method.
地址	Mumbai, Maharashtra IN