摘要 |
The present invention relates to text analysis, and discloses a text representation method. Aspects include identifying concepts in the text by using a knowledge base and determining relationship between the concepts and generating a concept graph by using the relationship between the concepts. Aspects also include determining connected components of the concept graph; calculating weight of the connected components and determining the concepts representing the text according to the weight of the connected components. By using correlation between concepts in a knowledge base and according to connected component theory of a graph, finds out a set of concepts which best represents subject of the text, and removes concepts irrelevant to the subject, thus improving accuracy of text representation and reducing noise. |