发明名称 |
SYSTEM AND METHOD FOR EXTRACTION OF OFF-TOPIC PART FROM CONVERSATION |
摘要 |
A system and method extract off-topic parts from a conversation. The system includes a first corpus including documents of a plurality of fields; a second corpus including only documents of a field to which the conversation belongs; a determination means for determination as a lower limit subject word a word for which IDF value for the first corpus and IDF value for the second corpus are each below a first certain threshold value; a score calculation part for calculation as a score a TF-IDF value for each word included in the second corpus; a clipping part, for sequential cutting out of intervals from text data that are contents of the conversation; and an extraction part for extraction as an off-topic part an interval where average value of the score of words included in the clipped interval is larger than a second certain threshold value. |
申请公布号 |
US2013185308(A1) |
申请公布日期 |
2013.07.18 |
申请号 |
US201313740473 |
申请日期 |
2013.01.14 |
申请人 |
INTERNATIONAL BUSINESS MACHINES CORPORATION;INTERNATIONAL BUSINESS MACHINES CORPORATION |
发明人 |
ITOH NOBUYASU;NISHIMURA MASAFUMI;YAMAGUCHI YUTO |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|