发明名称 TEXT SEGMENTATION AND LABEL ASSIGNMENT WITH USER INTERACTION BY MEANS OF TOPIC SPECIFIC LANGUAGE MODELS, AND TOPIC-SPECIFIC LABEL STATISTICS
摘要 The invention relates to a method, a computer program product, a segmentation system and a user interface for structuring an unstructured text by making use of statistical models trained on annotated training data. The method performs text segmentation into text sections and assigns labels to text sections as section headings. The performed segmentation and assignment is provided to a user for general review. Additionally, alternative segmentations and label assignments are provided to the user being capable to select alternative segmentations and alternative labels as well as to enter a user defined segmentation and user defined label. In response to the modifications introduced by the user, a plurality of different actions are initiated incorporating the re-segmentation and re-labeling of successive parts of the document or the entire document
申请公布号 US2014236580(A1) 申请公布日期 2014.08.21
申请号 US201414184440 申请日期 2014.02.19
申请人 Nuance Communications Austria 发明人 Peters Jochen;Matusov Evgeny;Meyer Carsten;Klakow Dietrich
分类号 G06F17/21;G06F17/27 主分类号 G06F17/21
代理机构 代理人
主权项 1. A method of segmentation of a text (512) into text sections and assigning a topic to each text section on the basis of annotated training data, the method comprising the steps of: segmenting the text (512) into text sections by making use of statistical models (514) extracted from training data, assigning a topic being indicative of the content of the text section to each text section by making use of the statistical models extracted from the training data, generating a structured text by inserting a label as a section heading into the text in order to assign the label to the text section, providing the structured text to a user (506), processing of modifications of the structured text in response to a user's review.
地址 Vienna AT