发明名称 Method and apparatus for text analysis
摘要 An electronic text analyzer operates on an ordered block of digitally coded text by analyzing sequential strings thereof to determine paragraph and sentence boundaries. Each string is broken down into component words. Possible abbreviations are identified and checked against a table of common abbreviations to identify abbreviations which cannot end a sentence. End punctuation and the following string are analyzed to identify the terminal word of a sentence. When sentence boundaries have been determined, the test may be further processed by a grammar checker, a readability analyzer, or other higher-level text processing system. A preferred embodiment includes a readability analyzer having a syllable counter for determining the number of syllables in each word. The system includes a modified common-word table having an empirical syllable-count field. A checker first determines if a word is in the table and, if so, returns its syllable count. An exception table identifies words not conforming to a syllable counting algorithm. Each word not in the common-word or exception tables is modified, and the modified word is processed to derive its syllable count. In a preferred embodiment, tallies are kept of words per sentence, syllable count, sentences per paragraph, and similar data, and readability scores based on the tallies are displayed.
申请公布号 US4773009(A) 申请公布日期 1988.09.20
申请号 US19860872094 申请日期 1986.06.06
申请人 HOUGHTON MIFFLIN COMPANY 发明人 KUCERA, HENRY;SOKOLOWSKI, RACHAEL;RUSSOM, JACQUELINE
分类号 G06F17/27;(IPC1-7):G06F15/40 主分类号 G06F17/27
代理机构 代理人
主权项
地址