发明名称 RELEVANT CHARACTER STRING FORMATION DEVICE, PROGRAM AND STORAGE MEDIUM
摘要 <P>PROBLEM TO BE SOLVED: To automatically extract a pair of character strings having an appropriate semantic relation from document data. <P>SOLUTION: A language analysis means 22 divides document data to a plurality of continuous segment strings. A segment area discrimination means 23 discriminates, using the divided segment strings, a continuous segment area containing a pair of segments surrounding a first segment string of a predetermined condition and a second segment string composed of one or more segments of a predetermined condition. Concretely, the predetermined condition related to the first segment string is set so that all the segments constituting the first segment string have independent words having an indeclinable word or a word class corresponding thereto, and the condition related to the pair of segments surrounding the second segment string is set so that independent words of two segments constituting the pair of segments are symbolic characters having a paired relation. An inter-segment string relation identification means 24 identifies the relation between the discriminated first segment string and second segment string. <P>COPYRIGHT: (C)2006,JPO&NCIPI
申请公布号 JP2006190044(A) 申请公布日期 2006.07.20
申请号 JP20050000934 申请日期 2005.01.05
申请人 RICOH CO LTD 发明人 KENMOCHI EIJI;SATO NAHOKO;SHIMADA ATSUO
分类号 G06F17/30;G06F17/28 主分类号 G06F17/30
代理机构 代理人
主权项
地址