主权项 |
1. A method implemented on a computing device comprising one or more processors, the method comprising:
receiving a first term as a name or description representing an object, wherein the object includes a physical or conceptual object, a topic, or an attribute associated with one or more objects, wherein the first term is received from a source including manual input from a user, or automatic input from a computing device, for automatically gathering information or knowledge about the object represented by the first term from unstructured data sources using a machine-based method; receiving a first group of text units comprising at least two words, or one or more phrases or sentences or paragraphs or documents, wherein at least half of the text units contain the first term or are from contents that contain the first term, and at least half of the text units contain one or more unspecified second terms each being different from the first term; for one or more second terms in the first group of text units, producing a cumulative value based at least on the number of text units that contain both the first term and the second term; producing a first score value based at least on dividing the cumulative value by at least half of the total number of the text units that contain the first term or are from contents that contain the first term; selecting one or more of the second terms based on the first score value; assembling the selected terms into a term set; attaching the term set to the first term to form a dataset, wherein the function of the selected terms includes representing terms associated with the first term, or representing properties associated with the object, or representing information about the object with information that is automatically gathered from unstructured text contents by using a machine-based method; and outputting the dataset as a form of information representation or knowledge representation for a specific object represented by the first term. |