主权项 |
1. A computer-implemented method comprising:
identifying, using one or more computing devices, a context; determining, using the one or more computing devices, a plurality of tagsets each including one or more tags describing an entity and a vocabulary of unique tags defined by the identified context; generating, using the one or more computing devices, counts statistics using the plurality of tagsets and the vocabulary of unique tags; determining a measure of co-occurrence consistent for a pair of tags in the vocabulary of unique tags based on the counts statistics, the measure of co-occurrence consistent indicating a likelihood of the pair of tags co-occurring in a tagset from the plurality of tagsets relative to random; generating, using the one or more computing devices, a weighted tag co-occurrence graph including the pair of tags in the vocabulary of unique tags based on the measure of co-occurrence consistent; denoising, using the one or more computing devices, the weighted tag co-occurrence graph; and responsive to removing the noise, identifying, using the one or more computing devices, at least one community in the weighted tag co-occurrence graph. |