摘要 |
597986 Disclosed is a method of grouping communication sessions based on data extracted from the communication sessions. The method includes selecting a plurality of communications sessions from a data stream and searching each communication session for at least one entity. Each occurrence of an entity is then extracted along with a context of the entity. A context is a predetermined number of characters before and/or after each entity. The method also includes determining which data structures (1, 2) of the communication sessions occur more frequently than chance based on data structures within said contexts. The method further includes sorting the communication sessions into groups wherein communication sessions which have similar data structures (1, 2), determined to occur more frequently than chance, are sorted into the same group. |