摘要 |
<p>A method of phonetically searching media information comprises receiving a plurality of search queries from one or more client systems and providing a phonetic representation of each search query. One or more search jobs are instantiated, each search job comprising a plurality of tasks, each task being arranged to sequentially read a block from an archive file. The archive file is stored within a distributed filing system (DFS) in which sequential blocks of data comprising the archive file are replicated to be locally available to one or more processors from a cluster of processors for executing the tasks. Each block stores index files corresponding to a plurality of the source media files, each index file containing a phonetic stream corresponding to audio information for a given source media file. Each task obtains phonetic representations of outstanding search queries for a block and sequentially searches the block for each outstanding search query. Responsive to matching a search query with a location within the phonetic stream for an index file, the location and an identifier of the source media are returned for responding to the search query.</p> |