摘要 |
The present invention relates to systems and methods for analyzing media material having a layout. A media material analyzer includes a segmenter and an article composer. The segmenter identifies block segments associated with columnar body text in the media material. The article composer determines which of the identified block segments belong to one or more articles in the media material. The article composer can determine whether candidate block segments belong to a same article based on language statistics information, layout transition information, or both language statistics information and layout transition information. A system for searching media material having a layout over a network is also provided.
|