发明名称 System and method for using XML to normalize documents
摘要 A system, method, and processor readable medium for normalizing documents using extensible markup language (XML). The system may determine a type of object repository storing at least one object. The object may include metadata. The system may then identify the object stored in the object repository. At least one portion of the one object may be extracted from the repository, wherein the portion is extracted in extensible markup language (XML) format. Preferably, some of the metadata is preserved. The metadata preserved may include at least one of author, title, subject, date created, date modified, list of modifiers, and link list information. The portion may then be transmitted to a processor. The processor may perform one or more processes on the portion. A mapping may be performed that maps at least one field in the object with a field designation identifier. The processor may include at least one of a full-text engine, a metrics engine, and a taxonomy engine.
申请公布号 US7222297(B2) 申请公布日期 2007.05.22
申请号 US20020044913 申请日期 2002.01.15
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 GOODWIN JAMES PATRICK;WILSON PAUL LEWIS
分类号 G06N3/00;G06F7/00;G06F17/30;G06N5/00;G06Q10/00;G06Q30/00 主分类号 G06N3/00
代理机构 代理人
主权项
地址