摘要 |
A system and method for developing a pharmacovigilance database from source data and reference data. The unedited source data contains verbatim terms. The method includes parsing source data into a relational safety database; performing cleanup on the relational safety database; and mapping verbatim terms from the cleaned safety database to at least one token from at least one reference source. Cleanup includes removing redundant entries, correcting misspellings, removing irrelevant non-alpha characters and noise words, and relocating dislocated terms. Mapping verbatim terms to tokens includes nominating tokens from the source data, choosing tokens from the reference sources, and linking chosen tokens to corresponding verbatim terms. In one embodiment, the history of clean-up and mapping is saved as the pedigree of the verbatim-to-token mapping.
|