摘要 |
Embodiments of the invention provide methods, systems, and articles of manufacture for modeling molecular properties based on information obtained from sources other than direct empirical measurements of the properties. Embodiments of the invention use "virtual data" related to molecular properties to train a molecular properties model. Virtual data about a molecule may include real-valued data (e.g. measurement values falling along a continuous range) or a positive or negative assertion about whether a molecule exhibits a property of interest. Virtual data may be generated using a variety of techniques and may be further characterized by confidence in the accuracy of the virtual data. In addition to virtual data, embodiments of the invention may use "virtual molecules" paired with "virtual data" to train a molecular properties model. The virtual molecules may themselves be generated in a variety of ways.
|