Systems, methods, and computer-readable storage media are provided for inferring missing or ambiguous attribute values for entities based on partial information about such entities and/or information about other similar entities as extracted from multiple information sources for generating Web ranking signals for online search. A plurality of heterogeneous input data sources are ingested and combined to produce output data having information content that is more than the sum of its parts. A generic platform is provided where multiple data sources having information content related to entity attributes can be plugged-in without additional changes being necessary to the platform. This generic plugin model for extracting and inferring entity attribute values makes it easy to leverage new data sources as they become available to improve the final inferred attribute data.