发明名称 INFORMATION SENSORS FOR SENSING WEB DYNAMICS
摘要 Disclosed herein are techniques and systems for building “information sensors,” which are programmable “focused crawlers” that periodically discover, extract, analyze and aggregate structured information around a topic from the Web. A platform for building an information sensor allows a user to specify one or more data elements within a data source that the user desires to monitor, and an update frequency at which the data elements are to be extracted. Code may be generated based on the user specifications for creation and submission of the information sensor for storage in a database with metadata containing the code and update frequency. Once created, information sensors are scanned to check if running conditions are met, and if met, they may be executed by retrieving the metadata using a sensor identifier (ID). The code is executed to locate a data source, and periodically extract specified data elements therefrom to output structured time-series data.
申请公布号 US2016125083(A1) 申请公布日期 2016.05.05
申请号 US201314896339 申请日期 2013.06.07
申请人 DOU Zhicheng;WEN Ji-Rong;MICROSOFT TECHNOLOGY LICENSING, LLC 发明人 Dou Zhicheng;Wen Ji-Rong
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method comprising: scanning, by one or more processors, a set of information sensors to determine that a running condition is met for executing at least one information sensor in the set of information sensors; at least partly in response to a determination the running condition is met for the at least one information sensor, retrieving metadata associated with the at least one information sensor, the metadata including an update frequency and code to extract one or more data elements from a data source, the code being user-editable and providing predefined functions for at least extracting the one or more data elements from the data source; running, by the one or more processors, the code to: locate the data source,identify the one or more data elements within the data source, and periodically extract the one or more data elements from the data source according to the update frequency; and storing each extracted data element as a data point in a structured time series.
地址 Shanghai CN