发明名称 Learning-based data decontextualization
摘要 Techniques are described for employing a crowdsourcing framework to analyze data related to the performance or operations of computing systems, or to analyze other types of data. A question is analyzed to determine data that is relevant to the question. The relevant data may be decontextualized to remove or alter contextual information included in the data, such as sensitive, personal, or business-related data. The question and the decontextualized data may then be presented to workers in a crowdsourcing framework, and the workers may determine an answer to the question based on an analysis or an examination of the decontextualized data. The answers may be combined, correlated, or otherwise processed to determine a processed answer to the question. Machine learning techniques are employed to adjust and refine the decontextualization.
申请公布号 US9342796(B1) 申请公布日期 2016.05.17
申请号 US201314028396 申请日期 2013.09.16
申请人 AMAZON TECHNOLOGIES, INC. 发明人 McClintock Jon Arron;Stathakopoulos George Nikolaos;Brezinski Dominique Imjya
分类号 G06N99/00 主分类号 G06N99/00
代理机构 Lindauer Law PLLC 代理人 Lindauer Law PLLC
主权项 1. A computer-implemented method, comprising: accessing a question having a predetermined answer, the question being associated with operations of at least one computing system; accessing at least one dataset associated with the question, the at least one dataset including data describing the operations of the at least one computing system and contextual information about the data; selecting at least one decontextualization operation from a plurality of decontextualization operations, wherein the at least one decontextualization operation at least partly alters the contextual information about the data included in the at least one dataset; applying the at least one decontextualization operation to the at least one dataset to determine at least one modified dataset in which the contextual information about the data included in the at least one dataset is at least partly altered; sending the at least one modified dataset including the data and the altered contextual information and the question to a plurality of worker devices associated with a plurality of workers in a crowdsourcing framework; receiving, from the plurality of worker devices, a plurality of answers to the question, the plurality of answers generated by the plurality of workers analyzing the at least one modified dataset having the contextual information about the data at least partly altered and the data in view of the question; incorporating, into training data, the plurality of answers and information describing the at least one decontextualization operation; and employing the training data in machine learning to train a decontextualizer to be used in subsequent data decontextualization to answer subsequent questions using the crowdsourcing framework.
地址 Reno NV US