发明名称 AUTOMATED DATA PARSING
摘要 A framing technique included in a set of framing techniques is applied to at least a portion of a target data set. When a result of the application of the framing technique meets a first condition, a typing technique included in a set of typing techniques is applied to the target data set. When a result of the application of the typing technique meets a second condition, a tokenization technique included in a set of tokenization techniques is applied to the target data set. When a result of the application of the tokenization technique meets a third condition, a parsing technique for the target data set is determined to include the framing technique, the typing technique and the tokenization technique. An indication of the parsing technique is generated.
申请公布号 US2014280256(A1) 申请公布日期 2014.09.18
申请号 US201414216461 申请日期 2014.03.17
申请人 WOLFRAM ALPHA LLC 发明人 Wolfram Stephen;Beynon Taliesin Sebastian
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method for determining a parsing technique for a target data set, the method comprising: receiving, via a communication link at one or more computing devices, a target data set; applying a framing technique included in a set of framing techniques to at least a portion of the target data set; when a result of the application of the framing technique meets a first condition across the at least the portion of the target data set, applying a typing technique included in a set of typing techniques to the at least the portion of the target data set, the typing technique corresponding to the framing technique; when a result of the application of the typing technique meets a second condition across the at least the portion of the target data set, applying a tokenization technique included in a set of tokenization techniques to the at least the portion of the target data set, the tokenization technique corresponding to the typing technique; and when a result of the application of the tokenization technique meets a third condition across the at least the portion of the target data set, determining the parsing technique for the target data set to include the framing technique, the typing technique and the tokenization technique; andcausing an indication of the parsing technique to be generated by the one or more computing devices, wherein the set of framing techniques, the set of typing techniques and the set of tokenization techniques are included in a set of defined parsing techniques.
地址 Champaign IL US