发明名称 PRE-TRAINING AND/OR TRANSFER LEARNING FOR SEQUENCE TAGGERS
摘要 Systems and methods for pre-training a sequence tagger with unlabeled data, such as a hidden layered conditional random field model are provided. Additionally, systems and methods for transfer learning are provided. Accordingly, the systems and methods build more accurate, more reliable, and/or more efficient sequence taggers than previously utilized sequence taggers that are not pre-trained with unlabeled data and/or that are not capable of transfer learning/training.
申请公布号 US2016247501(A1) 申请公布日期 2016.08.25
申请号 US201514625828 申请日期 2015.02.19
申请人 Microsoft Technology Licensing, LLC 发明人 Kim Young-Bum;Jeong Minwoo;Sarikaya Ruhi
分类号 G10L15/06;G06N99/00;G06F17/30;G06F17/27;G10L15/18 主分类号 G10L15/06
代理机构 代理人
主权项 1. A sequence tagging system that provides for transfer learning, the sequence tagging system comprising: a computing device including a processing unit and a memory, the processing unit implementing a first hidden layered conditional random field (HCRF) model, the first HCRF model comprises a pre-training system and a first training system, the pre-training system is operable to: obtain unlabeled data;run a word clustering algorithm on the unlabeled data to generate word clusters;determine a pseudo-label for each input of the unlabeled data based on the word clusters to form pseudo-labeled data;extract pre-training features from the pseudo-labeled data, andestimate pre-training model parameters for the pre-training features utilizing a first training algorithm, wherein the pre-training model parameters are stored in a first hidden layer of the first HCRF model; the first training system is operable to: obtain a first set of labeled data for a first specific task;estimate first task specific model parameters based on a second training algorithm that is initialized utilizing the pre-training model parameters.
地址 Redmond WA US