发明名称 |
PRE-TRAINING AND/OR TRANSFER LEARNING FOR SEQUENCE TAGGERS |
摘要 |
Systems and methods for pre-training a sequence tagger with unlabeled data, such as a hidden layered conditional random field model are provided. Additionally, systems and methods for transfer learning are provided. Accordingly, the systems and methods build more accurate, more reliable, and/or more efficient sequence taggers than previously utilized sequence taggers that are not pre-trained with unlabeled data and/or that are not capable of transfer learning/training. |
申请公布号 |
US2016247501(A1) |
申请公布日期 |
2016.08.25 |
申请号 |
US201514625828 |
申请日期 |
2015.02.19 |
申请人 |
Microsoft Technology Licensing, LLC |
发明人 |
Kim Young-Bum;Jeong Minwoo;Sarikaya Ruhi |
分类号 |
G10L15/06;G06N99/00;G06F17/30;G06F17/27;G10L15/18 |
主分类号 |
G10L15/06 |
代理机构 |
|
代理人 |
|
主权项 |
1. A sequence tagging system that provides for transfer learning, the sequence tagging system comprising:
a computing device including a processing unit and a memory, the processing unit implementing a first hidden layered conditional random field (HCRF) model, the first HCRF model comprises a pre-training system and a first training system, the pre-training system is operable to:
obtain unlabeled data;run a word clustering algorithm on the unlabeled data to generate word clusters;determine a pseudo-label for each input of the unlabeled data based on the word clusters to form pseudo-labeled data;extract pre-training features from the pseudo-labeled data, andestimate pre-training model parameters for the pre-training features utilizing a first training algorithm, wherein the pre-training model parameters are stored in a first hidden layer of the first HCRF model; the first training system is operable to:
obtain a first set of labeled data for a first specific task;estimate first task specific model parameters based on a second training algorithm that is initialized utilizing the pre-training model parameters. |
地址 |
Redmond WA US |