點擊上方,選擇星標或置頂,每天給你送乾貨!
閱讀大概需要5分鐘
跟隨小博主,每天進步一丟丟
整理:專知
編輯:zenRRan
【導讀】cedrickchee維護這個項目包含用於自然語言處理(NLP)的大型機器(深度)學習資源,重點關注轉換器(BERT)的雙向編碼器表示、注意機制、轉換器架構/網絡和NLP中的傳輸學習。
https://github.com/cedrickchee/awesome-bert-nlp
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova.
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context by Zihang Dai, Zhilin Yang, Yiming Yang, William W. Cohen, Jaime Carbonell, Quoc V. Le and Ruslan Salakhutdinov.
Uses smart caching to improve the learning of long-term dependency in Transformer. Key results: state-of-art on 5 language modeling benchmarks, including ppl of 21.8 on One Billion Word (LM1B) and 0.99 on enwiki8. The authors claim that the method is more flexible, faster during evaluation (1874 times speedup), generalizes well on small datasets, and is effective at modeling short and long sequences.
Conditional BERT Contextual Augmentation by Xing Wu, Shangwen Lv, Liangjun Zang, Jizhong Han and Songlin Hu.
SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering by Chenguang Zhu, Michael Zeng and Xuedong Huang.
Language Models are Unsupervised Multitask Learners by Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei and Ilya Sutskever.
The Evolved Transformer by David R. So, Chen Liang and Quoc V. Le.
They used architecture search to improve Transformer architecture. Key is to use evolution and seed initial population with Transformer itself. The architecture is better and more efficient, especially for small size models.
Open Sourcing BERT: State-of-the-Art Pre-training for Natural Language Processing from Google AI.
The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning).
Dissecting BERT by Miguel Romero and Francisco Ingham - Understand BERT in depth with an intuitive, straightforward explanation of the relevant concepts.
A Light Introduction to Transformer-XL.
Generalized Language Models by Lilian Weng, Research Scientist at OpenAI.
Attention Concept
The Annotated Transformer by Harvard NLP Group - Further reading to understand the "Attention is all you need" paper.
Attention? Attention! - Attention guide by Lilian Weng from OpenAI.
Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention) by Jay Alammar, an Instructor from Udacity ML Engineer Nanodegree.
Transformer Architecture
The Transformer blog post.
The Illustrated Transformer by Jay Alammar, an Instructor from Udacity ML Engineer Nanodegree.
Watch Łukasz Kaiser’s talk walking through the model and its details.
Transformer-XL: Unleashing the Potential of Attention Models by Google Brain.
Generative Modeling with Sparse Transformers by OpenAI - an algorithmic improvement of the attention mechanism to extract patterns from sequences 30x longer than possible previously.
OpenAI Generative Pre-Training Transformer (GPT) and GPT-2
Better Language Models and Their Implications.
Improving Language Understanding with Unsupervised Learning - this is an overview of the original GPT model.
How to build a State-of-the-Art Conversational AI with Transfer Learning by Hugging Face.
Additional Reading
huggingface/pytorch-pretrained-BERT - A PyTorch implementation of Google AI's BERT model with script to load Google's pre-trained models by Hugging Face.
codertimo/BERT-pytorch - Google AI 2018 BERT pytorch implementation.
innodatalabs/tbert - PyTorch port of BERT ML model.
kimiyoung/transformer-xl - Code repository associated with the Transformer-XL paper.
dreamgonfly/BERT-pytorch - PyTorch implementation of BERT in "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding".
dhlee347/pytorchic-bert - Pytorch implementation of Google BERT
Keras
Separius/BERT-keras - Keras implementation of BERT with pre-trained weights.
CyberZHG/keras-bert - Implementation of BERT that could load official pre-trained models for feature extraction and prediction.
TensorFlow
Chainer
編輯不易,還望給個好看!
今天留言內容為:
【day n】今天我學到了什麼或者今天打算學什麼。
(至少10個字,越詳細越好)
督促自己,每天進步一丟丟!
推薦閱讀:
一大批歷史精彩文章啦
詳解Transition-based Dependency parser基於轉移的依存句法解析器
乾貨 | 找工作的經驗總結(一)
經驗 | 初入NLP領域的一些小建議
學術 | 如何寫一篇合格的NLP論文
乾貨 | 那些高產的學者都是怎樣工作的?
是時候研讀一波導師的論文--一個簡單有效的聯合模型
近年來NLP在法律領域的相關研究工作