126篇殿堂級深度學習論文分類整理從入門到應用|乾貨

2020-12-12 雷鋒網

如果你有非常大的決心從事深度學習,又不想在這一行打醬油,那麼研讀大牛論文將是不可避免的一步。而作為新人,你的第一個問題或許是:「論文那麼多,從哪一篇讀起?」

本文將試圖解決這個問題——文章標題本來是:「從入門到絕望,無止境的深度學習論文」。請諸位備好道具,開啟頭懸梁錐刺股的學霸姿勢。

開個玩笑。

但對非科班出身的開發者而言,讀論文的確可以成為一件很痛苦的事。但好消息來了——為避免初學者陷入迷途苦海,暱稱為 songrotek 的學霸在 GitHub 發布了他整理的深度學習路線圖,分門別類梳理了新入門者最需要學習的 DL 論文,又按重要程度給每篇論文打上星星。

截至目前,這份 DL 論文路線圖已在 GitHub 收穫了近萬顆星星好評,人氣極高。雷鋒網感到非常有必要對大家進行介紹。

閒話少說,該路線圖根據以下四項原則而組織:

  • 從大綱到細節

  • 從經典到前沿

  • 從一般到具體領域

  • 關注最新研究突破

作者註:有許多論文很新但非常值得一讀。

1 深度學習歷史和基礎

1.0 書籍

[0] Bengio, Yoshua, Ian J. Goodfellow, and Aaron Courville. "Deep learning." An MIT Press book. (2015). [pdf] (Ian Goodfellow 等大牛所著的教科書,乃深度學習聖經。你可以同時研習這本書以及以下論文) ★★★★★

地址:https://github.com/HFTrader/DeepLearningBook/raw/master/DeepLearningBook.pdf

1.1 調查

[1] LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." Nature 521.7553 (2015): 436-444. [pdf] (三巨頭做的調查) ★★★★★

地址:http://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf

1.2 深度置信網絡 (DBN,深度學習前夜的裡程碑)

[2] Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18.7 (2006): 1527-1554. [pdf] (深度學習前夜) ★★★

地址:http://www.cs.toronto.edu/~hinton/absps/ncfast.pdf

[3] Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. "Reducing the dimensionality of data with neural networks." Science 313.5786 (2006): 504-507. [pdf] (裡程碑,展示了深度學習的前景) ★★★

地址:http://www.cs.toronto.edu/~hinton/science.pdf

1.3 ImageNet 的進化(深度學習從此萌發)

[4] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. [pdf] (AlexNet, 深度學習突破) ★★★★★

地址:http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

[5] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014). [pdf] (VGGNet,神經網絡變得很深層) ★★★

地址:https://arxiv.org/pdf/1409.1556.pdf

[6] Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. [pdf] (GoogLeNet) ★★★

地址:http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf

[7] He, Kaiming, et al. "Deep residual learning for image recognition." arXiv preprint arXiv:1512.03385 (2015). [pdf](ResNet,特別深的神經網絡, CVPR 最佳論文) ★★★★★

地址:https://arxiv.org/pdf/1512.03385.pdf

1.4 語音識別的進化

[8] Hinton, Geoffrey, et al. "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups." IEEE Signal Processing Magazine 29.6 (2012): 82-97. [pdf] (語音識別的突破) ★★★★

地址:http://cs224d.stanford.edu/papers/maas_paper.pdf

[9] Graves, Alex, Abdel-rahman Mohamed, and Geoffrey Hinton. "Speech recognition with deep recurrent neural networks." 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013. [pdf] (RNN) ★★★

地址:http://arxiv.org/pdf/1303.5778.pdf

[10] Graves, Alex, and Navdeep Jaitly. "Towards End-To-End Speech Recognition with Recurrent Neural Networks." ICML. Vol. 14. 2014. [pdf] ★★★

地址:http://www.jmlr.org/proceedings/papers/v32/graves14.pdf

[11] Sak, Haşim, et al. "Fast and accurate recurrent neural network acoustic models for speech recognition." arXiv preprint arXiv:1507.06947 (2015). [pdf] (谷歌語音識別系統) ★★★

地址:http://arxiv.org/pdf/1507.06947

[12] Amodei, Dario, et al. "Deep speech 2: End-to-end speech recognition in english and mandarin." arXiv preprint arXiv:1512.02595 (2015). [pdf] (百度語音識別系統) ★★★★

地址:https://arxiv.org/pdf/1512.02595.pdf

[13] W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu, G. Zweig "Achieving Human Parity in Conversational Speech Recognition." arXiv preprint arXiv:1610.05256 (2016). [pdf] (最前沿的語音識別, 微軟) ★★★★

地址:https://arxiv.org/pdf/1610.05256v1

研讀以上論文之後,你將對深度學習歷史、模型的基本架構(包括 CNN, RNN, LSTM)有一個基礎的了解,並理解深度學習如何應用於圖像和語音識別問題。接下來的論文,將帶你深入探索深度學習方法、在不同領域的應用和前沿尖端技術。我建議,你可以根據興趣和工作/研究方向進行選擇性的閱讀。

2 深度學習方法

2.1 模型

[14] Hinton, Geoffrey E., et al. "Improving neural networks by preventing co-adaptation of feature detectors." arXiv preprint arXiv:1207.0580 (2012). [pdf] (Dropout) ★★★

地址:https://arxiv.org/pdf/1207.0580.pdf

[15] Srivastava, Nitish, et al. "Dropout: a simple way to prevent neural networks from overfitting." Journal of Machine Learning Research 15.1 (2014): 1929-1958. [pdf] ★★★

地址:http://www.jmlr.org/papers/volume15/srivastava14a.old/source/srivastava14a.pdf

[16] Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015). [pdf] (2015 年的傑出研究) ★★★★

地址:http://arxiv.org/pdf/1502.03167

[17] Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. "Layer normalization." arXiv preprint arXiv:1607.06450 (2016). [pdf] (Batch Normalization 的更新) ★★★★

地址:https://arxiv.org/pdf/1607.06450.pdf?utm_source=sciontist.com&utm_medium=refer&utm_campaign=promote

[18] Courbariaux, Matthieu, et al. "Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to+ 1 or−1." [pdf] (新模型,快) ★★★

地址:https://pdfs.semanticscholar.org/f832/b16cb367802609d91d400085eb87d630212a.pdf

[19] Jaderberg, Max, et al. "Decoupled neural interfaces using synthetic gradients." arXiv preprint arXiv:1608.05343 (2016). [pdf] (訓練方法的創新,研究相當不錯) ★★★★★

地址:https://arxiv.org/pdf/1608.05343

[20] Chen, Tianqi, Ian Goodfellow, and Jonathon Shlens. "Net2net: Accelerating learning via knowledge transfer." arXiv preprint arXiv:1511.05641 (2015). [pdf] (改進此前的訓練網絡,來縮短訓練周期) ★★★

地址:https://arxiv.org/abs/1511.05641

[21] Wei, Tao, et al. "Network Morphism." arXiv preprint arXiv:1603.01670 (2016). [pdf] (改進此前的訓練網絡,來縮短訓練周期) ★★★

地址:https://arxiv.org/abs/1603.01670

2.2 優化 Optimization

[22] Sutskever, Ilya, et al. "On the importance of initialization and momentum in deep learning." ICML (3) 28 (2013): 1139-1147. [pdf] (Momentum optimizer) ★★

地址:http://www.jmlr.org/proceedings/papers/v28/sutskever13.pdf

[23] Kingma, Diederik, and Jimmy Ba. "Adam: A method for stochastic optimization." arXiv preprint arXiv:1412.6980 (2014). [pdf] (Maybe used most often currently) ★★★

地址:http://arxiv.org/pdf/1412.6980

[24] Andrychowicz, Marcin, et al. "Learning to learn by gradient descent by gradient descent." arXiv preprint arXiv:1606.04474 (2016). [pdf] (Neural Optimizer,Amazing Work) ★★★★★

地址:https://arxiv.org/pdf/1606.04474

[25] Han, Song, Huizi Mao, and William J. Dally. "Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding." CoRR, abs/1510.00149 2 (2015). [pdf] (ICLR best paper, new direction to make NN running fast,DeePhi Tech Startup) ★★★★★

地址:https://pdfs.semanticscholar.org/5b6c/9dda1d88095fa4aac1507348e498a1f2e863.pdf

[26] Iandola, Forrest N., et al. "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 1MB model size." arXiv preprint arXiv:1602.07360 (2016). [pdf] (Also a new direction to optimize NN,DeePhi Tech Startup) ★★★★

地址:http://arxiv.org/pdf/1602.07360

2.3 無監督學習/深度生成模型

[27] Le, Quoc V. "Building high-level features using large scale unsupervised learning." 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013. [pdf] (裡程碑, 吳恩達, 谷歌大腦, Cat) ★★★★

地址:http://arxiv.org/pdf/1112.6209.pdf&embed

[28] Kingma, Diederik P., and Max Welling. "Auto-encoding variational bayes." arXiv preprint arXiv:1312.6114 (2013). [pdf](VAE) ★★★★

地址:http://arxiv.org/pdf/1312.6114

[29] Goodfellow, Ian, et al. "Generative adversarial nets." Advances in Neural Information Processing Systems. 2014. [pdf](GAN,很酷的想法) ★★★★★

地址:http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf

[30] Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015). [pdf] (DCGAN) ★★★★

地址:http://arxiv.org/pdf/1511.06434

[31] Gregor, Karol, et al. "DRAW: A recurrent neural network for image generation." arXiv preprint arXiv:1502.04623 (2015). [pdf] (VAE with attention, 很出色的研究) ★★★★★

地址:http://jmlr.org/proceedings/papers/v37/gregor15.pdf

[32] Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016). [pdf] (PixelRNN) ★★★★

地址:http://arxiv.org/pdf/1601.06759

[33] Oord, Aaron van den, et al. "Conditional image generation with PixelCNN decoders." arXiv preprint arXiv:1606.05328 (2016). [pdf] (PixelCNN) ★★★★

地址:https://arxiv.org/pdf/1606.05328

2.4 遞歸神經網絡(RNN) / Sequence-to-Sequence Model

[34] Graves, Alex. "Generating sequences with recurrent neural networks." arXiv preprint arXiv:1308.0850 (2013). [pdf] (LSTM, 效果很好,展示了 RNN 的性能) ★★★★

地址:http://arxiv.org/pdf/1308.0850

[35] Cho, Kyunghyun, et al. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014). [pdf] (第一篇 Sequence-to-Sequence 的論文) ★★★★

地址:http://arxiv.org/pdf/1406.1078

[36] Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." Advances in neural information processing systems. 2014. [pdf] (傑出研究) ★★★★★

地址:http://papers.nips.cc/paper/5346-information-based-learning-by-agents-in-unbounded-state-spaces.pdf

[37] Bahdanau, Dzmitry, KyungHyun Cho, and Yoshua Bengio. "Neural Machine Translation by Jointly Learning to Align and Translate." arXiv preprint arXiv:1409.0473 (2014). [pdf] ★★★★

地址:https://arxiv.org/pdf/1409.0473v7.pdf

[38] Vinyals, Oriol, and Quoc Le. "A neural conversational model." arXiv preprint arXiv:1506.05869 (2015). [pdf] (Seq-to-Seq 聊天機器人) ★★★

地址:http://arxiv.org/pdf/1506.05869.pdf%20(http://arxiv.org/pdf/1506.05869.pdf)

2.5 神經網絡圖靈機

[39] Graves, Alex, Greg Wayne, and Ivo Danihelka. "Neural turing machines." arXiv preprint arXiv:1410.5401 (2014). [pdf] (未來計算機的基礎原型機) ★★★★★

地址:http://arxiv.org/pdf/1410.5401.pdf

[40] Zaremba, Wojciech, and Ilya Sutskever. "Reinforcement learning neural Turing machines." arXiv preprint arXiv:1505.00521 362 (2015). [pdf] ★★★

地址:https://pdfs.semanticscholar.org/f10e/071292d593fef939e6ef4a59baf0bb3a6c2b.pdf

[41] Weston, Jason, Sumit Chopra, and Antoine Bordes. "Memory networks." arXiv preprint arXiv:1410.3916 (2014). [pdf] ★★★

地址:http://arxiv.org/pdf/1410.3916

[42] Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." Advances in neural information processing systems. 2015. [pdf] ★★★★

地址:http://papers.nips.cc/paper/5846-end-to-end-memory-networks.pdf

[43] Vinyals, Oriol, Meire Fortunato, and Navdeep Jaitly. "Pointer networks." Advances in Neural Information Processing Systems. 2015. [pdf] ★★★★

地址:http://papers.nips.cc/paper/5866-pointer-networks.pdf

[44] Graves, Alex, et al. "Hybrid computing using a neural network with dynamic external memory." Nature (2016). [pdf] (裡程碑,把以上論文的想法整合了起來) ★★★★★

地址:https://www.dropbox.com/s/0a40xi702grx3dq/2016-graves.pdf

2.6 深度強化學習

[45] Mnih, Volodymyr, et al. "Playing atari with deep reinforcement learning." arXiv preprint arXiv:1312.5602 (2013). [pdf]) (第一個以深度強化學習為題的論文) ★★★★

地址:http://arxiv.org/pdf/1312.5602.pdf

[46] Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529-533. [pdf] (裡程碑) ★★★★★

地址:https://storage.googleapis.com/deepmind-data/assets/papers/DeepMindNature14236Paper.pdf

[47] Wang, Ziyu, Nando de Freitas, and Marc Lanctot. "Dueling network architectures for deep reinforcement learning." arXiv preprint arXiv:1511.06581 (2015). [pdf] (ICLR 最佳論文,很棒的想法) ★★★★

地址:http://arxiv.org/pdf/1511.06581

[48] Mnih, Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." arXiv preprint arXiv:1602.01783 (2016). [pdf] (前沿方法) ★★★★★

地址:http://arxiv.org/pdf/1602.01783

[49] Lillicrap, Timothy P., et al. "Continuous control with deep reinforcement learning." arXiv preprint arXiv:1509.02971 (2015). [pdf] (DDPG) ★★★★

地址:http://arxiv.org/pdf/1509.02971

[50] Gu, Shixiang, et al. "Continuous Deep Q-Learning with Model-based Acceleration." arXiv preprint arXiv:1603.00748 (2016). [pdf] (NAF) ★★★★

地址:http://arxiv.org/pdf/1603.00748

[51] Schulman, John, et al. "Trust region policy optimization." CoRR, abs/1502.05477 (2015). [pdf] (TRPO) ★★★★

地址:http://www.jmlr.org/proceedings/papers/v37/schulman15.pdf

[52] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489. [pdf] (AlphaGo) ★★★★★

地址:http://willamette.edu/~levenick/cs448/goNature.pdf

2.7 深度遷移學習 /終生學習 / 強化學習

[53] Bengio, Yoshua. "Deep Learning of Representations for Unsupervised and Transfer Learning." ICML Unsupervised and Transfer Learning 27 (2012): 17-36. [pdf] (這是一個教程) ★★★

地址:http://www.jmlr.org/proceedings/papers/v27/bengio12a/bengio12a.pdf

[54] Silver, Daniel L., Qiang Yang, and Lianghao Li. "Lifelong Machine Learning Systems: Beyond Learning Algorithms." AAAI Spring Symposium: Lifelong Machine Learning. 2013. [pdf] (對終生學習的簡單討論) ★★★

地址:http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.696.7800&rep=rep1&type=pdf

[55] Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the knowledge in a neural network." arXiv preprint arXiv:1503.02531 (2015). [pdf] (大神們的研究) ★★★★

地址:http://arxiv.org/pdf/1503.02531

[56] Rusu, Andrei A., et al. "Policy distillation." arXiv preprint arXiv:1511.06295 (2015). [pdf] (RL 領域) ★★★

地址:http://arxiv.org/pdf/1511.06295

[57] Parisotto, Emilio, Jimmy Lei Ba, and Ruslan Salakhu★★★tdinov. "Actor-mimic: Deep multitask and transfer reinforcement learning." arXiv preprint arXiv:1511.06342 (2015). [pdf] (RL 領域) ★★★

地址:http://arxiv.org/pdf/1511.06342

[58] Rusu, Andrei A., et al. "Progressive neural networks." arXiv preprint arXiv:1606.04671 (2016). [pdf] (傑出研究, 很新奇的想法) ★★★★★

地址:https://arxiv.org/pdf/1606.04671

2.8 One Shot 深度學習

[59] Lake, Brenden M., Ruslan Salakhutdinov, and Joshua B. Tenenbaum. "Human-level concept learning through probabilistic program induction." Science 350.6266 (2015): 1332-1338. [pdf] (不含深度學習但值得一讀) ★★★★★

地址:http://clm.utexas.edu/compjclub/wp-content/uploads/2016/02/lake2015.pdf

[60] Koch, Gregory, Richard Zemel, and Ruslan Salakhutdinov. "Siamese Neural Networks for One-shot Image Recognition."(2015) [pdf] ★★★

地址:http://www.cs.utoronto.ca/~gkoch/files/msc-thesis.pdf

[61] Santoro, Adam, et al. "One-shot Learning with Memory-Augmented Neural Networks." arXiv preprint arXiv:1605.06065 (2016). [pdf] (one shot 學習的基礎一步) ★★★★

地址:http://arxiv.org/pdf/1605.06065

[62] Vinyals, Oriol, et al. "Matching Networks for One Shot Learning." arXiv preprint arXiv:1606.04080 (2016). [pdf] ★★★

地址:https://arxiv.org/pdf/1606.04080

[63] Hariharan, Bharath, and Ross Girshick. "Low-shot visual object recognition." arXiv preprint arXiv:1606.02819 (2016). [pdf] (通向更大規模數據的一步) ★★★★

地址:http://arxiv.org/pdf/1606.02819

3 應用

3.1 自然語言處理 (NLP)

[1] Antoine Bordes, et al. "Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing." AISTATS(2012) [pdf] ★★★★

地址:https://www.hds.utc.fr/~bordesan/dokuwiki/lib/exe/fetch.php?id=en%3Apubli&cache=cache&media=en:bordes12aistats.pdf

[2] Mikolov, et al. "Distributed representations of words and phrases and their compositionality." ANIPS(2013): 3111-3119 [pdf] (word2vec) ★★★

地址:http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf

[3] Sutskever, et al. "「Sequence to sequence learning with neural networks." ANIPS(2014) [pdf] ★★★

地址:http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf

[4] Ankit Kumar, et al. "「Ask Me Anything: Dynamic Memory Networks for Natural Language Processing." arXiv preprint arXiv:1506.07285(2015) [pdf] ★★★★

地址:https://arxiv.org/abs/1506.07285

[5] Yoon Kim, et al. "Character-Aware Neural Language Models." NIPS(2015) arXiv preprint arXiv:1508.06615(2015) [pdf] ★★★

地址:https://arxiv.org/abs/1508.06615

[6] Jason Weston, et al. "Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks." arXiv preprint arXiv:1502.05698(2015) [pdf] (bAbI tasks) ★★★

地址:https://arxiv.org/abs/1502.05698

[7] Karl Moritz Hermann, et al. "Teaching Machines to Read and Comprehend." arXiv preprint arXiv:1506.03340(2015) [pdf](CNN/每日郵報完形填空風格的問題) ★★

地址:https://arxiv.org/abs/1506.03340

[8] Alexis Conneau, et al. "Very Deep Convolutional Networks for Natural Language Processing." arXiv preprint arXiv:1606.01781(2016) [pdf] (文本分類的前沿技術) ★★★

地址:https://arxiv.org/abs/1606.01781

[9] Armand Joulin, et al. "Bag of Tricks for Efficient Text Classification." arXiv preprint arXiv:1607.01759(2016) [pdf] (比前沿技術稍落後, 但快很多) ★★★

地址:https://arxiv.org/abs/1607.01759

3.2 物體檢測

[1] Szegedy, Christian, Alexander Toshev, and Dumitru Erhan. "Deep neural networks for object detection." Advances in Neural Information Processing Systems. 2013. [pdf] ★★★

地址:http://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection.pdf

[2] Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. [pdf] (RCNN) ★★★★★

地址:http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf

[3] He, Kaiming, et al. "Spatial pyramid pooling in deep convolutional networks for visual recognition." European Conference on Computer Vision. Springer International Publishing, 2014. [pdf] (SPPNet) ★★★★

地址:http://arxiv.org/pdf/1406.4729

[4] Girshick, Ross. "Fast r-cnn." Proceedings of the IEEE International Conference on Computer Vision. 2015. [pdf] ★★★★

地址:https://pdfs.semanticscholar.org/8f67/64a59f0d17081f2a2a9d06f4ed1cdea1a0ad.pdf

[5] Ren, Shaoqing, et al. "Faster R-CNN: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015. [pdf] ★★★★

地址:http://papers.nips.cc/paper/5638-analysis-of-variational-bayesian-latent-dirichlet-allocation-weaker-sparsity-than-map.pdf

[6] Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." arXiv preprint arXiv:1506.02640 (2015). [pdf] (YOLO,傑出研究,非常具有使用價值) ★★★★★

地址:http://homes.cs.washington.edu/~ali/papers/YOLO.pdf

[7] Liu, Wei, et al. "SSD: Single Shot MultiBox Detector." arXiv preprint arXiv:1512.02325 (2015). [pdf] ★★★

地址:http://arxiv.org/pdf/1512.02325

[8] Dai, Jifeng, et al. "R-FCN: Object Detection via Region-based Fully Convolutional Networks." arXiv preprint arXiv:1605.06409 (2016). [pdf] ★★★★

地址:https://arxiv.org/abs/1605.06409

3.3 視覺追蹤

[1] Wang, Naiyan, and Dit-Yan Yeung. "Learning a deep compact image representation for visual tracking." Advances in neural information processing systems. 2013. [pdf] (第一篇使用深度學習做視覺追蹤的論文,DLT Tracker) ★★★

地址:http://papers.nips.cc/paper/5192-learning-a-deep-compact-image-representation-for-visual-tracking.pdf

[2] Wang, Naiyan, et al. "Transferring rich feature hierarchies for robust visual tracking." arXiv preprint arXiv:1501.04587 (2015). [pdf] (SO-DLT) ★★★★

地址:http://arxiv.org/pdf/1501.04587

[3] Wang, Lijun, et al. "Visual tracking with fully convolutional networks." Proceedings of the IEEE International Conference on Computer Vision. 2015. [pdf] (FCNT) ★★★★

地址:http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Wang_Visual_Tracking_With_ICCV_2015_paper.pdf

[4] Held, David, Sebastian Thrun, and Silvio Savarese. "Learning to Track at 100 FPS with Deep Regression Networks." arXiv preprint arXiv:1604.01802 (2016). [pdf] (GOTURN,在深度學習方法裡算是非常快的,但仍比非深度學習方法慢很多) ★★★★

地址:http://arxiv.org/pdf/1604.01802

[5] Bertinetto, Luca, et al. "Fully-Convolutional Siamese Networks for Object Tracking." arXiv preprint arXiv:1606.09549 (2016). [pdf] (SiameseFC,實時物體追蹤領域的最新前沿技術) ★★★★

地址:https://arxiv.org/pdf/1606.09549

[6] Martin Danelljan, Andreas Robinson, Fahad Khan, Michael Felsberg. "Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking." ECCV (2016) [pdf] (C-COT) ★★★★

地址:http://www.cvl.isy.liu.se/research/objrec/visualtracking/conttrack/C-COT_ECCV16.pdf

[7] Nam, Hyeonseob, Mooyeol Baek, and Bohyung Han. "Modeling and Propagating CNNs in a Tree Structure for Visual Tracking." arXiv preprint arXiv:1608.07242 (2016). [pdf] (VOT2016 獲獎論文,TCNN) ★★★★

地址:https://arxiv.org/pdf/1608.07242

3.4 圖像標註

[1] Farhadi,Ali,etal. "Every picture tells a story: Generating sentences from images". In Computer VisionECCV 2010. Springer Berlin Heidelberg:15-29, 2010. [pdf] ★★★

地址:https://www.cs.cmu.edu/~afarhadi/papers/sentence.pdf

[2] Kulkarni, Girish, et al. "Baby talk: Understanding and generating image descriptions". In Proceedings of the 24th CVPR, 2011. [pdf] ★★★★

地址:http://tamaraberg.com/papers/generation_cvpr11.pdf

[3] Vinyals, Oriol, et al. "Show and tell: A neural image caption generator". In arXiv preprint arXiv:1411.4555, 2014. [pdf] ★★★

地址:https://arxiv.org/pdf/1411.4555.pdf

[4] Donahue, Jeff, et al. "Long-term recurrent convolutional networks for visual recognition and description". In arXiv preprint arXiv:1411.4389 ,2014. [pdf]

地址:https://arxiv.org/pdf/1411.4389.pdf

[5] Karpathy, Andrej, and Li Fei-Fei. "Deep visual-semantic alignments for generating image descriptions". In arXiv preprint arXiv:1412.2306, 2014. [pdf] ★★★★★

地址:https://cs.stanford.edu/people/karpathy/cvpr2015.pdf

[6] Karpathy, Andrej, Armand Joulin, and Fei Fei F. Li. "Deep fragment embeddings for bidirectional image sentence mapping". In Advances in neural information processing systems, 2014. [pdf] ★★★★

地址:https://arxiv.org/pdf/1406.5679v1.pdf

[7] Fang, Hao, et al. "From captions to visual concepts and back". In arXiv preprint arXiv:1411.4952, 2014. [pdf] ★★★★★

地址:https://arxiv.org/pdf/1411.4952v3.pdf

[8] Chen, Xinlei, and C. Lawrence Zitnick. "Learning a recurrent visual representation for image caption generation". In arXiv preprint arXiv:1411.5654, 2014. [pdf] ★★★★

地址:https://arxiv.org/pdf/1411.5654v1.pdf

[9] Mao, Junhua, et al. "Deep captioning with multimodal recurrent neural networks (m-rnn)". In arXiv preprint arXiv:1412.6632, 2014. [pdf] ★★★

地址:https://arxiv.org/pdf/1412.6632v5.pdf

[10] Xu, Kelvin, et al. "Show, attend and tell: Neural image caption generation with visual attention". In arXiv preprint arXiv:1502.03044, 2015. [pdf] ★★★★★

地址:https://arxiv.org/pdf/1502.03044v3.pdf

3.5 機器翻譯

部分裡程碑研究被列入 RNN / Seq-to-Seq 版塊。

[1] Luong, Minh-Thang, et al. "Addressing the rare word problem in neural machine translation." arXiv preprint arXiv:1410.8206 (2014). [pdf] ★★★★

地址:http://arxiv.org/pdf/1410.8206

[2] Sennrich, et al. "Neural Machine Translation of Rare Words with Subword Units". In arXiv preprint arXiv:1508.07909, 2015. [pdf] ★★★

地址:https://arxiv.org/pdf/1508.07909.pdf

[3] Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning. "Effective approaches to attention-based neural machine translation." arXiv preprint arXiv:1508.04025 (2015). [pdf] ★★★★

地址:http://arxiv.org/pdf/1508.04025

[4] Chung, et al. "A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation". In arXiv preprint arXiv:1603.06147, 2016. [pdf] ★★

地址:https://arxiv.org/pdf/1603.06147.pdf

[5] Lee, et al. "Fully Character-Level Neural Machine Translation without Explicit Segmentation". In arXiv preprint arXiv:1610.03017, 2016. [pdf] ★★★★★

地址:https://arxiv.org/pdf/1610.03017.pdf

[6] Wu, Schuster, Chen, Le, et al. "Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation". In arXiv preprint arXiv:1609.08144v2, 2016. [pdf] (Milestone) ★★★★

地址:https://arxiv.org/pdf/1609.08144v2.pdf

3.6 機器人

[1] Koutník, Jan, et al. "Evolving large-scale neural networks for vision-based reinforcement learning." Proceedings of the 15th annual conference on Genetic and evolutionary computation. ACM, 2013. [pdf] ★★★

地址:http://repository.supsi.ch/4550/1/koutnik2013gecco.pdf

[2] Levine, Sergey, et al. "End-to-end training of deep visuomotor policies." Journal of Machine Learning Research 17.39 (2016): 1-40. [pdf] ★★★★★

地址:http://www.jmlr.org/papers/volume17/15-522/15-522.pdf

[3] Pinto, Lerrel, and Abhinav Gupta. "Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours." arXiv preprint arXiv:1509.06825 (2015). [pdf] ★★★

地址:http://arxiv.org/pdf/1509.06825

[4] Levine, Sergey, et al. "Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection." arXiv preprint arXiv:1603.02199 (2016). [pdf] ★★★★

地址:http://arxiv.org/pdf/1603.02199

[5] Zhu, Yuke, et al. "Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning." arXiv preprint arXiv:1609.05143 (2016). [pdf] ★★★★

地址:https://arxiv.org/pdf/1609.05143

[6] Yahya, Ali, et al. "Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search." arXiv preprint arXiv:1610.00673 (2016). [pdf] ★★★★

地址:https://arxiv.org/pdf/1610.00673

[7] Gu, Shixiang, et al. "Deep Reinforcement Learning for Robotic Manipulation." arXiv preprint arXiv:1610.00633 (2016). [pdf] ★★★★

地址:https://arxiv.org/pdf/1610.00633

[8] A Rusu, M Vecerik, Thomas Rothörl, N Heess, R Pascanu, R Hadsell."Sim-to-Real Robot Learning from Pixels with Progressive Nets." arXiv preprint arXiv:1610.04286 (2016). [pdf] ★★★★

地址:https://arxiv.org/pdf/1610.04286.pdf

[9] Mirowski, Piotr, et al. "Learning to navigate in complex environments." arXiv preprint arXiv:1611.03673 (2016). [pdf] ★★★★

地址:https://arxiv.org/pdf/1611.03673

3.7 藝術

[1] Mordvintsev, Alexander; Olah, Christopher; Tyka, Mike (2015). "Inceptionism: Going Deeper into Neural Networks". Google Research. [html] (Deep Dream) ★★★★

地址:https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html

[2] Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. "A neural algorithm of artistic style." arXiv preprint arXiv:1508.06576 (2015). [pdf] (傑出研究,迄今最成功的方法) ★★★★★

地址:http://arxiv.org/pdf/1508.06576

[3] Zhu, Jun-Yan, et al. "Generative Visual Manipulation on the Natural Image Manifold." European Conference on Computer Vision. Springer International Publishing, 2016. [pdf] (iGAN) ★★★★

地址:https://arxiv.org/pdf/1609.03552

[4] Champandard, Alex J. "Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks." arXiv preprint arXiv:1603.01768 (2016). [pdf] (Neural Doodle) ★★★★

地址:http://arxiv.org/pdf/1603.01768

[5] Zhang, Richard, Phillip Isola, and Alexei A. Efros. "Colorful Image Colorization." arXiv preprint arXiv:1603.08511 (2016). [pdf] ★★★★

地址:http://arxiv.org/pdf/1603.08511

[6] Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. "Perceptual losses for real-time style transfer and super-resolution." arXiv preprint arXiv:1603.08155 (2016). [pdf] ★★★★

地址:https://arxiv.org/pdf/1603.08155.pdf

[7] Vincent Dumoulin, Jonathon Shlens and Manjunath Kudlur. "A learned representation for artistic style." arXiv preprint arXiv:1610.07629 (2016). [pdf] ★★★★

地址:https://arxiv.org/pdf/1610.00633

[8] Gatys, Leon and Ecker, et al."Controlling Perceptual Factors in Neural Style Transfer." arXiv preprint arXiv:1611.07865 (2016). [pdf] (control style transfer over spatial location,colour information and across spatial scale) ★★★★

地址:https://arxiv.org/pdf/1610.04286.pdf

[9] Ulyanov, Dmitry and Lebedev, Vadim, et al. "Texture Networks: Feed-forward Synthesis of Textures and Stylized Images." arXiv preprint arXiv:1603.03417(2016). [pdf] (紋理生成和風格變化) ★★★★

地址:https://arxiv.org/pdf/1611.03673

3.8 目標分割 Object Segmentation

[1] J. Long, E. Shelhamer, and T. Darrell, 「Fully convolutional networks for semantic segmentation.」 in CVPR, 2015. [pdf] ★★★★★

地址:https://arxiv.org/pdf/1411.4038v2.pdf

[2] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. "Semantic image segmentation with deep convolutional nets and fully connected crfs." In ICLR, 2015. [pdf] ★★★★★

地址:https://arxiv.org/pdf/1606.00915v1.pdf

[3] Pinheiro, P.O., Collobert, R., Dollar, P. "Learning to segment object candidates." In: NIPS. 2015. [pdf] ★★★★

地址:https://arxiv.org/pdf/1506.06204v2.pdf

[4] Dai, J., He, K., Sun, J. "Instance-aware semantic segmentation via multi-task network cascades." in CVPR. 2016 [pdf] ★★★

地址:https://arxiv.org/pdf/1512.04412v1.pdf

[5] Dai, J., He, K., Sun, J. "Instance-sensitive Fully Convolutional Networks." arXiv preprint arXiv:1603.08678 (2016). [pdf] ★★★

地址:https://arxiv.org/pdf/1603.08678v1.pdf

原文地址: 雷鋒網獲授權雷鋒網獲授權https://github.com/songrotek/Deep-Learning-Papers-Reading-Roadmap

相關焦點

  • 126篇殿堂級深度學習論文分類整理 從入門到應用 | 乾貨
    本文將試圖解決這個問題——文章標題本來是:「從入門到絕望,無止境的深度學習論文」。請諸位備好道具,開啟頭懸梁錐刺股的學霸姿勢。開個玩笑。但對非科班出身的開發者而言,讀論文的確可以成為一件很痛苦的事。但好消息來了——為避免初學者陷入迷途苦海,暱稱為 songrotek 的學霸在 GitHub 發布了他整理的深度學習路線圖,分門別類梳理了新入門者最需要學習的 DL 論文,又按重要程度給每篇論文打上星星。截至目前,這份 DL 論文路線圖已在 GitHub 收穫了近萬顆星星好評,人氣極高。
  • 乾貨分享 | 機器學習、深度學習、nlp、cv從入門到深入全套資源分享
    深度學習之目標檢測的前世今生(Mask R-CNN)深度學習目標檢測模型全面綜述:Faster R-CNN、R-FCN和SSD從RCNN到SSD,這應該是最全的一份目標檢測算法盤點目標檢測算法綜述三部曲基於深度學習的目標檢測算法綜述(一)基於深度學習的目標檢測算法綜述(二)基於深度學習的目標檢測算法綜述
  • 深度學習全網最全學習資料匯總之入門篇
    面對如此重要的江湖地位,我們相信一定有為數眾多的 AI 開發者對深度學習技術充滿了好奇心,想要快速著手使用這項強大的技術來解決現實生活中的實際問題。因此,雷鋒網(公眾號:雷鋒網)將圍繞深度學習技術整理一個系列文章,全面覆蓋與其相關的各項知識點。本文針對如何入門深度學習這一話題,整理了若干參考資料,希望對廣大開發者有所裨益。
  • 【機器學習】人人都可以做深度學習應用:入門篇(下)
    根據前面的內容人人都可以做深度學習應用:入門篇(上)人人都可以做深度學習應用:入門篇(中),我們對上述基於softmax只是三層(輸入、處理、輸出
  • 【乾貨薈萃】機器學習&深度學習知識資料大全集(一)(論文/教程/代碼/書籍/數據/課程等)
    【導讀】轉載來自ty4z2008(GItHub)整理的機器學習&深度學習知識資料大全薈萃,包含各種論文、代碼、視頻、書籍、文章、數據等等。是學習機器學習和深度學習的必備品!乾貨很多,值得深入學習下介紹:這篇文章主要是以Learning to Rank為例說明企業界機器學習的具體應用,RankNet對NDCG之類不敏感,加入NDCG因素後變成了LambdaRank,同樣的思想從神經網絡改為應用到Boosted Tree模型就成就了LambdaMART。
  • 10篇論文帶你入門深度學習圖像分類(附下載)
    在許多計算機視覺任務中,圖像分類是最基本的任務之一。它不僅可以用於許多實際產品中,例如Google Photo的標籤和AI內容審核,而且還為許多更高級的視覺任務(例如物體檢測和視頻理解)打開了一扇門。自從深度學習的突破以來,由於該領域的快速變化,初學者經常發現它太笨拙,無法學習。與典型的軟體工程學科不同,沒有很多關於使用DCNN進行圖像分類的書籍,而了解該領域的最佳方法是閱讀學術論文。
  • 【乾貨薈萃】機器學習&深度學習知識資料大全集(二)(論文/教程/代碼/書籍/數據/課程等)
    【導讀】轉載來自ty4z2008(GItHub)整理的機器學習&深度學習知識資料大全薈萃,包含各種論文、代碼、視頻、書籍、文章、數據等等。是學習機器學習和深度學習的必備品!昨天介紹了第一篇:【乾貨薈萃】機器學習&深度學習知識資料大全集(一)(論文/教程/代碼/書籍/數據/課程等)今天第二篇:   介紹:使用卷積神經網絡的圖像縮放
  • 近200篇機器學習&深度學習資料分享
    機器學習是什麼,被應用在哪裡?來看 Platt 的這篇博文《2014 年國際機器學習大會 ICML 2014 論文》介紹:2014 年國際機器學習大會(ICML)已經於 6 月 21-26 日在國家會議中心隆重舉辦。
  • 乾貨| 深度學習在文本分類中的應用
    文本分類的應用非常廣泛。>更多應用:讓AI當法官: 基於案件事實描述文本的罰金等級分類(多分類)和法條分類(多標籤分類)。>上述的深度學習方法通過引入CNN或RNN進行特徵提取,可以達到比較好的效果,但是也存在一些問題,如參數較多導致訓練時間過長,超參數較多模型調整麻煩等。
  • GitHub | 機器學習&深度學習&nlp&cv從入門到深入全套資源分享
    深度學習之目標檢測的前世今生(Mask R-CNN)深度學習目標檢測模型全面綜述:Faster R-CNN、R-FCN和SSD從RCNN到SSD,這應該是最全的一份目標檢測算法盤點目標檢測算法綜述三部曲基於深度學習的目標檢測算法綜述(一)基於深度學習的目標檢測算法綜述(二)基於深度學習的目標檢測算法綜述
  • 深度學習的興起:從NN到DNN | 小白深度學習入門
    小白深度學習入門系列    1.
  • 深度學習在自然語言處理上的七大應用
    然而,深度學習方法在某些特定的語言問題上可以獲得最先進的結果。 最有趣的不僅僅是深度學習模型在基準問題上的性能;事實上,一個單一的模型可以學習單詞的意思和執行語言任務,從而避免需要一套專門的和手工的方法。 在這篇文章中,你將發現 7 種有趣的自然語言處理任務,其中深度學習方法正在取得一些進展。 在此文中,我們將看看下面的 7 種自然語言處理問題。
  • python深度學習---帶你從入門到精通
    為輔助提高廣大科研工作者的使用python深度學習技術,我們舉辦了本次《python深度學習實踐技術及應用線上培訓》,利用大量的案例講解與實操練習讓大家更深入便捷的運用到工作學習中,也方便各個領域可以更好地交叉融合、擴展應用。
  • Deep Learning入門整理第一篇:深度學習的定義與歷史
    什麼是深度學習2. 為什麼要使用深度學習3. 深度學習和其他學科的關聯4. 以及深度學習的三大歷史浪潮。通讀這篇將使各位對深度學習有一個整體的了解,包括糾正各位對深度學習概念的某些誤解或曲解,文章採取十個問題的方式並儘量不用公式來向各位闡述深度學習這個當下最熱的課題。有興趣的朋友也可以根據文末的reference去選擇性的閱讀相關論文。
  • 機器學習(Machine Learning)&深度學習(Deep Learning)資料(之一)
    《機器學習常見算法分類匯總》 介紹: 機器學習無疑是當前數據分析領域的一個熱點內容。很多人在平時的工作中都或多或少會用到機器學習的算法。本文為您總結一下常見的機器學習算法,以供您在工作和學習中參考.81.《Deep Learning(深度學習)學習筆記整理系列》 介紹: 很多乾貨,而且作者還總結了好幾個系列。另外還作者還了一個文章導航.非常的感謝作者總結。
  • 乾貨!機器學習&深度學習經典資料匯總(續)
    合集《基於雲的自然語言處理開源項目FudanNLP》介紹:本項目利用了Microsoft Azure,可以在幾分種內完成NLP on Azure Website的部署,立即開始對FNLP各種特性的試用,或者以REST API的形式調用FNLP的語言分析功能《吳立德《概率主題模型&數據科學基礎》》介紹:現任復旦大學首席教授、計算機軟體博士生導師。
  • 【乾貨】2017最火的五篇深度學習論文 總有一篇適合你
    【導讀】最近,MIT博士生學生GREGORY J STEIN在博客中總結了2017年他最喜歡的深度學習論文,並且列出了這一年對他研究思考影響最深的五篇論文
  • 不堆砌公式,用最直觀的方式帶你入門深度學習
    所以花了大量時間查資料看論文,有的博客或者論文寫得非常贊,比如三巨頭 LeCun,Bengio 和 Hinton 2015 年在 Nature 上發表綜述論文的「Deep Learning」,言簡意賅地引用了上百篇論文,但適合閱讀,不適合 presentation 式的分享;再如 Michael Nielsen 寫的電子書《神經網絡與深度學習》(中文版,英文版)通俗易懂,用大量的例子解釋了深度學習中的相關概念和基本原理
  • 三篇論文,縱覽深度學習在表格識別中的最新應用
    原創 Synced 機器之心機器之心分析師網絡作者:仵冀穎編輯:Joni本文從三篇表格識別領域的精選論文出發,深入分析了深度學習在表格識別任務中的應用。本文從近兩年公開發表的文章中,包括國際文檔分析與識別會議(International Conference on Document Analysis and Recognition,ICDAR)和arXiv平臺的論文中精選了三篇,深入分析深度學習在表格識別任務中的應用。
  • 百度貼吧乾貨尋找技巧全攻略之入門篇
    此次小編嘔心解析,為大家帶來了百度貼吧使用技巧入門篇與進階篇,希望可以幫助所有尋求乾貨的人,做一隻棒棒的避「水」神獸!由於乾貨較多,為了便於大家掌握,今天先介紹貼吧使用技巧入門篇——明確興趣與需求。作為一個擁有超千萬興趣吧的社交平臺,百度貼吧既提供給了長期使用者一個廣泛的挑選空間,同時也讓初涉其中的新手有點傻眼:「這麼多主題吧和優質內容,我該從哪裡下手?」