The collection is best viewed with BibDesk, as in this way you will be able to see the categories of the papers (but any reference application can used to open the library).
Brown, S., Laird, A. R., Pfordresher, P. Q., Thelen, S. M., Turkeltaub, P., & Liotti, M. (2009). The somatotopy of speech: Phonation and articulation in the human motor cortex. Brain and Cognition, 70(1), 31--41. doi:10.1016/j.bandc.2008.12.006
Christiansen, M. H., & Chater, N. (2008). Language as shaped by the brain. Behavioral and Brain Sciences, 31(5), 489--509. doi:10.1017/S0140525X08004998
Fedorenko, E., Hsieh, P.-J., & Balewski, Z. (2015). A possible functional localiser for identifying brain regions sensitive to sentence-level prosody. Language, Cognition and Neuroscience, 30(1-2), 120-148. doi:10.1080/01690965.2013.861917
Fischl, B., & Dale, A. M. (2000). Measuring the thickness of the human cerebral cortex from magnetic resonance images. Proceedings of the National Academy of Sciences, 97(20), 11050--11055. doi:10.1073/pnas.200033797
Hutton, C., Draganski, B., Ashburner, J., & Weiskopf, N. (2009). A comparison between voxel-based cortical thickness and voxel-based morphometry in normal aging. NeuroImage, 48(2), 371--380. doi:10.1016/j.neuroimage.2009.06.043
Mack, J. E., Chandler, S. D., amd Emily Rogalski, A. M.-A., Weintraub, S., Mesulam, M., & Thompson, C. K. (2015). What do pauses in narrative production reveal about the nature of word retrieval deficits in PPA? Neuropsychologia, 77, 211--222. doi:http://dx.doi.org/10.1016/j.neuropsychologia.2015.08.019
Sowell, E. R., Peterson, B. S., Kan, E., Woods, R. P., Yoshii, J., Bansal, R., . . . Toga, A. W. (2006). Sex Differences in Cortical Thickness Mapped in 176 Healthy Individuals between 7 and 87 Years of Age. Cerebral Cortex, 17(7), 1550--1560. doi:10.1093/cercor/bhl066
Thambisetty, M., Wan, J., Carass, A., An, Y., Prince, J. L., & Resnick, S. M. (2010). Longitudinal changes in cortical thickness associated with normal aging. NeuroImage, 52(4), 1215--1223. doi:10.1016/j.neuroimage.2010.04.258
Tzourio-Mazoyer, N., Landeau, B., Papathanassiou, D., Crivello, F., Etard, O., Delcroix, N., . . . Joliot, M. (2002). Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage, 15(1), 273-289. doi:10.1006/nimg.2001.0978
Wildgruber, D., Ackermann, H., Kreifelts, B., & Ethofer, T. (2006). Cerebral processing of linguistic and emotional prosody: fMRI studies. Progress in Brain Research, 156, 249 - 268. doi:http://dx.doi.org/10.1016/S0079-6123(06)56013-3
Karpathy, A., & Fei-Fei, L. (2015). Deep Visual-Semantic Alignments for Generating Image Descriptions. Paper presented at the The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2015). Show and tell: A neural image caption generator. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3156-3164.
Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A. C., Salakhutdinov, R., . . . Bengio, Y. (2015). Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. CoRR, abs/1502.03044.
Adams, T. (2017). AI-Powered Social Bots. CoRR, abs/1706.05143.
Serban, I. V., Sankar, C., Germain, M., Zhang, S., Lin, Z., Subramanian, S., . . . Bengio, Y. (2017). A Deep Reinforcement Learning Chatbot. ArXiv e-prints.
Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2015). Show and tell: A neural image caption generator. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3156-3164.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. A. (2013). Playing Atari with Deep Reinforcement Learning. CoRR, abs/1312.5602.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., . . . al., e. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529--533. doi:10.1038/nature14236
Qi, F. and Wu, W. (2019). Human-like machine thinking: Language guided imagination. arXiv preprint arXiv:1905.07562.
Baroni, M., Joulin, A., Jabri, A., Kruszewski, G., Lazaridou, A., Simonic, K., & Mikolov, T. (2017). CommAI: Evaluating the first steps towards a useful general AI.
Hutter, M. (2005). Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Berlin: Springer.
Kaiser, L., Gomez, A. N., Shazeer, N., Vaswani, A., Parmar, N., Jones, L., & Uszkoreit, J. (2017). One Model To Learn Them All. CoRR, abs/1706.05137.
Lake, B. M., Ullman, T. D., Tenenbaum, J. B., & Gershman, S. J. (2016). Building Machines That Learn and Think Like People. Behavioral and Brain Sciences, 24, 1-101.
Mikolov, T., Joulin, A., & Baroni, M. (2015). A Roadmap towards Machine Intelligence. CoRR, abs/1511.08130.
Pascanu, R., Li, Y., Vinyals, O., Heess, N., Buesing, L., Racanière, S., . . . Battaglia, P. (2017). Learning model-based planning from scratch. ArXiv e-prints.
Rosa, M., & Feyereisl, J. (2016). A Framework for Searching for General Artificial Intelligence. CoRR, abs/1611.00685.
Weber, T., Racanière, S., Reichert, D. P., Buesing, L., Guez, A., Jimenez Rezende, D., . . . Wierstra, D. (2017). Imagination-Augmented Agents for Deep Reinforcement Learning. ArXiv e-prints.
Adams, T. (2017). AI-Powered Social Bots. CoRR, abs/1706.05143.
Vinyals, O., Ewalds, T., Bartunov, S., Georgiev, P., Vezhnevets, A. S., Yeo, M., . . . Tsing, R. (2017). StarCraft II: A New Challenge for Reinforcement Learning.
Herbelot, A., & Baroni, M. (2017). High-risk learning: acquiring new word vectors from tiny data. EMNLP 2017.
Lazaridou, A., Peysakhovich, A., & Baroni, M. (2016). Multi-Agent Cooperation and the Emergence of (Natural) Language.
Boureau, Y. L., Bach, F., LeCun, Y., & Ponce, J. (2010). Learning mid-level features for recognition 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2010 (pp. 2559--2566).
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., & Darrell, T. (2013). DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. CoRR, abs/1310.1531.
Farabet, C., Couprie, C., Najman, L., & LeCun, Y. (2013). Learning Hierarchical Features for Scene Labeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1915-1929. doi:10.1109/TPAMI.2012.231
Girshick, R. B., Donahue, J., Darrell, T., & Malik, J. (2013). Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR, abs/1311.2524.
Gregor, K., Danihelka, I., Graves, A., & Wierstra, D. (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 25 (pp. 1097--1105): Curran Associates, Inc.
Le, Q. V., Zou, W. Y., Yeung, S. Y., & Ng, A. Y. (2011). Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. Paper presented at the CVPR 2011. http://dx.doi.org/10.1109/CVPR.2011.5995496
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., . . . Li, F.-F. (2014). ImageNet Large Scale Visual Recognition Challenge. CoRR, abs/1409.0575.
Taigman, Y., Yang, M., Ranzato, M., & Wolf, L. (2014). DeepFace: Closing the Gap to Human-Level Performance in Face Verification. Paper presented at the 2014 IEEE Conference on Computer Vision and Pattern Recognition. http://dx.doi.org/10.1109/CVPR.2014.220
.Irsoy, O., & Cardie, C. (2014). Deep Recursive Neural Networks for Compositionality in Language. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27 (pp. 2096--2104): Curran Associates, Inc.
Andreas, J., Rohrbach, M., Darrell, T., & Klein, D. (2015). Deep Compositional Question Answering with Neural Module Networks. CoRR, abs/1511.02799.
Assylbekov, Z., Takhanov, R., Myrzakhmetov, B., & Washington, J. N. (2017). Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones. ArXiv e-prints.
Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A Neural Probabilistic Language Model. Journal of Machine Learning Research, 3, 1137--1155.
Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching Word Vectors with Subword Information. TACL, 5, 135--146.
Bojanowski, P., Joulin, A., & Mikolov, T. (2015). Alternative structures for character-level RNNs. CoRR, abs/1511.06303.
Boleda, G., Padó, S., Pham, N. T., & Baroni, M. (2017). Living a discrete life in a continuous world: Reference with distributed representations.
Deoras, A., Mikolov, T., Kombrink, S., & Church, K. (2013). Approximate inference: A sampling based modeling technique to capture complex dependencies in a language model. Speech Communication, 55(1), 162--177. doi:10.1016/j.specom.2012.08.004
Frome, A., Corrado, G. S., Shlens, J., Bengio, S., Dean, J., Ranzato, M. A., & Mikolov, T. (2013). DeViSE: A Deep Visual-Semantic Embedding Model. Paper presented at the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States. http://papers.nips.cc/paper/5204-devise-a-deep-visual-semantic-embedding-model
Glembek, O., Matejka, P., Burget, L. a., & Mikolov, T. (2008). Advances in phonotactic language recognition. Paper presented at the INTERSPEECH 2008, 9th Annual Conference of the International Speech Communication Association, Brisbane, Australia, September 22-26, 2008. http://www.isca-speech.org/archive/interspeech_2008/i08_0743.html
Goldberg, Y. (2015). A Primer on Neural Network Models for Natural Language Processing. CoRR, abs/1510.00726.
Griffiths, T. L., Chater, N., Kemp, C., Perfors, A., & Tenenbaum, J. B. (2010). Probabilistic models of cognition: exploring representations and inductive biases. Trends in Cognitive Sciences, 14(8), 357--364. doi:10.1016/j.tics.2010.05.004
Hermann, K. M. (2014). Distributed Representations for Compositional Semantics. CoRR, abs/1411.3146.
II, A. G. O., Mikolov, T., & Reitter, D. (2017). Learning Simpler Language Models with the Delta Recurrent Neural Network Framework. CoRR, abs/1703.08864.
Jernite, Y., Grave, E., Joulin, A., & Mikolov, T. (2016). Variable Computation in Recurrent Neural Networks. CoRR, abs/1611.06188.
Johnson, J., Hariharan, B., van der Maaten, L., Fei-Fei, L., Zitnick, C. L., & Girshick, R. B. (2016). CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning. CoRR, abs/1612.06890.
Johnson, J., Hariharan, B., van der Maaten, L., Hoffman, J., Li, F.-F., Zitnick, C. L., & Girshick, R. B. (2017). Inferring and Executing Programs for Visual Reasoning. CoRR, abs/1705.03633.
Joulin, A., Grave, E., Bojanowski, P., Douze, M., J'egou, H. e., & Mikolov, T. (2016). FastText.zip: Compressing text classification models. CoRR, abs/1612.03651.
Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of Tricks for Efficient Text Classification. CoRR, abs/1607.01759.
Joulin, A., & Mikolov, T. (2015). Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets. Paper presented at the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada. http://papers.nips.cc/paper/5857-inferring-algorithmic-patterns-with-stack-augmented-recurrent-nets
Lai, S., Xu, L., Liu, K., & Zhao, J. (2015). Recurrent Convolutional Neural Networks for Text Classification. Paper presented at the Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. http://dl.acm.org/citation.cfm?id=2886521.2886636
Manning, C. (2015). Last Words. Computational Linguistics and Deep Learning. Association for Computational Linguistics.
McClelland, J. L., Botvinick, M. M., Noelle, D. C., Plaut, D. C., Rogers, T. T., Seidenberg, M. S., & Smith, L. B. (2010). Letting structure emerge: connectionist and dynamical systems approaches to cognition. Trends in Cognitive Sciences, 14(8), 348--356. doi:10.1016/j.tics.2010.06.002
Mesnil, G. e., Mikolov, T., Ranzato, M. A., & Bengio, Y. (2014). Ensemble of Generative and Discriminative Techniques for Sentiment Analysis of Movie Reviews. CoRR, abs/1412.5335.
Mikolov, T., Joulin, A., Chopra, S., Mathieu, M. e., & Ranzato, M. A. (2014). Learning Longer Memory in Recurrent Neural Networks. CoRR, abs/1412.7753.
Mikolov, T., Le, Q. V., & Sutskever, I. (2013). Exploiting Similarities among Languages for Machine Translation. CoRR, abs/1309.4168.
Mikolov, T., Yih, W.-t., & Zweig, G. (2013). Linguistic Regularities in Continuous Space Word Representations. Paper presented at the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, June 9-14, 2013, Westin Peachtree Plaza Hotel, Atlanta, Georgia, USA. http://aclweb.org/anthology/N/N13/N13-1090.pdf
Mulder, W. D., Bethard, S., & Moens, M.-F. (2015). A survey on the application of recurrent neural networks to statistical language modeling. Computer Speech and Language, 30, 61--98. doi:http://dx.doi.org/10.1016/j.csl.2014.09.005
Norouzi, M., Mikolov, T., Bengio, S., Singer, Y., Shlens, J., Frome, A., . . . Dean, J. (2013). Zero-Shot Learning by Convex Combination of Semantic Embeddings. CoRR, abs/1312.5650.
Pascanu, R., Mikolov, T., & Bengio, Y. (2013). On the difficulty of training recurrent neural networks. Paper presented at the Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16-21 June 2013. http://jmlr.org/proceedings/papers/v28/pascanu13.html
Rekabsaz, N., Mitra, B., Lupu, M., & Hanbury, A. (2017). Toward Incorporation of Relevant Documents in word2vec. ArXiv e-prints.
Rojas-Carulla, M., Baroni, M., & Lopez-Paz, D. (2017). Causal Discovery Using Proxy Variables.
Szepesvári, C. (Draft). Algorithms for Reinforcement Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning.
Wellman, H. M., & Gelman, S. A. (1998). Knowledge Acquisition in Foundational Domains. In W. Damon, R. M. Lerner, D. Kuhn, & R. S. Siegler (Eds.), Handbook of child psychology: Cognition, perception, and language (Vol. 2, pp. 523-573). Hoboken, NJ, US: John Wiley & Sons Inc.
Zaremba, W., Mikolov, T., Joulin, A., & Fergus, R. (2016). Learning Simple Algorithms from Examples. Paper presented at the Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016. http://jmlr.org/proceedings/papers/v48/zaremba16.html
Le, Q. V., Jaitly, N., & Hinton, G. E. (2015). A Simple Way to Initialize Recurrent Networks of Rectified Linear Units. arXiv:1504.00941v2.
Pascanu, R., Mikolov, T., & Bengio, Y. (2012). Understanding the exploding gradient problem. CoRR, abs/1211.5063.
Cho, K., van Merrienboer, B., G"ul\ccehre, c., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. CoRR, abs/1406.1078.
Collobert, R., Weston, J., Bottou, L. e., Karlen, M., Kavukcuoglu, K., & Kuksa, P. P. (2011). Natural Language Processing (almost) from Scratch. CoRR, abs/1103.0398.
Kalchbrenner, N., Grefenstette, E., & Blunsom, P. (2014). A Convolutional Neural Network for Modelling Sentences. CoRR, abs/1404.2188.
Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. CoRR, abs/1408.5882.
Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D. (2014). The Stanford CoreNLP Natural Language Processing Toolkit. Paper presented at the Association for Computational Linguistics (ACL) System Demonstrations. http://www.aclweb.org/anthology/P/P14/P14-5010
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. D., Ng, A. Y., & Potts, C. (2013). Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. Paper presented at the Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Stroudsburg, PA.
Weston, J., Bordes, A., Chopra, S., & Mikolov, T. (2015). Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks. CoRR, abs/1502.05698.
Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural Machine Translation by Jointly Learning to Align and Translate. CoRR, abs/1409.0473.
Bengio, Y., Courville, A. C., & Vincent, P. (2012). Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives. CoRR, abs/1206.5538.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436--444. doi:10.1038/nature14539
Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61, 85 - 117. doi:https://doi.org/10.1016/j.neunet.2014.09.003
Lenz, I., Lee, H., & Saxena, A. (2013). Deep Learning for Detecting Robotic Grasps. CoRR, abs/1301.3592.
Abadi, M. i., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., . . . Zheng, X. (2016). TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. CoRR, abs/1603.04467.
Bastien, F. e. e., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I. J., Bergeron, A., . . . Bengio, Y. (2012). Theano: new features and speed improvements. CoRR, abs/1211.5590.
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R. B., . . . Darrell, T. (2014). Caffe: Convolutional Architecture for Fast Feature Embedding. CoRR, abs/1408.5093.
Vedaldi, A., & Lenc, K. (2014). MatConvNet - Convolutional Neural Networks for MATLAB. CoRR, abs/1412.4564.
Dahl, G. E., Yu, D., Deng, L., & Acero, A. (2012). Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition. IEEE Transactions on Audio, Speech, and Language Processing, 20(1), 30-42. doi:10.1109/TASL.2011.2134090
Graves, A., r. Mohamed, A., & Hinton, G. (2013). Speech recognition with deep recurrent neural networks. Paper presented at the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. http://dx.doi.org/10.1109/ICASSP.2013.6638947
Hinton, G., Deng, L., Yu, D., Dahl, G. E., r. Mohamed, A., Jaitly, N., . . . Kingsbury, B. (2012). Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups. IEEE Signal Processing Magazine, 29(6), 82-97. doi:10.1109/MSP.2012.2205597
Shen, C.-H., Sung, J. Y., & Lee, H.-Y. (2017). Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data. ArXiv e-prints.
Taigman, Y., Wolf, L., Polyak, A., & Nachmani, E. (2017). Voice Synthesis for in-the-Wild Speakers via a Phonological Loop. ArXiv e-prints.
van den Oord, A. a., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., . . . Kavukcuoglu, K. (2016). WaveNet: A Generative Model for Raw Audio. CoRR, abs/1609.03499.
Arandjelovic, R., & Zisserman, A. (2017). Look, Listen and Learn. CoRR, abs/1705.08168.
Aytar, Y., Vondrick, C., & Torralba, A. (2016). SoundNet: Learning Sound Representations from Unlabeled Video. CoRR, abs/1610.09001.
Harwath, D., Torralba, A., & Glass, J. R. (2016). Unsupervised Learning of Spoken Language with Visual Context. NIPS 2016.
Owens, A., Isola, P., McDermott, J. H., Torralba, A., Adelson, E. H., & Freeman, W. T. (2015). Visually Indicated Sounds. CoRR, abs/1512.08512.
Bergstra, J., & Bengio, Y. (2012). Random Search for Hyper-Parameter Optimization. Journal of Machine Learning Research, 13, 281--305.
Dean, J., Corrado, G. S., Monga, R., Chen, K., Devin, M., Le, Q. V., . . . Ng, A. Y. (2012). Large Scale Distributed Deep Networks. Paper presented at the Proceedings of the 25th International Conference on Neural Information Processing Systems, USA. http://dl.acm.org/citation.cfm?id=2999134.2999271
Girshick, R. B. (2015). Fast R-CNN. CoRR, abs/1504.08083.
Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. Paper presented at the Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Chia Laguna Resort, Sardinia, Italy. http://proceedings.mlr.press/v9/glorot10a.html
Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep Sparse Rectifier Neural Networks. Paper presented at the Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA. http://proceedings.mlr.press/v15/glorot11a.html
Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., . . . Bengio, Y. (2014). Generative Adversarial Networks. ArXiv e-prints.
Goodfellow, I. J., Warde-Farley, D., Mirza, M., Courville, A., & Bengio, Y. (2013). Maxout Networks. ArXiv e-prints.
He, K., Zhang, X., Ren, S., & Sun, J. (2014). Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. CoRR, abs/1406.4729.
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep Residual Learning for Image Recognition. CoRR, abs/1512.03385.
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. CoRR, abs/1502.01852.
Hinton, G., Vinyals, O., & Dean, J. (2014). Distilling the Knowledge in a Neural Network.
Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. CoRR, abs/1207.0580.
Kingma, D. P., & Ba, J. (2014). Adam: A Method for Stochastic Optimization. CoRR, abs/1412.6980.
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M. e., Fergus, R., & LeCun, Y. (2013). OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks. CoRR, abs/1312.6229.
Shelhamer, E., Long, J., & Darrell, T. (2016). Fully Convolutional Networks for Semantic Segmentation. CoRR, abs/1605.06211.
Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR, abs/1409.1556.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research, 15, 1929-1958.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., . . . Rabinovich, A. (2015). Going deeper with convolutions. Paper presented at the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). http://dx.doi.org/10.1109/CVPR.2015.7298594
Wan, L., Zeiler, M., Zhang, S., Cun, Y. L., & Fergus, R. (2013). Regularization of Neural Networks using DropConnect. Paper presented at the Proceedings of the 30th International Conference on Machine Learning, Atlanta, Georgia, USA. http://proceedings.mlr.press/v28/wan13.html
Yosinski, J., Clune, J., Bengio, Y., & Lipson, H. (2014). How transferable are features in deep neural networks? In Z. Ghahramani, M. Welling, C. Cortes, N. d. Lawrence, & K. q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27 (pp. 3320--3328): Curran Associates, Inc.
Zeiler, M. D., & Fergus, R. (2013). Visualizing and Understanding Convolutional Networks. CoRR, abs/1311.2901.
Ioffe, S., & Szegedy, C. (2015). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. CoRR, abs/1502.03167.
Nguyen, A. M., Yosinski, J., & Clune, J. (2014). Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images. CoRR, abs/1412.1897.
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., & Bengio, S. (2010). Why Does Unsupervised Pre-training Help Deep Learning? J. Mach. Learn. Res., 11, 625--660.
Coates, A., Ng, A., & Lee, H. (2011). An Analysis of Single-Layer Networks in Unsupervised Feature Learning. Paper presented at the Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA. http://proceedings.mlr.press/v15/coates11a.html
Hinton, G. E. (2012). A Practical Guide to Training Restricted Boltzmann Machines. In G. Montavon, G. B. Orr, & K.-R. Müller (Eds.), Neural Networks: Tricks of the Trade: Second Edition (pp. 599--619): Springer Berlin Heidelberg.
Le, Q. V., Monga, R., Devin, M., Corrado, G., Chen, K., Ranzato, M. A., . . . Ng, A. Y. (2011). Building high-level features using large scale unsupervised learning. CoRR.
Rifai, S., Vincent, P., Muller, X., Glorot, X., & Bengio, Y. (2011). Contracting auto-encoders: Explicit invariance during feature extraction. Paper presented at the Proceedings of the Twenty-eight International Conference on Machine Learning (ICML'11.
Shahriari, B., Swersky, K., Wang, Z., Adams, R. P., & de Freitas, N. (2016). Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proceedings of the IEEE, 104(1), 148-175. doi:10.1109/JPROC.2015.2494218
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., & Manzagol, P.-A. (2010). Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion. The Journal of Machine Learning Research, 11, 3371--3408.
Ji, S., Xu, W., Yang, M., & Yu, K. (2013). 3D Convolutional Neural Networks for Human Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 221-231. doi:10.1109/TPAMI.2012.59
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Large-scale Video Classification with Convolutional Neural Networks. Paper presented at the CVPR.
Lara, O. D., & Labrador, M. A. (2013). A Survey on Human Activity Recognition using Wearable Sensors. IEEE Communications Surveys and Tutorials, 15, 1192-1209.
Toshev, A., & Szegedy, C. (2013). DeepPose: Human Pose Estimation via Deep Neural Networks. CoRR, abs/1312.4659.
Wang, H., & Schmid, C. (2013). Action Recognition with Improved Trajectories. 2013 IEEE International Conference on Computer Vision. doi:10.1109/iccv.2013.441
Le, Q. V., & Mikolov, T. (2014). Distributed Representations of Sentences and Documents. CoRR, abs/1405.4053.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. CoRR, abs/1301.3781.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality. CoRR, abs/1310.4546.
Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. Paper presented at the In EMNLP.
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to Sequence Learning with Neural Networks. CoRR, abs/1409.3215.
Turian, J., Ratinov, L., & Bengio, Y. (2010). Word Representations: A Simple and General Method for Semi-supervised Learning. Paper presented at the Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Stroudsburg, PA, USA. http://dl.acm.org/citation.cfm?id=1858681.1858721