語音識別領域三十年來重要論文合集及其下載地址

整理:公眾號【深度學習每日摘要】


語音識別的研究歷史已經有三十多年了,從上個世紀八十年代的隱馬爾可夫模型,到二十一世紀初的幀級別的深度神經網絡模型,到2006年的CTC模型,到2012年的深度循環神經網絡模型,再到2014年的注意力機制運用到語音識別,2015年基于seq2seq模型的語音識別系統也被提出,再到2016年深度卷積神經網絡被用于大規模的語音識別系統。語音識別系統從最初的手動提取特征到如今的端對端的神經網絡模型,準確率已經接近97%。

本文列舉了自從1982年至今語音識別領域的相關論文,涵蓋了以上所有的模型,同時附上第一作者信息以及pdf文件下載鏈接。

論文清單已經按照發表年份以及首字母排序,完整論文清單以及下載鏈接請訪問:

https://github.com/zzw922cn/awesome-speech-recognition-papers

An Introduction to the Application of the Theory of Probabilistic Functions of a Markov Process to Automatic Speech Recognition(1982), S. E. LEVINSON et al. [pdf]

A Maximum Likelihood Approach to Continuous Speech Recognition(1983), LALIT R. BAHL et al. [pdf]

Heterogeneous Acoustic Measurements and Multiple Classifiers for Speech Recognition(1986), Andrew K. Halberstadt. [pdf]

Maximum Mutual Information Estimation of Hidden Markov Model Parameters for Speech Recognition(1986), Lalit R. Bahi et al. [pdf]

Hidden Markov Models for Speech Recognition(1991), B. H. Juang et al. [pdf]

Framewise phoneme classification with bidirectional LSTM and other neural network architectures(2005), Alex Graves et al. [pdf]

Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition(2012), Ossama Abdel-Hamid et al. [pdf]

Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks(2006), Alex Graves et al. [pdf]

Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition(2012), George E. Dahl et al. [pdf]

Deep Neural Networks for Acoustic Modeling in Speech Recognition(2012), Geoffrey Hinton et al. [pdf]

Sequence Transduction with Recurrent Neural Networks(2012), Alex Graves et al. [pdf]

Deep convolutional neural networks for LVCSR(2013), Tara N. Sainath et al. [pdf]

Improving deep neural networks for LVCSR using rectified linear units and dropout(2013), George E. Dahl et al. [pdf]

Improving low-resource CD-DNN-HMM using dropout and multilingual DNN training(2013), Yajie Miao et al. [pdf]

Improvements to deep convolutional neural networks for LVCSR(2013), Tara N. Sainath et al. [pdf]

Machine Learning Paradigms for Speech Recognition: An Overview(2013), Li Deng et al. [pdf]

Recent advances in deep learning for speech research at Microsoft(2013), Li Deng et al. [pdf]

Speech recognition with deep recurrent neural networks(2013), Alex Graves et al. [pdf]

Convolutional deep maxout networks for phone recognition(2014), László Tóth et al. [pdf]

Convolutional Neural Networks for Speech Recognition(2014), Ossama Abdel-Hamid et al. [pdf]

Combining time- and frequency-domain convolution in convolutional neural network-based phone recognition(2014), László Tóth. [pdf]

Deep Speech: Scaling up end-to-end speech recognition(2014), Awni Y. Hannun et al. [pdf]

End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results(2014), Jan Chorowski et al. [pdf]

First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs(2014), Andrew L. Maas et al. [pdf]

Long short-term memory recurrent neural network architectures for large scale acoustic modeling(2014), Hasim Sak et al. [pdf]

Robust CNN-based speech recognition with Gabor filter kernels(2014), Shuo-Yiin Chang et al. [pdf]

Stochastic pooling maxout networks for low-resource speech recognition(2014), Meng Cai et al. [pdf]

Towards End-to-End Speech Recognition with Recurrent Neural Networks(2014), Alex Graves et al. [pdf]

Attention-Based Models for Speech Recognition(2015), Jan Chorowski et al. [pdf]

Analysis of CNN-based speech recognition system using raw speech as input(2015), Dimitri Palaz et al. [pdf]

Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks(2015), Tara N. Sainath et al. [pdf]

Deep convolutional neural networks for acoustic modeling in low resource languages(2015), William Chan et al. [pdf]

Deep Neural Networks for Single-Channel Multi-Talker Speech Recognition(2015), Chao Weng et al. [pdf]

Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition(2015), Hasim Sak et al. [pdf]

Listen, Attend and Spell(2015), William Chan et al. [pdf]

Online Sequence Training of Recurrent Neural Networks with Connectionist Temporal Classification(2015), Kyuyeon Hwang et al. [pdf]

Advances in All-Neural Speech Recognition(2016), Geoffrey Zweig et al. [pdf]

Advances in Very Deep Convolutional Neural Networks for LVCSR(2016), Tom Sercu et al. [pdf]

End-to-end attention-based large vocabulary speech recognition(2016), Dzmitry Bahdanau et al. [pdf]

Deep Convolutional Neural Networks with Layer-Wise Context Expansion and Attention(2016), Dong Yu et al. [pdf]

Deep Speech 2: End-to-End Speech Recognition in English and Mandarin(2016), Dario Amodei et al. [pdf]

End-to-end attention-based distant speech recognition with Highway LSTM(2016), Hassan Taherian. [pdf]

Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning(2016), Suyoun Kim et al. [pdf]

Listen, attend and spell: A neural network for large vocabulary conversational speech recognition(2016), William Chan et al. [pdf]

Latent Sequence Decompositions(2016), William Chan et al. [pdf]

Segmental Recurrent Neural Networks for End-to-End Speech Recognition(2016), Liang Lu et al. [pdf]

Towards better decoding and language model integration in sequence to sequence models(2016), Jan Chorowski et al. [pdf]

Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition(2016), Yanmin Qian et al. [pdf]

Very Deep Convolutional Networks for End-to-End Speech Recognition(2016), Yu Zhang et al. [pdf]

Very deep multilingual convolutional neural networks for LVCSR(2016), Tom Sercu et al. [pdf]

Wav2Letter: an End-to-End ConvNet-based Speech Recognition System(2016), Ronan Collobert et al. [pdf]

WaveNet: A Generative Model for Raw Audio(2016), A?ron van den Oord et al. [pdf]

An enhanced automatic speech recognition system for Arabic(2017), Mohamed Amine Menacer et al. [pdf]

A network of deep neural networks for distant speech recognition(2017), Mirco Ravanelli et al. [pdf]

An Unsupervised Speaker Clustering Technique based on SOM and I-vectors for Speech Recognition Systems(2017), Hany Ahmed et al. [pdf]

Building DNN acoustic models for large vocabulary speech recognition(2017), Andrew L. Maas et al. [pdf]

Direct Acoustics-to-Word Models for English Conversational Speech Recognition(2017), Kartik Audhkhasi et al. [pdf]

English Conversational Telephone Speech Recognition by Humans and Machines(2017), George Saon et al. [pdf]

ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA(2017), Song Han et al. [pdf]

Deep LSTM for Large Vocabulary Continuous Speech Recognition(2017), Xu Tian et al. [pdf]

Multichannel End-to-end Speech Recognition(2017), Tsubasa Ochiai et al. [pdf]

Multi-task Learning with CTC and Segmental CRF for Speech Recognition(2017), Liang Lu et al. [pdf]

Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition(2017), Tara N. Sainath et al. [pdf]

Residual Convolutional CTC Networks for Automatic Speech Recognition(2017), Yisen Wang et al. [pdf]

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市,隨后出現的幾起案子,更是在濱河造成了極大的恐慌,老刑警劉巖,帶你破解...
    沈念sama閱讀 230,321評論 6 543
  • 序言:濱河連續發生了三起死亡事件,死亡現場離奇詭異,居然都是意外死亡,警方通過查閱死者的電腦和手機,發現死者居然都...
    沈念sama閱讀 99,559評論 3 429
  • 文/潘曉璐 我一進店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人,你說我怎么就攤上這事。” “怎么了?”我有些...
    開封第一講書人閱讀 178,442評論 0 383
  • 文/不壞的土叔 我叫張陵,是天一觀的道長。 經常有香客問我,道長,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 63,835評論 1 317
  • 正文 為了忘掉前任,我火速辦了婚禮,結果婚禮上,老公的妹妹穿的比我還像新娘。我一直安慰自己,他們只是感情好,可當我...
    茶點故事閱讀 72,581評論 6 412
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發上,一...
    開封第一講書人閱讀 55,922評論 1 328
  • 那天,我揣著相機與錄音,去河邊找鬼。 笑死,一個胖子當著我的面吹牛,可吹牛的內容都是我干的。 我是一名探鬼主播,決...
    沈念sama閱讀 43,931評論 3 447
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了?” 一聲冷哼從身側響起,我...
    開封第一講書人閱讀 43,096評論 0 290
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后,有當地人在樹林里發現了一具尸體,經...
    沈念sama閱讀 49,639評論 1 336
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內容為張勛視角 年9月15日...
    茶點故事閱讀 41,374評論 3 358
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發現自己被綠了。 大學時的朋友給我發了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 43,591評論 1 374
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖,靈堂內的尸體忽然破棺而出,到底是詐尸還是另有隱情,我是刑警寧澤,帶...
    沈念sama閱讀 39,104評論 5 364
  • 正文 年R本政府宣布,位于F島的核電站,受9級特大地震影響,放射性物質發生泄漏。R本人自食惡果不足惜,卻給世界環境...
    茶點故事閱讀 44,789評論 3 349
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧,春花似錦、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 35,196評論 0 28
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至,卻和暖如春,著一層夾襖步出監牢的瞬間,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 36,524評論 1 295
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人。 一個月前我還...
    沈念sama閱讀 52,322評論 3 400
  • 正文 我出身青樓,卻偏偏與公主長得像,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當晚...
    茶點故事閱讀 48,554評論 2 379

推薦閱讀更多精彩內容