【Github分享】語音交互、NLP相關資源分享

2021-02-14 語音雜談

導語：本文是關於語音交互和NLP相關的代碼的論文、語料庫、代碼、項目、教學等資源連結。讀完本文需要10分鐘。

該資源是在GitHub上ID為【mxer】的小夥伴分享的語音方向的資源，以下為原地址：https://github.com/mxer/awesome-speech#1.2

以下為目錄，如果喜歡，可以複製連結進行跳轉：

1.page

Xingyu Na

LanguageProcessing and Pattern Recognition in University of Aachen

Fernando de laCalle Silos

2.open source library/toolbox/code

HTK

Py2HTK

parallel-htk

HTK_C_MATLAB_tools

Kaldi:

Kaldi官方文檔（中文版）

Kaldi models

Corpus PhoneticsTutorial

py-kaldi-asr

https://github.com/pykaldi/pykaldi

https://github.com/gooofy/py-kaldi-asr

https://github.com/UFAL-DSG/pykaldi

https://github.com/janchorowski/kaldi-python

Dan's DNNimplementation:

pytorch-kaldi

kaldi-lstm

kaldi-ctc

keras-kaldi

python wrapperfor kaldi-online-decoder

Kaldi+PDNN

tfkaldi

Kaldi_CNTK_AMI

kaldi-io-for-python

kaldi-pyio

kaldi-tree-conv

kaldi-ivector

kaldi-yesno-tutorial

Kaldi nnet3 教程

Josh Meyer'sWebsite

Adapting your ownLanguage Model for Kaldi

Some Kaldi Notes

http://jrmeyer.github.io/asr/2016/02/01/Kaldi-notes.html

http://sentiment-mining.blogspot.com/

http://pages.jh.edu/~echodro1/tutorial/kaldi/

kaldi_tutorial

Online decoderfor Kaldi NNET2 and GMM speech recognition models with Pythonbindings

ResNet-Kaldi-Tensorflow-ASR

Kaldi ASR:Extending the ASpIRE model

FastCGI supportfor Kaldi ASR

alignUsingKaldi

kaldi-readers-for-tensorflow

kaldi-iot

lattice-info

lattice-char-to-word

lattice-word-length-distribution

kaldi-lattice-word-index

kaldi-decoders

lattice-remove-ctc-blank

kaldi-lattice-search

htk2kaldi

parallel-kaldi

kaldi在線中文識別系統搭建

kaldi-docker

CSLT-Sparse-DNN-Toolkit

featxtra

Sphinx

https://cmusphinx.github.io/

https://github.com/cmusphinx

https://github.com/cmusphinx/pocketsphinx

OpenFst

http://www.openfst.org/twiki/bin/view/FST/WebHome

https://github.com/UFAL-DSG/openfst

https://github.com/benob/openfst-utils

https://github.com/vchahun/pyfst

MIT SpokenLanguage Systems

Julius

Bavieca

Simon

SIDEKIT

SRILM

https://www.sri.com/engage/products-solutions/sri-language-modeling-toolkit

http://www.speech.sri.com/projects/srilm/

https://github.com/nuance1979/srilm-python

https://github.com/njsmith/pysrilm

awd-lstm-lm

ISIP

MIT Finite-StateTransducer (FST) Toolkit

MIT LanguageModeling (MITLM) Toolkit

OpenGrm

RNNLM

http://www.fit.vutbr.cz/~imikolov/rnnlm/

https://github.com/IntelLabs/rnnlm

https://github.com/glecorve/rnnlm2wfst

faster-rnnlm

CUED-RNNLMToolkit

Using RNNLMrescoring a sentence in Chinese ASR system

KenLM

rwthlm

word-rnn-tensorflow

tensorlm

SpeechRecognition

SpeechPy

Aalto

google-cloud-speech

apiai

https://pypi.org/project/apiai/

wit

Nabu

asr-study

dejavu

uSpeech

Juicer

PMLS

dragonfly

SPTK

https://github.com/r9y9/SPTK

https://github.com/sp-nitech/SPTK

http://sp-tk.sourceforge.net/

pysptk

RWTH ASR

Palaver

Praat

SpeechRecognition Grammar Specification

Automatic_Speech_Recognition

speech-to-text-wavenet

tensorflow-speech-recognition

tensorflow_end2end_speech_recognition

tensorflow_speech_recognition_demo

AVSR-Deep-Speech

TTS and ASR

CTC + TensorflowExample for ASR

tensorflow-ctc-speech-recognition

speechT

end2endASR

NADU

DTW (Dynamic TimeWarping) python module

Various scriptsand tools for speech recognition model building

基於深度學習的語音識別系統，使用CNN、LSTM和CTC實現的中文語音識別系統

tacotron_asr

ASR_Keras

Kaggle TensorflowSpeech Recognition Challenge

Speechrecognition script for Asterisk that uses google's speech engine

Libraries andscripts for manipulating and handling ASR output/n-bests/etc

Some scripts andcommands for working with ASR

PySpeechGrammar

Python module forevaluating ASR hypotheses

edit-distance

3.dataset

VoxForge

http://www.voxforge.org/home

http://www.voxforge.org/zh

http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Audio/Main/16kHz_16bit/

ASR Audio DataLinks

The CMUPronouncing Dictionary

TIMIT

https://catalog.ldc.upenn.edu/LDC93S1

https://github.com/syhw/timit_tools

https://github.com/philipperemy/timit

GlobalPhoneLanguage Models

1 Billion WordLanguage Model Benchmark

DaCiDian-Develop

AISHELL

CC-CEDICT

https://www.mdbg.net/chinese/dictionary?page=cc-cedict

TED-LIUM

open-asr-lexicon

4.Tutorial

University ofEdinburgh ASR2017-18

stanford CS224s

NYU asr12

SpeechRecognition with Neural Networks

page

CSTR-Edinburgh

open source library/toolbox

WORLD

HTS

http://hts.sp.nitech.ac.jp/

http://hts-engine.sourceforge.net/

https://github.com/shamidreza/HTS-demo_CMU-ARCTIC-SLT-Formant

https://github.com/MattShannon/HTS-demo_CMU-ARCTIC-SLT-STRAIGHT-AR-decision-tree

Tacotron

https://github.com/Kyubyong/tacotron

https://github.com/Kyubyong/expressive_tacotron

https://github.com/keithito/tacotron

https://github.com/GSByeon/multi-speaker-tacotron-tensorflow

https://github.com/r9y9/tacotron_pytorch

https://github.com/soobinseo/Tacotron-pytorch

Tacotron2

https://github.com/NVIDIA/tacotron2

https://github.com/riverphoenix/tacotron2

https://github.com/A-Jacobson/tacotron2

https://github.com/selap91/Tacotron2

https://github.com/LGizkde/Tacotron2_Tao_Shujie

https://github.com/rlawns1016/Tacotron2

https://github.com/CapstoneInha/Tacotron2-rehearsal

Merlin

mozilla TTS

Flite

Speect

Festival

eSpeak

nnmnkwii

Ossian

gTTS

gnuspeech

supercollider

sc3-plugins

Neural_Network_Voices

pggan-pytorch

cainteoir-engine

loop

nnmnkwii

TTS and ASR

musa_tts

marytts(JAVA)

1.open source library/toolbox

Alize

speaker-recognition-py3

openVP

2.Genderrecognition by voice and speech analysis

page

NTU

Tsung-Hsien Wen

open source library/toolbox

PyDial

alex

ROS 語音交互系統

結合ROS框架的中文語音交互系統

1.Speech Processing

madmom

pydub

kapre: KerasAudio Preprocessors

BTK

EspNet

Signal-Processing

pyroomacoustics

librosa

REAPER

MSD_split_for_tagging

VOICEBOX

liquid-dsp

ffts

mir_eval

aupyom

Pitch Detection

TFTB

maracas

SRMRpy

ssp

iss

asr_preprocessing

asrt

Audio superresolution using NN

RNN training fornoise reduction in robust asr

RNN for audionoise reduction

muda

Efficient samplerate conversion in python

Smarc audio rateconverter

Python scripts tocomputes f0s of a wave file

2.Audio I/O

PortAudio

audiolab

pytorch audio

Digital SpeechDecoder

audioread

audacity.py

3.Sound Source Separation

HARK

Deep RNN forSource Separation

nussl

DNN for MusicSource Separation in Tensorflow

Alexey Ozerov

University ofSurrey CVSSP

source separationusing CNN

4.FeatureExtraction

openSMILE

veles.sound_feature_extraction

vamp-plugin-sdk

Yaafe

py_bank

AuditoryFilterbanks

python_speech_features

VAD

https://github.com/jtkim-kaist/VAD

https://github.com/jtkim-kaist/VAD_DNN

https://github.com/marsbroshok/VAD-python

https://github.com/shiweixingcn/vad

https://github.com/fedden/RenderMan

rVAD

Aurora 2 VAD

IsraelCohen

Python interfaceto the WebRTC Voice Activity Detector

1.code/tool/data

cmusphinx

julius-speech

OpenSLR

List of speechrecognition software

KTH

VERBIO

timeview

Speech at CMU WebPage

CMU Robust SpeechGroup

Speech Softwareat CMU

Aalto SpeechResearch

CMU FestvoxProject

CSTR

Xiph

Brno Universityof Technology Speech Processing Group

SoX

STRAIGHT

Idiap ResearchInstitute

Transcriber

Amirsina Torfi

The SpeechRecognition Virtual Kitchen

SparseRepresentation & Dictionary Learning Algorithms with Applicationsin Denoising, Separation, Localisation and Tracking

Audacity

beetbox

CAQE

UCL Speech FilingSystem

Ryuichi Yamamoto

Kyubyong Park

HideyukiTachibana

Colin Raffel

Paul Dixon

smacpy

c4dm

Matt Shannon

Keunwoo Choi

ADASP

uchicago Speechand Language @ TTIC

justin salamon

COLEA

openAUDIO

Praat

librosa

Essentia

timmahrt

Lefteris Zafiris

audio-to-audioand audio-to-midi alignment

DNN based hotwordand wake word detection toolkit

free-spoken-digit-dataset

中文語言資源聯盟

Institute ofFormal and Applied Linguistics – Dialogue Systems Group

https://github.com/UFAL-DSG

https://github.com/edobashira/speech-language-processing

https://github.com/andabi?tab=repositories

https://code.soundsoftware.ac.uk/projects

2.tutorial

DL for ComputerVision, Speech, and Language

臺大數位語音處理概論

IISc SpeechInformation Processing

paper

https://arxiv.org/search/?query=speech&searchtype=all&source=header

https://www.isca-speech.org/iscaweb/index.php/archive/online-archive

https://www.aclweb.org/anthology/

https://github.com/zzw922cn/awesome-speech-recognition-speech-synthesis-papers

states of thearts and recent results (bibliography) on speech recognition

Dan Povey

cmusphinx

CMU LanguageTechnologies Institute

CMU SPEECH@SV

MitsubishiElectric Research Laboratorie

MIT SpokenLanguage Systems

Brno Universityof Technology Speech Processing Group

IISc

uchicago Speechand Language @ TTIC

RWTH AachenUniversity

TOKUDA andNANKAKU LABORATORY

Institute ofFormal and Applied Linguistics – Dialogue Systems Group

Ohio StateUniversity speech separation

LEAP Laboratory

Hainan Xu

Mark Gales

Karen Livescu

Shubham Toshniwal

Adrien Ycart

Ron Weiss

Yajie Miao

Scott T Wisdom

Alan W Black

Amirsina Torfi

Liang Lu

Zhizheng WU

justin salamon

Karen Livescu

Shubham Toshniwal

Keith Vertanen

Aviv Gabbay

Mehryar Mohri

Jonathan LE ROUX

Suyoun Kim

DeepSound

Lei Xie

該資源是在GitHub上ID為【msgi】的小夥伴分享的NLP相關的資源，以下為原地址：https://github.com/msgi/nlp-journey

以下為目錄，如果喜歡，可以複製連結進行跳轉：

https://github.com/msgi/nlp-journey/blob/master/docs/tools.md

https://github.com/msgi/nlp-journey/blob/master/docs/alg.md

https://github.com/msgi/nlp-journey/blob/master/docs/basic.md

https://github.com/msgi/nlp-journey/blob/master/docs/fq.md

https://github.com/msgi/nlp-journey/blob/master/docs/notes.md

https://pan.baidu.com/share/init?surl=sE_20nHCfej6f9yRaisz7Q

http://www.ituring.com.cn/book/1605

https://www.deeplearningbook.org/

http://neuralnetworksanddeeplearning.com/

https://nndl.github.io/

http://web.stanford.edu/~jurafsky/slp3/ed3book.pdf

http://cs224d.stanford.edu/

1.算法模型與優化

http://www.bioinf.jku.at/publications/older/2604.pdf

https://arxiv.org/pdf/1207.0580.pdf

https://arxiv.org/pdf/1512.03385.pdf

https://arxiv.org/pdf/1502.03167.pdf

2.綜述論文

https://arxiv.org/pdf/1812.08951.pdf

https://arxiv.org/pdf/1803.07133.pdf

3.語言模型

https://www.researchgate.net/publication/221618573_A_Neural_Probabilistic_Language_Model

https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf

4.文本增強

https://arxiv.org/pdf/1901.11196.pdf

5.文本預訓練

https://arxiv.org/pdf/1301.3781.pdf

https://arxiv.org/pdf/1405.4053.pdf

地址https://arxiv.org/pdf/1607.04606.pdf

解讀https://www.sohu.com/a/114464910_465975

https://nlp.stanford.edu/projects/glove/

https://arxiv.org/pdf/1802.05365.pdf

https://arxiv.org/pdf/1810.04805.pdf

https://arxiv.org/pdf/1906.08101.pdf

https://arxiv.org/pdf/1906.08237.pdf

6.文本分類

https://arxiv.org/pdf/1510.03820.pdf

https://arxiv.org/pdf/1408.5882.pdf

https://www.aclweb.org/anthology/P16-2034

7.文本生成

https://arxiv.org/pdf/1805.06553.pdf

https://arxiv.org/pdf/1609.05473.pdf

https://arxiv.org/pdf/1605.05396.pdf

8.文本相似性

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.723.6492&rep=rep1&type=pdf

https://www.aclweb.org/anthology/W16-1617

9.短文本匹配

http://papers.nips.cc/paper/5019-a-deep-architecture-for-matching-short-texts.pdf

10.自動問答

https://arxiv.org/pdf/1801.08290.pdf

https://arxiv.org/pdf/1812.08989.pdf

https://arxiv.org/pdf/1702.01932.pdf

https://arxiv.org/pdf/1512.01337v1.pdf

https://arxiv.org/abs/1612.01627

https://arxiv.org/pdf/1806.09102.pdf

https://www.aclweb.org/anthology/P18-1103

11.機器翻譯

https://arxiv.org/pdf/1406.1078v3.pdf

https://arxiv.org/pdf/1706.03762.pdf

https://arxiv.org/pdf/1901.02860.pdf

12.自動摘要

https://arxiv.org/pdf/1704.04368.pdf

13.事件提取

https://pdfs.semanticscholar.org/ca70/480f908ec60438e91a914c1075b9954e7834.pdf

14.推薦系統

https://arxiv.org/pdf/1905.06874.pdf

必讀博文

https://jalammar.github.io/illustrated-transformer/

http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp/

https://www.countbayesie.com/blog/2017/5/9/kullback-leibler-divergence-explained

https://blog.keras.io/building-autoencoders-in-keras.html

https://nlpoverview.com/

https://github.com/msgi/nlp-journey

https://www.cnblogs.com/rucwxb/p/10277217.html

https://zhuanlan.zhihu.com/p/49271699

https://zhuanlan.zhihu.com/p/70257427

https://mp.weixin.qq.com/s?__biz=MzI4MDYzNzg4Mw==&mid=2247488287&idx=2&sn=aa7b045337940886d5a7767f95ab0128&chksm=ebb42bcbdcc3a2ddcfb73fb77bead9655d6608b1a951a8b429fb2c38d56ca92289e97e6decd1&mpshare=1&scene=24&srcid=0930GzGGm3m7uZfJyblgWV3k&key=5b1b221b044835abb8ce952ed69e6acdfe5f30700caa3c560c8fe663354916c6753858e4dbbf1b4d1c2eded3876c67c0983d3d51324c321458405b0cacec9103640c28a7a5c068729172703bf23c0348&ascene=14&uin=Mjk3NzQ2NDczMQ%3D%3D&devicetype=Windows+10&version=62060833&lang=zh_CN&pass_ticket=uQSzwn38HjOIK%2BZwFf5AXCp%2Fk0QiE7budc%2Bl5t1yBFtOXA%2BPvSaFwqUWEwEmyZEd

https://github.com/msgi/nlp-journey/tree/master/nlp/embedding

fasttext(skipgram+cbow)

gensim(word2vec)

https://github.com/msgi/nlp-journey/blob/master/nlp/similaritybilstm+crf

https://github.com/CyberZHG/keras-gpt-2

https://github.com/jiangxinyang227/textClassifier

https://github.com/Lsdefine/attention-is-all-you-need-keras

https://github.com/miroozyx/BERT_with_keras

https://github.com/CyberZHG/keras-bert

https://github.com/iliaschalkidis/ELMo-keras

https://github.com/tyo-yo/SeqGAN

http://www.52nlp.cn/

https://kexue.fm/category/Big-Data

https://www.cnblogs.com/pinard/

https://tobiaslee.top/

https://github.com/msgi/nlp-journey

https://www.jiqizhixin.com/

https://colah.github.io/

https://zhpmatrix.github.io/

http://www.wildml.com/

http://www.shuang0420.com/

https://www.zybuluo.com/hanbingtao/note/433855

https://www.aclweb.org/portal/

https://www.emnlp-ijcnlp2019.org/

https://www.sheffield.ac.uk/dcs/research/groups/nlp/iccl/index#tab00

https://nips.cc/

https://www.aaai.org/

https://www.ijcai.org/

https://icml.cc/

【Github分享】語音交互、NLP相關資源分享

相關焦點

乾貨 | NLP、知識圖譜教程、書籍、網站、工具...(附資源連結)

Awesome-Chinese-NLP:中文自然語言處理相關資料

【Github】nlp-roadmap:自然語言處理路相關路線圖(思維導圖)和關鍵詞(知識點)

分享GitHub上一些嵌入式相關的高星開源項目

一文學會最常見的10種NLP處理技術(附資源&代碼)

【分享包】最全語音文本數據、工具包大分享,快來下載吧!(II)

【設計分享】人機互動的未來:語音交互和觸控交互

一份超全的PyTorch資源列表(GitHub 2.2K Stars)

NLP簡報(Issue#8)

【2018最新版】 200個機器學習 && NLP && Python 相關教程

【數據】CMU大佬分享三類優質數據集:綜合、CV和NLP

資源 | 史丹福大學發布Stanford.NLP.NET:集合多個NLP工具

2018,語音交互何去何從?

支持53種語言預訓練模型,斯坦福發布全新NLP工具包StanfordNLP

寫給NLP研究者的編程指南

自然語言處理任務相關經典論文、免費書籍、博客、tf代碼整理分享

打包帶走,競賽必備的NLP庫

【NLP】競賽必備的NLP庫

【福利第2彈】自然語言處理NLP知識資料大全集(一鍵下載!入門/進階/論文/Toolkit/數據/綜述/專家等)

【專知薈萃02】自然語言處理NLP知識資料大全集(入門/進階/論文/Toolkit/數據/綜述/專家等)(附pdf下載)