Microbiome Discovery 宏基因組入門課程

2021-02-14 生信菜鳥團

偶然間在 youtube 上看到 Dan Knights 的 Microbiome Discovery 宏基因組入門課程，大致瀏覽了一下，由淺入深，從理論到實踐講得非常不錯，真是相見恨晚 QAQ，只看這個應該完全足夠入門宏基因組了~

課程播放列表：https://www.youtube.com/playlist?list=PLOPiWVjg6aTzsA53N19YqJQeZpSCH9QPc

RMarkdown 示例數據及實踐代碼：https://github.com/danknights/mice8992-2016

視頻目錄1. Intro to the Microbiome

•介紹微生物組•如何進行研究•面對的一些挑戰（微生物組數據相對不穩定，biomarker discovery）

網址 https://youtu.be/6564K4-_DBI

2. How microbiome data are generated

•如何產生這些數據的•兩種測序方法的優劣•宏基因組測序•擴增子測序

網址 https://youtu.be/FWT1HBzlWOE

3. 16S Variable Regions

•為什麼選擇 16S 片段，16S rRNA 的結構功能•OTU 從何而來

網址 https://youtu.be/8Aa_mnyXm70

4. QIIME

•QIIME 分析流程介紹

網址 https://youtu.be/iy0JWgzmM_A

4.5. (Optional) UNIX Command Line

•UNIX 命令介紹以及 Git 的使用

網址 https://youtu.be/u2IQQUMeWy8

5. Picking OTUs

•OTU 聚類方法•closed reference•de novo•UCLUST•CD-HIT•SUMACLUST•mothur•SWARM•open reference

網址 https://youtu.be/Ok5h24KZbAE

6. Assigning Taxonomy

•如何注釋菌群分類•The Random Forests classifier seems to work better•Nearest neighbor using optimal gapped alignment with large reference databases will probably win eventually

網址 https://youtu.be/HkwFdzFLZ0I

7. Alpha Diversity

•Alpha diversity measures diversity within communities•Beta diversity measures diversity between communities•Rarefaction determines saturation•There is room for experimental validation•不同計算 Alpha Diversity 的方法•species count•phylogenetic diversity (PD)•Chao1 Estimator

網址 https://youtu.be/9ZvoR89HYP8

8. Beta Diversity

•Beta diversity measures diversity between communities•不同 Beta Diversity 的計算方法•euclidean distance•Chi-square distance, Chi-square is usually best for gradients•Bray-Curtis•Most people use Bray Curtis or UniFrac•用 PCoA 可視化

網址 https://youtu.be/lcbp6EecDg4

9. UniFrac

•Beta diversity using UniFrac

網址 https://youtu.be/M8ylvsS0MHg

10. Statistical testing part 1

•統計學基礎•Linear models are not always appropriate•Non-parametric tests (no distribution assumptions)•Generalized linear models(better underlying distributions)

網址 https://youtu.be/_uDv7LRUUsY

11. Statistical testing part 2

•統計學基礎•t-test：Compare 2 groups•ANOVA：Compare three or more groups•Correlation：Compare to a continuous variable (e.g.Age)•Linear Regression：Similar to correlation,but you can regress on multiple variables at the same time•NOTE：all of these assume normal distributions!•When linear regression tests do not have normally distributed residuals,use a generalized linear model with the negative binomial distribution.This is in the edgeR package in R.•Use false discovery rate (FDR) to correct for multiple hypothesis testing.•If you don't need to control for confounders, non-parametric tests are very safe (although lower power than linear models or generalized linear models).•Two-category test：Mann-Whitney U (Wilcoxon) test (like a t-test)•Multi-category test：Kruskal-Wallis (like ANOVA)•Continuous test：Spearman correlation (like Pearson correlation)

網址 https://youtu.be/tNxfYqa5Rtc

12. Visualizing Microbiome Diversity, Ordination

•用 R 或 QIIME 可視化•PCA•PCoA•NMDS

網址 https://youtu.be/H-u2iyiTzj0

13. Detrending and detecting gradients

•用 QIIME 進行 detrending•Detrending does not have strong statistical foundations•Use detrending for visualizing a primary gradient•Use detrending to test whether your ordination recovered the primary gradient in axis 1

網址 https://youtu.be/aNLPzdfivkM

14. Constrained Ordination

•CCA does direct gradient analysis•Never use more than 3-4 variates•More will simply over fit the data•Measure success by ratio of constrained variance explained to unconstrained variance explained•Canonical Correspondence analysis == Constrained Correspondence analysis•Not to be confused with canonical correlation analysis

網址 https://youtu.be/wHSECEI2tnQ

15. Clustering

•Use caution with supervised ordination - need to assess significance carefully•Prediction strength >0.9 or Silhouette index >0.5•Clusters can be useful ways to analyze high-dimensional data•However, direct analysis is generally better when you have known gradients/groups•Diagnostics based on direct supervised analysis generally better

網址 https://youtu.be/ORX968xJqiA

16. Supervised Learning Background

•Supervised learning tries to learning a model that will predict outcomes for novel samples•Example: classify cancer patients to determine treatment path•Models have to balance low complexity (underfitting) and high complexity (overfitting)•Model accuracy should be assessed in separate test data that it has never seen•10-fold cross validation is standard

網址 https://youtu.be/-eXnrA_3xzA

17. Supervised Learning Applications

•用 QIIME 進行隨機森林分類

網址 https://youtu.be/ecz5SzP6Z_U

18. Source Tracking

•介紹 Source Tracking 實現原理以及 SourceTracker 應用•Microbial source tracking can be done at the community-wide level•SourceTracker uses Bayesian methods to deconvolute mixtures of communities•Can identify contributions of individual species from each source environment•Does not model changes after mixing (temporal dynamics)•SourceTracker：github.com/danknights/sourcetracker/releases

網址 https://youtu.be/sDevHMuYJ28

19. Compositionality

•Compositionality can cause spurious and even opposite conclusions•Dominant bugs can skew the relative abundance of minor bugs•Correlation is hard to infer•See Sparco, SPIEC-EASI•Best to do analysis with absolute abundances when possible•Spike-ins of foreign bugs and/or q PCR can circumvent this

網址 https://youtu.be/X60nFYpLWRs

20. PICRUSt and predicting functions

. PICRUSt and predicting functions

•Shotgun metagenomics can describe the full functional repertoire of a metagenome, but it is expensive•PICRUSt can produce 80-85% accurate metagenomes from 16S data sets•Useful for mining published data•Can be used to select a subset of 16S samples for shotgun sequencing•Be sure to treat the results as "suggestive only"in publications•Mostly useful on human gut samples

網址 https://youtu.be/mPQCl_cHCsM

21. Shotgun Taxonomy

•Shotgun metagenomics can be used for identifying species•Far superior to 16S•Approaches to Shotgun taxonomy•MetaPhlAn and MetaPhlAn2•Pre-identify a set of marker genes•Genes that are conserved within a species but not elsewhere•Requires alignment,but uses small database•Kraken,others•Use all unique k-mers as markers•UItrafast,but large database

網址 https://youtu.be/DlQTXdb2rhg

看到這裡的小夥伴恭喜你發現了隱藏福利~ 我幫大家搬運了全集

連結:https://pan.baidu.com/s/194r0zs5WbcNFQKQrV0Nnkg 密碼:0rjr

生信技能樹目前已經公開了三個生信知識庫，記得來關注哦~

每周文獻分享

https://www.yuque.com/biotrainee/weeklypaper

腫瘤外顯子分析指南

https://www.yuque.com/biotrainee/wes

生物統計從理論到實踐

https://www.yuque.com/biotrainee/biostat

友情宣傳

強烈建議你推薦給身邊的博士後以及年輕生物學PI，多一點數據認知，讓他們的科研上一個臺階：

•底褲價轉錄組產品線（還送數據分析培訓）(八九百一個樣品)•三維基因組學分析實戰培訓班，線上直播課，2天僅需399（生信技能樹粉絲特權價格）•生信技能樹的2019年終總結，你的生物信息學成長寶藏•2020學習主旋律，B站74小時免費教學視頻為你領路

Microbiome Discovery 宏基因組入門課程

相關焦點

Review:Microbiota, metagenome, microbiome傻傻分不清

Nature綜述:Microbiota, metagenome, microbiome傻傻分不清

Microbiome:CAMISIM模擬宏基因組和微生物群落

2019微生物組—宏基因組分析專題培訓第三期

利用深度變體自動編碼器改進宏基因組的組裝

宏基因組和代謝組學揭示結直腸癌相關菌群的階段特異性

JoVE微生物組專刊徵稿,寫方法拍視頻教程發SCI(宏基因組公眾號專屬福利)

你想要的宏基因組-微生物組知識全在這(2020.9)

宏基因組02. HUMAnN2 --宏基因組代謝通路分析

nanopore宏基因組分析培訓班（第2期）開始報名了

Nature子刊:HUMAnN2實現宏基因組和宏轉錄組種水平功能組成分析

三代nanopore宏基因組測序數據分析，北京，11月7-9日

「量子化學入門」課程

資源分享 | 微生物組分析必備書籍《Microbiome Analysis》

多快好省的宏基因組研究技巧

通過對一系列宏基因組的亞胺還原酶進行篩選和表徵實現生物催化...

【課程預告】手把手教你入門生信——The Biostar Handbook

高中化學電子式入門課程第一講

【音頻信號處理專欄】【2】入門書籍和課程推薦連結索引

DNA/RNA-SIP與宏基因組