利用深度變體自動編碼器改進宏基因組的組裝
作者:
小柯機器人發布時間:2021/1/5 16:19:03
丹麥哥本哈根大學Simon Rasmussen課題組的最新研究利用深度變體自動編碼器改進了宏基因組的組裝。該項研究成果發表在2021年1月4日出版的《自然-生物技術》上。
研究人員開發了用於宏基因組劃分(VAMB)的變體自動編碼器,該程序使用深度變體自動編碼器在聚類之前對序列多豐度和k-mer分布信息進行編碼。研究人員證明了變體自動編碼器能夠集成這兩種不同的數據類型,而無需注釋的數據集。VAMB的表現優於現有的最新編輯器,其利用模擬和真實數據分別重建了29-98%、45%或近乎完整(NC)的基因組。
此外,VAMB能夠分離出平均核苷酸同一性(ANI)達99.5%的密切相關菌株,並將1,000例人類腸道微生物組樣本數據集中的255個和91個NC多形擬桿菌和多雷擬桿菌特異性基因組重構為兩個不同的簇。研究人員使用了該數據集的2606個NC庫,來揭示人類腸道微生物組種群具有不同的分布模式。VAMB可以在標準計算機上運行,並且可以從https://github.com/RasmussenLab/vamb免費獲得。
據介紹,儘管最近的研究在宏基因組學分類方面取得了新進展,但是利用宏基因組學數據重構微生物物種仍然具有挑戰性。
附:英文原文
Title: Improved metagenome binning and assembly using deep variational autoencoders
Author: Jakob Nybo Nissen, Joachim Johansen, Rosa Lundbye Allese, Casper Kaae Snderby, Jose Juan Almagro Armenteros, Christopher Heje Grnbech, Lars Juhl Jensen, Henrik Bjrn Nielsen, Thomas Nordahl Petersen, Ole Winther, Simon Rasmussen
Issue&Volume: 2021-01-04
Abstract: Despite recent advances in metagenomic binning, reconstruction of microbial species from metagenomics data remains challenging. Here we develop variational autoencoders for metagenomic binning (VAMB), a program that uses deep variational autoencoders to encode sequence coabundance and k-mer distribution information before clustering. We show that a variational autoencoder is able to integrate these two distinct data types without any previous knowledge of the datasets. VAMB outperforms existing state-of-the-art binners, reconstructing 29–98% and 45% more near-complete (NC) genomes on simulated and real data, respectively. Furthermore, VAMB is able to separate closely related strains up to 99.5% average nucleotide identity (ANI), and reconstructed 255 and 91 NC Bacteroides vulgatus and Bacteroides dorei sample-specific genomes as two distinct clusters from a dataset of 1,000human gut microbiome samples. We use 2,606NC bins from this dataset to show that species of the human gut microbiome have different geographical distribution patterns. VAMB can be run on standard hardware and is freely available at https://github.com/RasmussenLab/vamb.
DOI: 10.1038/s41587-020-00777-4
Source: https://www.nature.com/articles/s41587-020-00777-4