科學家研發出可高效組裝人基因組的方法
作者:
小柯機器人發布時間:2020/5/6 14:22:10
美國加州大學聖克魯斯分校Benedict Paten及其課題組,利用納米孔測序和Shasta工具包完成了對11個人類基因組的高效從頭組裝。2020年5月4日的《自然-生物技術》在線發表了這項成果。
為了實現快速的人類基因組組裝,研究人員開發了Shasta(一種從頭開始的長讀組裝器)以及名為MarginPolish和HELEN的拋光算法。使用單個PromethION納米孔測序儀和Shasta工具包,研究人員在9 d內從頭組裝了11個高度連續的人基因組。
研究人員在每個樣品的三個流動池,獲得了約63倍的覆蓋範圍、42 kb的N50讀數值和6.5倍100 kb以上的讀數覆蓋率。在單個商業計算節點上,Shasta在6小時內完成了人完整的單倍體基因組組裝。
MarginPolish和HELEN拋光的單倍體組件與單納米孔讀數的同一性超過99.9%(Phred質量得分QV = 30)。另外,通過鄰近連接測序可以在近染色體水平對這11個基因組進行組裝。研究人員將Shasta與現有二倍體、單倍體和三重結合人類樣品的方法進行了比較,揭示了其更高的準確性和速度。
據悉,已經報導使用納米孔長讀長測序的方法從頭組裝人類基因組,但是它花費了超過150,000 CPU小時和數周的計算使用時間。
附:英文原文
Title: Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes
Author: Kishwar Shafin, Trevor Pesout, Ryan Lorig-Roach, Marina Haukness, Hugh E. Olsen, Colleen Bosworth, Joel Armstrong, Kristof Tigyi, Nicholas Maurer, Sergey Koren, Fritz J. Sedlazeck, Tobias Marschall, Simon Mayes, Vania Costa, Justin M. Zook, Kelvin J. Liu, Duncan Kilburn, Melanie Sorensen, Katy M. Munson, Mitchell R. Vollger, Jean Monlong, Erik Garrison, Evan E. Eichler, Sofie Salama, David Haussler, Richard E. Green, Mark Akeson, Adam Phillippy, Karen H. Miga, Paolo Carnevali, Miten Jain, Benedict Paten
Issue&Volume: 2020-05-04
Abstract: De novo assembly of a human genome using nanopore long-read sequences has been reported, but it used more than 150,000CPU hours and weeks of wall-clock time. To enable rapid human genome assembly, we present Shasta, a de novo long-read assembler, and polishing algorithms named MarginPolish and HELEN. Using a single PromethION nanopore sequencer and our toolkit, we assembled 11 highly contiguous human genomes de novo in 9d. We achieved roughly 63× coverage, 42-kb read N50 values and 6.5× coverage in reads >100kb using three flow cells per sample. Shasta produced a complete haploid human genome assembly in under 6h on a single commercial compute node. MarginPolish and HELEN polished haploid assemblies to more than 99.9% identity (Phred quality score QV=30) with nanopore reads alone. Addition of proximity-ligation sequencing enabled near chromosome-level scaffolds for all 11 genomes. We compare our assembly performance to existing methods for diploid, haploid and trio-binned human samples and report superior accuracy and speed.
DOI: 10.1038/s41587-020-0503-6
Source: https://www.nature.com/articles/s41587-020-0503-6