At ISMB in Stockholm, July 2009, Bujie Zhan et al. of Department of Genetics and Biotechnology at Aarhus University, published a poster where they have compared various de novo assembly algorithms to see which one delivered the best results to outline the most efficient de novo assembly strategy for bacterial genomes.
CLC bio’s de novo assembly algorithm was compared to edena, SSAKE, Velvet, and 454’s Newbler, and was found to give the largest contig size with Solexa data. The performance of CLC bio’s assembler was especially good, when doing hybrid assemblies of data from both 454 and Solexa instruments.
Click here to download the “Comparison of assembly strategies for high-throughput de novo sequencing of bacterial genomes” poster (80 KB) in PDF format.
This Monday, Stephen Quake et al. published an article in Nature Biotechnology on the first human genome sequenced with a Single-Molecule Sequencing approach using Helicos’ platform.
The publication has garnered quite an amount of PR, being cited in both New York Times and The Independent, while the scientific community has blogged and twitted extensively on the subject.
We have assembled a small collection of links on the subject that are definitely worth reading!
Bio-IT World:
Quake Sequences Personal Genome Using Helicos Single-Molecule Sequencing
The Single Life: Stephen Quake Q&A
Time Online:
Inflated claims for the $50,000 genome
Last week Nicola Palmieri and Christian Schlötterer of Institut für Populationsgenetik, Veterinärmedizinische Universität Wien, Vienna, Austria, published an article in PLoS ONE, Vol. 4, No. 7. (28 July 2009), e6323, under the headline Mapping Accuracy of Short Reads from Massively Parallel Sequencing and the Implications for Quantitative Expression Profiling.
This is the conclusions of the article:
In complex genomes, expression profiling by massively parallel sequencing could introduce a considerable bias due to incorrectly mapped sequence reads if the read length is short. Nevertheless, this bias could be accounted for if the genomic sequence is known. Furthermore, sequence polymorphisms and indels also affect the mapping accuracy and may cause a biased gene expression measurement. The choice of the mapping software is highly critical and the reliability depends on the presence/absence of indels and the divergence between reads and the reference genome. Overall, we found SSAHA2 and CLC to produce the most reliable mapping results.
We’re of course very happy to see our own internal findings confirmed in an independent study, even though they only had our command-line application CLC NGS Cell running at fraction of the speed it’s capable of - and still retain full quality.
Via GenomeWeb, I had quite a laugh at Keith Robinson’s recent blog post on Metagenomic Analysis with Galaxy: Windshield Genomics and Beyond - a dataset he came across while browsing the NCBI Short Read Archive, with the following abstract:
When I drive through Pennsylvania in June my windshield gets quite dirty with all these bugs. Yet do I know what they are? How many beetles versus butterflies? Is there a difference between day and night? Is there a difference between Pennsylvania and Connecticut? So we scraped the windshield, isolated genomic DNA, and subjected it to 454 FLX sequencing. We then uploaded the data into Galaxy and attempted answering these questions. In the end Pennsylvania turned out to be different from Connecticut.
It’s certainly an alternative use of high-throughput sequencing capacity at this point, but a fun one!
On a related note I might add that I would like to read a physics article on windshield dynamics, as I - through ongoing empirical observations - experience a dramatical increase in the number of bugs squashed on my windshield if my average commuting speed is raised by around 10%…
Research by a group of Montreal scientists calls into question one of the most basic assumptions of human genetics: that when it comes to DNA, every cell in the body is essentially identical to every other cell.
This discovery sprang from an investigation into the underlying genetic causes of abdominal aortic aneurysms (AAA) led by Dr. Morris Schweitzer, Dr. Bruce Gottlieb, Dr. Lorraine Chalifour and colleagues at McGill University and the affiliated Lady Davis Institute for Medical Research at Montreal’s Jewish General Hospital. The researchers focused on BAK, a gene that controls cell death.
What they found surprised them.
AAA is one of the rare vascular diseases where tissue samples are removed as part of patient therapy. When they compared them, the researchers discovered major differences between BAK genes in blood cells and tissue cells coming from the same individuals, with the suspected disease “trigger” residing only in the tissue. Moreover, the same differences were later evident in samples derived from healthy individuals.
Click here to read more at sciencedaily.com, including statements from Dr. Bruce Gottlieb from McGill University in Montreal, Canada.
Journal reference:
Gottlieb et al.
BAK1 gene variation and abdominal aortic aneurysms.
Human Mutation, 2009; 30 (7): 1043 DOI:
10.1002/humu.21046