Mar 25 2008

Launch of the Human Oral Microbiome Database

Tag: Researchrforsberg @ 21:33

To follow up on a previous post, researchers funded by the Human Microbiome Project have now launched the Human Oral Microbiome Database (HOMD). HOMD is intended as a service to researchers that are investigating the role of microbes in human health and disease, with particular emphasis on the oral environment. It is anticipated that the database can serve as a model for the gut, skin, and vaginal databases for the Human Microbiome Project. GenomeWeb has more.


Mar 18 2008

Sequence capture for next generation sequencing is offered commercially by NimbleGen

Tag: Technologyrforsberg @ 16:57

NimbleGen are now offering their high-density oligonucleotide microarrays as a programmable genomic selection technology to allow targeted sequencing of genomic regions such as e.g. exons.

NGS technologies are based on random amplification of input DNA. This makes sample preparation easier but leaves the actual sequencing undirected. The idea of capture arrays is to insert a selection step prior to the actual sequencing. The array is programmed to capture only the genomic regions of interest and thus enable users to utilize the full capacity of the NGS machines in the sequencing of specific genomic regions of interest.

Illustration of the array capture process from the NimbleGen website

Image from the NimbleGen website, illustrating the array capture process.


Mar 17 2008

When the manger is empty the horses bite each other - and so do bacteria!!

Tag: Researchrforsberg @ 17:41

An old Danish proverb says that when then the manger is empty the horses bite each other. This idea has now been put to use by Anthony Sinskey’s research group at MIT and is described in a report by the MIT Technology Review. Sinskey’s group had previously produced the genome sequence of the soil-dwelling bacteria known as Rhodococcus fascians. Looking at the genome, they were surprised to find that this organism, not known for its antibiotic-producing powers, harbored a number of genes involved in the metabolism of antibiotic-like compounds. However, none of these genes seemed to be expressed when the bacteria was grown in the lab. To bring out the worst in the bacteria, the group decided to grow the bacteria in competition with a Streptomyces bacteria. After performing selection experiments, one strain of the Rhodococcus bacteria was shown to excrete a novel antibiotic compound, dubbed rhodostreptomycin, which belongs to the same class of antibiotics as streptomycin, a tuberculosis drug.

The inference of the exact molecular mechanisms responsible for the new compound are still under way, but one fascinating preliminary finding is that the selected Rhodococcus strain seem to have assimilated a large chunk of DNA from the competing Streptomyces strain.

This is a fascinating example of the new picture of bacterial genomics that is emerging as a result of improved sequencing technology - for an introduction, I recommend this review by Raskin et al.


Mar 13 2008

Spanning the Great Wall with human genomic DNA - now for less than $60,000

Tag: Research, Technologyrforsberg @ 11:51

Applied Biosystems have released data to the public from the genome sequencing of a Yoruba Nigerian HapMap sample. In their press release, AB claim that the data were generated using only 7 runs of the SOLiD system and at a total sequencing costs of less than 60.000$.

The data covered the genome 12 fold and paired end information provided a physical coverage of a 100 fold, i.e. the coverage stemming from the inserted but not sequenced part of paired end reads. Millions of SNP’s and a large number of structural variations were identified from the data.

As an amusing aside, AB gave these funny facts about the dataset:

  • If all 36 billion bases were spread out at 1 millimeter apart, they would extend 36,000 kilometers, or more than 4,000 times the height of Mt. Everest, which at 8,848 meters above sea level, is the highest mountain on Earth.
  • If all 36 billion bases were spread along the Great Wall of China at 1 millimeter apart, this would equate to spanning the 5,000 kilometer wall more than 7 times.
  • If a person were to proofread the 36 billion bases in this dataset at one letter per second for 24 hours-per-day, it would take 1,200 years to read the entire data set.
  • If each base represented one individual in the world population, the dataset would account for more than 5 times the entire world population of 6.8 billion people.
  • This dataset, at 36 billion bases of DNA sequence, is equivalent to 360 times all of the 100 million visible stars in the Earths galaxy.

Mar 11 2008

Evaluating the Illumina/Solexa Genome Analyzer for whole genome re-sequencing

Tag: Research, Technologyrforsberg @ 13:54

Does Solexa have problems with amplifying A/T rich regions? I just read a really interesting paper by Hillier et al, from Nature Methods which claims that this might be the case.

The paper is entitled “Whole-genome sequencing and variant discovery in C. elegans” and reports the use of Solexa technology to re-sequence two C.elegans specimens for variant discovery. The paper demonstrates the use of the Solexa technology for re-assembly of the C.elegans genome, especially when paired-end information is used.

However, it points to a general lack of coverage in A/T rich regions (see figure 2 of the supplementary material) which leaves a number of zero size gaps in the assembly - places where reads sit shoulder to shoulder but simply do not overlap. Having found these problematic A/T rich regions, the authors went back and took a look across the genome, where they found a general correlation between A/T content and read coverage. This correlation was stronger when examining a 200 bp window than when examining a 32 bp window. 200 bp corresponds to the size of the amplicons that are amplified during the cluster generation step prior to sequencing and 32 bp corresponds to the number of cycles in the actual sequencing by synthesis procedure. This finding made Hillier et al. conclude that failure to amplify A/T rich regions during cluster generation is the cause of the low coverage (other reasons for the bias such as hairpin formation were also explored but discarded).

Unfortunately the authors did not pursue a chemical explanation for the phenomenon and did not investigate other Solexa datasets for a similar trend. Therefore it is premature to say whether this is a general phenomenon of the Solexa technology but it is definitely something that warrants the attention of people like us that are designing assembly- and variant detection algorithms.


Next Page »