Mar 17 2008

When the manger is empty the horses bite each other - and so do bacteria!!

Tag: Researchrforsberg @ 5:41 pm Email This Post

An old Danish proverb says that when then the manger is empty the horses bite each other. This idea has now been put to use by Anthony Sinskey’s research group at MIT and is described in a report by the MIT Technology Review. Sinskey’s group had previously produced the genome sequence of the soil-dwelling bacteria known as Rhodococcus fascians. Looking at the genome, they were surprised to find that this organism, not known for its antibiotic-producing powers, harbored a number of genes involved in the metabolism of antibiotic-like compounds. However, none of these genes seemed to be expressed when the bacteria was grown in the lab. To bring out the worst in the bacteria, the group decided to grow the bacteria in competition with a Streptomyces bacteria. After performing selection experiments, one strain of the Rhodococcus bacteria was shown to excrete a novel antibiotic compound, dubbed rhodostreptomycin, which belongs to the same class of antibiotics as streptomycin, a tuberculosis drug.

The inference of the exact molecular mechanisms responsible for the new compound are still under way, but one fascinating preliminary finding is that the selected Rhodococcus strain seem to have assimilated a large chunk of DNA from the competing Streptomyces strain.

This is a fascinating example of the new picture of bacterial genomics that is emerging as a result of improved sequencing technology - for an introduction, I recommend this review by Raskin et al.


Mar 13 2008

Spanning the Great Wall with human genomic DNA - now for less than $60,000

Tag: Research, Technologyrforsberg @ 11:51 am Email This Post

Applied Biosystems have released data to the public from the genome sequencing of a Yoruba Nigerian HapMap sample. In their press release, AB claim that the data were generated using only 7 runs of the SOLiD system and at a total sequencing costs of less than 60.000$.

The data covered the genome 12 fold and paired end information provided a physical coverage of a 100 fold, i.e. the coverage stemming from the inserted but not sequenced part of paired end reads. Millions of SNP’s and a large number of structural variations were identified from the data.

As an amusing aside, AB gave these funny facts about the dataset:

  • If all 36 billion bases were spread out at 1 millimeter apart, they would extend 36,000 kilometers, or more than 4,000 times the height of Mt. Everest, which at 8,848 meters above sea level, is the highest mountain on Earth.
  • If all 36 billion bases were spread along the Great Wall of China at 1 millimeter apart, this would equate to spanning the 5,000 kilometer wall more than 7 times.
  • If a person were to proofread the 36 billion bases in this dataset at one letter per second for 24 hours-per-day, it would take 1,200 years to read the entire data set.
  • If each base represented one individual in the world population, the dataset would account for more than 5 times the entire world population of 6.8 billion people.
  • This dataset, at 36 billion bases of DNA sequence, is equivalent to 360 times all of the 100 million visible stars in the Earths galaxy.

Mar 11 2008

Evaluating the Illumina/Solexa Genome Analyzer for whole genome re-sequencing

Tag: Research, Technologyrforsberg @ 1:54 pm Email This Post

Does Solexa have problems with amplifying A/T rich regions? I just read a really interesting paper by Hillier et al, from Nature Methods which claims that this might be the case.

The paper is entitled “Whole-genome sequencing and variant discovery in C. elegans” and reports the use of Solexa technology to re-sequence two C.elegans specimens for variant discovery. The paper demonstrates the use of the Solexa technology for re-assembly of the C.elegans genome, especially when paired-end information is used.

However, it points to a general lack of coverage in A/T rich regions (see figure 2 of the supplementary material) which leaves a number of zero size gaps in the assembly - places where reads sit shoulder to shoulder but simply do not overlap. Having found these problematic A/T rich regions, the authors went back and took a look across the genome, where they found a general correlation between A/T content and read coverage. This correlation was stronger when examining a 200 bp window than when examining a 32 bp window. 200 bp corresponds to the size of the amplicons that are amplified during the cluster generation step prior to sequencing and 32 bp corresponds to the number of cycles in the actual sequencing by synthesis procedure. This finding made Hillier et al. conclude that failure to amplify A/T rich regions during cluster generation is the cause of the low coverage (other reasons for the bias such as hairpin formation were also explored but discarded).

Unfortunately the authors did not pursue a chemical explanation for the phenomenon and did not investigate other Solexa datasets for a similar trend. Therefore it is premature to say whether this is a general phenomenon of the Solexa technology but it is definitely something that warrants the attention of people like us that are designing assembly- and variant detection algorithms.


Mar 07 2008

BGI to sequence Giant Panda genome in 6 months using next generation sequencing technology

Tag: Researchrforsberg @ 10:29 am Email This Post

Our collaborators at Beijing Genomics Institute, Shenzen recently announced the launch of the giant panda genome project. Through the use of next generation sequencing technology, the aim is to complete the panda genome within only six months.

The researchers involved will use the results to answer a number of questions regarding the animals biology. These include, the exact phylogenetic position of the panda and the genetics underlying the pandas extraordinary metabolism. Furthermore, results will be used in the analysis of panda population genetics and conservation biology.


Mar 03 2008

George Church gets backed by Google in quest to sequence 100.000 humans

Tag: Researchrforsberg @ 10:06 am Email This Post

I picked this up via Genome Tehcnology. According to an article at Blomberg.com Church has launched an ambitious project to sequence the coding regions of 100.000 humans, a number that may even increase to a million.

The plan is to tie the genomic information to phenotypic information and health records of the sequenced individuals to create a unique data resource from which novel links between genetic variation and disease can be learned.

Google have been one of the first companies to support the project and apparently have plans of making their Google Health project a front-end to the collected data.


« Previous PageNext Page »