Readings: "Steps toward broad-spectrum therapeutics: discovering virulence-associated genes present in diverse human pathogens"

Reading Stubben et al: "Steps toward broad-spectrum therapeutics: discovering virulence-associated genes present in diverse human pathogens"
BMC Genomics, Volume 10, Issue 501, 2009

I found this paper interesting for several reasons: first, it follows a metagenomics approach, second it involves data mining work, and third, it's a step towards making an abstraction from individual pathogenics organisms to a more general understanding of virulence at a biomolecular level.

In the abstract, the authors summarize the objective of their work:
"New and improved antimicrobial countermeasures are urgently needed to counteract increased resistance to existing antimicrobial treatments and to combat currently untreatable or new emerging infectious diseases. We demonstrate that computational comparative genomics, together with experimental screening, can identify potential generic (i.e., conserved across multiple pathogen species) and novel virulence-associated genes that may serve as targets for broad-spectrum countermeasures."
It is interesting that the authors investigate virulence as a possible target in combating pathogens. The rationale behind this is to avoid killing bacteria - which fosters the development of strains that are antibiotics resistant. Instead, the idea is to "disarm" the bacteria by targeting and eventually disabling their virulence factors.

Virulence: The damage a pathogen causes to the host during infection

The authors thus attempted to computationally identify virulence-associated proteins. The idea was to consider all proteins produced by a set of bacteria, both pathogenic and non-pathogenic, group the proteins by similarity across species, and then look for groups of proteins that are present in all pathogenic organisms, but in none of the non-pathogenic ones.
To this end, they obtained a collection of 617.000 proteins from all the 214 microbial genomes that have been completely sequenced to date. They then aligned each of these 617.000 proteins against all others, selected the 1000 best hits for each protein and grouped these hits using a clustering technique. They then recorded which of the clusters was represented in which organism.With this information, it was possible to search for clusters that were associated with many organisms that are known to be pathogenic, but only with few nonpathogenic organisms.

In a closing word, the authors underline again the potential benefit of targeting virulence factors rather than deploying broad-spectrum antibiotics:
"An advantage of our approach is that commensal flora, which often play important roles in the well-being of humans, should be minimally affected. This is dramatically illustrated in the development of Clostridium difficile-associated colitis where the administration of broad- spectrum antibiotics significantly impacts the commensal gut flora producing an environment where the pathogenic C. difficile can proliferate [39]. Additional grounds for targeting virulence per se is furnished by recent metagenomic studies in humans, which suggest that the human metagenome contains several orders of magnitude more microbial genes than Homo sapiens genes and that our bodies themselves contain perhaps ten times as many microbial cells as “human” ones [40, 41]."

  • For background reading on clustering, an unsupervised machine learning technique, try Chapter 11, "Unsupervised learning and Clustering" in the book "Pattern Classification" by Duda, Hart and Stork
  • Basic Local Alignment Search Tool (BLAST), online version available at
  • Database of Essential Genes (DEG), a database documenting genes that are known to be critical to the viability of an organism. Available at


Popular posts from this blog

Getting my Pharmacia LKB Multidrive XL online... now with 3D printing!

Charting a course to hands-on DNA sequencing with the Oxford Nanopore MinION

Google Xistence - a new approach to identity