Tag Archives: WGS

New paper out: metagenomics study of poultry production environments

I am happy to say that myself and my collaborators in the Department of Occupational and Environmental Health here at the University of Iowa have had our recent work on the bacterial composition of poultry bioaerosols (i.e., the dust that poultry workers breath during their tasks) published in Microbial Biotechnology.   

The key figure from this work is the following heat map that illustrates the top taxa that are common to all 21 samples:


What is remarkable about whole-genome shotgun metagenomics is that we are not only surveying bacterial DNA, but also viral, fungal, archaeal, and eukaryotic DNA in one experiment.  You can see from the figure that certain viruses are found in all samples, but it is bacteria, particularly Lactobacillus and Salinicoccus, that are the most abundant.

Stay tuned because we will have a paper coming out soon on the fungal composition of these samples as well.   In the case of this paper, and our next manuscript, it is the first time whole-genome shotgun metagenomics has been applied to the field of environmental health in poultry environments.


Mutational signatures in cancer with DNA-Seq

A recent collaboration with a clinician here at UI hospital and clinics introduced me to the idea of mutational signatures in cancer.  Characterizing mutational signatures is made possible by the falling cost and increasing accuracy of whole-genome sequencing methods.  Tumors are sequenced across the entire genome and the catalog of somatic mutations (i.e, SNPs) is used to compute the mutational signatures of a tumor’s genome.

The idea is that the collection of somatic mutations found in a tumor  are the result of a variety of defective DNA-repair or DNA-replication machinery combined with the action of known or unknown mutagens and environmental exposures.  The processes operate over time and leave a “footprint” in the tumor DNA that can be examined.  These sum of all of the mutational processes operating within a tumor cell is a distinct mutational “signature” that differs by tumor types.

For example, in lung cancer, the bulk of somatic mutations are C>A transversions resulting from chronic exposure to tobacco smoke.  In melanoma, the predominant mutation type is C>T and CC>TT at dipyrimidines, a mutation type associated with UV-light exposure.  And in colorectal cancer, defective DNA mismatch repair contributes the majority of the mutations.

Mutational signatures of a cancer by the operation of several mutational processes over time.
Mutational signature of a cancer by the operation of several mutational processes over time. From http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3990474/figure/fig0005/. Used under CC License BY 3.0.

A recent paper in Nature has formalized this notion of mutational signatures in tumors and provided a mathematical framework (written in MatLab) for assessing how many and which signatures are operational within an uncharacterized tumor type (generally there between 2 and 6 processes).

In the paper, the authors analyzed almost 5 million somatic cancer SNPs and identified 21 unique signatures of mutational processes through a mathematical process of deconvolution, followed by experimental validation.  A curated catalog of the most current signatures based on available sequence data can be found at the COSMIC database.

In part 2 of this post, I’ll go into more detail on the mutational signatures and link to some python code I’ve written to help get flat-file lists of SNPs into the correct form for easy input into the MatLab framework.