Confounding in *-seq experiments

I see a lot of experimental design that is confounded.  Please read this important essay to understand why this must be avoided to have quality results from expensive *-seq experiments.

Confounded designs ruin experiments. Current batch effect removal methods will not save you. If you are designing a large genomics experiments, learn about randomization.


Top 75 in Bioinformatics by

This blog named a “Top 75 in Bioinformatics” by!

I made the list at #58.  I’m proud of that fact, but I want to push into the top 30 on the internet.  I plan to increase my rate of posting new articles and also up my game on content and analysis.   Stay tuned!


P values in practice

Found a great article from Andrew Gelman at Columbia on how to think about p values from a Bayesian perspective.  Although I don’t really understand Bayesian statistics at this point, the article still had some very nice explanations about how to think about traditional p values as they are used in practice.

Some key points:

The P value is a measure of discrepancy of the fit of a model or “null hypothesis” H to data y. Mathematically, it is defined as Pr(T(yrep)>T(y)|H), where yrep represents a hypothet- ical replication under the null hypothesis and T is a test statis- tic (ie, a summary of the data, perhaps tailored to be sensitive to departures of interest from the model).”

“[…] the P value is itself a statistic and can be a noisy measure of evidence. This is a problem not just with P values but with any mathematically equivalent procedure, such as summarizing results by whether the 95% confidence interval includes zero.”

“[…] we cannot interpret a nonsignificant result as a claim that the null hypothesis was true or even as a claimed probability of its truth. Rather, nonsignificance revealed the data to be compatible with the null hypothesis;”

“we accept that sample size dictates how much we can learn with confidence; when data are weaker, it can be possible to find reliable patterns by averaging.”

“The focus on P values seems to have both weakened [the] study (by encouraging the researcher to present only some of his data so as to draw attention away from nonsignificant results) and to have led reviewers to inappropriately view a low P value (indicating a misfit of the null hypothesis to data) as strong evidence in favor of a specific alternative hypothesis […] rather than other, perhaps more scientifically plausible, alternatives such as measurement error and selection bias.”