One-sample design has been used in many ChIP-seq experiments. It allows more biological contexts to be analyzed within a fixed sequencing budget. To study the merits and limitations of this design, we analyzed ChIP-seq data for two additional transcription factors, Oct4 and Nanog, which are crucial regulators for self-renewal and pluripotency of embryonic stem cells.Again, there was good agreement between one-sample and two-sample analyses after postprocessing, with 96% concordance in the case of Oct4 and 83% in the case of Nanog These examples suggest that under certain conditions, a one-sample experiment can provide a cost-effective alternative to the two-sample experiment, albeit perhaps at the expense of some specificity.
To gain a better understanding of limitations of one-sample analysis, we applied it to
negative control samples. Although no peaks were expected, a small number of peaks were reported at the 10% FDR level.This was caused by the residual background variation that the negative binomial model was not able to explain . Systematic evaluation using simulated spike-in data showed that, although the one-sample analysis can provide reasonable FDR estimates when the overall binding signal is strong, the method may underestimate the real FDR significantly when the overall binding in the sample is weak.
Fortunately, poor peak reliability and problematic FDR estimation can often be diagnosed through several criteria, such as highly repeat-rich predictions, predictions covering a low percentage of reads, and lack of motif enrichment . We
recommend using two-sample experiments whenever it is affordable or when little is
known about the transcription factor. When cost constraints necessitate one-sample analyses, a negative binomial rather than Poisson background model should be used to exclude background noise, and prediction quality should be evaluated using multiple criteria as described above. CisGenome is designed to support these types of analyses.
不要直接放到Goole或者有道上翻译!
|