DOI

Introduction

Here we provide a complete workflow for the preprocessing and analysis of DNA methylation array data. The workflow combines best practices in the field with in-house developed methodology, and is geared towards large-scale studies, including Epigenome-wide Association Studies (EWAS). Its development was informed by our research analysing BIOS consortium data, which contains multiomics measures from 6 Dutch biobanks comprising ~4000 individuals (Jansen et al. 2016) .

The DNAmArray-package contains a series of convenient functions for the quality control, normalization, and analysis of methylation array data. The workflow has been thoroughly tested for the Illumina 450k array but is similarly applicable to the newer 850k EPIC array.

It is worth noting that this workflow makes extensive use of other BioConductor packages. For example, the DNAmArray function read.metharray.exp.par() converts IDAT files to an RGset by harnessing functions from minfi (Aryee et al. 2014) and combining them with BiocParallel. Usually the required packages are installed automatically, but otherwise please refer to the relevant package’s documentation.

Furthermore, we have also developed packages used by this workflow. MethylAid (Iterson et al. 2014) provides a web application to assist in performing interactive sample quality control, and bacon (Iterson et al. 2017) corrects for bias and inflation in ome-wide association studies, such as EWAS.


Example Data

The example data (Cobben et al. 2019) used in this workflow is available from the NCBI Gene Expression Omnibus (GEO), a public repository of microarray data. It contains genome-wide DNA methylation data from whole blood obtained using the Illumina 450k microarray. The participants consist of 46 fetal alcohol spectrum disorder (FASD) cases and 92 controls from both a discovery and replication cohort.


Aryee, Martin J., Andrew E. Jaffe, Hector Corrada-Bravo, Christine Ladd-Acosta, Andrew P. Feinberg, Kasper D. Hansen, and Rafael A. Irizarry. 2014. “Minfi: A Flexible and Comprehensive Bioconductor Package for the Analysis of Infinium DNA Methylation Microarrays.” Bioinformatics 30 (10) (May): 1363–1369. doi:10.1093/bioinformatics/btu049. http://www.ncbi.nlm.nih.gov/pubmed/24478339 http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC4016708.

Cobben, Jan M., Izabela M. Krzyzewska, Andrea Venema, Adri N. Mul, Abeltje Polstra, Alex V. Postma, Robert Smigiel, et al. 2019. “DNA-Methylation Abundantly Associates with Fetal Alcohol Spectrum Disorder and Its Subphenotypes.” Epigenomics (March): epi–2018–0221. doi:10.2217/epi-2018-0221. https://www.futuremedicine.com/doi/10.2217/epi-2018-0221.

Iterson, Maarten van, Elmar W. Tobi, Roderick C. Slieker, Wouter Den Hollander, René Luijk, P. Eline Slagboom, and Bastiaan T. Heijmans. 2014. “MethylAid: Visual and Interactive Quality Control of Large Illumina 450k Datasets.” Bioinformatics 30 (23) (December): 3435–3437. doi:10.1093/bioinformatics/btu566. https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btu566.

Iterson, Maarten van, Erik W. van Zwet, Bastiaan T. Heijmans, ’t HoenPeter A.C., Joyce van Meurs, Rick Jansen, Lude Franke, et al. 2017. “Controlling Bias and Inflation in Epigenome- and Transcriptome-Wide Association Studies Using the Empirical Null Distribution.” Genome Biology 18 (1) (December): 19. doi:10.1186/s13059-016-1131-9. http://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1131-9.

Jansen, Rick, Lude Franke, Coen D. A. Stehouwer, Hailiang Mei, Cornelia M. van Duijn, Jan H. Veldink, André G. Uitterlinden, et al. 2016. “Blood Lipids Influence DNA Methylation in Circulating Cells.” Genome Biology 17 (1) (December): 138. doi:10.1186/s13059-016-1000-6. http://www.ncbi.nlm.nih.gov/pubmed/27350042 http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC4922056 http://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1000-6.