Acknowledgements

Contributors to the development of the package and workflow include:

Lucy Sinke, Tom Kuipers, Jazmin Taubert, Yunfeng Liu, Manhoor Sulaiman, Thomas Jonkman, Elmar W. Tobi, Roderick Slieker, Wouter den Hollander, Rene Luijk, Koen F. Dekkers, and Bas Heijmans (PI) from Molecular Epidemiology, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands

Development of the workflow and generation of previously used example data in LLS was funded by BBMRI-NL, a research infrastructure financed by the Dutch government (NWO 184.021.007).

Lucy Sinke and Maarten van Iterson are the main authors of the package and associated workflow. Lucy Sinke is the current author and maintainer and should be contacted with any questions: l.j.sinke@lacdr.leidenuniv.nl

We are grateful to Paul Hop, Jenny van Dongen, and Helena Rasche for testing and improving the pipeline.

Please cite van *Sinke, L., van Iterson, M., Cats, D., BIOS Consortium, Kuipers, T. & Heijmans, B. (2019) ‘DNAmArray: Streamlined workflow for the quality control, normalization, and analysis of Illumina methylation array data’ http://doi.org/10.5281/zenodo.3355292* when using the DNAmArray package or workflow.

This web-site is generated with R Markdown on 20 July, 2025 using R-version and packages:

## R version 4.4.3 (2025-02-28)
## Platform: x86_64-pc-linux-gnu
## Running under: Rocky Linux 8.10 (Green Obsidian)
## 
## Matrix products: default
## BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.15.so;  LAPACK version 3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Europe/Amsterdam
## tzcode source: system (glibc)
## 
## attached base packages:
##  [1] grid      parallel  stats4    stats     graphics  grDevices utils    
##  [8] datasets  methods   base     
## 
## other attached packages:
##  [1] bacon_1.34.0                           
##  [2] ellipse_0.5.0                          
##  [3] limma_3.62.2                           
##  [4] sva_3.54.0                             
##  [5] genefilter_1.88.0                      
##  [6] mgcv_1.9-3                             
##  [7] nlme_3.1-168                           
##  [8] methyLImp2_1.2.0                       
##  [9] ChAMPdata_2.38.0                       
## [10] ComplexHeatmap_2.22.0                  
## [11] circlize_0.4.16                        
## [12] reshape2_1.4.4                         
## [13] lubridate_1.9.4                        
## [14] forcats_1.0.0                          
## [15] stringr_1.5.1                          
## [16] dplyr_1.1.4                            
## [17] purrr_1.1.0                            
## [18] readr_2.1.5                            
## [19] tidyr_1.3.1                            
## [20] tibble_3.3.0                           
## [21] ggplot2_3.5.2                          
## [22] tidyverse_2.0.0                        
## [23] knitr_1.50                             
## [24] BiocParallel_1.40.2                    
## [25] DNAmArray_2.0.0                        
## [26] FDb.InfiniumMethylation.hg19_2.2.0     
## [27] org.Hs.eg.db_3.20.0                    
## [28] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
## [29] GenomicFeatures_1.58.0                 
## [30] AnnotationDbi_1.68.0                   
## [31] minfi_1.52.1                           
## [32] bumphunter_1.48.0                      
## [33] locfit_1.5-9.12                        
## [34] iterators_1.0.14                       
## [35] foreach_1.5.2                          
## [36] Biostrings_2.74.1                      
## [37] XVector_0.46.0                         
## [38] SummarizedExperiment_1.36.0            
## [39] Biobase_2.66.0                         
## [40] MatrixGenerics_1.18.1                  
## [41] matrixStats_1.5.0                      
## [42] GenomicRanges_1.58.0                   
## [43] GenomeInfoDb_1.42.3                    
## [44] IRanges_2.40.1                         
## [45] S4Vectors_0.44.0                       
## [46] BiocGenerics_0.52.0                    
## [47] BiocManager_1.30.26                    
## [48] rmarkdown_2.29                         
## 
## loaded via a namespace (and not attached):
##   [1] splines_4.4.3             BiocIO_1.16.0            
##   [3] bitops_1.0-9              preprocessCore_1.68.0    
##   [5] methylumi_2.52.0          XML_3.99-0.18            
##   [7] lifecycle_1.0.4           edgeR_4.4.2              
##   [9] doParallel_1.0.17         vroom_1.6.5              
##  [11] lattice_0.22-7            MASS_7.3-65              
##  [13] base64_2.0.2              scrime_1.3.5             
##  [15] magrittr_2.0.3            sass_0.4.10              
##  [17] jquerylib_0.1.4           yaml_2.3.10              
##  [19] doRNG_1.8.6.2             askpass_1.2.1            
##  [21] DBI_1.2.3                 RColorBrewer_1.1-3       
##  [23] lumi_2.58.0               abind_1.4-8              
##  [25] zlibbioc_1.52.0           quadprog_1.5-8           
##  [27] RCurl_1.98-1.17           GenomeInfoDbData_1.2.13  
##  [29] ggrepel_0.9.6             rentrez_1.2.4            
##  [31] annotate_1.84.0           DelayedMatrixStats_1.28.1
##  [33] codetools_0.2-20          DelayedArray_0.32.0      
##  [35] xml2_1.3.8                tidyselect_1.2.1         
##  [37] shape_1.4.6.1             UCSC.utils_1.2.0         
##  [39] farver_2.1.2              beanplot_1.3.1           
##  [41] illuminaio_0.48.0         GenomicAlignments_1.42.0 
##  [43] jsonlite_2.0.0            GetoptLong_1.0.5         
##  [45] multtest_2.62.0           survival_3.8-3           
##  [47] tools_4.4.3               Rcpp_1.1.0               
##  [49] glue_1.8.0                SparseArray_1.6.2        
##  [51] xfun_0.52                 HDF5Array_1.34.0         
##  [53] withr_3.0.2               fastmap_1.2.0            
##  [55] rhdf5filters_1.18.1       openssl_2.3.3            
##  [57] digest_0.6.37             timechange_0.3.0         
##  [59] R6_2.6.1                  colorspace_2.1-1         
##  [61] RSQLite_2.4.1             generics_0.1.4           
##  [63] corpcor_1.6.10            pls_2.8-5                
##  [65] data.table_1.17.8         rtracklayer_1.66.0       
##  [67] httr_1.4.7                S4Arrays_1.6.0           
##  [69] pkgconfig_2.0.3           gtable_0.3.6             
##  [71] blob_1.2.4                siggenes_1.80.0          
##  [73] htmltools_0.5.8.1         clue_0.3-66              
##  [75] scales_1.4.0              png_0.1-8                
##  [77] tzdb_0.5.0                rjson_0.2.23             
##  [79] curl_6.4.0                cachem_1.1.0             
##  [81] rhdf5_2.50.2              GlobalOptions_0.1.2      
##  [83] KernSmooth_2.23-26        restfulr_0.0.16          
##  [85] GEOquery_2.74.0           pillar_1.11.0            
##  [87] reshape_0.8.10            vctrs_0.6.5              
##  [89] cluster_2.1.8             xtable_1.8-4             
##  [91] evaluate_1.0.4            cli_3.6.5                
##  [93] compiler_4.4.3            Rsamtools_2.22.0         
##  [95] rlang_1.1.6               crayon_1.5.3             
##  [97] rngtools_1.5.2            labeling_0.4.3           
##  [99] nor1mix_1.3-3             mclust_6.1.1             
## [101] affy_1.84.0               plyr_1.8.9               
## [103] stringi_1.8.7             nleqslv_3.3.5            
## [105] htm2txt_2.2.2             Matrix_1.7-3             
## [107] hms_1.1.3                 sparseMatrixStats_1.18.0 
## [109] bit64_4.6.0-1             Rhdf5lib_1.28.0          
## [111] KEGGREST_1.46.0           statmod_1.5.0            
## [113] memoise_2.0.1             affyio_1.76.0            
## [115] bslib_0.9.0               bit_4.6.0

References

1 Dekkers, K.F., van Iterson, M., Slieker, R.C., et al. Blood lipids influence DNA methylation in circulating cells. Genome Biol. 17, 138 (2016).

2 Slieker, R.C., van Iterson, M., Luijk, R. et al. Age-related accrual of methylomic variability is linked to fundamental ageing mechanisms. Genome Biol. 17, 191 (2016).

3 Bonder, M., Luijk, R., Zhernakova, D. et al. Disease variants alter transcription factor levels and methylation of their binding sites. Nat Genet. 49, 131-138 (2017).

4 Luijk, R., Wu, H., Ward-Caviness, C.K. et al. Autosomal genetic variation is associated with DNA methylation in regions variably escaping X-chromosome inactivation. Nat Commun. 9, 3738 (2018).

5 van Rooij, J., Mandaviya, P.R., Claringbould, A., et al. Evaluation of commonly used analysis strategies for epigenome- and transcriptome-wide association studies through replication of large-scale population studies. Genome Biol. 20, 1:235 (2019).

6 Hop, P.J., Luijk, R., Daxinger, L., et al. Genome-wide identification of genes regulating DNA methylation using genetic anchors for causal inference. Genome Biol. 21, 220 (2020).

7 Dekkers, K.F., Slieker, R.C., Ioan-Facsinay, A. et al. Lipid-induced transcriptomic changes in blood link to lipid metabolism and allergic response. Nat Commun. 14, 544 (2023).

8 Lui, Y., Sinke, L., Jonkman, T.H., et al. The inactive X chromosome accumulates widespread epigenetic variability with age. Clin Epigenetics. 15, 135 (2023).

9 Aryee, M.J., Jaffe, A.E., Corrada-Bravo, H. et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics. 30, 10 (2014).

10 van Iterson, M., Tobi, E.W., Slieker, R.C., et al. MethylAid: visual and interactive quality control of large Illumina 450k datasets. Bioinformatics 30, 23 (2014).

11 van Iterson, M., van Zwet, E.W., BIOS consortium, et al. Controlling bias and inflation in epigenome- and transcriptome-wide association studies using the empirical null distribution. Genome Biol. 18, 19 (2017).

12 van Iterson, M., Cats, D., Hop, P. et al. omicsPrint: detection of data linkage errors in multiple omics studies. Bioinformatics. 34, 12 (2018).

13 Curtis, S.W., Cobb, D.O., Kilaru, V. et al. Exposure to polybrominated biphenyl (PBB) associates with genome-wide DNA methylation differences in peripheral blood. Epigenetics. 14, 1:52-66 (2019).

14 Davis, S. and Meltzer, P.S. GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics. 23, 14 (2007).

15 Zhou, W., Laird, P.W., and Shen, H. Comprehensive characterization, annotation, and innovative use of Infinium DNA methylation BeadChip probes. Nucleic Acids Res. 45, 4:e22 (2017).

16 Fortin, J-P., Labbe, A., Lemire, M. et al. Functional normalization of 450k methylation array data improves replication in large cancer studies. Genome Biol. 15, 12:503 (2014).

17 Heiss, J.A. and Brenner, H. Between-array normalization for 450k data. Front Genet. 6 (2015).

18 Min, J.L., Hemani, G., Davey-Smith, G., et al. Meffil: efficient normalization and analysis of very large DNA methylation datasets. Bioinformatics. 34, 23:3983-3989 (2018).

19 Ori, A.P.S., Lu, A.T., Horvath, S., et al. Significant variation in the performance of DNA methylation predictors across data preprocessing and normalization strategies. Genome Biol. 23, 225 (2022).

20 Plasienko, A., Di Lena, P., Nardini, C., et al. methyLImp2: faster missing value estimation for DNA methylation data. Bioinformatics. 40, 1 (2024).

21 Zheng, S.C., Breeze, C.E., Beck, S., et al. Identification of differentially methylated cell-types in Epigenome-Wide Association Studies. Nat Methods. 15, 1059-1066 (2018).

22 Amemiya, H.M., Kundaje, A., and Boyle, A.P. The ENCODE Blacklist: Identification of Problematic Regions of the Genome. Sci Rep. 9, 9354 (2019).

23 Leek, J.T. and Storey, J.D. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet. 9, 1724-35 (2007).

24 Johnson, W.E., Li, C., and Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 8, 1:118-27 (2006).

25 Leek, J.T. svaseq: removing batch effects and other unwanted noise from sequencing data. Nucleic Acids Res. 42, 21:e161 (2014).

26 Leek, J.T., Scharpf, R.B., Bravo, H.C. et al. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet. 11, 10:733-9 (2010).

27 Ritchie, M.E., Phipson, B., Wu, D., et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, 7 (2015).

28 Roadmap Epigenomics Consortium, Kundaje, A., Meuleman, W. et al. Integrative analysis of 111 reference human epigenomes. Nature. 518, 7539:317-30 (2015).

29 Guo, X., Sulaiman, M., Neumann, A., et al. Unified high-resolution immune cell fraction estimation in blood tissue from birth to old age. Genome Med. 17, 63 (2025).

30 Pidsley, R., Wong, C.C.Y., Volta, M., et al. A data-driven approach to preprocessing Illumina 450K methylation array data. BMC Genomics. 14, 293 (2013).

31 Wang, Y., Hannon, E., Grant, O.A., et al. DNA methylation-based sex classifier to predict sex and identify sex chromosome aneuploidy. BMC Genomics. 22, 484 (2021).

32 Heinz, S., et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 38, 4:576-89 (2010).