The DNAmArray-package can be installed in several ways, and has been successfully for >= R-3.2.0 on various Linux-builds and for >= R-3.5.3 on Windows.
First install the package devtools, then use the install_github()
function to fetch the DNAmArray package.
git/R
Using git, you can git clone
our repository and then install the package, changing _x.y.z.
to the version you cloned.
git clone git@github.com/molepi/DNAmArray.git
R CMD build DNAmArray
R CMD INSTALL DNAmArray_x.y.z.tar.gz
The first step of any analysis of microarray data involves importing the raw intensity files into your software program. In this example, we show how to import raw IDAT files from GEO into R, but similar strategies can be employed for all Illumina methylation array files when using R.
Using GEOquery [@Davis2007], you can download the example data [@Cobben2019]. This package contains functions that bridge the gap between between BioConductor tools and GEO, which is a public repository of microarray data.
Initially, we use getGEOSuppFiles()
to download the supplementary data to the current working directory. These consist of an archive containing the raw IDATs alongside some documentation. We extract these and store them in our chosen directory, ready for decompression.
getGEOSuppFiles("GSE113018")
untar(tarfile="../GSE113018/GSE113018_RAW.tar", exdir="../GSE113018/IDATs")
After creating a list of files, we utilise the gunzip()
function to efficiently unpack the data.
We then use getGEO()
to import SOFT format microarray data into R as a large GSE-class list. This contains a multitude of information, but we extract the phenotypic and meta-data of interest.
## Found 1 file(s)
## GSE113018_series_matrix.txt.gz
## Parsed with column specification:
## cols(
## .default = col_double(),
## ID_REF = col_character()
## )
## See spec(...) for full column specifications.
## File stored at:
## /tmp/RtmpTxAC2D/GPL16304.soft