DNA methylation of shelf, shore and open sea CpG positions distinguish high microsatellite instability from low or stable microsatellite status colon cancer stem cells

Aim: To investigate the genome-wide methylation of genetically characterized colorectal cancer stem cell (CR-CSC) lines. Materials & methods: Eight CR-CSC lines were isolated from primary colorectal cancer (CRC) tissues, cultured and characterized for aneuploidy, mutational status of CRC-related genes and microsatellite instability (MSI). Genome-wide DNA methylation was assessed by MethylationEPIC microarray. Results: We describe a distinctive methylation pattern that is maintained following in vivo passages in immune-compromised mice. We identified an epigenetic CR-CSC signature associated with MSI. We noticed that the preponderance of the differentially methylated positions do not reside at CpG islands, but spread to shelf and open sea regions. Conclusion: Given that CRCs with MSI-high status have a lower metastatic potential, the identification of a MSI-related methylation signature could provide new insights and possible targets into metastatic CRC. Our results show that: seven out of eight CR-CSC lines display an intragenetic homogeneity; each cell lines disclose a distinctive epigenetic proﬁle; the methylation status of 262 CpG can distinguish the microsatellite instability (MSI) status of the cell lines and that these CpGs are mostly located on shelf, shore and open sea genomic positions. CRCs and hypomethylation in non-neoplastic mucosa.

epithelial-mesenchymal transition (EMT). All these pathways are normally activated in healthy stem cells at the base of the colonic crypts and are associated with the invasive/metastatic phenotype of CRC [13]. Among the markers that are specifically expressed by CR-CSCs there are CD44, CD166, CD133 antigens, and leucinerich repeat-containing G-protein-coupled receptor 5 (LGR5), together with functional markers such as aldehyde dehydrogenase 1 family member A1 (ALDH1) and β-catenin (CTNNB1) [14,15].
In addition, CR-CSCs resist conventional chemo-and radio-therapy. This is attributed to several survival mechanisms, including: relatively quiescent nature, meaning that they have proliferative capacity but are not often cycling; greater ability to repair DNA and higher expression of the ATP-binding cassette (ABC) membrane transporters [16][17][18][19]. Therefore, even though CR-CSCs account for less than 1% of the tumor bulk, they are thought to represent the major source of recurrent disease after treatment, and to be capable of reconstituting the tumor at the original and/or distant sites, ultimately leading to therapeutic failure.
Several studies suggest that aberrant DNA methylation is one of the mechanisms driving tumor onset, development, progression and recurrence [20][21][22][23][24]. In CRC, hypermethylation is observed at CpG-rich promoter regions and results in transcriptional gene silencing [21]. On the other hand, hypomethylated regions are often localized in open sea areas of the genome and are linked with chromosomal instability (CIN), gene activation and loss of imprinting [25,26]. However, these data mostly derive from studies performed on primary tumors, characterized by intrinsic cellular heterogeneity [27], which dampens their biological relevance.
Here we report the epigenetic profiles of eight CR-CSC lines and compare these profiles to those publicly available for non neoplastic colorectal mucosa, adenomas, CRCs and matching metastases. Our results show that: seven out of eight CR-CSC lines display an intragenetic homogeneity; each cell lines disclose a distinctive epigenetic profile; the methylation status of 262 CpG can distinguish the microsatellite instability (MSI) status of the cell lines and that these CpGs are mostly located on shelf, shore and open sea genomic positions.

Materials & methods
Isolation & culture of CR-CSCs CRC specimens were obtained from patients undergoing CRC resection, in accordance with the ethical standards of the institutional committee of the University of Palermo. CR-CSCs were isolated as previously described [28][29][30]. Briefly, tumor samples were mechanically and enzymatically digested with collagenase and hyaluronidase for 1 h at 37 • C. Then the obtained digests have been cultured in ultralow adhesion flasks to select CSCs that do not undergo anoikis. The culture medium is based on serum-free advanced Dulbecco's Modified Eagle Medium: Nutrient Mixture F-12 (DMEM/F12), supplemented with EGF and bFGF. Cells are cultured in these conditions for several passages (3-4 months depending on the parental tumor histological grading and the number of CSCs in tumor specimens) and expanded by both enzymatic and mechanical dissociation. Then cells are characterized and eventually sorted for the expression of putative CSC markers. Finally, CR-CSCs are tested for their tumorigenic potential by subcutaneous injection in non-obese diabetic (NOD) severe combined immunodeficiency (SCID) gamma mice (NSG). Cells were maintained at 37 • C in a 5% CO 2 humidified incubator. Short tandem repeat (STR) DNA profiling was carried out for the authentication of CR-CSC lines using a 24-locus STR kit (GlobalFiler™ STR kit, Thermo Fisher Scientific, MA, USA). STR analysis was performed on the ABIPRISM 3130 genetic analyzer (Thermo Fisher Scientific) following the manufacturer's instructions. DNA profiles of isolated CR-CSC lines were matched with their relative patient tumor tissues to ensure no contamination with standard cell lines and their identity.

Animal procedures
All the in vivo experiments have been performed in accordance with the Animal Care Committee Guidelines of the University of Palermo, using three mice per condition. The tumorigenic potential of CR-CSCs was assessed by harvesting 5 × 10 5 dissociated tumor cells, resuspending them into 100 μl of 1:1 Matrigel (BD Matrigel Matrix Growth Factor Reduced, Becton, Dickinson and Company, NJ, USA) and subcutaneously injecting cells in NOD/SCID mice. Tumor growth was monitored and measured twice per week following cell injection using an electronic caliper.
Tumor volume was calculated using the formula: (largest measured diameter × [smallest measured diameter] 2 × π/6) for up to 15 weeks. At the end points, when subcutaneous tumor reached 2 cm in diameter, or when mice showed the first signs of suffering (Italian Ministry of Health, authorization no. 154/2017-PR), animals were sacrificed accordingly to Directive 2010/63/EU Guidelines (D.lgs 26/2016). Xenograft tumors were collected and used for paraffin-embedded tissues, DNA/RNA/protein extraction and cell isolation.

Cell viability assay
The cell viability assay was assessed for up to 96 h, using the CellTiter 96 R AQueous One Solution Cell Proliferation Assay kit (MTS) according to the manufacturer's instruction (Promega, WI, USA). To evaluate the CR-CSCs chemosensitivity to oxaliplatin, 3000 cells were plated into ultralow attachment 96-well plates (Corning, NY, USA) and exposed to vehicle or 50 μM oxaliplatin (cat. 09512, Sigma Aldrich, MO, USA). The results were analyzed by using a multiwall plate reader (GDV programmable MPT Reader DV 990BV6).

Reverse transcription quantitative PCR
Total RNA from CR-CSCs was purified by using the RNeasy Mini Kit (Qiagen) and retro-transcribed with the iScript™ gDNA Clear cDNA Synthesis Kit as recommended by the manufacturer (Bio-Rad, CA, USA). Relative expression level of selected genes was assessed by real-time PCR using the PrimePCR Custom Panel (Bio-Rad) according to manufacturer's instructions. Each sample was normalized to GAPDH and HPRT1 reference genes to obtain the Ct [(2-[Ctgene -CtGAPDH)] Ct)]. Data, collected from QuantStudio 7 Flex Real-Time PCR System (Thermo Fisher Scientific) were analyzed with the PrimePCR Analysis software (Bio-Rad).

Methylation EPIC bead CHip analysis
A total of 1 μg of extracted DNA was bisulphite-converted with the EZ-96 DNA Methylation-Gold™ Kit, used according to the manufacturer's protocol (Zymo Research, CA, USA). Next, the 850 K DNA methylation array by Infinium MethylationEPIC BeadChip (Illumina, CA, USA) was performed on 4 μg of bisulphite-converted DNA, following the Illumina Infinium HD Methylation protocol. This array includes 853.307 cytosine positions of the human genome (CpG sites, non-CpG sites and random SNPs). The methylation score for each CpG was represented as a β-value according to the fluorescent intensity ratio representing any value between 0 (unmethylated) and 1 (completely methylated). Arrays were scanned by HiScan (Illumina). DNA methylation data discussed in this publication have been deposited in NCBI's Gene Expression Omnibus (GEO) and are accessible through GEO Series accession number GSE123367 (www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE123367).

Statistical analyses
Raw signal intensities generated using the Infinium MethylationEPIC BeadChip were extracted from .idat files using the minfi Bioconductor package. The following filtering criteria were applied: samples with more than 5% of sites with a detection p-value greater than 0.05 were removed (0 samples removed); probes with a detection p-value greater than 0.05 in more than 1% of the samples were removed (5059 probes removed). For each probe, β values (Meth/[Meth + Unmeth]) were calculated. Exploratory analysis of DNA methylation profiles was performed using RnBeads platform (https://rnbeads.org/index.html) [31]. Statistical analyses were performed using R software. For differentially methylated probes (DMPs) identification, the ANOVA test was applied and probes having Benjamini-Hochberg (BH) p-value < 0.05 were selected. For differentially methylated regions (DMRs) identification, we applied the analytical pipeline described in Bacalini et al. [32]. Briefly, we used the Infinium EPIC annotation to group the CpG probes located on the same island, shore or shelf associated to a gene, and for each of region we evaluated the differences in DNA methylation between the groups under investigation using a multivariate analysis of variance (MANOVA). In the comparison between Gr1 CR-CSCs and Gr2 CR-CSCs, DMRs with a BH-corrected p-value < 0.05 were selected. In the comparison between CR-CSCs and CRC samples from datasets GSE42752 and GSE53051, a more stringent selection criteria for DMRs was adopted (Bonferroni corrected p-value < 0.01, at least two adjacent CpG sites with >0.2 concordant difference of methylation between CRCs and CR-CSCs), in order to minimize potentially confounding batch effects. Hierarchical clustering was performed using the heatmap.2 function in R gplots package, using default settings (euclidean distance function, complete agglomeration method). K-means clustering was performed using the R function kmeans, while the adjusted rand index was calculated using the randIndex function in the flexclust R package.

Sequencing data analysis & variant identification
Sequencing data analysis was conducted by using the Torrent Suite software v. 5.4.0 (Thermo Fisher Scientific). Briefly, low-quality reads were removed, adapter sequences trimmed and alignment against a reference genome (hg19) performed by using the Torrent Mapping Alignment Program. The Variant Caller plugin was used to identify variations from the reference sequence. The annotation of the variants was performed with ANNOVAR [33]. Insertions and deletions belonging to homopolymeric regions were removed, because sequencing error rate is high in these regions. The mutation waterfall plot was generated using the Bioconductor package GenVisR [34].

Copy number variation (CNV)
Copy number variation (CNV) was performed by reverse transcription quantitative PCR (RT-qPCR) using both the TaqMan technology (STUB1 ID: Hs02061495 cn; COX19 ID: Hs02372251 cn; JAG2 ID: Hs03069093; HES5 ID: Hs02713003 cn; Thermo Fisher Scientific) and the Roche Assay Design Center software to identify primers and relative UPL probes (https://lifescience.roche.com/en it/brands/universal-probe-library.html; KRAS F TGTATGGGCTGTGACATTGC -KRAS R CCACCTGTTCTTCCACCATC, UPL #68; TP53 F TGTTCTTGCAGTTAAGGGTTAGTTT -TP53 R TGAAGTGGGCCCCTACCTA, UPL#05; DLEU2 F ACGTTGTGCAGAAACTTGAGAC -DLEU2 R TCTAAGCAACCTGGATTTCACA, UPL#68). Briefly, 10 ng of DNA was amplified using the Universal Mastermix (Roche, Basel, CH), specific primers, fluorescent probes and the Human TaqMan R Copy Number Reference Assays RNase P as internal standard reference (Thermo Fisher Scientific). Each sample was analyzed in triplicate on CFX96 Touch™ Real-Time PCR Detection System (Bio-Rad). Each well was normalized to RNase P to obtain the Ct (2-[FAM dye Ct -VIC RNAseP dye Ct]). All samples were then normalized to a calibrator (three DNA samples from healthy donors) to determine Ct. Samples were considered to be carrying deletion or amplification if loss/acquisition of genetic material at the corresponding chromosome region was ≥25% compared with the average values of normal controls.

Genetic characterization of CR-CSC lines
We sequenced the hotspot mutation regions of 20 CRC-related genes (Supplementary Table 1) in eight primary CR-CSC lines obtained as described before [9,29,30] (Table 1). We focused on the most relevant genes associated with CRC development, including TP53, APC, KRAS and SMAD4 [35,36], which respectively showed 8, 13, 5 and 6 nucleotide variations (NVs) affecting the coding sequences in 6, 8, 5 and 4 cell lines. Of note is the missense SNP rs459552 in APC whose minor allele 'T' (MAF, 0.2019/24418-ExAC, NCBI-dbSNP Short Genetic Variations) was present in all the CR-CSC lines. However, the association of APC rs459552 with cancer remains unclear [37,38]. The allele frequencies of the NVs in TP53, APC, KRAS and SMAD4 were mostly around 50 or 100%, hinting at hetero-or homozygosity and clonal homogeneity. However, in four cell lines, the genes mutations showed peculiar allele frequencies of 33% or 61-70%, pointing to the presence of three allele sets ( Figure 1A; Supplementary Table 2). To verify this, we focused on KRAS, which revealed three non-synonymous NVs with a frequency of approximately 67% in three cell lines. We examined all the KRAS NVs (synonymous and non-synonymous) in those cell lines. In one of them, CR-CSC#12, we found the rs1137282 synonymous polymorphism at a frequency of 36.6%, in other words, close to 33%, the frequency expected for three allele copies (Supplementary Table 2). This suggests that, at least in CR-CSC#12, there are three KRAS alleles. We then considered the frequencies of all the NVs found in the investigated genes in the eight cell lines ( Figure 1B). Altogether, CR-CSC#5, #12 and #14 showed NV frequencies mostly around 50% ( Figure 1B), pointing to near-diploid status, while CR-CSC#8, #11, #13 and #9 displayed allele frequencies suggesting amplification to three (33 and 66%) or four allele copies (25 and 75%). Only CR-CSC#21 presented a more complicated pattern of allele frequencies, which could reflect genetic and clonal heterogeneity.
To verify if the distribution pattern of the NVs frequencies reflected aneuploidy, we analyzed the CNVs of seven chromosomal regions known to be amplified or deleted in CRC ( Figure 1C). The results of NVs frequency analysis were confirmed by CNV analysis. In fact, compared with the controls (healthy donors, HDs), CR-CSC#5, #12, and #14 (Group 1; Gr1) showed a moderate level of aneuploidy; CR-CSC#8, #11, #13 and #19 (Group 2; Gr2) displayed several DNA copy number alterations and CR-CSC#21 (Group 3; Gr3), showed a pattern intermediate between Gr1 and Gr2.
These results indicate that seven out of eight CR-CSC lines are genetically homogeneous, while one, CR-CSC#21, could comprise multiple subclones, in agreement with the fact that its TP53 mutation presents a frequency of 70%, with no variation in gene copy number.
Given that metastatic capacity is associated with CIN [39][40][41] and with CD44v6 expression on CR-CSCs [30], we measured CD44v6 levels on the CR-CSC cell lines. As expected Gr2 and Gr3, that had higher rates of aneuploidy, also showed higher CD44v6 surface expression compared with Gr1 ( Figure 1D). Overall, differences in CD44v6 expression were not significant, but higher CD44v6 expression correlated with increased levels of CRC stemness markers ( Figure 1E). Indeed, critical genes involved in embryonic development, EMT and invasiveness, including DKK1, FN1 and TWIST, showed higher expression in Gr2 compared with Gr1 ( Figure 1E    the density distribution of the methylation β values, we noticed that only the Gr1 CR-CSCs had a peak between 0.3 and 0.5 ( Figure 2B). To rule out the possibility that this peculiar distribution was the result of a technical artifact, we performed the same analysis after normalization of raw data using the method proposed by Fortin et al. [42]. Gr1 samples maintained an evident peak of intermediate methylation values respect to non-normalized data (Supplementary Figure 1A). To gain insight on this pattern, we divided the probes according to genomic localization: CpG islands; shores, in other words, regions up to 2 kb from CpG island; shelves, in other words, regions from 2 to 4 kb from CpG island and open sea, in other words, the rest of the genome [43,44]. We found that the peak at intermediated DNA methylation values in Gr1 was due to CpG probes mapping in shores, shelves and open sea, but not in CpG islands ( Figure 2C). Given that Gr1 displayed molecular features associated to favorable prognosis when found in tumor sample patients (MSI-H; Table 1), we investigated the differences in DNA methylation between Gr1 and Gr2 in order to identify genes whose aberrant epigenetic regulation could be linked to more aggressive tumor phenotype.
First, we used the ANOVA test to compare Gr1 and Gr2, and identified 262 DMPs (BH corrected p-value < 0.05; Figure 3A and Supplementary Table 3). A hierarchical clustering heatmap confirmed that Gr1 and Gr2 were clearly separated ( Figure 3B Thereafter, we used an alternative analytical pipeline specifically designed to identify DMRs, in other words, regions where multiple adjacent CpG sites were differentially methylated among the compared groups [32]. Focusing on CpG islands, shores and shelves associated to genes, we identified 282 DMRs (BH corrected p-value < 0.05;  Supplementary Table 4). Among the genes, MERTK, CPE, HEY2 and FOXO3 stood out because of their known association with cancer aggressiveness and metastatic potential [45][46][47][48][49].

Genome-wide methylation analysis of CR-CSC xenografts
As demonstrated by unsupervised clustering, the CR-CSCs and the CSCs derived from the xenografts formed after their engraftment into NOD-SCID (CR-XenoCSCs) tended to maintain similar genome-wide DNA methylation profiles ( Figure 4A). Accordingly, the peculiar distribution of β values of Gr1 and Gr2 samples was also maintained in CR-XenoCSCs (Supplementary Figure 1B). Paired t-test identified 3176 DMPs (BH corrected p < 0.05; Supplementary Table 5) between the CR-CSC and CR-XenoCSCs pairs, but intrapair methylation differences were low, mostly >-0.1 or <0.1 ( Figure 4B).

In vitro & in vivo characterization of CR-CSC lines
We characterized the eight CR-CSCs for their proliferative and tumorigenic capacity through in vitro and in vivo experiments. The proliferative potential was assessed in vitro in presence or absence of oxaliplatin, to establish whether chromosomal or MSI status could correlate with proliferation and chemoresistance. Despite the low number of cell lines tested, our results show a less proliferative phenotype in the Gr1/MSI-H subset compared with the Gr2/MSS. Interestingly, CR-CSC #21 (Gr3) showed very low growth rate coupled with resistance to oxaliplatin ( Supplementary Figure 2A-C). The little differences observed in vitro disappeared in vivo when the CR-CSCs were injected subcutaneously into NOD/SCID mice. No significant differences in tumor growth or survival rate of the xenografted mice were identified (Supplementary Figure 2D).

Comparison between the DNA methylation signatures of the CR-CSC lines & of patient-derived normal & neoplastic colorectal tissues
Two Infinium450k datasets were downloaded from the Gene Expression Omnibus -GEO repository. The first dataset (GSE42752) [50] included 22 primary CRCs, their matched healthy mucosa samples and 19 cancer-  unrelated colon tissues. The second dataset (GSE53051) [25] included 18 normal colon tissues, 10 colon adenomas, 9 primary CRCs and 16 CRC lung or liver metastases. Only common probes (intersection between datasets: 450626 probes) were considered to compare the β value distributions in the Infinium450k and EPIC datasets. The DNA methylation landscape of the CRCs and of the CR-CSCs was largely different, with the CSCs generally hypomethylated compared with the CRCs ( Figure 5A). We cannot exclude a priori that the observed differences are at least in part the result of a batch effect, as the CR-CSCs and the CRCs datasets have been generated by different laboratories using different versions of the same microarray platform. However, two considerations suggest that the distinct patterns observed in CR-CSCs and CRCs have a true biological basis. First of all, a small dataset (GSE98990) containing two normal samples and two CRC samples, processed using the EPIC microarray, had a distribution of β values similar to that of the samples in GSE42752 and GSE53051 datasets and completely distinct from the CR-CSCs ( Figure 5A). Secondly, hypomethylation of CR-CSCs was not generalized, as it was specific for shores, shelves and open sea CpGs, while a trend toward hypermethylation was observed for the distribution of the β values in CpG islands (Supplementary Figure 3).
To gain further insights into the DNA methylation differences between CRCs and CR-CSCs, we compared our dataset of CR-CSCs with CRCs samples from datasets GSE42752 and GSE53051. To minimize the potentially confounding batch effect discussed above and to maximize the identification of biologically relevant signals, we focused on the DMRs mapping in regions associated to genes [32]. Using stringent criteria (Bonferroni corrected p-value < 0.01, at least two adjacent CpG sites with >0.2 concordant difference of methylation between CRCs and CR-CSCs) we selected 2967 DMRs mapping in 2688 genes (Supplementary Table 6). Compared with CRCs, we found that in CR-CSCs DNA methylation increased at CpG islands (hypermethylation in 78% of the DMRs) but decreased in shores and shelves (hypomethylation in 84 and 97% of the DMRs, respectively).
We used the DMRs associated to genes to cluster the samples of normal mucosa, adenoma, primary CRC and metastases included in the GSE42752 and in GSE53051 datasets. Cluster analysis using the most significant CpG site within each DMR suggested that the epigenetic profile of the selected DMRs tended to accompany CRC progression ( Figure 5B). Figure 6 shows examples of hypermethylation of gene promoters in CR-CSCs, intermediate methylation in CRCs and hypomethylation in non-neoplastic mucosa.  Table 6), including miR-124-3 and miR-129-2, both found hypermethylated in CR-CSCs in this study and already described as hypermethylated in CRCs compared with normal tissue [51,52]. Moreover, we found 31 DMRs mapping in imprinted loci (Supplementary Table 6), including the IGF2/H19 locus and the KCNQ1 gene both at 11p15, associated in several studies with CRC [53][54][55].
We plotted the CpG methylation status of the genes associated to cell stemness [6]. We found these genes, with few exceptions, having complete promoter demethylation in CR-CSCs (Supplementary Figure 4), which could attest promoter accessibility of these genes to sustain stemness and/or the reprogramming of differentiated CRC cells toward the CR-CSC phenotype [56][57][58].
Finally, we analyzed the GSE42752 and GSE53051 datasets to evaluate the DNA methylation values of 262 DMPs that distinguish Gr1 and Gr2 CR-CSCs (Supplementary Table 3). Of these, 135 were included in both datasets. Unsupervised clustering using these 135 probes showed that the Gr1 CR-CSCs clustered separately from the other samples ( Figure 7). Furthermore, the GEO samples tended to cluster according to CRC progression from normal mucosa to CRC. We performed the same analyses for the 83 DMPs that had methylation values between 0.3 and 0.5 in Gr1 but not in Gr2, 38 of which were included in GSE42752 and GSE53051 datasets (Supplementary File 1). Clustering of GEO samples according to CRC progression tended to be recapitulated also using this small set of probes (Supplementary Figure 5; K-means clustering with k = 4, adjusted rand index: 0.38).

CR-CSC Gr3
Color key Value 0 0.2 0.6 By using three different techniques, we were able to discriminate the MSI-H CR-CSCs (Group 1) from those with MSS/MSI-L features and CIN phenotype (Group 2). These two groups were distinguishable based on: frequencies of genetic variations in 20 CRC associated genes investigated by next generation sequencing (NGS), CNV analysis of seven chromosomal regions frequently deleted or amplified in CRC or methylation status of 262 CpG dinucleotides. Moreover, we saw a weak increase in surface level of the metastatic marker CD44v6 [30] in Group 2 relative to Group 1.
The MSI-H CRCs rarely metastasize [60]. In this respect MSI-H status is mutually exclusive with CIN, which has been associated with metastatic potential [39]. Here, we identified a CR-CSC DNA methylation signature associated to CIN. However, in vivo experiments did not confirm greater aggressiveness of the CIN relative to the MSI CR-CSCs, since no differences in tumorigenic potential were identified between the two groups. This result could reflect the fact that increased immunogenic potential [59,[61][62][63], a hallmark of MSI-H CRCs, cannot be evaluated in in vivo models based on immune-deficient mice. Furthermore, it is worth to note that the most significant DMR between Gr1 and Gr2 fits with the DNA regulative elements of the MERTK gene, methylated in the MSI-H and demethylated in the MSI-L/MSS CR-CSCs, respectively. MERTK is a tyrosine kinase receptor that possesses several roles in cellular biology, including immune suppression through PDL-1 expression [64]. Thus, our data suggest a possible role of epigenetics in immune escape, independently of neo-antigen expression. Another significant difference between MSI-H and MSS CRCs is the lack of benefit of adjuvant 5-FU-based chemotherapy in stage II colon MSI-H cancer patients [65]. We did not find DMRs in genes associated with 5-FU resistance [66,67], but we registered miR-193a hypermethylation in Gr1 versus Gr2 (Supplementary Table 4), accordingly with its methylation and involvement in 5-FU resistance in hepatocellular carcinoma cells [68]. However, this does not exclude that our DMR regions, which are mostly shore, shelves and open sea, could be involved in tumor drug resistance.
One of the most relevant epigenetic differences between the MSI-H (Gr1) and MSS/MSI-L (Gr2) subsets is that most of the significant DMPs are localized in shelf, shore and open sea CpG positions. This is evident in the light of the Gr1 specific-peak around 0.4 of β values, composed mostly of shelve, shore and open sea CpGs. Due to the homogeneity of our cell lines, we assume that this value identifies CpGs that are monoallelically methylated, although we cannot exclude a specific CpGs hemimethylated pattern in CR-CSCs [69]. In the view of mono-allelic methylation, since Irizarry et al. show that most of the cancer-and tissue-specific CpG methylations occurs at shore positions and not at CpG islands of gene promoter [43], we acknowledge the possibility that the MSS/MSI-L (Gr2) could have lost its tissue-specific epigenetic footprint when compared with the MSI group (Gr1) and/or gained other methylation hallmarks.
Other studies correlated epigenetics to aneuploidy, but whether CIN could be caused [70,71] from or be the cause [26,72] of aberrant DNA methylation is still unresolved. Our specific epigenetic signatures separating CR-CSCs with or without CIN could provide insights into this unsolved issue by indicating new genomic regulative elements for the chromosomal stability.
Next, we compared the methylation profiles of the CR-CSCs with those of non-neoplastic mucosa, adenoma, primary and metastatic CRC and found a progression in methylation alterations from normal mucosa to CR-CSCs. The overall hypomethylation of the CR-CSCs does not reflect the methylation status of the different genomic compartments. In the CR-CSCs, CGIs were essentially hypermethylated, whereas shelves and shores were hypomethylated. However, a more specific analysis identified demethylation of several gene promoters controlling cell stemness. This suggests that demethylation of these genes is not necessary for cellular differentiation but relevant/required for possible reprogramming of differentiated cells to a stem status. Therefore, as already stated for intestinal stem cell differentiation, gene regulation by DNA methylation might play a role in the differentiation steps of colorectal tumorigenesis [22,73]. However, given the molecular complexity of gDNA methylation, it is unclear how this could occur. For example, an unexpected molecular mechanism was recently described by Roulois et al., who demonstrated that the demethylating agent 5-aza-2-deoxycytidine targets CRC-initiating cells by inducing viral mimicry through activation of endogenous double-strand RNAs [74].

Conclusion
In conclusion, our findings provide further characterization of CR-CSCs and could help in the identification of novel targets to be used for the eradication of resistant stem-like cancer cells in preclinical and clinical settings. Moreover, the genetic and epigenetic homogeneity of the CR-CSCs derived from both primary and xenograft tumors suggests that these models could be used to test the effectiveness of therapies against the CSC compartment in vitro and in vivo.

Future perspective
CR-CSCs are implicated in tumor development, metastasis and therapy resistance; however, little is known about their epigenetic features. The results presented in this manuscript pave the way for further researches that will functionally characterize the involvement of CR-CSC DNA methylation in the regulation of cellular stemness behavior and of metastatic potential, taking into account the differences that we have described between CR-CSCs with or without MSI. It will be challenging to understand if these epigenetics alterations are affecting, consequence or both of the CR-CSC phenotype, with particular regard to those CpGs that do not reside within CpG islands. Finally, future research should address the prognostic potential of the methylation profiles of CR-CSC lines, especially when purified from primary tumors of early-stages CRC patients.

Supplementary data
To view the supplementary data that accompany this paper please visit the journal website at: www.futuremedicine.com/doi/full/10.2217/epi-2018-0158