REVIEW

Modern methods for analysis of changes to epigenetic landscape caused by exposure to environmental pollutants

Zanyatkin IA, Titova AG, Bayov AV
About authors

Centre for Strategic Planning and Management of Biomedical Health Risks of the Federal Medical Biological Agency, Moscow, Russia

Correspondence should be addressed: Ivan A. Zanyatkin
Shchukinskaya, 5, str. 6, k. 323, Moscow, 123182; ur.zmpsc@niktaynaZI

About paper

Author contribution: Zanyatkin IA systematized literature data and wrote the manuscript; Titova AG provided additional literature for the review and edited the manuscript; Bayov AV edited the manuscript.

Received: 2020-12-23 Accepted: 2021-01-26 Published online: 2021-02-10
|

A pollutant is a natural or synthetic chemical that causes environmental pollution when present in the environment at levels exceeding background values. The organs and systems that have direct contact with the pollutant sustain the most damage. Gases and suspended particulate matter affect the respiratory tract. Pollutants ingested with food or drinks are harmful to the gastrointestinal tract. Blood cells are affected as the main transport system of the body. The liver and kidneys can be damaged because of their leading role in the metabolism and excretion of toxic substances from the body.

Systemic effects of pollutants on the human body include irritation; disrupted mucociliary clearance, which results in the increased permeability of the bronchial epithelium to allergens and infection and promotes the risk of asthma; neurogenic inflammation; lipid peroxidation activation and depression of the ROS metabolism system; hyperactivity of neutrophil elastase, which causes lung tissue damage; increased production of inflammatory mediators, like metabolites of arachidonic acid, cytokines and adhesion molecules.

Basic concepts of epigenetics

Epigenetic studies the rules and patterns of epigenetic inheritance, i.e. changes in gene expression and cell phenotypes caused by mechanisms other than changes in DNA sequences. When exploring environmental effects on the epigenome, the primary focus is placed on the regulatory mechanisms of gene expression. The most common mechanisms are listed in tab. 1.

DNA methylation at cytosine residues is the most prevalent epigenetic mark. The most abundant form of methylated cytosine is 5-methylcytosine (5-mС) found in GC-rich sequences, which are known as CpG islands. These regions 

are typically located in the regulatory areas of the genome. In the absence of external influences, the pattern of DNA methylation is inherited by offspring from their parent. The inability to maintain this pattern leads to the death of the organism. Methylation of cytosine residues is carried out by a family of DNA-(cytosine-C5)-methyltransferases (DNMT) [8], the enzymes that transfer methyl groups from a donor S-adenosyl methionine to cytosine. DNMT1 maintains the level of methylation inherited from a parent. When complexed to UHRF1 (a chromatin protein), it can recognize methylated sites in a parental chromosome and reproduce a “methylation mark” at the equivalent locus on the new DNA. DNMT3a and DNMT3b establish methylation patterns de novo. DNMT3b is responsible for the hypermethylation of genes encoding DNA repair enzymes, which is believed to play the key role in malignant transformation in some cancer types [9]. Mutations in the DMNT3a gene are associated with acute myeloid leukemia in one-fifth of leukemia patients. Demethylation of cytosine bases occurs through iterative oxidation reactions of 5-mC to 5-formylcytosine (5-fC) and 5-carboxylcytosine (5caC), followed by the excision and substitution of these modified residues with unmodified cytosine; this process is mediated by thymine DNA glycosylase (TDG) and enzymes participating in the base excision repair (BER) mechanism [10].

Histone modifications constitute the second most common type of epigenetic marks. Histones are highly conservative proteins responsible for packaging and ordering DNA into nucleosomes. Histone modifications that modulate gene expression include lysine acetylation, which induces transcriptional activation, and lysine methylation, which, depending on the methylation site, can either act as an activating or repressing mechanism [11]. Lysine acetylation is regulated by 2 families of enzymes: histone acetyltransferases (HATs) and histone deacetylases (HDACs). HDACs are categorized into 4 classes. Class I comprises HDAC 1, 2, 3 and 8 expressed in the nucleus; class IIA includes HDAC 4, 5, 7 and 9, which shuttle between the cytoplasm and the nucleus; class IIB encompasses HDAC 6 and 10, which remain in the cytoplasm; class IV is constituted by HDAC 11.

Both DNA methylation and histone modifications (methylation and acetylation) can be affected by exogenous factors. For example, the activity of NAD+-dependent HDAC (sirtuin 1) can be modulated by a number of bioactive compounds, including resveratrol. HDAC inhibitors cancel transcriptional repression and gene silencing; this may result in untimely gene activation and trigger pathology. By contrast, HAT inhibitors restore epigenetic control, preventing unwanted gene transcription.

Summing up, epigenetic mechanisms of gene regulation per se constitute a complex multi-tiered system that remains understudied to this day. Environmental factors only add to its complexity, creating extra challenges for the analysis.

Methods for epigenetic landscape analysis

At present, two major types of epigenetic inheritance are known. With direct inheritance, epigenetic modifications are acquired at the germinal or embryonic stages [12]. They are manifested in phenotypes as early as the first generation and persist into the second or third generation of offspring. With indirect inheritance, phenotypic changes reveal themselves in the second or third generation of offspring, long after the causative epimutagen has been removed from the organism. If an epimutation is severe and affects critical genes, its consequences can manifest themselves during the lifetime of the organism.

Currently, there are a few methods for rapid methylation measurement in individual genes. Peripheral blood DNA methylation profiles hold promise as biomarkers of multiple small metastases [13]. Abnormal cellular content of the certain protein may be associated with cancer: levels of glycolytic and mitochondrial proteins (alpha-enolase, glyceraldehyde3-phosphate dehydrogenase, ATP synthase) are substantially elevated in human breast cancer induced by exposure to benzo[a]pyrene [14]. However, information about the proteome has value only when it is analyzed together with transcriptome data. Besides, cells can change their proteome to compensate for the effects elicited by the pollutant. For example, MCF-7 cells exposed to benzo[a]pyrene, dibenzo[а,i]pyrene or coal tar extract were shown to hyperexpress heat shock proteins HSP70 and HSP-27 [14]. Also, antibodies specific for the native protein may fail to recognize its mutant variant. An experimental study tested the reactivity of p53 with conformation-specific monoclonal antibodies PAb1620 and PAb240 in MCF-7 cells treated with cadmium salts. Exposure to cadmium resulted in the incorrect folding of the protein, disrupted its conformational structure and affected its recognition by antibodies [15]. Such analysis can be carried out using two-dimensional polyacrylamide gel electrophoresis.

In the simplest model, the gene would have only 3 distinct levels of methylation: 0 — no methylation, 50% — methylation of 1 allele, 100% — methylation of both alleles. In practice, this is not the case due to the heterogeneity of samples collected from real populations; most studies estimate DNA methylation at only 10–30%. Only quantitative methods are suitable for this type of analysis.

At present, there are two very alike groups of methods suitable for the analysis of genomes and transcriptomes (tab. 2). The first is DNA-RNA hybridization in which short DNA molecules are immobilized on a microarray, the studied DNA/RNA is hybridized to the immobilized DNA and then used as a template for DNA synthesis with fluorescent tagged nucleotides. Fluorescence intensity measured during DNA synthesis correlates with the amount of the analyzed DNA/RNA. This rapid analytical method for measuring gene transcription is, however, not free of errors associated with faulty hybridization.

Chromatin immunoprecipitation is another common analytical method. It consists of a few stages: formation of DNA-protein complexes, DNA purification, elution and sequencing. It is used to determine the proportion of DNA fragments with the target sequences in the mixture. The main constraint of massively parallel sequencing (MPS) is associated with the length of DNA fragments subject to sequencing: during immunoprecipitation, DNA is normally cut into short 100–500 bp fragments because longer fragments can give rise to sequencing errors. If the level of gene expression and the level of modification differ between the epimutated and the intact sites by only 10–20%, they will not be detected by chromatin immunoprecipitation. Interestingly, benign tumors are usually characterized by 10–20% difference in the levels of methylation at a studied locus [16]. At the same time, MPS can be employed to sequence both individual genes and whole genomes; the procedure can be sped up by using automated MPS. Unlike data from microarrays, MPS can be used to identify allelic variants, detect alternative splicing events, study DNA methylation at single-base resolution, and obtain information about previously unsequenced genomic regions, which makes MPS data only more valuable over time. The advantage of this method stems from its potential for further development: MPS is becoming faster and cheaper, whereas microarray-based sequencing has almost exhausted its potential. Besides, sequencing ensures higher accuracy of methylation measurements than microarrays.

Methylome sequencing is performed using the same approaches. However, in order to be applied to methylomes, sequencing techniques have been modified. Classically, unmodified cytosine is converted to uracil through sodium bisulfite-mediated covalent modification; in contrast, the methylated form of cytosine (5-mC) doesn’t react with sodium bisulfite. Differences in the obtained sequences allow identifying cytosine methylation sites. Novel luminometric methylation assays are based on DNA cleavage by methylation-sensitive restriction enzymes and subsequent DNA pyrosequencing accompanied by fluorescence detection. One of the platforms exploiting this technique is Pyrosequencer by Qiagen [26]. Pyrosequencing is a quantitative, reproducible and scalable method that doesn’t require any genomic DNA modification and is, therefore, time-saving. Besides, it works with as little as 200–500 ng of genomic DNA and includes internal controls to trace errors associated with differences in the amounts of initial DNA. Pyrosequencing has a few downsides: only relatively short DNA sequences can be sequenced without errors, and the probability of error increases for sequences with repeated bases.

The search for possible associations between the effects exerted by pollutants and genetic/epigenetic marks relies on the analysis of genome-wide, epigenomic and transcriptomic data. For the purpose of systematization, epigenomic data are arranged into databases (tab. 3), like ENCODE and Roadmap in Epigenomics. Challenges facing epigenomic data analysis pertain to the choice of the reference epigenome: even within one organism, the epigenome varies across tissues [27], changing over time and at different phases of the cell cycle [28].

Epigenomic databases will continue to expand as new data are accumulated. In the future, epigenomic databases will become an effective tool for uncovering the pathogenesis of human diseases associated with pollutants.

Challenges facing epigenomic data analysis

The diversity of epigenetic alterations caused by a pollutant is a serious obstacle in the development of models simulating the effects of the pollutant on the organism. It is reported that exposure to dioxin derivatives leads to the hypermethylation of CpG islands located in the imprinting control region of the murine Igf2 gene, whereas differential histone retention sites located upstream of the adjacent noncoding regions of the H19 gene are hypomethylated in comparison with the control group [29].

The second challenge pertains to the way epigenetic modifications are interpreted by the organism depending on tissue type, age, and the context in which the modification occurs. For example, histone 3 lysine 9 trimethylation (H3K9me3) is recognized by the transcription system as repressive in cases when H3 is not only bound to heterochromatin at individual sites but affects chromatin packaging globally within a cell [30] or is located in a gene promotor. However, H3K9me3 is also found in the bodies of actively transcribed genes [31]. DNA methylation inhibits transcription when it occurs in a gene promoter and has the opposite effect when it occurs in the gene body, which is characteristic of actively transcribed genes [20]. Besides, patterns of nucleosome positioning [32] and DNA methylation detected at intron-exon boundaries are different [33]. So, epigenetic modifications can affect the choice of splicing pathways and modulate the functions of the synthesized protein. Thus, transcriptome analysis is essential in developing a model of epigenetic modifications.

The dynamic nature of the epigenetic landscape, which transforms throughout the cell cycle, makes the analysis more complicated. At the same time, epigenomic signatures can be retained long after the causative factor has been removed [34]. This property of epigenomic signatures has given rise to an intrauterine growth restriction (IUGR) paradigm: a past event induces epigenetic changes that transform cellular memory into phenotypic consequences. The increased risk of morbidity and type 2 diabetes at older age long after the exposure to a toxic agent speaks in favor of this hypothesis [16, 35]. A caloric deficit in the uterus is presumed to evoke an adaptive response, causing the embryo to reorganize its metabolism in order to accumulate more calories; this adaptation becomes harmful once the baby is born and has access to a balanced diet [36].

Another problem that complicates the analysis arises from the existence of a non-linear interplay between several metabolic pathways, which get affected by a pollutant. For instance, bisphenol A directly interacts with S-adenosyl-methionine and at the same time modulates miRNA-29 expression via estrogen receptors [37]. This results in the decreased expression of DNA methyltransferases and the elevated expression of histone methyltransferase EZH2 implicated in repressive histone modification [38]. This means that the cumulative effect of all changes happening to the methylome is hard to predict.

Outside the laboratory, organisms are exposed to a medley of pollutants, which produce an unpredictable interplay of effects, complicating the analysis of real populations vs. model objects. This problem can be solved by using data on the epigenetic modifications that are caused by known pollutants and produce known effects [39].

Biological models for genomic and epigenomic analysis

A high-quality study of the epigenome must adhere to the fundamental principles of toxicologic research, including proper dosing, injection routes and the duration of toxic exposure [40, 41].

Epigenetic deregulation events are traditionally considered to be somatic; therefore, epigenome studies should be carried out on cells in which genetic, epigenetic and phenotypic changes can be detectable and distinct. This poses a serious difficulty for human studies because they can only rely on small biopsy specimens. Besides, even within one tissue specimen collected from a living organism cells may be in different states and affect each other.

The available biological models can be classified into three major groups. The first group is represented by cell cultures. Primary cell cultures collected during animal/human biopsies are very close to living organism cells in terms of their epigenome and transcriptome; however, primary cell cultures are fastidious and can 

undergo a limited number of passages. Besides, the stability of their methylome cannot be maintained without synthetic organoids that require a lot of time and resources to grow. Cancer cells are less capricious and can survive over 200 passages [18]. However, their epigenome, transcriptome and sometimes genome (unstable number of chromosomes) significantly differ from those of in vivo healthy tissue; so, the possibility of extrapolating the characteristics of healthy cell methylomes from the methylome of cancer cells is unlikely to be reliable. Primary cultures immortalized by viruses [42] are a tradeoff: they do not differ drastically from conventional primary cultures in their metabolism, can undergo an infinite number of passages and are easy to maintain. Another solution lies in the use of primary cultures obtained from embryonic or inducible stem cells.

The second group includes animal models. Epigenetic modifications are known to bring about the same effects in model mammals and humans [43], i. e. the results of a murine study can be extrapolated to humans. Advantageously, animal models allow exploring the inherited effects of pollutants [4447]. Yellow agouti mice Avy are a great example of animal lines whose phenotype correlates directly with DNA methylation levels. However, sometimes these animals do not respond to a known epimutagen used as positive control [48, 49]. The zebrafish (Danio rerio) is another popular model object: it breeds rapidly, allowing researchers to study the inheritance of epigenetic marks within a short time [50]. The genome of Danio rerio has been fully sequenced, so its changes are easy to track. Mechanisms underlying epigenetic regulation in these fish only slightly differ from those in mammals [51]. Zebrafish embryos are a successful model for studying the toxic effects of pollutants at early developmental stages.

The third group comprises cell cultures that are generated by animals throughout their lives and can be obtained without killing the animal. For example, the methylome of parental reproductive cells can be used to assess susceptibility to disease in offspring [52].

Candidate drugs against diseases caused by pollutants

The main therapeutic strategy against epigenetic disorders includes the following steps: removing the detrimental factor and neutralizing its residual effects. Often, prescribing a therapeutic diet is enough. For instance, the demethylating effect of ВРА can be compensated for by ingesting foods rich in methyl donors (folic acid and vitamin В12). Natural and synthetic chemotherapeutics are being increasingly used to reverse epigenetic modifications associated with cancer [53]. They usually act as inhibitors of DNA methyltransferases and histone deacetylases. For example, green tea polyphenols (GTP) and epigallocatechin gallate (EGCG) were shown to inhibit DNMT activity and expression; thus, GSTP1 [54] and the onco-suppressor gene RARβ2 were reactivated, which led to the inhibition of proliferation of esophageal cancer cells [55], breast cancer cells [56] and lung cancer cells [57] in model cell cultures and mice. On the one hand, the anti-cancer effects of the listed compounds have been proved; on the other hand, genome-wide demethylation may reactivate genes whose activity per se may have serious side effects.

Improved selectivity of synthetic drugs targeting the enzymes implicated in epigenomic regulation is an important research goal. N-hydroxy-N'-feniloctandiamide, which has been approved in the USA for treating cutaneous T-cell lymphoma [58] and thyroid cancer [59] and is available on the Russian market as Vorinostat or Zolinza, inhibits class I and II HDAC but ignores class III HDAC. Romidepsin, also known as Istodax, has a similar effect. Another promising chemical is DIM (3,3'-diindolylmethane), which selectively inhibits class I HDAC and thereby leads to the increased transcription of р21 and р27 (genes coding for cyclin-dependent kinases) [60], the termination of the cell cycle at the G2/M phase, inhibition of papillomavirus-associated neoplastic growth [61], induction of apoptosis in breast cancer cells [62], and inhibition of prostate cancer growth [63]. DIM has the potential to prevent acute radiation syndrome caused by technogenic disasters and radiation therapy and alleviate its symptoms [64]. DIM precursors have therapeutic potential, too. For example, indole3-carbinol (I3C) can regulate methylation levels in the promoter region of the р16 INK4a gene in a dose-dependent manner [65] and terminate cell division. I3C suppresses production of estrogen mediators, and therefore can be used to mitigate the course of some autoimmune diseases [66]. On the other hand, the overuse of I3C poses a risk for endocrine disorders. Another group of drugs that are currently undergoing clinical trials is represented by histone methyltransferase (HMT) inhibitors. One of them, tazemetostat, blocks EZH2-methyl transferase [67]. Pinometostat is another member of this group. Pinometostat inhibits (DOT1L) HMT [68] and GSK3326595, which, in turn, inhibits arginine methyltransferase 5 (PRMT5) [69].

Summing up, drugs that target the methylome and have been already approved for use in a clinical setting modulate the level of DNA methylation across the entire genome [70, 71] or by inhibiting one particular enzyme [72] involved in methylation.

They are intended for symptomatic treatment of progressing diseases but cannot correct epimutations.

Prospects of genetic and epigenetic therapy

Development of de novo drugs that can penetrate into the cell nucleus, selectively bind to a specific DNA locus and recruit or carry enzymes regulating DNA methylation is the most promising area of drug research. Systems for targeted genome editing are thought to have the greatest potential. Initially, hopes were laid on endonucleases with zinc-containing DNA recognition domains (ZFN or TAL) [73]. Later it became clear that each target site requires a unique protein to be synthesized, resulting in increased costs. A more versatile CRISPR/Cas9 system is based on the immune system of bacteria that specifically recognizes nucleotide sequences typical of viruses. This protein complex can be modified to disable its endonuclease activity, incorporate an RNA molecule responsible for the recognition of the target site and thus obtain an RNA-guided DNA-binding protein. Using genetic engineering techniques, the modified complex can be equipped with an enzyme exerting an intended effect on the epigenetic mark. With short Cas9 molecules it becomes possible to package the enzymatic complex into adeno-associated viral particles and thus integrate it into a recipient’s genome. Cpf1 is another promising endonuclease: it is smaller than a CRISPR/Cas complex but exerts similar activity. However, its potential is yet to be investigated.

Conclusion

Genetic and epigenetic changes are interrelated. Under certain conditions, replication/transcription enzymes recognize an epigenetic mark as a different nucleotide, which poses a risk of mutations. Epigenetic modifications can interfere with DNA repair by suppressing the expression of proteins involved in this process. In turn, genetic aberrations can disrupt the normal functioning of epigenome editing systems.

The analysis of epigenetic effects of pollutants poses a more serious challenge than genetic analysis due to the varied nature of epigenetic tags, their plasticity, the context in which they occur and the complex interplay of transcriptional regulatory pathways. Applied epigenetics requires a systemic approach. Bioinformatic projects may be very useful in systematizing epigenomic data.

Most methods of studying epigenetic marks rely on the analysis of the most common covalent modifications of DNA (methylation) and histones (methylation and acetylation); DNA isolation and epigenome analysis are the modifications of similar methods used in genomic studies and involve detection of modified sites.

There are a lot of limitations impeding the study of pollutantassociated effects on human genomes and epigenomes. The list of model objects exploited to investigate and predict the detrimental effects of pollutants includes cell cultures from organs and tissues, embryonic stem cells, embryonic tissue analysis and model animals, like mice, rats and Danio rerio fish.

Most of the currently available epigenetic drugs only alleviate the symptoms of epigenetic disorders. Research focus is placed on the targeted editing of pathogenic epigenetic sites.

КОММЕНТАРИИ (0)