Abstract
How cells partition the genome into active and inactive genes and how that information is established and propagated during embryonic development are fundamental to maintaining the normal differentiated state. The molecular mechanisms of epigenetic action and cellular memory are increasingly amenable to study primarily as a result of the rapid progress in the area of chromatin biology. Methylation of DNA and modification of histones are critical epigenetic marks that establish active and silent chromatin domains. During development of the kidney, DNA-binding factors such as Pax2/8, which are essential for the intermediate mesoderm and the renal epithelial lineage, could provide the locus and tissue specificity for histone methylation and chromatin remodeling and thus establish a kidney-specific fate. The role of epigenetic modifications in development and disease is under intense investigation and has already affected our view of cancer and aging.
Implicit in the structure of the DNA double helix proposed by Watson and Crick was the heritable nature of the genome during cell division, for one complementary strand served as a template to replicate the other. Once the genetic code was revealed as triplets of nucleotides transcribed into messenger RNA and translated into proteins, the central dogma of molecular biology was complete. Thus, genetics, which predated the discovery of DNA and concerned itself with the abstract concepts of heritability and phenotype, was explainable in more biochemical terms as genes were assigned to blocks of DNA sequence and alleles became variants of such sequences. Despite the remarkable progress made in deciphering the meaning of the genome, certain phenomena remained outside the realm of classical genetics and the DNA centric view of inheritance. These epigenetic phenomena have received much attention in recent years because they have a direct impact on our understanding of genome regulation, the differentiated state, and the stability of cellular identity in many biologic systems.
WHAT IS EPIGENETICS?
The term epigenetics, most broadly defined, refers to a type of inheritance that is not encoded directly within the DNA sequences of genes. Such heritability can be through the germ line, as from parents to offspring, or from a single mother cell to its progeny during mitosis. In humans, classic examples of epigenetic phenomena include maternal and paternal imprinting, in which a disease phenotype is differentially expressed depending on whether the allele is inherited from the mother or the father. For example, deletions near chromosome 15q11 are associated with Prader-Willi syndrome, when inherited from the father, yet manifest a very different spectrum of phenotypes, Angelman syndrome, when inherited through the mother. These types of epigenetic imprinting effects likely arose with the evolution of placental mammals1 and underscore a fundamental difference between maternal and paternal genomes. That haploid genomes are not equivalent was also addressed in an elegant series of nuclear transplant experiments by Solter (for review see reference2). Mouse zygotes containing two maternal genomes developed to midgestation but lacked extraembryonic tissues, whereas zygotes with two paternal genomes generated mostly extraembryonic tissues. These classic experiments demonstrate the need for both maternal and paternal genomes in mammalian development. Furthermore, the early embryo must be able to distinguish two different haploid genomes through some epigenetic mechanism and subsequently erase or reprogram these epigenetic recognition marks once the genomes pass through the new germline.
More recently, the study of epigenetics has concerned itself with inheritance not just through the germline but also in somatic cells during embryonic development. Development can be thought of as a series of decisions that restrict the fate of rapidly dividing cells in response to positional information. Although embryonic stem cells and cells of the epiblast are truly pluripotent, once gastrulation occurs, the three germ layers become restricted in their eventual fates. Despite that the cells are still proliferating rapidly and do not express many differentiated markers, such restriction is generally stable and inherited in all progeny cells. As cells of the neural ectoderm, mesoderm, and endoderm continue their differentiation, cell fates become increasingly more restricted. The stability of the terminally differentiated genome is best illustrated by the difficulty of cloning by somatic cell nuclear transfer. When a nucleus from a specialized adult cell type is introduced into the denucleated zygote, the epigenetic marks accumulated within that differentiated nucleus are not easily erased. Although Dolly the sheep3 first proved that epigenetic marks can be erased and the somatic genome reprogrammed within the cytoplasmic environment of a fertilized egg, it is a rare and inefficient process.
The loss of pluripotency and the heritability of a differentiated state imply the genome is modified to remember patterns of gene expression in all progeny cells. Dosage compensation by inactivation of the X chromosome in female cells is a well-studied example of a mitotically inherited epigenetic phenomenon that controls gene expression patterns.4 Strikingly, inactivation of the X chromosome is coincident with the loss of pluripotency as the mammalian epiblast undergoes gastrulation. Although the initial decision to inactivate the maternal or paternal X seems to be random, once the decision is made, all daughter cells from a single founder will have the same X chromosome inactivated. The inactive X chromosome is tightly packaged and easily identifiable as a Barr body, yet in subsequent rounds of mitosis, it must be unpackaged, replicated, and repackaged once again, suggesting that the silent X is marked through an epigenetic mechanisms that distinguishes it from its active counterpart.
Because the pattern of gene expression defines the differentiated state, the heritability of such patterns must be well regulated. On the autosomes, active and inactive genes are present, sometimes in close proximity, and must be recognized by positive and negative transacting factors. Does the complement of regulatory proteins define the differentiated state, or are these active and inactive genes somehow marked, like the X chromosomes? As a result of many recent breakthroughs in the fields of chromatin biochemistry and structure, these questions and more are now being addressed in a variety of experimental systems.
BIOCHEMISTRY OF EPIGENETICS
The study of epigenetic phenomenon has presupposed that genes can be marked somehow by modifications that are independent of the primary nucleotide coding sequence. Such marks include the covalent modification of DNA nucleotides directly, specifically the methylation of cysteine in CpG di-nucleotides. Modifications are also found in the proteins tightly associated with DNA. Eukaryotic chromatin consists of the double-stranded DNA helix and associated proteins. Histones are small, highly basic proteins that are tightly bound to the DNA. As a family, the histones are extremely well conserved from yeast to humans and are among the most common proteins in the cell. The basic unit of chromatin is the nucleosome, which consists of approximately 147 bp of the DNA helix wrapped around a histone octamer containing two copies of histone H2A, H2B, H3, and H4 (Figure 1). Individual nucleosomes are separated by a spacer, or linker, region that are bound by histone H1 or H5. The primary structure of chromatin is often referred to as “beads on a string” because of the appearance of the nucleosome and spacer sequences under the electron microscope; however, this relatively unfolded form of chromatin can assume higher order structures to compact the genome even further (Figure 1). The folding and unfolding of chromatin is a dynamic and regulated process that is necessary for DNA replication but can also have a direct impact on the accessibility of genes to transcription machinery.
The dynamic structure of chromatin. The basic unit of chromatin is the nucleosome, a histone octamer represented by the tan-colored balls on the 11-nm string. The DNA helix is wrapped around the nucleosome, and a spacer region, bound with histone H1 or H5, separates adjacent nucleosomes. The histone tails extend out of the nucleosome and can be modified by methylation or acetylation. The beads-on-a-string can condense into a solenoid, a theoretical structure thought to be approximately 30 nm in diameter. Increased condensation leads to a chromatin fiber that is several hundred nm in diameter. The mitotic chromosomes is the most compact form of chromatin in the cell. Chromatin remodeling refers to the sliding of nucleosomes along the string, the compaction of nucleosomes into higher order structures, and the unfolding of higher order structures into accessible chromatin.
The methylation of DNA at CpG di-nucleotides was among the first genome modifications described and remains an attractive mechanism for epigenetic inheritance of cell identity.5–7 During embryogenesis, the genome undergoes dramatic changes in CpG methylation and requires the function of de novo DNA methyltransferases Dnmt3a and Dnmt3b for postimplantation development. Hypermethylation is found on the inactive X chromosome and generally correlates with inactive genes on the autosomes. Many regulated promoters have CG-rich sequences upstream of and around the transcription start site. In general, these so-called CpG islands show hypomethylation when genes express yet can convert to a hypermethylated state upon inactivation. Because DNA replication is semiconservative, the pattern of CpG methylation can be inherited, whereby maintenance DNA methyltransferases modify the replicated strand on the basis of the pattern of CpG methylation in the template. DNA methylation as a gene-silencing mechanism was further supported by identification of proteins with methyl-CpG binding domains. Such methyl-CpG binding domains can act as adaptor proteins linking the methylated CpG islands to nucleosome and chromatin remodeling factors such as histone deacetylases.8–10
The complexity and types of modifications that can affect the histone octamer have been studied in great detail and form the basis of the histone code hypothesis.11 More recently, the dynamic nature of many histone modifications (Figure 2), whose mechanisms of heritability are still unclear, has prompted a more careful reexamination of the significance and impact of individual histone marks with respect to gene function.12 From an epigenetic viewpoint, the most important modifications are methylation and acetylation of the histone tails, which extend out of the nucleosome and are thus accessible to chromatin remodeling and/or transcription regulatory proteins. Acetylation of histones H2A, 2B, 3, and 4 strongly correlates with transcription activation in almost all experimental systems. Furthermore, histone deacetylases mediate gene repression and compaction into heterochromatin.13,14 In contrast, methylation of histones can correlate with gene activity or gene silencing, depending on which specific residues are modified. Within the amino-terminal tails, lysine residues available for methylation include K4, K9, K27, and K36 of histone H3 and K20 of histone H4. For example, methylation of H3 K9 by the mammalian HMT Suv39h15 correlates with silent chromatin,16 whereas actively expressed genes are enriched in di- and trimethylated H3K4.17 The conserved SET domain, first described in the yeast protein Set1,18 is the catalytic domain for methyltransferases and is found in more than 50 potential mammalian proteins. Among these are the mammalian ALL-1,19 ALR,20 and MLL21 genes, the homologues of the Drosophila trithorax group of epigenetic regulators.
Summary of histone modifications. The amino acid sequences of the core histone tails are shown. Amino acid residues that are subject to epigenetic modifications are shown. Of particular significance are mono-, di-, or trimethylation (M) at arginine or lysine residues; acetylation (A) at lysine residues; and phosphorylation (P) at serine residues. Some amino acids, such as lysine 9 of histone H3, can be subject to either acetylation or methylation, but not both.
How does methylation at specific residues alter chromatin structure? One hypothesis is these modified histone tails interact with specific proteins. Indeed, the highly conserved chromodomain, found in some chromatin remodeling factors, is able to recognize methylated lysine residues to promote heterochromatin formation and suppression.22–24 Conversely, the WDR5 protein binds to dimethyl-K4 of histone H3 and is responsible for promoting trimethylation as a mark for active gene expression.25 Furthermore, histone H3 methylation at K4 can provide docking sites for chromatin remodeling proteins to establish and maintain open chromatin configurations and potential access for other transcription factors and RNA polymerase.26
HISTONE MODIFICATION IN DEVELOPMENT
The realm of biomedical research encompasses a multitude of disciplines and organisms, spanning everything from prokaryotic biochemistry to human genetics and clinical practice. Thus, it is particularly satisfying when seemingly disparate fields converge to define new paradigms that revolutionize how we think about an old problem. That is precisely what happened when the fields of chromatin biochemistry met developmental genetics.
Development and differentiation require the partitioning of the genome into active and inactive domains, or loci. This genomic organization must be stably inherited through many rounds of cell division such that cellular memory is preserved during development. Genes that control such epigenetic memory in a complex metazoan were first discovered in the fruit fly, Drosophila, as enhancers or suppressors of homeotic gene expression during early pattern formation.27,28 Genes that prevent inappropriate activation or suppression of homeotic genes fall into the polycomb group of epigenetic regulators, whereas the trithorax group function to maintain homeotic gene activity. Although many individual genes were identified in screens for modifiers of homeotic gene activity, the biochemical function of the respective proteins and their effects on chromatin were unknown. Meanwhile, investigators using such diverse model systems as teterahymena and yeast discovered histone methylation as an essential component of gene silencing and heterochromatin formation. Histone H3K9 methylation correlated with the inactive chromatin in yeast15,29 and was later found as an early epigenetic mark on the mammalian X chromosome.30 The importance of histone methylation as a developmental epigenetic mark became clear when a polycomb protein complex was shown to have H3K9 methyltransferase activity and contain not only the SET domain enzymes but also numerous co-factors whose activity had been described in yeast. Enhancer of Zeste and Extra Sex Combs are polycomb group proteins that form a complex capable of methylating histone H3K9 and H3K27 to silence expression.31,32 Theses studies linked the genetics of polycomb group genes to biochemical activity on silent chromatin. Similarly, the trithorax group proteins TRX and Ash1 contain SET domains and histone methyltransferase activity able to mark H3K4, a mark that correlates with gene expression.19,21,33 Epistasis experiments (one gene inhibiting another) in the fly using mutants in TRX and PcG suggested that the trithorax group genes function not as classical activators but rather as suppressors of silencing by the polycomb genes.34 Although many issues still need to addressed, one interpretation of such experiments is that repression is a default state that will ultimately win out if active epigenetic marks are not maintained.
If active or repressed chromatin requires positive or negative histone methylation marks, then what is the status of chromatin in undifferentiated, pluripotent cells? This question is being addressed systematically in embryonic stem cells. So far, the results have been surprising. In undifferentiated mammalian embryonic stem cells, many important regulatory genes seem to possess neither fully active nor fully inactive histone modifications; rather, they contain some elements of both. This so-called bivalent epigenetic mark is thought to provide plasticity in the stem cell, because these genes are poised to assume either active or inactive chromatin states depending on which direction the cells differentiate.35,36 Thus, as stem cells differentiate along lineage pathways, more and more genes assume epigenetic marks specific for a particular cell type. Such a model would require interactions of epigenetic imprinting enzymes with DNA binding proteins that recognize specific genes at precise times and respond to positional information in the embryo. Unfortunately, few such factors have been described. Thus, linking changes in cell type–specific epigenetic patterning to the essential regulatory proteins that determine cell lineages, as defined genetically through mutant analysis, remains to be fully realized.
Genetic studies of flies first pointed to the stability and heritability of epigenetic modifications during development. The principle of genome stability in differentiated cells was further underscored by the difficulty in cloning mammalian embryos by somatic cell nuclear transfer; however, recent advances in histone biochemistry and stem cell technology suggest that, given the right circumstances, the genome may ultimately be more amenable to epigenetic reprogramming than previously thought. Several key breakthroughs have altered prevailing dogma. First, the identification of histone demethylases clearly suggests that, like acetylation, specific nucleosome methylation marks can be erased.37–39 Second, the ability to reprogram somatic cells into embryonic stem cells by introducing a limited set of genes that were known to maintain pluripotency dramatically shifts both the scientific and ethical boundaries in the stem cell field.40,41 In fact, histone demethylation of the repressive H3K9 epigenetic mark is a key event in stem cell self-renewal,42 suggesting that general inhibition of gene suppression mechanisms is critical for maintaining pluripotency.
KIDNEY DEVELOPMENT
During the past 40 yr, progress in understanding the genetic basis of kidney development has been remarkable.43,44 What began as a model system for epithelial-mesenchymal inductive interactions, early kidney development has now become a paradigm for organogenesis, epithelial cell differentiation, branching morphogenesis, and complex patterning events. The adult kidney is composed of many specialized epithelial, endothelial, and stromal cell types; however, the functional components of the kidney, the renal epithelial cells, share a common lineage that is specified quite early in development. This lineage is first apparent shortly after gastrulation, in a region of mesoderm called intermediate, because it lies between the axial, or somitic, mesoderm, and the lateral plate mesoderm along the mediolateral axis (Figure 3).
The kidney is specified from intermediate mesoderm. (A) A cross-section through a mammalian embryo shortly after gastrulation. The mesoderm becomes specified into paraxial, intermediate, and lateral plate. The paraxial mesoderm makes somites, whereas the intermediate mesoderm will make the nephric duct and metanephric mesenchyme. Bone morphogenetic protein (BMP) signals derived from the lateral plate are thought to promote expression of intermediate mesodermal markers, such as Pax2/8 and Osr1. Axial signals (AS) may counteract the BMP effect to suppress intermediate markers in the paraxial mesoderm. (B) A longitudinal view of the intermediate mesoderm shortly before kidney induction, anterior is top. The nephric duct is a bilateral epithelial tube from which mesonephric tubules are induced more rostrally and the ureteric bud grows out at the posterior aspect. Adjacent to the posterior nephric duct is the metanephric mesenchyme, an aggregate of cells already programmed to generate renal epithelia. (C) As the ureteric bud invades the mesenchyme and undergoes branching morphogenesis, inductive WNT signals from the bud induce the mesenchyme to aggregate around the tips, the cap mesenchyme, and become polarized to form a primitive epithelial renal vesicle. Each renal vesicle will generate a single nephron and reconnect to the branching ureteric bud, which generates the collecting duct system.
What defines the intermediate mesoderm and how does it arise? This question is fundamental to our understanding of renal epithelial cell lineage determination, because the fate decisions made in the mesoderm as it becomes more specialized may be irreversible. For example, by embryonic day 10.5 in the mouse, the posterior aspect of the intermediate mesoderm consists of an aggregate of cells called metanephric mesenchyme. This mesenchyme is awaiting inductive WNT signals from the ureteric bud,45 which will promote aggregation and mesenchyme-to-epithelial conversion; however, these inductive signals, normally provided by the ureteric bud, can come from many other tissues, such as spinal cord, WNT1-expressing cells, or even LiCl, an activator of the canonical WNT signaling pathway. The idea that metanephric mesenchyme is already programmed to make only renal epithelial precursors prompted Saxen44 to propose the concept of a permissive inductive signal, rather than instructive signal that reprograms the fate of the mesenchyme. The metanephric mesenchyme may be epigenetically programmed to respond to WNT signals in a limited way, such that only a certain type of epithelia and stromal derivatives can arise. Thus, cell fate decisions for making renal epithelia have already begun and can likely be traced back to specialization of the intermediate mesoderm after gastrulation.
Genetic mutations in the mouse and tissue manipulation in the chick embryo have helped to define some critical components of intermediate mesoderm identity and, by inference, metanephric mesenchyme specification. The DNA-binding proteins Pax2, Lim1, and Osr1 (odd skipped related) all are required and in some cases sufficient to specify intermediate mesoderm from surrounding lateral plate and paraxial mesoderm. The epistatic interactions among these three genes are still unclear. In the mouse, Pax2 is still expressed in Lim146 or Osr147 mutants, although Lim1 is not expressed in Pax2/8 double mutants.48 At the time of mesodermal specialization, ectopic expression of Pax2 can expand the domain of intermediate mesoderm in the chick embryo.48 In the frog embryo, Pax2 together with Lim1 can also make pronephric tissue,49 although Osr1 has a similar activity all by itself.50 The activation of Pax2/8, Lim1, and Osr1 genes must be position dependent and require local environmental cues. James and Schultheiss51,52 suggested that low concentrations of bone morphogenetic proteins diffusing from the lateral plate mesoderm promote Osr1 and Pax2 expression, hence intermediate mesoderm formation, whereas axial signals may inhibit bone morphogenetic proteins to suppress the intermediate phenotype. Thus, the Pax2/8 expression domain could be initiated at the boundary between axial and lateral plate mesoderm by the actions of two opposing gradients.
The intermediate mesoderm is initially defined by its position along the mediolateral axis; however, as development proceeds, it is clear that anterior-posterior (A-P) patterning is necessary for the derivatives of the intermediate mesoderm to assume their appropriate fates. For example, more anterior intermediate mesoderm generates the mesonephric tubules, whereas the more posterior metanephric mesenchyme generates the definitive kidney. Such A-P patterning is likely to involve the homeotic or Hox genes, which are known to specify segmented identity in the fly and A-P position in the mouse. Indeed, loss of the Hox11 group of genes results in complete renal agenesis as a result of the suppression of more posterior markers such as GDNF and Six2.53 The intersection among genes specifying the mediolateral axis and the A-P axis may provide a unique molecular address to position the metanephric kidney. This idea is supported by recent data on the regulation of Six2 by a complex of proteins that includes Hox11, Pax2, and Eya1 in the metanephric mesenchyme.54
Given that the metanephric mesenchyme is already programmed to generate renal epithelia, whether this restriction in fate is in part determined by epigenetic changes at specific loci that act to limit the developmental potential of these renal stem cells remains to be determined. Recently, the Pax2 protein was linked to a histone methyltransferase complex containing the trithorax homologues ALR/Mll2 and Mll3 through interactions with the adaptor protein, PTIP.55 Originally identified because of its interaction with Pax proteins,56 PTIP contains multiple BRCT domains, at least one of which is a phospho-serine–binding domain.57 Several laboratories have independently shown that PTIP is part of a histone H3K4 methyltransferase complex.58,59 Because Pax2 can be serine/threonine phosphorylated in response to Wnt signals,60,61 the interaction with PTIP promotes H3K4 methylation at kidney-specific loci in response to inductive signals. Although PTIP is not kidney specific, it may serve as an adaptor protein linking Pax2 and other tissue-specific DNA-binding proteins to the epigenetic machinery as important cell fate decisions are being made (Figure 4). Consistent with this idea, PTIP mutants are postgastrulation lethal and show a global reduction in H3K4 methylation.55,62 These data suggest that Pax2/8 function to provide locus and tissue specificity for epigenetic cues to restrict the developmental potential of the intermediate mesoderm to the renal lineage.
A model of Pax mediates gene activation in the metanephric mesenchyme. (A) The Pax2 protein recognizes a DNA-binding site and becomes phosphorylated (*), perhaps in response to WNT signals. (B) The nuclear adaptor protein PTIP localizes to the Pax2 protein and recruits the Mll2 histone methyltransferase complex to promote trimethylation at H3K4. (C) Trimethylation of H3K4 promotes nucleosome remodeling through interactions with BPTF, a subunit of the NURF complex. Pax2 and the Mll2 complex are no longer required at the binding site. Remodeling attracts the RNA polymerase II complex, and gene activation commences.
EPIGENETICS AND RENAL DISEASE
Although the study of epigenetic phenomena in disease has been widely investigated, I believe there are two specific areas in which chromatin modifications play a particularly important role. In aging and cancer, the stability of the differentiated state must be determined, at least in part, by the inherent memory of the genome within the mature cell. The hypermethylation of promoter regions in cancer cells can lead to the inactivation of tumor suppressor genes.63 For counteracting this effect, inhibitors of DNA methylation have been developed for treating leukemias and other hematologic malignancies; however, a longstanding issue in this field has always been whether DNA methylation is the cause of gene inactivation or is merely the result of inactivation. At least in one case, chromatin inactivation as measured by changes in histone methylation patterns seems to precede DNA methylation at an important tumor suppressor locus.64 There are many other indications that aberrant histone methylation is oncogenic, including translocations of the H3K4 methyltransferase ALL/MLL1 in acute lymphocytic leukemia65 and the overexpression of the H3K27 methyltransferase EZH2 in prostate cancer.66 Furthermore, overexpression of the histone demethylases LSD1, the Jumonji domain–containing protein 2, and Jumonji interacting protein PLU-1 correlates with prostate, esophageal, and breast cancer.67 Thus, the balance between histone methylation and demethylation is likely to be critical for maintaining the differentiated state in many adult tissues.
Chronic renal disease is in part a function of age. In the glomerulus, aging podocytes exhibit altered morphology and patterns of gene expression.68,69 In mammals, reduced levels of DNA methylation correlates with age,70 suggesting that repression of gene expression may be lost over time. Loss of epigenetic repression has also been observed in aging mice at X-linked and imprinted loci.71 The potential maintenance functions of histone and DNA methyltransferases in aging cells is an area of research that will need to be explored further if the stability of the epigenome in aging cells is to be fully understood.
CONCLUSIONS
The study of epigenetics is proving to be fertile ground. How the genome is packaged and replicated in different cell types is fundamental to maintaining cell lineage restriction and differentiated phenotypes in developing and adult mammals. The complex machinery that methylates DNA, modifies histones, and alters chromatin structure is being characterized at the biochemical level. A major challenge ahead is to link this machinery to locus- and tissue-specific factors that drive development and disease in mammalian organ systems, including the kidney.
DISCLOSURES
None.
Acknowledgments
This work was funded in part by National Institutes of Health grants DK073722 and DK054740 to G.R.D.
I am especially grateful to Sanj Patel for sharing preliminary data and many stimulating discussions.
Footnotes
Published online ahead of print. Publication date available at www.jasn.org.
- © 2008 American Society of Nephrology