Abstract
Mutation-based molecular diagnostics of autosomal dominant polycystic kidney disease (ADPKD) is complicated by genetic and allelic heterogeneity, large multi-exon genes, duplication of PKD1, and a high level of unclassified variants (UCV). Present mutation detection levels are 60 to 70%, and PKD1 and PKD2 UCV have not been systematically classified. This study analyzed the uniquely characterized Consortium for Radiologic Imaging Study of PKD (CRISP) ADPKD population by molecular analysis. A cohort of 202 probands was screened by denaturing HPLC, followed by direct sequencing using a clinical test of 121 with no definite mutation (plus controls). A subset was also screened for larger deletions, and reverse transcription–PCR was used to test abnormal splicing. Definite mutations were identified in 127 (62.9%) probands, and all UCV were assessed for their potential pathogenicity. The Grantham Matrix Score was used to score the significance of the substitution and the conservation of the residue in orthologs and defined domains. The likelihood for aberrant splicing and contextual information about the UCV within the patient (including segregation analysis) was used in combination to define a variant score. From this analysis, 44 missense plus two atypical splicing and seven small in-frame changes were defined as probably pathogenic and assigned to a mutation group. Mutations were thus defined in 180 (89.1%) probands: 153 (85.0%) PKD1 and 27 (15.0%) PKD2. The majority were unique to a single family, but recurrent mutations accounted for 30.0% of the total. A total of 190 polymorphic variants were identified in PKD1 (average of 10.1 per patient) and eight in PKD2. Although nondefinite mutation data must be treated with care in the clinical setting, this study shows the potential for molecular diagnostics in ADPKD that is likely to become increasingly important as therapies become available.
Autosomal dominant polycystic kidney disease (ADPKD) is the most common inherited kidney disease, with an incidence of 1 in 400 to 1000 (accounting for approximately 5% of ESRD), and is characterized by the development and progressive enlargement of cysts in the kidney. ADPKD is genetically heterogeneous, with two genes identified: PKD1 (16p13.3) and PKD2 (4q21).1–4 In linkage-characterized populations, PKD1 accounts for approximately 85% of cases and PKD2 accounts for most of the remainder,5,6 but further heterogeneity is possible.7 PKD1 has an average age at ESRD of 54.3 yr, compared with 74.0 yr for PKD2.8 PKD1 and PKD2 encode polycystin-1 (PC1) and polycystin-2 (PC2), respectively. PC2 is a TRP channel that may be involved in regulating intracellular Ca2+.9,10 PC1 and PC2 interact and, similar to other cystogenic proteins, have been localized to primary cilia.11,12 This complex may act as a flow-dependent mechanosensor that regulates the differentiated state of tubular epithelial cells.13
The diagnosis of ADPKD is typically determined by renal imaging with age-related cyst number criteria established for a diagnosis by ultrasound.14 Computed tomography and magnetic resonance imaging can also quantify renal cystic disease, and a recent trial, the Consortium for Radiologic Imaging Study of PKD (CRISP), showed that magnetic resonance imaging is a reliable means to monitor disease progression through renal enlargement.15,16 Image-based diagnostics is highly reliable in older individuals (>30 yr) but less certain in young adults and can be equivocal in the case of the young, living-related kidney donor. Identification of the ADPKD genes has allowed molecular diagnostics. Linkage studies are of limited utility because of the genetic heterogeneity and common cases with a negative family history. Diagnostics by mutation analysis has been challenging because of the large size and multi-exon structures of the genes, genomic duplication of PKD1, marked allelic heterogeneity, and common missense variants.17
PKD1 has 46 exons and a coding region of approximately 13 kb; the area that contains exons 1 to 33 is duplicated six times more proximally on chromosome 16 (the HG loci).1,2 Because of the high level of homology between PKD1 and the HG, protocols for locus-specific amplification of PKD1 are required for analysis.17 PKD2 has 15 exons and a coding region of approximately 3 kb. Because of these complications, few complete screens of both genes have been described, and detection levels are at 60 to 70%.17 A total of 270 different PKD1 and 73 PKD2 mutations are described in the Human Gene Mutation Database (HGMD), illustrating the high level of allelic heterogeneity; most mutations are unique to a single family.17–19 The majority of changes are predicted to truncate the proteins, although a significant number of missense changes have been described.
Whereas the pathogenicity of frame-shifting, nonsense, typical splicing, or large re-arrangements is usually clear, missense variants, atypical splicing changes, or small in-frame deletions (unclassified variants [UCV]), need further evaluation. The Grantham Matrix Score (GMS)20 has been used to score the significance of amino acid substitutions (the Grantham Distance [GD]) and the conservation of the residue in a multisequence alignment (MSA) of orthologous proteins (Grantham Variation [GV]).21–24 This evaluation, integrated with family, population (contextual), and histopathologic data, has been used to score UCV at BRCA1 and BRCA2 for diagnostics.25 Similar approaches have been applied to the autosomal recessive PKD gene.26
Here, we describe a comprehensive mutation screen and evaluation of UCV in the CRISP ADPKD population. These data are essential to analyze the results of the study,15,27 and similar approaches will be required to interpret future ADPKD therapeutic trials. Although the demand for molecular diagnostics in ADPKD is limited, the development of therapies is likely to transform the situation as a firm diagnosis in young adults before significant renal changes have occurred will be required.10
RESULTS
Mutation Screening
DNA for mutation screening was available from 202 families who had at least one individual enrolled in CRISP. Thirty-two families were multiplexed within CRISP, giving a total of 239 patients. The initial PKD1 and PKD2 screening of the 202 probands by denaturing HPLC (DHPLC) identified 81 definite mutations (nonsense, frame-shifting deletions or insertions, typical splicing, or in-frame changes of five or more amino acids [aa]); defined as mutation group (MG) = A in Table IA). A second round of screening by direct sequencing (DS) of both genes in all patients in whom no definite mutation was detected, plus 29 control subjects (a total of 150), revealed an additional 37 definite mutations. In addition, analysis of a subset by field inversion gel electrophoresis (FIGE) and visualization of PKD1 long-range PCR (LR-PCR) products revealed four large deletion mutations, giving 122 probands with MG = A (Table IA).
In four atypical splicing cases, RNA was available to test the splicing predictions, and in a fifth, IVS43+14del20, skipping of exon 43 had been previously demonstrated.28 In two patients, substitutions close to the end of IVS15 generated novel AG dinucleotides predicted to form novel splice acceptors and cause frame-shifting mutations. Reverse transcription–PCR (RT-PCR) and sequencing confirmed this abnormal splicing (Figure 1A). A similar mutation, IVS37-10C→A, was predicted to abolish the normal acceptor site by weakening the poly-pyrimidine tract (Figure 1B) and RT-PCR showed skipping of exon 38, plus a product that included the final 180 bp of IVS37. In the final case, PKD2 IVS4+3delAAGT, the splice donor site was predicted to be destroyed (Figure 1C). No abnormality was seen with RT-PCR across the region, but amplification with a primer in IVS4 showed that all or part of IVS4 is incorporated into the sequence. These mutations were therefore classed as MG = A, giving a total of 127 families with definite mutations.
Examples of atypical splicing events in PKD1 (A and B) and PKD2 (C). (A, top) Diagram of wild-type and mutant cDNA sequence illustrating the effects of two IVS15 substitutions. Novel sequence included in the transcript is boxed in red. (Bottom) Genomic sequence from these cases. Exonic sequence is in boldface, intron/exon boundaries are indicated with vertical lines, and polypyrimidine tracts are underlined. Substituted nucleotides are in red, and novel acceptor sites and polypyrimidine tracts are shown with red vertical and horizontal lines, respectively. Novel exonic sequence is in blue boldface. (B) Wild-type and mutated sequence of IVS37-10C→A. This substitution weakens the normal polypyrimidine tract (7/10; dashed black line) and eliminates the normal acceptor site. Reverse transcription–PCR (bottom right: M, mutant; N, normal) shows that exon 38 is skipped (the skipped product is also found at a low level normally) and a product including part of IVS37 is also seen. A diagram shows the wild-type and mutant transcripts (bottom left); the genomic sequence of the novel donor site in IVS37 is in red. (C) Diagram of the wild-type and mutant sequence of the PKD2 mutant IVS4+3del4 (top) and mutant cDNA sequence (bottom). This deletion destroys the normal donor site (underlined), and cDNA sequence (bottom) shows that a product including part of IVS4 is present in the mutant.
Scoring of UCV
All UCV that were not previously described as polymorphisms were analyzed and evaluated as possible disease-associated mutations. Those variants that ultimately were scored as probably pathogenic are summarized in Table II. The detailed scoring of UCV is shown in Table II (probably pathogenic), Table III (indeterminate), and Table IV (likely neutral polymorphisms). The algorithm that was used to evaluate the UCV is described in detail in the Materials and Methods section. To illustrate how this algorithm works, two examples are provided here. The PKD1 change G381S is moderately conservative but at a highly conserved site, giving a GD/GV score of +5. The change is not in a conserved domain, and no abnormal splicing was predicted (+0). This change was found in three different families, two of which had no other likely mutation and one had the indeterminate variant R3063C (+5). Segregation was demonstrated in one family with nine affected individuals (+4); and another mutation, G381C, has been described at this residue (+1) giving a variant score (VS) = +15. This is a high score (MG = B) reflecting the recurrent nature of the change, demonstrated segregation, and highly conserved residue that is mutated. In contrast, the same substitution at another site, G3651S, is less clearly pathogenic. The GD/GV score is the same (+5), but it is not in a conserved domain, it is not predicted to change splicing, and segregation has not been demonstrated (+0). No other variant was found in this individual (+2), giving a VS = +7 (MG = C). This lower level of certainty reflects that the substitution is relatively conservative and that the mutation is novel.
As a result of this analysis, 44 missense, two atypical splicing, and seven small in-frame deletions were defined as probably pathogenic, giving a total 180 (89.1%) of 202 with a probable mutation characterized. In 37 probands, a highly likely mutation (MG = B) was defined, with a likely mutation (MG = C) in another 16 (Table II). The pathologic significance of 22 additional variants was not clear, and they were defined as indeterminate (I; Table III). The remainder of novel variants were defined as likely neutral polymorphisms and are shown in Table IV.
The most important factor in determining whether a missense change was likely pathogenic was the degree to which it was conserved in orthologs and in other proteins with the same domain, with the GD almost always substantially greater than the GV (Table II). This contrasted with the 48 newly described neutral variants (Table IV), where the residue was rarely conserved and the GV was often higher than the GD. Recurrence of a variant in two or more patients with no other clear mutation also strongly supported a pathogenic likelihood. A factor that was important in excluding UCV as likely pathogenic was the finding of a more likely mutation in the same patient (Table IV). Four PKD1 UCV that previously were described as mutations were defined as indeterminate, R1340W and R4276W, or neutral polymorphisms, R324L and R2200C (Tables III and IV). The most interesting polymorphism was a 3-aa deletion (PKD1: 2894delANS) that did not segregate with the disease in a family with a segregating PKD2 missense change, R325Q. Analysis of orthologs in this area (Figure 2A) shows that this region is present only in the human sequence, not in the HG pseudogenes or other mammals. The presence of these three residues is, therefore, the result of a recent duplication in the human sequence. Segregation analysis also helped to identify likely mutations; for instance, in case 380166, two possible changes were found (F3168L and R2477C), but only the former segregated with the disease.
Multisequence alignment illustrating how comparative analysis can inform about the likely pathogenicity of variants. (A) Polycystin (PC1) sequence from CRISP case 187456 (pedigree 120010), mammalian orthologs, as indicated, and an example of the HG sequence, showing the position of the 2894delANS variant. This 3–amino acid region is present just in the human sequence. (B) Alignment of human polycystin-1 PC1 and PC2 homologs and sea urchin REJ sequences, as indicated, in the region of the polycystin-A motif (PC-A).29 Four missense mutations, two PC1 (red; R3753W and R3753Q) and two PC2 (blue; R322W and R325Q), affect highly conserved basic residues in this motif. (C) C-type lectin domains of human PC1, low-affinity Ig epsilon Fc receptor (FCER2), macrophage mannose receptor 1 (MRC1), asialoglycoprotein receptor 2 (ASGR2), and C-lectin consensus (Pfam). Mutations are shown in red (A432V, C436Y, and C508R), and indeterminate changes are shown in green (V466L and V466M). (D) Alignment of the PKD repeats of human PC1. The position of pathogenic changes are shown in red (W967R, Y1412D, G1503R, N1870H, G1999S, and G1999V), indeterminate changes are shown in green (N970K, H1093D, R1340W, S1352N, R1411C, and R1698W), and neutral changes are shown in blue (S903G, N10345, H1093Y, L1106V, V1339M, R1351W, A1422T, A1516T, H1777P, A1790V, E1811D, and A1871T). The position of the residue changed is shown colored.
Characteristics of Mutations
Mutation analysis of the CRISP population defined 85.0% as PKD1 and 15.0% as PKD2. These values correspond closely to previous studies,5,6 although these are the first data that are based on mutation detection. The breakdown of mutation type in the two genes is summarized in Table 1. Overall, >70% of mutations were predicted to truncate the protein, >80% in PKD2. Although the majority of mutations were novel, 27 had previously been described and seven others were seen more than once in the study; 30% of families had a recurrent mutation (Table 1). The most common mutation in the study was 5014delAG (five [2.5%] families), followed by Q2556X (four [2.0%] families); two truncating and two missense changes were found three times, and six mutations were found twice (see Table I for details).
Summary of mutations in the CRISP families
Important Functional Residues in PC1 and PC2
The large number of missense changes identified several residues or regions that were disrupted more than once, suggesting particular functional significances. In the PC-A domain (in the first extracellular loop of PC2 and third extracellular loop of PC129), two basic residues were substituted a total of four times (two PC1 and two PC2; Figure 2B) and four additional times in the literature (see Table I). Several mutations were also seen in the C-type lectin with three highly conserved residues substituted with a nonconservative or, in one case, a relatively conservative change (Figure 2C). Two other interesting substitutions, L4137P and L4139P, are predicted to disrupt the helical structure of the G-protein binding region.30 A clear indication of how conservation within a domain helped determine whether a substitution is likely pathogenic is shown in the PKD repeats (Figure 2D). Likely pathogenic changes were usually nonconservative in nature and at well-conserved residues, whereas the likely neutral or indeterminate changes were most often found at nonconserved residues and were often conservative changes (in some cases matching the residue found in other PKD repeats).
Mutation Location
Mutations were spread throughout both the PKD1 and PKD2 genes with at least three mutations found in each 5% interval across PKD1 (except one with a single event) and at least one in each 12.5% interval across PKD2 (Figure 3). Of the 46 PKD1 exons (including immediate flanking regions), 40 harbored mutations, as did nine of the 15 in PKD2. Analysis of the distributions show that they were NS differently from uniform in PKD1 (P > 0.05) or PKD2 (P > 0.10). The median mutation position in PKD1 was 6873nt (median gene position 6454.5nt) with 55.6% of mutation in the 3′ half of the gene. Missense changes were relatively evenly distributed in PKD1 (P > 0.10), but truncating changes were nonuniform (P < 0.01; Figure 3A). In particular, there was a cluster of truncations at the end of exon 15 to exon 19, corresponding to the junction of the PKD repeats and the REJ domain. This clustering was not due to recurrent mutations (Table I). IVS21 has previously been suggested as a possible hotspot for mutations as a result of a long polypyrimidine tract,31 and although this could be a factor, the peak did not correspond precisely to that region. Too few missense changes were seen in PKD2 to determine the significance of the distribution (Figure 3B).
Positions of mutations in PKD1 (A) and PKD2 (B). Graph of number of mutations, truncating (light blue) or in-frame (dark blue), in intervals of the genes (PKD1, 10 divisions; PKD2, eight divisions). Corresponding exonic regions of the transcript are shown below each graph (PKD1, green; PKD2, red) with exon number indicated. Protein structures with domain regions indicated are shown above the graphs for PC1 (A) and PC2 (B). Only truncating mutations in PKD1 were distributed significantly differently from uniform (expected per interval 10.7 mutations).
Mutation-Negative Cases
Twenty-two families did not have a probable mutation defined (Table V). In 10 of these, no variants other than known polymorphisms were detected, whereas in five others, only a single unique synonymous change was detected, which was not predicted to change splicing significantly. Two cases had a substitution of the same residue, V466, suggesting significance; however, both changes (to leucine or methionine) were highly conservative, and this residue is not well conserved (leucine in frog) and variable in the C-type lectin (Figure 2C). The unique change S1352N was similarly not scored as a mutation, because the substitution matched that found in fish, whereas another, L2696P, was rejected because of the highly nonconservative nature of this site in orthologs (see Table IIIA). Two cases, 103227 (V609G and N970K) and 118641 (V690G and R1340W), had two indeterminate changes, and a third had a single such change (476972; H1093D). V690G is at a residue previously described as the site of a pathogenic change (V690D), whereas H1093D is a nonconservative change at the site of a likely polymorphism (H1093Y). Although these both are suggestive of pathogenic changes, they did not meet the threshold that we set in this article (Table IIIA). Some cases may have larger deletions because not all were screened by FIGE, and, indeed, suggestive changes (dosage or possible aberrant fragments) or regions of apparent homozygosity (hemizygosity?) were seen in four cases, but the DNA changes have not been characterized (Table V).
Polymorphic Variants
PKD1 is a highly polymorphic gene with 194 different neutral or indeterminate variants detected within the coding and immediate IVS flanking regions. The corresponding figure for PKD2 was 8 (see Table IV for novel nonsynonymous changes). The average number of variant alleles per patient in PKD1 was 10.1, with a range from 0 to 55, whereas the corresponding figures for PKD2 were 0.8 (range 0 to 3). Only one polymorphic variant in PKD2 had a minor allele frequency >5% (R28P, 30.9% of alleles). In PKD1, the majority of variants were rare; 94 (51.1%) were found just once within the study, with 24 (12.6%) having a minor allele frequency >5%. The greatest variability in PKD1 was in the 26 black patients, in whom an average of 20.6 variant alleles was found. The high level of polymorphisms at PKD1 compared with PKD2 is likely explained by the larger coding region (four to five times greater) and GC richness of the region, resulting in a much higher level of hypermutable CpG dinucleotides.18,32 Additional special mechanisms are probably not required to explain the higher polymorphism rate in PKD1.31,33
DISCUSSION
We describe here a scheme for comprehensive mutation analysis in ADPKD. A definite mutation was detected in 62.9% of cases; in an additional 26.2%, a UCV was found to be probably mutagenic, giving a total detection level of 89.1%. These values compare favorably with previous mutation studies using DHPLC, with a mutation detection level of approximately 67%,17 or other screenings that found PKD1 truncating mutations in <50% of cases.18 The higher detection level was achieved because of DS; DHPLC alone detected mutations in 63.9% (similar to previous studies), but DS identified an additional 51 mutations, 28.3% of the total cases. These variants did not generate abnormal DHPLC profiles under the conditions used or mimicked one of the many polymorphisms in PKD1 and so were not characterized.34 Although the majority of missed mutations were to PKD1, seven were to PKD2. It is worth noting, however, that two control mutations were not found by DS, and one other required specific re-analysis. In two additional cases, changes were initially seen as homozygous but later shown as heterozygous with different primers. These findings illustrate the complexity of mutation analysis in ADPKD, especially in PKD1 with locus-specific LR-PCR, and suggest that allele dropout can be a problem, even with carefully designed primers.
Screening for larger deletions revealed four mutations (2.0% of the total), consistent with the level (1 to 3%) detected in other studies,1,35 with suggestions that some missed mutations may also be larger rearrangements (Table V). However, FIGE is time-consuming, and a rapid screening method, such as multiplex ligation probe-dependent amplification,36 is required to detect these mutations, although the duplicated area of PKD1 will prove a challenge to establishing multiplex ligation probe-dependent amplification at this locus.
A major factor in achieving this high detection level was establishing a framework for scoring UCV. Building on experience gained in analysis of the BRCA1 and 2 genes,22,25 we used a combination of scoring the potential detrimental effect of the change with contextual information about the variant. Although it would be ideal to test the functional significance of UCV, this is impractical for large, complex proteins such as PC1 and PC2 in a routine diagnostic setting (and at least for PC1, no such functional test is yet available). However, data obtained in the research setting—for instance, channel activity of a PC2 mutant,37 or effect on GPS cleavage of a PC1 mutant,38—may be useful to include in calculations of likely pathogenicity.
The value of an ADPKD molecular test depends on the reliability to which the pathogenicity of UCV can be determined. Factors that were found critical were to include distant related vertebrates (chicken, frog, and fish) in the comparative analysis and to analyze conservation in domain structures. There is scope to improve these predictions as more three-dimensional structures become available.24 Contextual information was gained by complete analysis of both genes; when a UCV was the only variant, it was a strong candidate to be pathogenic. No patient was found in this study with two definitely pathogenic mutations. One previous example has been described of a PKD1 patient with two nonsense mutations in cis17 (either occurring simultaneously or a second mutation on an already mutant allele), and rare similar examples have been described in other diseases. However, because these cases are rare, the finding of a UCV in a patient with a definite mutation is highly suggestive that the change is not pathogenic. Data regarding the contextual finding of variants are enhanced by accumulated knowledge on disease-associated and neutral variants being collected in ADPKD databases (HGMD and ADPKD Mutation Database).
The high level of novel variants limits the number of cases in which previous data are useful, but additional information can be obtained by segregation analysis (lack of segregation proves that it is not pathogenic). In this study population, segregation analysis was limited by the family samples that were available, although segregation of 20 UCV was demonstrated (Table II). Furthermore, segregation was crucial for showing that some PKD1 changes (R2477C and 2894delANS) were not pathogenic (Figure 2A and Table IV). Some variants were classed indeterminate, with a possibility that they may be pathogenic; segregation analysis would be helpful to define their status, especially in families with more than one such variant. When possible, the family should be the unit of analysis for molecular diagnostics in ADPKD.
From this study, approximately 11% of cases had no probable mutation defined. Further analysis is needed to screen for larger DNA rearrangements (see Discussion). Studies of other large, multi-exon genes have illustrated how exonic and intronic variants can have unpredicted consequences on splicing by influencing the normal splice junction or branch point or by creating a cryptic site39; our RT-PCR data have illustrated some examples in ADPKD. A category of changes that have not been analyzed in this study are those that alter splice enhancer and silencing sites.40 These are difficult to predict because the consensus sequences are short and not highly conserved41; RT-PCR analysis in patients in whom no mutation has been defined is likely to be productive and may also identify mutations that are embedded deep within introns that can influence splicing.42 Other possible sites of missed mutations are the promoter regions.43 Once this next level of mutation analysis has been completed, it will be possible to assess the representation of unlinked “PKD3” families in this cohort. The level of such families is likely to be low (certainly <10%), but they cannot be excluded altogether at this stage.
Recurrent mutations account for 30% of the families and have diagnostic implications because prescreening regions that contain these changes may simplify molecular diagnostics; ultimately, a focused approach that uses a DNA array may be productive. The most common mutation, 5014delAG (also the most commonly described in the literature; Table I), is not flanked by sequences that are likely to promote recurrent deletion and raises the possibility that it is an ancestral change. Haplotype analysis would be helpful to determine whether a common origin is likely. Of the other five mutations found at least three times in this study, three have been described elsewhere (see Table I for details) and two represent C→T (or A→G) changes at CpG dinucleotides (PKD1:E2771K and PKD2:R872X), suggesting that they may be recurrent.44 Others may be population specific; all three 3814delG cases were identified at the same center and may reflect unknown relatedness. The finding of some missense changes multiple times, notably E2771K (seven times in total), giving it a high VS, and to a lesser extent G381S (three times), suggests that a group of missense changes with almost certain pathogenicity will be defined over time.
Our studies have important implications for molecular diagnostics of ADPKD. Sequencing is the most reliable means to identify base-pair variants, but screens for larger changes have revealed mutations in a few additional cases. Definite truncating mutations can be determined in 60 to 65% of cases, and careful evaluation of UCV in the remainder can push the level of detection of probable mutations to approximately 90%. It is particularly important that the pathogenic potential of all UCV be assessed and explained in a mutation report so that the likely significance of the described variants can be judged. Although using the mutation information on the majority of patients with definite mutations in the clinical arena should be straightforward, caution is still required when using information from MG = B or C patients in the setting of transplantation decisions or starting therapy. ADPKD mutation databases will be a guide for those previously described UCV, but many novel changes can be expected so that analysis of a family, especially if large enough to determine the gene responsible, will allow the use of more nondefinite mutations in the clinic. It is likely that improved mutation screening will become an increasingly important part of obtaining a firm diagnosis in the patient with ADPKD.
CONCISE METHODS
Details of the CRISP Study Cohort
The CRISP study cohort consists of 239 patients who have ADPKD, are aged 15 to 46 yr, and have a GFR of >70 ml/mm at enrollment.15,16 For each proband, a blood sample was obtained for DNA extraction, and Epstein Barr virus (EBV)-transformed lymphoblast cell lines were established. Samples are stored at the National Institute of Diabetes and Digestive and Kidney Disease Center for Genetic Studies.
Mutation Analysis of PKD1 and PKD2 by DHPLC
DHPLC was performed as described previously.17 The duplicated region of PKD1 was amplified as five PKD1-specific fragments by LR-PCR, and exons were amplified from these fragments after dilution 1:1000 to avoid genomic DNA carryover. LR-PCR products were checked on 0.8% agarose gels before the nested PCR to detect large deletions/insertions. DHPLC was performed using a Wave System HT (Transgenomic, Omaha, NE), as described previously.17 Each amplicon was run at two temperatures, typically the melting temperature and +2°C, as determined by Wavemaker version 4.0.32 (Transgenomic). Chromatograms were subjectively grouped, depending on the differences in the profile from normals and known polymorphic variants.45 Novel profiles were sequenced, plus a representative profile that resembled a known variant; the remainder were identified by signature-based genotyping.34
DS of PKD1 and PKD2 and Confirmation of Mutations
DS was performed on a fee-for-service basis using a commercial diagnostic test (Athena Diagnostics, Worcester, MA), and a subset were analyzed at Emory for PKD2. This sequence data are now stored at the CRISP web site.
All likely pathogenic changes were double-checked in a different aliquot of the same sample, and, when samples were available, the segregation of variants was tested by DS. A subset of samples were also analyzed for large genomic rearrangements using FIGE, as described previously.1 The precise mutation in these cases was determined by amplifying across the deletion and sequencing the aberrant product. The designation of all variants in both genes was numbered from their translational start. The PKD1 sequence also included the alternatively spliced codon at the start of exon 32 so that the total length of the protein was 4303 aa.
RT-PCR to Analyze Atypical Splicing
RNA was isolated from lymphoblast cell lines using TRizol (Invitrogen, Carlsbad, CA), and cDNA was generated with the Superscript III cDNA synthesis kit (Invitrogen). For the IVS15 variants, LR-RT-PCR was performed with the SPEC-4 primers,18 followed by nested PCR with primers in exons 15 and 18. The IVS37 mutation was analyzed with primers in exons 36 and 39, and the PKD2 IVS4 mutation was analyzed with primers in exons 3, 7, and IVS4 (primer sequences available on request). All PCR products were sequenced by standard methods.
Classification of UCV
The GMS20 was used to determine the GD of each substitution. An MSA was generated using the software ClustalW (MacVector Inc, Cary, NC) of PC1 orthologs from human, rat, mouse, chicken, frog (Xenopus tropicalis), and fish (Fugu rubripes). Other fish species, Tetraodon nigroviridis and Danio rerio, were also used in some cases. For PC2, human, dog, mouse, chicken, Danio, and Fugu were used, plus in some cases Drosophila melanogaster, sea urchin, and Caenorhabditis elegans. Sequences were obtained from NCBI or Ensembl. Predicted sequences for chicken, frog, and fish species were edited as necessary. The GV was determined as the largest GMS between vertebrate orthologs. Conservation within defined motifs used published MSA: WSC46; PKD repeat2,3,47; REJ domain48; GPS38,46; PLAT domain49; and polycystin motifs, PC-A and B29; plus MSA of PC1 and PC2 homologs (PC1L1, PC1L2, PC1L3, PKDREJ, PC2L1, and PC2L2) and sea urchin REJ1 to 4 homologs. Changes within transmembrane regions and the G-protein binding region were also assessed for disruption of secondary structure. Predicted splicing changes were scored using the Splicing Predictor by Neural Network (http://www.fruitfly.org/seq_tools/splice.html).50 Details of previous descriptions of the UCV were obtained from the HGMD (http://www.hgmd-cf.ac.uk) and/or the ADPKD Mutation Database (http:/pkdb.mayo.edu).
For all of the UVC, a VS was calculated. A matrix comparing the GD and GV was used with the GMS divided into six groups (0, 1 to 35, 36 to 80, 81 to 130, 131 to 179, and >180) and scores from +8 (e.g., GD >131, GV <35) to −4 (e.g., GD <35, GV >81) given. An additional −4 was allotted when the substitution matched the residue in an ortholog. Conservation of the reside in a domain scored +2 to +4 (highly conserved to invariant) with −2 when substitution matched the residue in an ortholog. Splicing changes were scored from +20 when an abnormal splice product was demonstrated by RT-PCR to −5 when an intronic change was not predicted to alter splicing. Additional points were given when a variant segregated in siblings or a family (+2 to +4); failure to segregate = −20. Other descriptions of the change as a mutation added +2 per time described. If another likely mutation (groups A through C) or an indeterminate (I) change was found with the variant, −5 (A) to −1 (I) was subtracted; otherwise +2 was added. The sum of these scores gave the VS. VS ≥11 were classed very likely mutations (MG = B); 5 to 10 as likely mutations (MG = C); −4 to 4 as variants of indeterminate (I) pathogenicity; and ≤5 as neutral polymorphisms (P). The χ2 test was used to examine whether the mutation location in each gene was uniform.
DISCLOSURES
None.
Appendix
Details of probable mutations in the CRISP cohort: Definite, mutation group Aa
Details of probable mutations in the CRISP cohort: Highly likely, mutation group B
Details of probable mutations in the CRISP cohort: Likely, mutation group C
Classification of probable pathogenic mutations: Missensea
Classification of probable pathogenic mutations: Atypical splicing changesa
Classification of probable pathogenic mutations: Small deletions
Classification of indeterminate variants: Missense
Classification of indeterminate variants: Splicing variants
Classification of newly described likely polymorphic changes: Missense
Classification of newly described likely polymorphic changes: Splicing variants
Classification of newly described likely polymorphic changes: Deletion
Details of molecular analysis of mutation negative cases
Acknowledgments
The study was supported by National Institute of Diabetes and Digestive and Kidney Diseases cooperative agreements (DK56934, DK56956, DK56957, and DK56961), with additional support from an ancillary study for genetic analysis (DK56957-S1) and grant DK58816.
The study coordinators who collected samples are thanked: Jody Mahan, Beth Stafford, Lorna Stevens, Kristin Cornwell, Vickie Kubly, Diane Watkins, Sharon Langley, and Pam Trull. We also thank Mary Virginia Gaines for management support. Jim Calvet is thanked for discussions of specific mutations. John McAuliffe, William Seltzer, Lynne Leclair, and Mark Smith at Athena Diagnostics are thanked for assistance in the fee-for-service DS mutation analysis.
Footnotes
Published online ahead of print. Publication date available at www.jasn.org.
- © 2007 American Society of Nephrology