Genome-Editing Technologies: Principles and Applications
- 1Department of Bioengineering, University of California, Berkeley, California 94720
- 2Department of Chemical Engineering, Stanford University, Stanford, California 94305
- 3Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, Shanghai, China
- Correspondence: gaj{at}berkeley.edu; liujia{at}shanghaitech.edu.cn
-
↵4 These authors contributed equally to this work.
Abstract
Targeted nucleases have provided researchers with the ability to manipulate virtually any genomic sequence, enabling the facile creation of isogenic cell lines and animal models for the study of human disease, and promoting exciting new possibilities for human gene therapy. Here we review three foundational technologies—clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated protein 9 (Cas9), transcription activator-like effector nucleases (TALENs), and zinc-finger nucleases (ZFNs). We discuss the engineering advances that facilitated their development and highlight several achievements in genome engineering that were made possible by these tools. We also consider artificial transcription factors, illustrating how this technology can complement targeted nucleases for synthetic biology and gene therapy.
In recent years, the emergence of highly versatile genome-editing technologies has provided investigators with the ability to rapidly and economically introduce sequence-specific modifications into the genomes of a broad spectrum of cell types and organisms. The core technologies now most commonly used to facilitate genome editing, shown in Figure 1, are (1) clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated protein 9 (Cas9), (2) transcription activator-like effector nucleases (TALENs), (3) zinc-finger nucleases (ZFNs), and (4) homing endonucleases or meganucleases.
Genome-editing technologies. Cartoons illustrating the mechanisms of targeted nucleases. From top to bottom: homing endonucleases, zinc-finger nucleases (ZFNs), transcription activator-like effector (TALE) nucleases (TALENs), and clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated protein 9 (Cas9). Homing endonucleases generally cleave their DNA substrates as dimers, and do not have distinct binding and cleavage domains. ZFNs recognize target sites that consist of two zinc-finger binding sites that flank a 5- to 7-base pair (bp) spacer sequence recognized by the FokI cleavage domain. TALENs recognize target sites that consist of two TALE DNA-binding sites that flank a 12- to 20-bp spacer sequence recognized by the FokI cleavage domain. The Cas9 nuclease is targeted to DNA sequences complementary to the targeting sequence within the single guide RNA (gRNA) located immediately upstream of a compatible protospacer adjacent motif (PAM). DNA and protein are not drawn to scale.
In particular, the ease with which CRISPR-Cas9 and TALENs can be configured to recognize new genomic sequences has driven a revolution in genome editing that has accelerated scientific breakthroughs and discoveries in disciplines as diverse as synthetic biology, human gene therapy, disease modeling, drug discovery, neuroscience, and the agricultural sciences.
The diverse array of genetic outcomes made possible by these technologies is the result, in large part, of their ability to efficiently induce targeted DNA double-strand breaks (DSBs). These DNA breaks then drive activation of cellular DNA repair pathways and facilitate the introduction of site-specific genomic modifications (Rouet et al. 1994; Choulika et al. 1995). This process is most often used to achieve gene knockout via random base insertions and/or deletions that can be introduced by nonhomologous end joining (NHEJ) (Fig. 2A) (Bibikova et al. 2002). Alternatively, in the presence of a donor template with homology to the targeted chromosomal site, gene integration, or base correction via homology-directed repair (HDR) can occur (HDR) (Fig. 2B) (see Fig. 2 for an overview of other possible genome-editing outcomes) (Bibikova et al. 2001, 2003; Porteus and Baltimore 2003; Urnov et al. 2005). Indeed, the broad versatility of these genome-modifying enzymes is evidenced by the fact that they also serve as the foundation for artificial transcription factors, a class of tools capable of modulating the expression of nearly any gene within a genome.
Genome-editing outcomes. Targeted nucleases induce DNA double-strand breaks (DSBs) that are repaired by nonhomologous end joining (NHEJ) or, in the presence of donor template, homology-directed repair (HDR). (A) In the absence of a donor template, NHEJ introduces small base insertions or deletions that can result in gene disruption. When two DSBs are induced simultaneously, the intervening genomic sequence can be deleted or inverted. (B) In the presence of donor DNA (plasmid or single-stranded oligonucleotide), recombination between homologous DNA sequences present on the donor template and a specific chromosomal site can facilitate targeted integration. Lightning bolts indicate DSBs.
Here we review key principles of genome editing, emphasizing many of the engineering advances that have laid the groundwork for the creation, refinement, and implementation of the current suite of genome-modifying tools. We also provide an overview of the achievements made possible by genome editing, illustrating how this technology can enable advances throughout the life sciences.
TARGETED NUCLEASES
Zinc-Finger Nucleases
ZFNs, which are fusions between a custom-designed Cys2-His2 zinc-finger protein and the cleavage domain of the FokI restriction endonuclease (Kim et al. 1996), were the first targeted nuclease to achieve widespread use (Porteus and Carroll 2005; Urnov et al. 2010). ZFNs function as dimers, with each monomer recognizing a specific “half site” sequence—typically nine to 18 base pairs (bps) of DNA—via the zinc-finger DNA-binding domain (Fig. 1). Dimerization of the ZFN proteins is mediated by the FokI cleavage domain, which cuts DNA within a five- to seven-bp spacer sequence that separates two flanking zinc-finger binding sites (Smith et al. 2000). Each ZFN is typically composed of three or four zinc-finger domains, with each individual domain composed of ∼30 amino acid residues that are organized in a ββα motif (Pavletich and Pabo 1991). The residues that facilitate DNA recognition are located within the α-helical domain and typically interact with three bps of DNA, with occasional overlap from an adjacent domain (Wolfe et al. 2000). Using methods such as phage display (Choo and Klug 1994; Jamieson et al. 1994; Wu et al. 1995), a large number of zinc-finger domains recognizing distinct DNA triplets have been identified (Segal et al. 1999; Dreier et al. 2001, 2005; Bae et al. 2003). These domains can be fused together in tandem using a canonical linker peptide (Liu et al. 1997) to generate polydactyl zinc-finger proteins that can target a wide range of possible DNA sequences (Beerli et al. 1998, 2000a; Kim et al. 2009). In addition to this “modular assembly” approach to zinc-finger construction, selection-based methods for constructing zinc-finger proteins have also been reported (Greisman and Pabo 1997; Isalan et al. 2001; Hurt et al. 2003; Magnenat et al. 2004), including those that consider context-dependent interactions between adjacent zinc-finger domains, such as oligomerized pool engineering (OPEN) (Maeder et al. 2008). In addition, specialized sets of validated two-finger, zinc-finger modules have been used to assemble zinc-finger arrays (Kim et al. 2009; Bhakta et al. 2013), including those that take context-dependent effects into account (Sander et al. 2011b; Gupta et al. 2012).
One major concern associated with the use of ZFNs for genome editing (in addition to all targeted nucleases) is off-target mutations (Gabriel et al. 2011; Pattanayak et al. 2011). As a result, several approaches have been undertaken to enhance their specificity. Among the most successful of these has been the creation of obligate heterodimeric ZFN architectures that rely on charge–charge repulsion to prevent unwanted homodimerization of the FokI cleavage domain (Miller et al. 2007; Doyon et al. 2011), thereby minimizing the potential for ZFNs to dimerize at off-target sites. Additionally, protein-engineering methods have been used to enhance the cleavage efficiency of the FokI cleavage domain (Guo et al. 2010). One particularly promising approach for improving ZFN specificity is to deliver them into cells as protein. Because of the intrinsic cell-penetrating activity of zinc-finger domains (Gaj et al. 2014a), ZFN proteins themselves are inherently cell-permeable and can facilitate gene editing with fewer off-target effects when applied directly onto cells as purified protein compared to when expressed within cells from nucleic acids (Gaj et al. 2012). Modified ZFN proteins endowed with improved cell-penetrating activity have since been described (Liu et al. 2015a). ZFNickases can also facilitate gene correction in the absence of a DSB (Kim et al. 2012; Ramirez et al. 2012; Wang et al. 2012). These enzymes, which consist of one catalytically inactivated ZFN monomer in combination with a second native ZFN monomer, can stimulate HDR by nicking or cleaving one strand of DNA and are derived from a concept first illustrated by Stoddard and colleagues using homing endonucleases (McConnell Smith et al. 2009).
Unlike TALENs and CRISPR-Cas9, the difficulty associated with constructing zinc-finger arrays has hindered their widespread adoption in unspecialized laboratories. In particular, it remains challenging to create zinc-finger domains that can effectively recognize all DNA triplets, especially those of the 5′-CNN-3′ and 5′-TNN-3′ variety. As a result, ZFNs lack the target flexibility inherent to more recent genome-editing platforms. Nevertheless, the potential for ZFNs to mediate specific and efficient genome editing is evidenced by ongoing clinical trials based on ZFN-mediated knockout of the human immunodeficiency virus (HIV)-1 coreceptor CCR5 for treatment of HIV/acquired immune deficiency syndrome (AIDS) (Tebas et al. 2014) and a planned clinical trial based on site-specific integration of the factor IX gene into the albumin locus to treat hemophilia B (Clinical Trial ID: NCT02695160) (Sharma et al. 2015).
TALE Nucleases
TALE proteins are bacterial effectors. In 2009, the code used by TALE proteins to recognize DNA was uncovered (Boch et al. 2009; Moscou and Bogdanove 2009). In a matter of months, this discovery enabled the creation of custom TALENs capable of modifying nearly any gene. Like ZFNs, TALENs are modular in form and function, comprised of an amino-terminal TALE DNA-binding domain fused to a carboxy-terminal FokI cleavage domain (Christian et al. 2010; Miller et al. 2011). Also like ZFNs, dimerization of TALEN proteins is mediated by the FokI cleavage domain, which cuts within a 12- to 19-bp spacer sequence that separates each TALE binding site (Fig. 1) (Miller et al. 2011). TALEs are typically assembled to recognize between 12- to 20-bps of DNA, with more bases typically leading to higher genome-editing specificity (Guilinger et al. 2014a). The TALE-binding domain consists of a series of repeat domains, each ∼34 residues in length. Each repeat contacts DNA via the amino acid residues at positions 12 and 13, known as the repeat variable diresidues (RVDs) (Boch et al. 2009; Moscou and Bogdanove 2009). Unlike zinc fingers, which recognize DNA triplets, each TALE repeat recognizes only a single bp, with little to no target site overlap from adjacent domains (Deng et al. 2012; Mak et al. 2012). The most commonly used RVDs for assembling synthetic TALE arrays are: NI for adenine, HD for cytosine, NG for thymine, and NN or HN for guanine or adenine (Boch et al. 2009; Moscou and Bogdanove 2009; Cong et al. 2012; Streubel et al. 2012). TALE DNA-binding domains can be constructed using a variety of methods, with the most straightforward approach being Golden Gate assembly (Cermak et al. 2011). However, high-throughput TALE assembly methods have also been developed, including FLASH assembly (Reyon et al. 2012), iterative capped assembly (Briggs et al. 2012), and ligation independent cloning (Schmid-Burgk et al. 2013), among others. More recent advances in TALEN assembly, though, have focused on the development of methods that can enhance their performance, including specificity profiling to uncover nonconventional RVDs that improve TALEN activity (Guilinger et al. 2014a; Yang et al. 2014; Juillerat et al. 2015; Miller et al. 2015), directed evolution as means to refine TALE specificity (Hubbard et al. 2015), and even fusing TALE domains to homing endonuclease variants to generate chimeric nucleases with extended targeting specificity (discussed in more detail below) (Boissel et al. 2014).
Compared to ZFNs, TALENs offer two distinct advantages for genome editing. First, no selection or directed evolution is necessary to engineer TALE arrays, dramatically reducing the amount of time and experience needed to assemble a functional nuclease. Second, TALENs have been reported to show improved specificity and reduced toxicity compared to some ZFNs (Mussolino et al. 2014), potentially because of their increased affinity for target DNA (Meckler et al. 2013) or perhaps a greater energetic penalty for associating with base mismatches. However, TALENs are substantially larger than ZFNs, and have a highly repetitive structure, making their efficient delivery into cells through the use of lentivirus (Holkers et al. 2013) or a single adeno-associated virus (AAV) particle challenging. Methods for overcoming these limitations have emerged as TALENs can be readily delivered into cells as mRNA (Mahiny et al. 2015; Mock et al. 2015) and even protein (Cai et al. 2014; Liu et al. 2014a), although alternative codon usage and amino acid degeneracy can also be leveraged to express RVD arrays that might be less susceptible to recombination (Kim et al. 2013a). In addition, adenoviral vectors have also proven particularly useful for mediating TALEN delivery to hard-to-transfect cell types (Holkers et al. 2014; Maggio et al. 2016).
CRISPR-Cas9
The CRISPR-Cas9 system, which has a role in adaptive immunity in bacteria (Horvath and Barrangou 2010; Marraffini and Sontheimer 2010), is the most recent addition to the genome-editing toolbox. In bacteria, the type-II CRISPR system provides protection against DNA from invading viruses and plasmids via RNA-guided DNA cleavage by Cas proteins (Wiedenheft et al. 2012; Sorek et al. 2013). Short segments of foreign DNA are integrated within the CRISPR locus and transcribed into CRISPR RNA (crRNA), which then anneal to trans-activating crRNA (tracrRNA) to direct sequence-specific degradation of pathogenic DNA by the Cas9 protein (Jinek et al. 2012). In 2012, Charpentier, Doudna, and co-workers reported that target recognition by the Cas9 protein only requires a seed sequence within the crRNA and a conserved protospacer-adjacent motif (PAM) upstream of the crRNA binding site (Jinek et al. 2012). This system has since been simplified for genome engineering (Cho et al. 2013; Cong et al. 2013; Jinek et al. 2013; Mali et al. 2013b) and now consists of only the Cas9 nuclease and a single guide RNA (gRNA) containing the essential crRNA and tracrRNA elements (Fig. 1). Because target site recognition is mediated entirely by the gRNA, CRISPR-Cas9 has emerged as the most flexible and user-friendly platform for genome editing, eliminating the need for engineering new proteins to recognize each new target site. The only major restriction for Cas9 target site recognition is that the PAM motif—which is recognized by the Cas9 nuclease and is essential for DNA cleavage—be located immediately downstream of the gRNA target site. The PAM sequence for the Streptococcus pyogenes Cas9, for example, is 5′-NGG-3′ (although in some cases 5′-NAG-3′ can be tolerated) (Hsu et al. 2013; Jiang et al. 2013; Mali et al. 2013a). Several studies have now shed light on the structural basis of DNA recognition by Cas9, revealing that the heteroduplex formed by the gRNA and its complementary strand of DNA is housed in a positively charged groove between the two nuclease domains (RuvC and HNH) within the Cas9 protein (Nishimasu et al. 2014), and that PAM recognition is mediated by an arginine-rich motif present in Cas9 (Anders et al. 2014). Doudna and colleagues have since proposed that DNA strand displacement induces a structural rearrangement within the Cas9 protein that directs the nontarget DNA strand into the RuvC active site, which then positions the HNH domain near target DNA (Jiang et al. 2016), enabling Cas9-mediated cleavage of both DNA strands.
The Cas9 nuclease and its gRNA can be delivered into cells for genome editing on the same or separate plasmids, and numerous resources have been developed to facilitate target site selection and gRNA construction, including E-CRISP (Heigwer et al. 2014), among others. Although Cas9 boasts the highest ease of use among the targeted nuclease platforms, several reports have indicated that it could be prone to inducing off-target mutations (Cradick et al. 2013; Fu et al. 2013). To this end, considerable effort has been devoted to improving the specificity of this system, including using paired Cas9 nickases (Mali et al. 2013a; Ran et al. 2013), which increase gene-editing specificity by requiring the induction of two sequential and adjacent nicking events for DSB formation, or truncated gRNA that are more sensitive to mismatches at the genomic target site than a full-length gRNA (Fu et al. 2014). Off-target cleavage has also been reduced by controlling the dosage of either the Cas9 protein or gRNA within the cell (Hsu et al. 2013), or even by using Cas9 variants configured to enable conditional genome editing, such as a rapamycin-inducible split-Cas9 architecture (Zetsche et al. 2015b) or a Cas9 variant that contains a strategically placed small-molecule-responsive intein domain (Davis et al. 2015). Nucleofection (Kim et al. 2014) or transient transfection (Zuris et al. 2015) of a preformed Cas9 ribonucleoprotein complex has also been shown to reduce off-target effects, enabling DNA-free gene editing in primary human T cells (Schumann et al. 2015), embryonic stem cells (Liu et al. 2015b), Caenorhabditis elegans gonads (Paix et al. 2015), mouse (Menoret et al. 2015; Wang et al. 2015a) and zebrafish embryos (Sung et al. 2014), and even plant protoplasts (Woo et al. 2015). The incorporation of specific chemical modifications known to protect RNA from nuclease degradation and stabilize secondary structure can further enhance Cas9 ribonucleoprotein activity (Hendel et al. 2015; Rahdar et al. 2015). In a clever marriage of genome-editing platforms, the FokI cleavage domain has even been fused to an inactivated Cas9 variant to generate hybrid nucleases that require protein dimerization for DNA cleavage (Guilinger et al. 2014b; Tsai et al. 2014), theoretically increasing CRISPR-Cas9 specificity. Similarly, fusing Cas9 to DNA-binding domains has also proven effective at improving its specificity (Bolukbasi et al. 2015). Finally, several studies have recently showed that protein engineering can broadly enhance Cas9 specificity (Kleinstiver et al. 2016; Slaymaker et al. 2016) and even alter its PAM requirements (Kleinstiver et al. 2015), the latter having the potential to enable creation of customized variants of Cas9 for allele-specific gene editing, although Cas9 orthologs (Cong et al. 2013; Esvelt et al. 2013; Hou et al. 2013; Ran et al. 2015) or alternative CRISPR systems (Zetsche et al. 2015a) with unique PAM specificities have been uncovered in nature.
Homing Endonucleases
Homing endonucleases, also known as meganucleases, represent the final member of the targeted nuclease family. These enzymes have been reviewed at length elsewhere (Silva et al. 2011; Stoddard 2014) but, briefly, members of the LAGLIDADG family of endonucleases—so named for the conserved amino acid motif present within these enzymes that interacts with DNA—are a collection of naturally occurring enzymes that recognize and cleave long DNA sequences (14–40 bps) (Fig. 1). These enzymes make extensive sequence-specific contacts with their DNA substrate (Stoddard 2011), and thus typically show exquisite specificity. However, unlike ZFNs and TALENs, the binding and cleavage domains in homing endonucleases are not modular. This overlap in form and function make their repurposing challenging, and limits their utility for more routine applications of genome editing. More recently megaTALs—fusions of a rare-cleaving homing endonuclease to a TALE-binding domain—have been reported to induce highly specific gene modifications (Boissel et al. 2014; Lin et al. 2015a). These enzymes have enabled integration of antitumor and anti-HIV factors into the human CCR5 gene in both primary T cells and hematopoietic stem/progenitor cells (Sather et al. 2015), as well as disruption of endogenous T-cell receptor elements in T cells (Osborn et al. 2016), indicating their potential for enabling and enhancing immunotherapies.
GENOME-EDITING APPLICATIONS
Engineering Cell Lines and Organisms
Before the emergence of engineered nucleases, genetically modifying mammalian cell lines was labor intensive, costly, and often times limited to laboratories with specialized expertise. However, with the advent of cost-effective and user-friendly gene-editing technologies, custom cell lines carrying nearly any genomic modification can now be generated in simply a matter of weeks. Examples of some of the outcomes that have become routine because of the emergence of targeted nucleases include gene knockout (Santiago et al. 2008; Mali et al. 2013b), gene deletion (Lee et al. 2010), gene inversion (Xiao et al. 2013), gene correction (Urnov et al. 2005; Ran et al. 2013), gene addition (Moehle et al. 2007; Hockemeyer et al. 2011; Hou et al. 2013), and even chromosomal translocation (Fig. 2) (Torres et al. 2014). In addition to cell line engineering, targeted nucleases have also expedited the generation of genetically modified organisms, facilitating the rapid creation of transgenic zebrafish (Doyon et al. 2008; Sander et al. 2011a; Hwang et al. 2013), mice (Cui et al. 2011; Wang et al. 2013; Wu et al. 2013), rats (Geurts et al. 2009; Tesson et al. 2011; Li et al. 2013), monkeys (Liu et al. 2014c), and livestock (Hauschild et al. 2011; Carlson et al. 2012), which together have the capacity to accelerate human disease modeling and the discovery of new therapeutics.
Targeted nucleases have also emerged as powerful tools for plant engineering (Baltes and Voytas 2015). Both TALENs and CRISPR-Cas9 have been used to modify multiple alleles within hexaploid bread wheat to confer heritable resistance to powdery mildew (Wang et al. 2014b). In another study, TALENs were used to knock out nonessential genes in the fatty acid metabolic pathway in soybean to generate a simplified plant cell with reduced metabolic components (Haun et al. 2014). Of special note, two recent reports showed that purified nuclease proteins can be introduced directly into plant protoplasts, enabling the introduction of germline-transmissible modifications that are virtually indistinguishable from naturally occurring (Luo et al. 2015; Woo et al. 2015). This technical advance could help to overcome certain regulatory hurdles associated with the use of transgenic crops. Finally, targeted nucleases have also been used to inactivate pathogenic genes to prevent viral (Lin et al. 2014) or parasitic (Ghorbal et al. 2014) infection, as well as to introduce knockin-specific factors capable of imparting pathogen resistance (Wu et al. 2015).
Intriguingly, targeted nucleases could also serve as conduits for curbing mosquito- or insect-borne diseases through a technique known as gene drive (Burt 2003; Sinkins and Gould 2006), which harnesses genome editing to facilitate the introduction of a specific gene or mutation that can then confer a particular phenotype into a host and also be transmitted to its progeny (Windbichler et al. 2011). Gene drives have now been tested in the malaria vector mosquitos Anopheles stephensi (Gantz et al. 2015) and Anopheles gambiae (Hammond et al. 2016) as a means for achieving population control and to prevent disease transmission, respectively. However, owing to the ease with which CRISPR-Cas9 can be programmed (Gantz and Bier 2015), debate has ignited on the potential societal and environmental impact of this technology (Esvelt et al. 2014; Akbari et al. 2015), spurring the development of safeguard elements that could help to minimize the risk of gene-edited organisms escaping from the laboratory (DiCarlo et al. 2015).
Synthetic Biology and Genome-Scale Engineering
Targeted nucleases also offer a facile means for generating modified bacterial and yeast strains for synthetic biology, including metabolic pathway engineering. Bacterial species of the order Actinomycetales, for instance, are one of the most important sources of industrially relevant secondary metabolites. However, many Actinomycetales species are recalcitrant to genetic manipulation, a fact that has severely hampered their use for metabolic engineering. CRISPR-Cas9 has been used to inactivate multiple genes in actinomycetes (Tong et al. 2015), indicating its ability to enable the creation of designer bacterial strains with enhanced metabolite production capabilities. CRISPR has also facilitated multiplexed metabolic pathway engineering in yeast at high efficiencies (Jakociunas et al. 2015a,b), as well as random mutagenesis of yeast chromosomal DNA for phenotypic screening of desired mutants (Ryan et al. 2014). Indeed, genome-wide CRISPR-based knockout screens hold tremendous potential for functional genomics (Hilton and Gersbach 2015), having facilitated the discovery of genomic loci that confer drug resistance to cells (Koike-Yusa et al. 2014; Shalem et al. 2014; Wang et al. 2014a; Zhou et al. 2014), uncovered how cells can control induction of the host immune response (Parnas et al. 2015), provided new insights into the genetic basis of cellular fitness (Hart et al. 2015; Wang et al. 2015b), and even shed light on how certain viruses induce cell death (Ma et al. 2015). Genome-wide CRISPR screens can also facilitate the discovery of functional noncoding elements (Kim et al. 2013b; Korkmaz et al. 2016), and provide a means for studying the structure and evolution of the human genome. In a remarkable example of the latter, Shendure and colleagues used Cas9 to mediate integration of short randomized DNA sequences into the BRCA1 and DBR1 genes (Findlay et al. 2014). They then measured the functional consequences of these mutations on fitness, achieving an unprecedented look at some of the factors driving genome and disease evolution. Finally, CRISPR screens have even proven effective in vivo, enabling the identification of factors involved in zebrafish development (Shah et al. 2015) and disease progression in mice (Chen et al. 2015).
Therapeutic Genome Editing
Genome editing itself also holds tremendous potential for treating the underlying genetic causes of certain diseases (Cox et al. 2015; Porteus 2015; Maeder and Gersbach 2016). In one of the most successful examples of this to date, ZFN-mediated disruption of the HIV coreceptor CCR5 was used to engineer HIV resistance into both CD4+ T cells (Perez et al. 2008) and CD34+ hematopoietic stem/progenitor cells (HSPCs) (Holt et al. 2010), proving safe and well-tolerated in a phase I clinical trial that infused these gene-modified T cells into individuals with HIV/AIDS (Tebas et al. 2014). In addition to enabling the introduction of gene modification that can enhance autologous cell therapies, targeted nucleases can also be combined with viral vectors—including AAV—to mediate genome editing in situ (Gaj et al. 2016). For instance, delivery of an AAV vector encoding a ZFN pair designed to target a defective copy of the factor IX gene, along with its repair template, led to efficient gene correction in mouse liver, increasing factor IX protein production in both neonatal (Li et al. 2011) and adult (Anguela et al. 2013) models of the disease. In vivo genome editing also recently enabled the restoration of dystrophin gene expression and the rescue of muscle function in mouse models of Duchenne muscular dystrophy (Long et al. 2015; Nelson et al. 2015; Tabebordbar et al. 2015). Therapeutic gene editing in a mouse model of human hereditary tyrosinemia has also been reported using both hydrodynamic injection of plasmid DNA encoding CRISPR-Cas9 (Yin et al. 2014) and by combining nanoparticle-mediated delivery of Cas9-encoding mRNA with AAV-mediated delivery of the DNA template for gene correction (Yin et al. 2016). More recently, a dual particle AAV system, wherein one AAV vector carried the Cas9 nuclease and a second harbored the gRNA and donor repair template, was able to mediate correction of a disease-causing mutation in the ornithine transcarbamylase gene in the liver of a neonatal model of the disease (Yang et al. 2016). This work, in particular, showed that therapeutic levels of gene correction could be achieved in a regenerating tissue even when using multiple AAV particles. Although highly promising, numerous hurdles still need to be overcome for in vivo applications of genome editing to reach its full potential. Chief among these are methods that can facilitate nuclease delivery or expression to only diseased cells or tissues, and the development of new strategies that can enhance HDR in disease-associated postmitotic cells in vivo.
TARGETED TRANSCRIPTION FACTORS
Tools for Modulating Gene Expression
The modular qualities of zinc-finger and TALE proteins, in addition to the highly flexible DNA recognition ability of CRISPR-Cas9, also provide investigators with the ability to modulate the expression of nearly any gene from its promoter or enhancer sequences via their fusion to transcriptional activator and repressor protein domains. Among the first fully synthetic transcriptional effector proteins to be generated (Beerli et al. 1998) were those based on the fusion of engineered zinc-finger proteins with either the Herpes simplex–derived transactivation domain (Sadowski et al. 1988) or the Krüppel-associated box (KRAB) repression protein (Margolin et al. 1994). Over the course of the next 15 years, zinc-finger-based transcriptional modulators were expanded and featured several other types of effector domains (Beerli and Barbas 2002), including, for example, the Dnmt3a methyltransferase domain (Rivenbark et al. 2012; Siddique et al. 2013) and the ten-eleven translocation methylcytosine dioxygenase 1 (TET1) (Chen et al. 2014), which can modulate transcription via targeted methylation or demethylation, respectively. As a natural extension of zinc-finger transcription factors, and further drawing on the parallels with zinc-finger proteins, TALE transcription factors have also emerged as an especially effective platform for achieving targeted transcriptional modulation (Miller et al. 2011; Zhang et al. 2011). Effector domains are generally fused to the carboxyl terminus of the synthetic TALE array and, contrary to the longer sequence typically required for efficient modulation by zinc-finger transcription factors, TALEs have been reported to regulate gene expression with as few as 10.5 repeats (Boch et al. 2009). Like zinc fingers, TALEs are also compatible with numerous epigenetic modifiers, including the TET1 hydroxylase catalytic domain (Maeder et al. 2013b) and the lysine-specific histone demethylase 1 (LSD1) (Mendenhall et al. 2013) domains, which have been used for targeted CpG demethylation and histone demethylation, respectively. In particular, the ease with which a large number of TALEs can be constructed has enabled the discovery that tiling a promoter sequence with combinations of synthetic transcription factors can lead to a synergistic increase in gene expression (Maeder et al. 2013b; Perez-Pinera et al. 2013). And, like zinc fingers (Beerli et al. 2000b; Pollock et al. 2002; Magnenat et al. 2008; Polstein and Gersbach 2012), TALE activators have also been successfully engineered to regulate gene expression in response to external (Mercer et al. 2014) or endogenous (Li et al. 2012) chemical stimuli, optical signals (Konermann et al. 2013), and even proteolytic cues (Copeland et al. 2016; Lonzaric et al. 2016).
Because of the exquisite ease with which it can be programmed, the CRISPR-Cas9 system has also been adapted for transcriptional modulation through fusion of specific effector domains to a catalytically inactivated variant of the Cas9 protein (Qi et al. 2013). Deactivation is achieved by introducing two amino acid substitutions, D10A and H840A, into the RuvC and NHN endonuclease domains of Cas9, respectively. Although unable to cleave DNA, this mutant, referred to as dCas9, retains its ability to bind DNA in an RNA-directed manner. Effector domains are fused to the carboxyl terminus of the dCas9 protein and can modulate gene expression from either strand of the targeted DNA sequence (Farzadfard et al. 2013; Maeder et al. 2013a; Perez-Pinera et al. 2013). Genome-scale activation studies have indicated that the most robust levels of activation are generally observed when dCas9 activators are targeted to -400 to -50 bp upstream from the transcriptional start site (Gilbert et al. 2014; Hu et al. 2014). Additionally, dCas9 can inhibit gene expression by simply blocking transcriptional initiation or elongation through a process known as CRISPR interference (Qi et al. 2013), although fusing dCas9 to transcriptional repressor domains can also lead to efficient silencing from the promoter (Gilbert et al. 2013; Zalatan et al. 2015). Much like zinc fingers and TALEs, methods for achieving conditional gene modulation using dCas9 have also been reported, including the fusion of a dihydrofolate reductase destabilization domain to dCas9, which can provide chemical control over activation, enabling cellular reprogramming or differentiation (Balboa et al. 2015). Light-inducible dCas9-based systems capable of providing optical control of gene expression provide another means for achieving conditional control of gene expression (Nihongaki et al. 2015; Polstein and Gersbach 2015).
Although flexible, first-generation dCas9 activators were routinely found to display suboptimal levels of activation. As a result, the development of second-generation CRISPR activators quickly emerged as a highly active area of research. One particularly elegant approach for overcoming the low activation thresholds inherent within first-generation systems was by strategically inserting an RNA aptamer within a functionally inert region of the gRNA. This aptamer recruits specific activation helper proteins that work in concert with a dCas9 activator to enhance transcription (Konermann et al. 2015; Zalatan et al. 2015). Other strategies based on directly fusing additional helper activation domains to dCas9 have also been shown to enhance transcription (Chavez et al. 2015). Targeted acetylation of histone proteins within a promoter or enhancer sequence via epigenome editing using the catalytic core of the human acetyltransferase p300 fused to dCas9 can also lead to robust levels of gene activation (Hilton et al. 2015). Similarly, dCas9 repressor proteins targeted to distal regulatory elements have been found to facilitate chromatin remodeling and gene repression via epigenomic modification (Thakore et al. 2015). Finally, by simply reducing the length of the gRNA, catalytically active variants of Cas9 can stimulate transcription without inducing DNA breaks (Dahlman et al. 2015; Kiani et al. 2015), enabling orthogonal gene knockout and activation with the same Cas9 variant in a single cell.
Applications of Targeted Transcriptional Regulation
Early work on the use of engineered zinc-finger transcription factors revealed that synthetic transcriptional modulators are effective tools for a broad range of applications, enabling such tasks as inhibiting viral replication (Papworth et al. 2003; Reynolds et al. 2003; Segal et al. 2004; Eberhardy et al. 2006), modulating the expression of disease-associated loci (Graslund et al. 2005; Wilber et al. 2010), inducing angiogenesis for accelerated wound healing (Rebar et al. 2002), and genomic screening of cellular targets for cancer progression and drug resistance (Park et al. 2003; Blancafort et al. 2005, 2008). Facilitated by many of the insights gained from zinc-finger transcription factor technology, both TALEs and CRISPR-Cas9 have now further expanded the possibilities of engineered transcriptional activators and repressors. For example, TALEs and CRISPR-Cas9 have enabled rapid construction of custom genetic circuits and logic gates (Gaber et al. 2014; Lebar et al. 2014; Liu et al. 2014b), complex gene regulation networks (Nielsen and Voigt 2014; Nissim et al. 2014), and even facilitated cellular reprogramming (Gao et al. 2013) and the differentiation of mouse embryonic fibroblasts to skeletal myocytes (Chakraborty et al. 2014). dCas9 transcriptional effectors have even been used to efficiently mediate repression and activation of endogenous genes in Drosophila (Lin et al. 2015b) and in plant cells (Piatek et al. 2015). Both TALE and Cas9 activators have also been configured to stimulate transcription of latent HIV (Zhang et al. 2015; Ji et al. 2016; Limsirichai et al. 2016; Perdigao et al. 2016; Saayman et al. 2016), indicating their potential to work in concert with antiretroviral therapy for eradicating HIV infection. Importantly, because of the ease with which the CRISPR-Cas9 system can be used, genome-wide screens using Cas9 transcriptional activators (Gilbert et al. 2014; Konermann et al. 2015) and repressors (Gilbert et al. 2014) can be easily implemented to discover genes involved in a number of diverse processes, including drug resistance and cancer metastasis. In particular, CRISPR-based genome-scale screening methods have the potential to overcome many of the technical hurdles associated with other contemporary screening technologies, such as cDNA libraries and RNAi, indicating its potential for facilitating drug discovery and basic biological research.
CONCLUSIONS
Despite the successes already achieved, many challenges remain before the full potential of genome editing can be realized. First and foremost are the development of new tools capable of introducing genomic modifications in the absence of DNA breaks. Targeted recombinases (Akopian et al. 2003; Mercer et al. 2012), which can be programmed to recognize specific DNA sequences (Gaj et al. 2013; Sirk et al. 2014; Wallen et al. 2015) and even integrate therapeutic factors into the human genome (Gaj et al. 2014b), are one such option. More recent work has indicated that single-base editing without DNA breaks can be achieved using an engineered Cas9 nickase complex (Komor et al. 2016), although it remains unknown how effective this technology is in therapeutically relevant settings. By linking genomic modifications induced by targeted nucleases to their own self-degradation, self-inactivating vectors are also poised to improve the specificity of genome editing, especially because the frequency of off-target modifications can be directly proportional to the duration of cellular exposure to a nuclease (Pruett-Miller et al. 2009). In addition, much of the knowledge behind genome engineering has been obtained in immortalized cell lines. However, in the case of regenerative medicine, it is highly desirable to genetically manipulate progenitor or stem-cell populations, both of which can differ markedly from transformed cell lines with respect to their epigenome or three-dimensional organization of their genomic DNA. These differences can have profound effects on the ability of genome-editing tools to either modify a specific sequence or regulate gene expression. Although the union between genome engineering and regenerative medicine is still in its infancy, realizing the full potential of these technologies in stem/progenitor cells requires that their functional landscape be fully explored in these genetic backgrounds. Only then will genome editing technologies truly be able to reprogram cell fate and behavior for the next generation of advances in synthetic biology and gene therapy.
ACKNOWLEDGMENTS
We gratefully acknowledge the support and mentorship of the late Carlos F. Barbas, III (1964–2014). This work is supported by the National Institutes of Health (F32GM113446 to T.G.) and ShanghaiTech University (to J.L.). We apologize to those whose important contributions were not cited because of space constraints.
Footnotes
-
Editors: Daniel G. Gibson, Clyde A. Hutchison III, Hamilton O. Smith, and J. Craig Venter
-
Additional Perspectives on Synthetic Biology available at www.cshperspectives.org
- Copyright © 2016 Cold Spring Harbor Laboratory Press; all rights reserved
REFERENCES
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵












