Instructions for
using ZiFiT (V4.2)
CRIPSR/Cas Assembly
- CRISPR RFNs (RNA-guided FokI Nucleases)
TALE Repeat Array Assembly
Zinc Fingers
- Zinc Finger Arrays
- Zinc Finger Nucleases
- Cloning
CoDA (Context Dependent Assembly)
- Input
OPEN (OligomerizedPool Engineering)
- Input
- Scoring
Interface With ZiFDB
CRISPR/Cas:
In bacteria and archea clustered, regularly interspaced, short palindromic repeats
(CRISPR) and CRISPR-associated (Cas) systems provide immunity against foreign DNA.
The CRISPR/Cas9 system recognizes and cleaves double stranded DNA complimentary
to a guide RNA (gRNA) sequence as shown in the schematic below.
The CRISPR/Cas9 system consists of two parts:
1. The Cas9 protein (in blue) that creates the double stranded break.
2. The gRNA that directs the Cas9 to a specific genomic locus.
ZiFiT provides guidance for the construction of CRISPR/Cas9 RNA-Guided Nucleases
(RGNs) as described in:
1. Hwang et al., Nat Biotechnol. 2013 (doi:10.1038/nbt.2501)
2. Fu et al., Nat Biotechnol. 2013 (doi: 10.1038/nbt.2623)
ZiFiT currently provides the following functionality:
1. Identification of target sites.
2. Design of oligos required for the construction of the gRNA constructs driven
by either a U6 or a T7 promoter (MLM3636 and DR274, respectively)
3. Orthogonality information
a. ZiFiT identifies all potential off-targetsites that are "off by 3" from the intended
targetsites. If the target site is < 18 bases, ZiFiT will report all potential off-targetsites
that are "off by 2" from the intended targetsites.
b. This option is
not available will querying in batch.
c. The orthogonality is calculated using Bowtie 1.0.0. For more information about
Bowtie please refer to the
Bowtie Manuals
CRISPR
RFNs (RNA-guided FokI Nucleases):
CRISPR RFNs (RNA-guided FokI Nucleases) combines the simplicity of the CRISPR-Cas9
system with the specificity of dimerization dependent FokI fusion nucleases systems
such as TALENs and ZFNs. The CRISPR RFN system recognizes and cleaves double stranded
DNA complimentary to a guide RNA (gRNA) sequence as shown in the schematic below.
The CRISPR/Cas9 system consists of three parts:
1. The Inactive dCas9 protein (in blue).
2. The sgRNA that directs the Cas9 to a specific genomic locus (in Red).
3. The FokI nuclease domain that creates a double stranded break upon dimerization
(in Yellow).
Protocol for cloning gRNA sequences into the Csy4-based multiplex expression vector
as described in Tsai et al.
Overview
Middle oligoduplex sequence
Synthesize oSQT875 and oSQT876 as phosphorylated desalted ‘Ultramer’ oligonucleotides.
(IDT)
Name
|
Sequence
|
oSQT875 middle oligo F
|
/5Phos/ AGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTTCACTGCCGTATA
|
oSQT876 middle oligo R
|
/5Phos/ TGCCTATACGGCAGTGAACGCACCGACTCGGTGCCACTTTTTCAAGTTGATAACGGACTAGCCTTATTTTAACTTGCTATTTCT
|
Design Multiplex gRNA Target Site Oligos
1. Visit http://zifit.partners.org/ZiFiT/ to find RNA-guided FokI nuclease target
sites in your sequence of interest.
2. Order two pairs of oligos (corresponding to left and right “half-sites”) for
each full dimeric target site.
Target site oligos require the following overhangs for ordered ligation into SQT1313:
Name
|
Sequence
|
left target oligoduplex F
|
5’-GCAGNNNNNNNNNNNNNNNNNNNGTTTTAG–3’
|
left target oligoduplex R
|
5’-AGCTCTAAAACNNNNNNNNNNNNNNNNNNN–3’
|
|
|
right target oligoduplex F
|
5’-GGCAGNNNNNNNNNNNNNNNNNNN–3’
|
right target oligoduplex R
|
5’-AAACNNNNNNNNNNNNNNNNNNNC–3’
|
Prepare Digested Vector Backbone
1. Digest 3-5 µg of pSQT1313 (available via Addgene) with BsmBI (NEB) overnight
at 55°C.
2. Heat inactivate for 20 minutes at 80°C.
3. Purify digested pSQT1313 with 1X volume of Ampure XP beads (Agencourt) according
to manufacturer’s instructions, or gel purify.
4. Dilute digested backbone plasmid pSQT1313 to 10 ng/µl.
10X Oligoduplex Annealing Buffer (STE)
Component
|
Volume (ul)
|
Final concentration (mM)
|
1 M Tris HCL
|
5
|
100
|
5 M NaCl
|
5
|
500
|
0.5 M EDTA
|
1
|
10
|
dH2O
|
39
|
|
Anneal oligos
1. Anneal each of the oligonucleotide pairs separately at 10 µM.
Component
|
Volume (ul)
|
Final concentration (mM)
|
Oligo 1 (100 µM)
|
10
|
10 µM
|
Oligo 2 (100 µM)
|
10
|
10 µM
|
10X STE Buffer
|
10
|
1X
|
dH2O
|
70
|
|
Run the following annealing protocol in a thermocycler:
a. 95°C, 5 minutes
b. Ramp down temperature to 25°C at 1°C/30 s.
c. 4°C, ∞
2. Dilute 10 µM annealed oligoduplexes 1:1000 to a final concentration of 0.01 µM.
Cloning into multiplex gRNA vectors
left oligoduplex + middle oligoduplex + right oligoduplex + pSQT1313/BsmBI
1. Assemble the following 10 µl ligation reaction.
Component
|
Volume (µl)
|
Left oligoduplex (0.01 µM)
|
2
|
Middle oligoduplex (0.01 µM)
|
2
|
Right oligoduplex (0.01 µM)
|
2
|
pSQT1313/BsmBI (~10 ng/µl)
|
2
|
10X T4 Ligase Buffer (NEB)
|
1
|
T4 Polynucleotide Kinase (NEB)
|
0.5
|
T4 DNA Ligase (NEB)
|
0.5
|
1. Incubate sample at 16°C for 30 minutes, then at 4°C overnight.
2. The following day, transform 5 µl of ligation into 50 µl of chemically competent
cells. (Incubate 15-30 minutes on ice, heat shock at 42°C for 45 s, incubate 2-5
minutes on ice. Recover with 150-500 µl of SOC for 1 hour.)
3. Plate on an LB Agar plate containing 100 µg/ml carbenicillin.
4. Pick ~3-6 colonies per target site, grow overnight cultures in LB medium supplemented
with 50 µg/ml carbenicillin, and prepare miniprep plasmid DNA.
5. Sequence verify multiplex gRNA constructs with primer oSQT379 (5’-AGGGTTATTGTCTCATGAGCGG-3’)
TALE
repeat array:
Transcription Activator-Like Effectors (TALEs) are DNA binding proteins used by
Xanthamonas bacteria to facilitate colonization of plants. TALEs typically contain
arrays of highly conserved, 33-35 amino acid TALE repeat domains that mediate sequence-specific
DNA-binding. Individual TALE repeat domains each bind to a single base pair of DNA
with specificity primarily determined by two variable amino acid residues (known
as repeat variable di-residues or RVDs) within the conserved repeat. TALE repeats
can be joined together into extended arrays to create customized DNA-binding domains
capable of binding to longer DNA sequences of interest.
ZiFiT v4.2 provides guidance for construction of engineered TALE repeat arrays based
on a framework that has been shown to produce highly active DNA-binding proteins
that function in C. elegans (Wood et al., Science 2011), zebrafish (Sander et al.,
Nat Biotechnol. 2011), rats (Tesson et al., Nat Biotechnol. 2011), and human somatic
(Miller et al., Nat Biotechnol. 2011) and pluripotent stem cells (Hockemeyer et
al., Nat Biotechnol. 2011). TALE repeat arrays constructed with this platform consist
of 3 parts:
1. A 102 amino acid domain derived from a naturally occurring TALE that is required
for DNA-binding and positioned just amino-terminal to the TALE repeat array
2. A series of 34 amino acid TALE repeat domains with different Repeat Variable
Domains (RVDs) with a single 20 amino acid “half-repeat” at the carboxy-terminal
end - See figure.
3. A 63 amino acid domain derived from a naturally occurring TALE that is required
for DNA-binding and positioned just carboxy-terminal to the TALE repeat array
Note that the TALE repeat array framework described above requires the presence
of a T (thymine) just 5’ to the sequence bound by the TALE repeats (see Figure above).
ZiFiT v4.2 provides support for construction of TALE nucleases (TALENs) that consist
of an engineered TALE repeat array domain as described above fused to the non-specific
nuclease domain from the FokI restriction enzyme. TALENs can introduce targeted
double-strand DNA breaks for inducing repair by non-homologous end-joining (NHEJ)
or homologous recombination (HR). TALENs function as dimers and therefore a pair
of nucleases must be designed and constructed for each required target site.
To design TALENs for a target sequence of interest, users enter sequence into the
ZiFiT program. With its default settings, ZiFiT will return as many as five target
sites for which TALENs can be designed. For each target site, users can generate
easy-to-follow graphical overviews that can be used to guide the assembly of plasmids
encoding the desired TALEN. These guides are cross-referenced to plasmids that can
be obtained from a dedicated website hosted by the non-profit plasmid distribution
service
Addgene (http://www.addgene.org/talengineering).
Additional details for using ZiFiT v4.2 can be found on the “Examples” page.
Zinc Fingers (ZFs) :
A ZF is a protein motif consisting of two beta strands and an alpha helix, stabilized
by coordination of a zinc ion by pairs of conserved cysteine and histidine residues.
Residues -1 to 6 of the alpha-helix (numbered relative to the start of the helix)
recognize specific DNA triplet sequences, primarily by forming base-specific contacts
in the major groove of the double-stranded target DNA. ZFs are often referred to
according to "recognition" residues in the alpha helix, listed in N- to C-terminal
direction; other residues in the ZF are referred to as the backbone.
As illustrated below, ZFs bind target DNA sites with amino acids of the recognition
alpha helix (shown in the top line from the amino (N) to carboxyl (C)
terminus) contacting consecutive nucleotides in DNA in the 3' to 5'
direction. This can be confusing because the DNA target site is always referred
to in the 5' to 3' direction, whereas amino acid sequences are referred
to from N to C terminus. Therefore, in the ZF shown below, N-QSSNLVR-C,
the R actually recognizes the G in the triplet 5' -GAA- 3.
Zinc
Fingers Arrays :
Multiple ZFs can be linked together to recognize a specific and preferably unique
sequence in double-stranded genomic DNA. When multiple ZFs are joined together,
the resulting multi-finger array recognizes a longer target DNA sequence. Multi-finger
arrays can be fused to and thereby target transcriptional activation domains, repressors,
or nucleases to specific locations in the genome.
Zinc Finger Nucleases (ZFNs):
ZFNs consist of two zinc finger arrays, each fused to a nuclease domain. The FokI
nuclease is only active as a dimer, and dimerization occurs when both zinc finger
arrays bind their target sequence. The requirement for dimerization ensures that
two zinc finger nucleases must bind to DNA in particular configuration in order
to cleave DNA. Multi-finger arrays are typically designed to recognize and bind
sites that are separated by a ”spacer” sequence of several base pairs (usually 5,
6, or 7 bps in length). We use the naming convention “Left Array” and “Right Array”
to refer to the array that binds the sequence at the 5’ bottom strand and the 3'
top strand, respectively, of the target DNA site, (diagram shown below).
Zinc Finger Cloning (ZFNs):
Cloning ZFNs: Each ZFN consists of two zinc finger arrays each fused to monomeric
nuclease domains. The preference of a ZFN dimer for spacers of different lengths
can be altered by varying the linker between the zinc finger array and the nuclease
domain. Changes to this linker affect both the cloning method and the target plasmids.
ZF arrays generated within standard Zinc Finger Consortium plasmids can be cloned
into ZFN expression vectors using unique XbaI/BamHI restriction sites. The resulting
vector will produce a ZFN designed to recognize sites with a 5 or 6 bp spacer (Foley
2009). However, ZFNs with a 7 bp spacer require PCR amplification with primers that
introduce XbaI/NotI sites for insertion into alternative Zinc Finger Consortium
plasmids that encode a ZFN capable of recognizing target sites with a 7 bp spacer.
CoDA
(Context Dependent Assembly):
CoDA is a design based ZF assembly method that utilizes zinc finger domains pre-selected
to work well together. Novel three finger ZFPs are generated by combining Finger
1 domains and Finger 3 domains that have been preselected to bind their cognate
target sites in the context of the same Finger 2 domain (see figure below).
CoDA - Input
Sequence: The DNA sequence for which
a user wishes to identify potential target sites is pasted into the Sequence Window.
Sequences should be in FASTA format. White space and numbers will be ignored.
Exon/Intron Case Sensitivity: This option
allows users to distinguish between intron and exon sequences by denoting exons
as uppercase and introns as lowercase. This information is used to generate a model
of the gene as a graphic in the output. Additionally, this information can be used
to filter out targets that occur in introns from the output.
Spacer size: This parameter is only available
when designing zinc finger nucleases. The user specifies the number of nucleotides
between the ZF arrays. The appropriate distance is determined by the length of the
amino acid linker between the ZF array and the associated nuclease domain. For standard
linkers, the spacer is five or six bp, which provides the proper spacing between
the zinc finger nuclease monomers so that they can interact to create a functional
enzyme.
Triplets(Position 1, Position 2, Position 3):
This parameter allow the users to select which module pools to consider for target
site identification. Default parameters include all pools currently available from
the Zinc Finger Consortium.
CoDA - Advanced Options
Triplet Composition:
The user can specify the composition of nucleotide triplets desired in target DNA
sequences. For example, if ANN (i.e., A followed by any 2 nucleotides) and GNN triplets
are desired, the user can choose to exclude CNN and TNN triplets from consideration
by setting the max value for these triplets to zero. As another example, the user
can specify search parameters that require at least 3 GNN triplets in the target.
OPEN (Oligomerized
Pool ENgineering):
This approach to ZFP production uses pools of zinc fingers, each of which consists
of numerous unique solutions that recognize a particular DNA triplet at one of three
positions (amino-terminal, middle, and carboxy-terminal) in a three-finger protein.
Pools are recombined to generate hundreds of thousands of unique solutions for each
target site. Optimal combinations of fingers that work well together are identified
using genetic selections in which binding of the protein to the target site upstream
of a pair of selectable marker gene activates expression and confers a growth advantage
to E. coli bacteria.
OPEN - Input
Sequence: The DNA sequence for which
a user wishes to identify potential ZF target sites is pasted into the Sequence
Window. Sequences should be in FASTA format. White space and numbers will be ignored.
Spacer size: This parameter is only available
when designing zinc finger nucleases. The user specifies the number of nucleotides
present in the spacer sequence between the “half-sites” bound by each ZF array.
Note: The DNA sequences encoding ZF arrays that are provided as output contain the
appropriate flanking XbaI/BamHI or XbaI/NotI restriction sites.
Triplets(Position 1, Position 2, Position 3): This parameter allows users
to select which OPEN pools to consider for target site identification. Default parameters
include all pools currently available from the Zinc Finger Consortium.
OPEN - Advanced Options
Triplet Composition:
The user can specify the composition of nucleotide triplets desired in target DNA
sequences. For example, if ANN (i.e., A followed by any 2 nucleotides) and GNN triplets
are desired, the user can choose to exclude CNN and TNN triplets from consideration
by setting the max value for these triplets to zero. As another example, the user
can specify search parameters that require at least 3 GNN triplets in the target.
OPEN - Scoring
Active/Inactive Scoring (ZiFOpT): Active/Inactive predictions and their corresponding
confidence values reflect the likelihood that an OPEN selection for a given 9bp
sequence will successfully yield a zinc finger array capable of activating transcription
of a reporter gene by three-fold or more in the well-established bacterial-two-hybrid
assay (Wright et al., Nat. Protocols 2006; Maeder et al., Nat. Protocols 2009).
Previous work has shown that a when a pair of zinc finger arrays both activate over
the three-fold cut-off, these arrays have a high probability (>50%) of functioning
when used as zinc finger nucleases. Predictions are based on success rates and sequence
composition for target sequences previously assayed in the OPEN system. (Sander
& Reyon, BMC Bioinformatics 2010)
Interface
with ZiFDB:
To enhance its utility, ZiFiT is interfaced directly with ZiFDB - a web accessible database
of zinc fingers and engineered zinc finger arrays. Target Site hyperlinks within
the ZiFiT output directly query ZiFDB to determine if any previously constructed
arrays exist that bind to completely or partially matched target sequences. In addition,
ZiFiT users can query ZiFDB for finger information for a specific triplet subsite
by clicking on the triplet. Thus, ZiFiT and ZiFDB work synergistically to aid in
ZFP design.