ZiFiT Targeter Version 4.2


Instructions for using ZiFiT (V4.2)


CRIPSR/Cas Assembly
- CRISPR RFNs (RNA-guided FokI Nucleases)

TALE Repeat Array Assembly

Zinc Fingers
- Zinc Finger Arrays
- Zinc Finger Nucleases
- Cloning

CoDA (Context Dependent Assembly)
- Input

OPEN (OligomerizedPool Engineering)
- Input
- Scoring

Interface With ZiFDB

CRISPR/Cas:


In bacteria and archea clustered, regularly interspaced, short palindromic repeats (CRISPR) and CRISPR-associated (Cas) systems provide immunity against foreign DNA. The CRISPR/Cas9 system recognizes and cleaves double stranded DNA complimentary to a guide RNA (gRNA) sequence as shown in the schematic below.


The CRISPR/Cas9 system consists of two parts:

1. The Cas9 protein (in blue) that creates the double stranded break.
2. The gRNA that directs the Cas9 to a specific genomic locus.

ZiFiT provides guidance for the construction of CRISPR/Cas9 RNA-Guided Nucleases (RGNs) as described in:
1. Hwang et al., Nat Biotechnol. 2013 (doi:10.1038/nbt.2501)
2. Fu et al., Nat Biotechnol. 2013 (doi: 10.1038/nbt.2623)

ZiFiT currently provides the following functionality:
1. Identification of target sites.
2. Design of oligos required for the construction of the gRNA constructs driven by either a U6 or a T7 promoter (MLM3636 and DR274, respectively)
3. Orthogonality information
a. ZiFiT identifies all potential off-targetsites that are "off by 3" from the intended targetsites. If the target site is < 18 bases, ZiFiT will report all potential off-targetsites that are "off by 2" from the intended targetsites.
b. This option is not available will querying in batch.

c. The orthogonality is calculated using Bowtie 1.0.0. For more information about Bowtie please refer to the Bowtie Manuals

CRISPR RFNs (RNA-guided FokI Nucleases):


CRISPR RFNs (RNA-guided FokI Nucleases) combines the simplicity of the CRISPR-Cas9 system with the specificity of dimerization dependent FokI fusion nucleases systems such as TALENs and ZFNs. The CRISPR RFN system recognizes and cleaves double stranded DNA complimentary to a guide RNA (gRNA) sequence as shown in the schematic below.


The CRISPR/Cas9 system consists of three parts:

1. The Inactive dCas9 protein (in blue).
2. The sgRNA that directs the Cas9 to a specific genomic locus (in Red).
3. The FokI nuclease domain that creates a double stranded break upon dimerization (in Yellow).

Protocol for cloning gRNA sequences into the Csy4-based multiplex expression vector as described in Tsai et al.

Overview



Middle oligoduplex sequence

Synthesize oSQT875 and oSQT876 as phosphorylated desalted ‘Ultramer’ oligonucleotides. (IDT)

Name Sequence
oSQT875 middle oligo F /5Phos/ AGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCGTTCACTGCCGTATA
oSQT876 middle oligo R /5Phos/ TGCCTATACGGCAGTGAACGCACCGACTCGGTGCCACTTTTTCAAGTTGATAACGGACTAGCCTTATTTTAACTTGCTATTTCT


Design Multiplex gRNA Target Site Oligos

1. Visit http://zifit.partners.org/ZiFiT/ to find RNA-guided FokI nuclease target sites in your sequence of interest.
2. Order two pairs of oligos (corresponding to left and right “half-sites”) for each full dimeric target site.

Target site oligos require the following overhangs for ordered ligation into SQT1313:
Name Sequence
left target oligoduplex F 5’-GCAGNNNNNNNNNNNNNNNNNNNGTTTTAG–3’
left target oligoduplex R 5’-AGCTCTAAAACNNNNNNNNNNNNNNNNNNN–3’
right target oligoduplex F 5’-GGCAGNNNNNNNNNNNNNNNNNNN–3’
right target oligoduplex R 5’-AAACNNNNNNNNNNNNNNNNNNNC–3’


Prepare Digested Vector Backbone

1. Digest 3-5 µg of pSQT1313 (available via Addgene) with BsmBI (NEB) overnight at 55°C.
2. Heat inactivate for 20 minutes at 80°C.
3. Purify digested pSQT1313 with 1X volume of Ampure XP beads (Agencourt) according to manufacturer’s instructions, or gel purify.
4. Dilute digested backbone plasmid pSQT1313 to 10 ng/µl.

10X Oligoduplex Annealing Buffer (STE)

Component Volume (ul) Final concentration (mM)
1 M Tris HCL 5 100
5 M NaCl 5 500
0.5 M EDTA 1 10
dH2O 39


Anneal oligos

1. Anneal each of the oligonucleotide pairs separately at 10 µM.
Component Volume (ul) Final concentration (mM)
Oligo 1 (100 µM) 10 10 µM
Oligo 2 (100 µM) 10 10 µM
10X STE Buffer 10 1X
dH2O 70


Run the following annealing protocol in a thermocycler:
a. 95°C, 5 minutes
b. Ramp down temperature to 25°C at 1°C/30 s.
c. 4°C, ∞

2. Dilute 10 µM annealed oligoduplexes 1:1000 to a final concentration of 0.01 µM.

Cloning into multiplex gRNA vectors

left oligoduplex + middle oligoduplex + right oligoduplex + pSQT1313/BsmBI
1. Assemble the following 10 µl ligation reaction.

Component Volume (µl)
Left oligoduplex (0.01 µM) 2
Middle oligoduplex (0.01 µM) 2
Right oligoduplex (0.01 µM) 2
pSQT1313/BsmBI (~10 ng/µl) 2
10X T4 Ligase Buffer (NEB) 1
T4 Polynucleotide Kinase (NEB) 0.5
T4 DNA Ligase (NEB) 0.5


1. Incubate sample at 16°C for 30 minutes, then at 4°C overnight.
2. The following day, transform 5 µl of ligation into 50 µl of chemically competent cells. (Incubate 15-30 minutes on ice, heat shock at 42°C for 45 s, incubate 2-5 minutes on ice. Recover with 150-500 µl of SOC for 1 hour.)
3. Plate on an LB Agar plate containing 100 µg/ml carbenicillin.
4. Pick ~3-6 colonies per target site, grow overnight cultures in LB medium supplemented with 50 µg/ml carbenicillin, and prepare miniprep plasmid DNA.
5. Sequence verify multiplex gRNA constructs with primer oSQT379 (5’-AGGGTTATTGTCTCATGAGCGG-3’)

TALE repeat array:


Transcription Activator-Like Effectors (TALEs) are DNA binding proteins used by Xanthamonas bacteria to facilitate colonization of plants. TALEs typically contain arrays of highly conserved, 33-35 amino acid TALE repeat domains that mediate sequence-specific DNA-binding. Individual TALE repeat domains each bind to a single base pair of DNA with specificity primarily determined by two variable amino acid residues (known as repeat variable di-residues or RVDs) within the conserved repeat. TALE repeats can be joined together into extended arrays to create customized DNA-binding domains capable of binding to longer DNA sequences of interest.

ZiFiT v4.2 provides guidance for construction of engineered TALE repeat arrays based on a framework that has been shown to produce highly active DNA-binding proteins that function in C. elegans (Wood et al., Science 2011), zebrafish (Sander et al., Nat Biotechnol. 2011), rats (Tesson et al., Nat Biotechnol. 2011), and human somatic (Miller et al., Nat Biotechnol. 2011) and pluripotent stem cells (Hockemeyer et al., Nat Biotechnol. 2011). TALE repeat arrays constructed with this platform consist of 3 parts:


1. A 102 amino acid domain derived from a naturally occurring TALE that is required for DNA-binding and positioned just amino-terminal to the TALE repeat array

2. A series of 34 amino acid TALE repeat domains with different Repeat Variable Domains (RVDs) with a single 20 amino acid “half-repeat” at the carboxy-terminal end - See figure.

3. A 63 amino acid domain derived from a naturally occurring TALE that is required for DNA-binding and positioned just carboxy-terminal to the TALE repeat array

Note that the TALE repeat array framework described above requires the presence of a T (thymine) just 5’ to the sequence bound by the TALE repeats (see Figure above).

ZiFiT v4.2 provides support for construction of TALE nucleases (TALENs) that consist of an engineered TALE repeat array domain as described above fused to the non-specific nuclease domain from the FokI restriction enzyme. TALENs can introduce targeted double-strand DNA breaks for inducing repair by non-homologous end-joining (NHEJ) or homologous recombination (HR). TALENs function as dimers and therefore a pair of nucleases must be designed and constructed for each required target site.

To design TALENs for a target sequence of interest, users enter sequence into the ZiFiT program. With its default settings, ZiFiT will return as many as five target sites for which TALENs can be designed. For each target site, users can generate easy-to-follow graphical overviews that can be used to guide the assembly of plasmids encoding the desired TALEN. These guides are cross-referenced to plasmids that can be obtained from a dedicated website hosted by the non-profit plasmid distribution service Addgene (http://www.addgene.org/talengineering). Additional details for using ZiFiT v4.2 can be found on the “Examples” page.

Zinc Fingers (ZFs) :

A ZF is a protein motif consisting of two beta strands and an alpha helix, stabilized by coordination of a zinc ion by pairs of conserved cysteine and histidine residues. Residues -1 to 6 of the alpha-helix (numbered relative to the start of the helix) recognize specific DNA triplet sequences, primarily by forming base-specific contacts in the major groove of the double-stranded target DNA. ZFs are often referred to according to "recognition" residues in the alpha helix, listed in N- to C-terminal direction; other residues in the ZF are referred to as the backbone.

As illustrated below, ZFs bind target DNA sites with amino acids of the recognition alpha helix (shown in the top line from the amino (N) to carboxyl (C) terminus) contacting consecutive nucleotides in DNA in the 3' to 5' direction. This can be confusing because the DNA target site is always referred to in the 5' to 3' direction, whereas amino acid sequences are referred to from N to C terminus. Therefore, in the ZF shown below, N-QSSNLVR-C, the R actually recognizes the G in the triplet 5' -GAA- 3.




Zinc Fingers Arrays :

Multiple ZFs can be linked together to recognize a specific and preferably unique sequence in double-stranded genomic DNA. When multiple ZFs are joined together, the resulting multi-finger array recognizes a longer target DNA sequence. Multi-finger arrays can be fused to and thereby target transcriptional activation domains, repressors, or nucleases to specific locations in the genome.



Zinc Finger Nucleases (ZFNs):

ZFNs consist of two zinc finger arrays, each fused to a nuclease domain. The FokI nuclease is only active as a dimer, and dimerization occurs when both zinc finger arrays bind their target sequence. The requirement for dimerization ensures that two zinc finger nucleases must bind to DNA in particular configuration in order to cleave DNA. Multi-finger arrays are typically designed to recognize and bind sites that are separated by a ”spacer” sequence of several base pairs (usually 5, 6, or 7 bps in length). We use the naming convention “Left Array” and “Right Array” to refer to the array that binds the sequence at the 5’ bottom strand and the 3' top strand, respectively, of the target DNA site, (diagram shown below).

Zinc Finger Cloning (ZFNs):

Cloning ZFNs: Each ZFN consists of two zinc finger arrays each fused to monomeric nuclease domains. The preference of a ZFN dimer for spacers of different lengths can be altered by varying the linker between the zinc finger array and the nuclease domain. Changes to this linker affect both the cloning method and the target plasmids. ZF arrays generated within standard Zinc Finger Consortium plasmids can be cloned into ZFN expression vectors using unique XbaI/BamHI restriction sites. The resulting vector will produce a ZFN designed to recognize sites with a 5 or 6 bp spacer (Foley 2009). However, ZFNs with a 7 bp spacer require PCR amplification with primers that introduce XbaI/NotI sites for insertion into alternative Zinc Finger Consortium plasmids that encode a ZFN capable of recognizing target sites with a 7 bp spacer.





CoDA (Context Dependent Assembly):

CoDA is a design based ZF assembly method that utilizes zinc finger domains pre-selected to work well together. Novel three finger ZFPs are generated by combining Finger 1 domains and Finger 3 domains that have been preselected to bind their cognate target sites in the context of the same Finger 2 domain (see figure below).


CoDA - Input

Sequence:   The DNA sequence for which a user wishes to identify potential target sites is pasted into the Sequence Window. Sequences should be in FASTA format. White space and numbers will be ignored.

Exon/Intron Case Sensitivity: This option allows users to distinguish between intron and exon sequences by denoting exons as uppercase and introns as lowercase. This information is used to generate a model of the gene as a graphic in the output. Additionally, this information can be used to filter out targets that occur in introns from the output.

Spacer size: This parameter is only available when designing zinc finger nucleases. The user specifies the number of nucleotides between the ZF arrays. The appropriate distance is determined by the length of the amino acid linker between the ZF array and the associated nuclease domain. For standard linkers, the spacer is five or six bp, which provides the proper spacing between the zinc finger nuclease monomers so that they can interact to create a functional enzyme.

Triplets(Position 1, Position 2, Position 3): This parameter allow the users to select which module pools to consider for target site identification. Default parameters include all pools currently available from the Zinc Finger Consortium.


CoDA - Advanced Options

Triplet Composition:

The user can specify the composition of nucleotide triplets desired in target DNA sequences. For example, if ANN (i.e., A followed by any 2 nucleotides) and GNN triplets are desired, the user can choose to exclude CNN and TNN triplets from consideration by setting the max value for these triplets to zero. As another example, the user can specify search parameters that require at least 3 GNN triplets in the target.




OPEN (Oligomerized Pool ENgineering):

This approach to ZFP production uses pools of zinc fingers, each of which consists of numerous unique solutions that recognize a particular DNA triplet at one of three positions (amino-terminal, middle, and carboxy-terminal) in a three-finger protein. Pools are recombined to generate hundreds of thousands of unique solutions for each target site. Optimal combinations of fingers that work well together are identified using genetic selections in which binding of the protein to the target site upstream of a pair of selectable marker gene activates expression and confers a growth advantage to E. coli bacteria.


OPEN - Input

Sequence:   The DNA sequence for which a user wishes to identify potential ZF target sites is pasted into the Sequence Window. Sequences should be in FASTA format. White space and numbers will be ignored.

Spacer size: This parameter is only available when designing zinc finger nucleases. The user specifies the number of nucleotides present in the spacer sequence between the “half-sites” bound by each ZF array. Note: The DNA sequences encoding ZF arrays that are provided as output contain the appropriate flanking XbaI/BamHI or XbaI/NotI restriction sites.

Triplets(Position 1, Position 2, Position 3): This parameter allows users to select which OPEN pools to consider for target site identification. Default parameters include all pools currently available from the Zinc Finger Consortium.


OPEN - Advanced Options

Triplet Composition:

The user can specify the composition of nucleotide triplets desired in target DNA sequences. For example, if ANN (i.e., A followed by any 2 nucleotides) and GNN triplets are desired, the user can choose to exclude CNN and TNN triplets from consideration by setting the max value for these triplets to zero. As another example, the user can specify search parameters that require at least 3 GNN triplets in the target.


OPEN - Scoring

Active/Inactive Scoring (ZiFOpT): Active/Inactive predictions and their corresponding confidence values reflect the likelihood that an OPEN selection for a given 9bp sequence will successfully yield a zinc finger array capable of activating transcription of a reporter gene by three-fold or more in the well-established bacterial-two-hybrid assay (Wright et al., Nat. Protocols 2006; Maeder et al., Nat. Protocols 2009). Previous work has shown that a when a pair of zinc finger arrays both activate over the three-fold cut-off, these arrays have a high probability (>50%) of functioning when used as zinc finger nucleases. Predictions are based on success rates and sequence composition for target sequences previously assayed in the OPEN system. (Sander & Reyon, BMC Bioinformatics 2010)




Interface with ZiFDB:


To enhance its utility, ZiFiT is interfaced directly with ZiFDB - a web accessible database of zinc fingers and engineered zinc finger arrays. Target Site hyperlinks within the ZiFiT output directly query ZiFDB to determine if any previously constructed arrays exist that bind to completely or partially matched target sequences. In addition, ZiFiT users can query ZiFDB for finger information for a specific triplet subsite by clicking on the triplet. Thus, ZiFiT and ZiFDB work synergistically to aid in ZFP design.