Members of this family have been found in more than 40 bacterial species. Their GC content ranges from 70% in the Mycobacterial examples to 25% in those isolated from Mycoplasma species. In spite of this enormous variation, they are strikingly similar in many respects and form an extremely coherent and highly related family. Most show a similar GC content to their host organism. Several family members have been shown to be part of compound transposons. These include IS3411 flanking genes for citrate utilisation (220), IS4521 which flanks a heat stable enterotoxin gene in enterotoxinogenic Escherichia coli (203) and IS1706 (208) which flanks genes of the Clp protease/chaperone family.
Members are characterised by lengths of between 1200 and 1550 bp (one exception previously attributed to this family, IS481, 1045 bp, has now been placed in a separate family; see IS481 family), and inverted terminal repeats in the range of 20-40 bp. These repeats are variable but clearly related (Fig) (Fig). The majority of the elements terminate with 5'-TG-----CA-3' and present an internal block of G/C residues of variable length. IS3-family members generally have two consecutive and partially overlapping reading frames, orfA and orfB, in relative translational reading phases 0 and -1, respectively (Fig. A). It has been demonstrated in at least three cases (IS150 (495), IS3 (429), and IS911 (376)) that, in addition to the product of the upstream frame, OrfA, a fusion protein, OrfAB, is generated by programmed translational frameshifting (see (79)). However, in contrast to IS1, the product of the downstream frame, OrfB, is also detected. The frequency of frameshifting varies from element to element. It is approximately 50% in the case of IS150 (495) and only 15% for IS911 (376).
Several members exhibit an organisation which does not apparently conform to the generic IS3 member. In IS120, for example, the relationship between the reading phases of the upstream and downstream orfs appears to be +1 rather than -1 while in ISNg1 and ISYe1 the characteristic motifs of OrfB (see below) are distributed between reading phases. Other members, such as IS1076, IS1138, IS1221, and IS1141, exhibit only one long open reading frame. Although these may be true variants, it cannot at present be ruled out that the variations are simply be due to errors in sequence determination.
Family members from Mycoplasma species merit special attention. Not only does the host use a non-universal genetic code in which the opal termination codon TGA directs insertion of tryptophan (see (358)), but their genomes are among the smallest bacterial genomes known and extremely rich in A+T. To date, several different IS3 family members have been observed in Mycoplasma. Of these, only IS1138 (and IS1138b) has been demonstrated directly to undergo autonomous transposition (40). All exhibit similarly high AT levels and this unusual base composition could lead to difficulties in sequence determination. It is remarkable that typical IS3 family characters have been maintained in such an "extreme" genetic environment. Nine individuals are closely related and form a group of iso-elements which have been called IS1221. As indicated above, one of these carries a single long reading frame (representing orfA + orfB) instead of two consecutive overlapping frames. The others each carry insertions or deletions which destroy either the equivalent of orfA, orfB, or both. Expression studies in E. coli indicate that a protein, equivalent to OrfAB, is indeed produced from the long open reading frame of IS1221. Interestingly, it appears that a second truncated protein, equivalent to OrfA, may be generated from the single orfAB frame by translational frameshifting, representing an "inverted" expression pattern to the majority of the family members (532). Although this appears not to be a general rule for IS3 family members originating from Mycoplasma hosts, the presence of a similar single-frame arrangement in a second member, IS1138, indicates that it might not be rare. Because of the extremely high AT content of these elements, many potential frameshift windows of the A6G(/C) or A7 type are expected to occur. Only direct experiment will therefore be able to determine which, if any, of these sequences are used to generate the Tpase or, conversely, an OrfA-like protein.
Extensive alignment studies of the predicted OrfA and OrfB amino acid sequences between themselves and with those of other transposable elements (120 ,135, 238, 242, 404) have provided insights into structure/function relationships of the proteins. The predicted primary amino acid sequences of the various OrfA proteins from different members of the family exhibit a relatively strong a helix-turn-a helix motif ( Fig. A) suggesting that they might provide sequence-specific binding to the terminal IRs of their particular IS (425). The OrfB products on the other hand carry a DD(35)E motif and share additional identities with retroviral integrases and various other Tpases (120 ,135, 192, 238, 242, 404, 375). These include two amino acids located 4 and 7 residues downstream from the glutamate residue. Interestingly, many members carry a putative leucine zipper located at the end of OrfA (and sometimes extending into the OrfB region of the OrfAB protein) (see (486), (532), (52)). Studies with IS911 and IS2 strongly suggest that this represents one domain of multimerization of the proteins (191,193, 282)).
The IS3 family can be divided into several subgroups (Table 2)(Table 1) defined by deep branching in the alignment of the various OrfB sequences (Fig)(310). We have designated these the IS2 and IS407 subgroups (which appear closely related), and the IS3, IS51, and IS150 groups. Additional members of the family identified subsequently also tend to follow this pattern. One feature which lends biological credence to these subgroups is that they also clearly appear clustered (with some exceptions) in the results of the alignments with the upstream OrfA protein (310). Moreover, there is a strong correlation between the members of each group and the number of base pairs of target DNA duplicated on insertion : for those elements in the IS2 subgroup, insertion invariably leads to a 5 bp direct target repeat; for the IS407 subgroup a 4 bp repeat is observed; while for the other groups a 3 bp repeat is obtained. In the latter cases some of the elements, e.g. IS911, have been shown to occasionally generate 4 bp repeats. This clustering is also exhibited to some extent in the nucleotide sequence of the terminal IRs (Fig)( fig ) and is particularly marked in the IS2, IS51 and IS407 subgroups. It can also be observed in the primary sequence details of the putative leucine zipper (data not shown).
Several members carry GATC methylation sites within 50bp of their ends, which have been shown in one case, IS3, to modulate transposition activity (450) however, this is not a general characteristic of the family nor is it restricted to any particular subgroup.
Little is known concerning the insertion specificity of members of the family. IS2 exhibits a preference for a region of bacteriophage P1 but the basis of this preference is at present unknown (433). Both IS911 (378) and IS150 (507) have been found next to sequences which resemble their IRs and IS1397 is invariably located within intergenic repeated sequences in E. coli (bacterial interspersed mosaic elements or BIMEs, (18)).
Finally, an element isolated from the ECOR collection of E. coli and closely related to IS3411 carries a group II intron (136). The implication of this on the regulation transposition of this element has not been investigated.
One of the best characterized members of the IS3 family is IS911. OrfA and OrfAB bind specifically to the terminal IRs (373, 191, 193). Constitutive production of OrfAB from a contiguous orfAB frame (generated by mutation and eliminating production of OrfA) leads to only a modest increase in intermolecular transposition activity. In contrast, high production of OrfAB in this way either in cis or in trans stimulates production of excised circular transposon copies whose formation is best explained by a single-ended attack by one IS extremity on the other (377). This is thought to occur in a two-step process (Fig. 8) in which one end of the transposon is cleaved to generate a free 3'OH which, in turn, is used as a nucleophile to attack the opposite end. This results in the circularization of a single transposon strand visualised as a figure-eight molecule in which the transposon ends are joined by a single-strand bridge (Fig. 8; (374)). It leaves a 3'OH group on the vector backbone at the point of insertion. Kinetic studies suggest that the figure-eight species is processed into transposon circles.
The figure-eight recombination reaction has been reproduced using a cell-free system but no transposon circles, excised linear transposon species or indeed any product resulting from double-strand cleavage at the transposon ends was detected in the in vitro system. It is not known how the figure-eight form might be processed into transposon circles although host factors which promote either replication or second strand recombination are thought to be involved (379). Using a purified figure-eight substrate the cell-free system was found to support a reaction equivalent to "disintegration" characteristic of retroviral integrases (86) (see Reaction mechanism). Moreover, this activity is also exhibited by the OrfB region of the protein alone, demonstrating that the DD(35)E domain carries out catalysis (379).
Although it is not yet clear whether transposon circles are "natural" transposition intermediates, they are efficient Tpase substrates for intermolecular transposition in vivo ((477), and see (378)). Simultaneous high-level expression of OrfA with the OrfAB protein greatly reduces or eliminates formation of excised transposon circles (and the figure-eight species) and stimulates intermolecular transposition. It also stimulates intramolecular transposition of a plasmid carrying a cloned circle junction. The development of an intermolecular transposition system in vitro (478) has demonstrated highly efficient integration of transposon circles in a reaction which requires both OrfAB and OrfA. Integration does not require a supercoiled donor molecule and is optimal when the abutted IRs are separated by 3 bp, as occurs in the circle junction. More recently, linear derivatives of IS911 have been observed in vivo. They appear to be derived from the transposon circle rather than resulting from direct excision from the donor plasmid molecule. Moreover, while they undergo integration in vitro, the efficiency of this reaction is significantly reduced compared to that of the transposon circles (479).
One striking feature of transposon circularization is that it creates a strong promoter at the circle junction in which IRR contributes a -35 hexamer and IRL a -10 hexamer (411). This has suggested a novel mechanism for autoregulation of transposition ( Fig). Transposon circles are proposed to be generated at low frequencies by a combination of low Tpase levels (from the weak endogenous promoter) and host functions (to assure processing of the second strand). Once formed, the junction promoter assures high levels of Tpase which is capable of binding to the abutted ends, introducing two single-strand breaks (one at each end) generating an opened transposon molecule and transferring both ends to a suitable target. Insertion results in the destruction of the efficient junction promoter. This model does not require double strand cleavage and therefore takes into account the observation that the only activity detected for the OrfAB protein is cleavage and transfer of a single DNA strand. It also assures that a suitable substrate is present before high levels of Tpase are produced. The results do not rule out alternative pathways involving simple excision (Fig).
Several other members of this family are also being analysed in detail. These include IS2, IS3, and IS150. All three have been shown to generate circles when supplied with high levels of the fused frame Tpase (287), (429), (507).
IS3 also generates adjacent deletions (429) but, unlike IS911, appears to undergo excision from the donor molecule as a linear form following a staggered double strand break at each end. These forms have a 3 base 5' overhang and may be an alternative type of transposition intermediate (430) (Fig). Such forms may be equivalent to the linear IS911 species derived from transposon circles. In addition, IS3-derivative transposons in which two abutted ends have been engineered undergo high levels of transposition (450). Insertion of IS3 creates generally 3 and sometimes 4 bp direct target repeats. It is significant that plasmids in which the IRs are separated by 4 bp are more active than those separated by 8 bp. In these studies the authors were unable to engineer derivatives with two complete tandem IS3 elements. This may be the result of the formation of a strong hybrid promoter which, as described for IS911 and other ISs (see above), drives high levels of Tpase expression. This configuration of ends is equivalent to that found at the circle junction, and suggests that abutted ends of IS3 are also efficient substrates in transposition.
IS2 generates direct target duplications of 5 bp on insertion (156) although transposon circles generated with this element carry only a single base pair separating IRL and IRR (287). Moreover, while IS2 carries a conserved terminal 5'-CA-3' at its right end, the left end terminates with 5'-TG-3'. Further analysis of IS2 circles has demonstrated that the atypical IRL does not act as a strand donor but uniquely as a target in the circularization reaction. Functional studies indicate that the product of the upstream orfA may inhibit transposition (202). It has been shown to bind specifically to IRL at a sequence which overlaps the -10 hexamer of the resident Tpase promoter and repress expression of OrfA. It does not appear to bind IRR (note that in the original article the authors inverse the standard definition of IRL and IRR; (202)). Several other elements also exhibit small inverted repeat sequences which flank the -10 hexamer of the putative resident Tpase promoter. IS2-derivative transposons in which two abutted ends have been engineered also undergo high levels of transposition (463),(287) and, like IS911, the circle junction of IS2 also constitutes a strong promoter capable of driving Tpase expression. Several (but not all) IS3-family elements may also carry similarly located potential -35 and -10 sequences within their IRs.
Mahillon J. and
and Molecular Biology
62 : 725-774
Chandler, M. and Mahillon, J.(2002) Insertion Sequences Revisited Mobile DNA II Edited by N.L., Craig et al. ASM Press 305-366
with permission of American Society of Microbiology the 10-26-01.
Last modification : December 20 2001