BMC Cell Biology BioMed Central

Background The mechanism involved in the maintenance and differentiation of embryonic stem (ES) cells is incompletely understood. Results To address this issue, we have developed a retroviral gene trap vector that can target genes expressed in undifferentiated ES cells. This gene trap vector harbors both GFP and Neo reporter genes. G-418 drug resistance was used to select ES clones in which the vector was integrated into transcriptionally active loci. This was then followed by GFP FACS profiling to identify ES clones with reduced GFP fluorescence and, hence, reduced transcriptional activity when ES cells differentiate. Reduced expression of the GFP reporter in six of three hundred ES clones in our pilot screening was confirmed to be down-regulated by Northern blot analysis during ES cell differentiation. These six ES clones represent four different genes. Among the six integration sites, one was at Zfp-57 whose gene product is known to be enriched in undifferentiated ES cells. Three were located in an intron of a novel isoform of CSL/RBP-Jkappa which encodes the key transcription factor of the LIN-12/Notch pathway. Another was inside a gene that may encode noncoding RNA transcripts. The last integration event occurred at a locus that may harbor a novel gene. Conclusion Taken together, we demonstrate the use of a novel retroviral gene trap vector in identifying genes preferentially expressed in undifferentiated ES cells.


Background
CtBP proteins were originally identified as C-terminal binding proteins of type 2/5 adenovirus E1A proteins [1]. They function primarily in the nucleus as transcriptional co-repressors, modulating the activity of a large number of transcriptional repressors via recruitment of chromatin modifiers such as histone deacetylases, histone methyltransferases and polycomb proteins [2][3][4], and sequestration of histone acetyltransferases [5]. CtBP proteins also play a role in the cytoplasm in regulating mitotic Golgi membrane fissioning [6,7], and also associate with centrosomes during mitosis [8]. CtBP proteins have been implicated in tumorigenesis, as their interaction with the C-terminus of E1A is essential for immortalisation of primary rodent cells, and also negatively regulates E1A-mediated transformation, tumorigenicity and metastasis [1,9,10]. In addition, many transcriptional repressors regulated by CtBPs are involved in pathways associated with tumorigenesis, including TGF-β and Wnt signalling pathways and cell cycle regulators such as RB/p130 and HDM2 [11][12][13][14][15]. Presumably, as a consequence of disruption of some of these critical functions, inhibition of CtBP expression in cancer cells can result in apoptosis [16]; reviewed in [17].
Humans possess two CtBP gene loci, CTBP1 and CTBP2. CtBP1 and CtBP2 proteins share 78% amino acid identity and 83% similarity [18]. Alternate promoter usage and gene splicing from the CTBP2 locus generates RIBEYE, a retina-specific CtBP2 variant [19]. The CTBP1 locus also similarly encodes a CtBP1 variant with an alternate N-terminus, variously described as CtBP3, BARS or CtBP1-S [20]. The primary protein products, CtBP1 and CtBP2, both contain a conserved N-terminal domain involved in the binding of transcription factors possessing a consensus PxDLS peptide motif, and a central dehydrogenase homology domain that has a number of functions, including dimerisation. CtBP1 and CtBP2 appear to function interchangeably, at least in terms of their role as transcriptional co-repressors, but evidence is emerging that they are subject to differential transcriptional and posttranslational regulation (reviewed in [21]).
Control of subcellular localisation is emerging as an important mechanism whereby CtBP1 function is regulated. For example, phosphorylation of CtBP1 at Ser158 by p21-activated kinase 1 (PAK1) results in cytoplasmic localisation and inhibition of its corepressor activity under certain growth conditions [22]. Certain PxDLS-containing transcriptional repressors are able to recruit CtBP1 to the nucleus, such as Ets family member NET [23] and the tumour suppressor protein HIC1 [24]. CtBP1 is modified by sumoylation at K428, which, in conjunction with protein-protein interactions involving its C-terminal PDZbinding domain [25,26], regulates its nuclear localisation [26]. CtBP2 lacks both this sumoylation site and the PDZbinding domain, indicating that its subcellular localisation is likely to be regulated in a different manner to CtBP1. We therefore examined the primary sequence of CtBP2 to look for alternative sequence motifs that could be involved in the regulation of its localisation. This analysis identified a putative, evolutionarily conserved nuclear localisation signal (NLS), which has recently been shown to be functional in promoting the nuclear accumulation of CtBP2, [27,28] though it has been shown to function in nuclear retention, rather than nuclear import [27]. In this present study, we have undertaken a detailed analysis of the role played by this N-terminal sequence of CtBP2 in regulating CtBP protein localisation.

Structure and phylogenetics of CTBP loci
The CtBP2 protein sequence was subjected to an in silico search for potential nuclear localisation signals [29] and a potential NLS (KxKRQR) was identified at amino acids (a.a.) [8][9][10][11][12][13]. Because this sequence is located within the non-conserved N-terminus of CtBP2, and because of the differential promoter usage and alternative splicing of the variant CtBP proteins, we first clarified the genomic structures of the 5' regions of the CTBP1 and CTBP2 loci (Fig.  1a). Fig. 1b shows the N-terminus of human CtBP2 and its homology to other known CtBP proteins. The putative NLS in human CtBP2 is conserved completely in mouse and zebrafish CtBP2, and contains a single amino acid CtBP gene structure and sequence comparison Figure 1 CtBP gene structure and sequence comparison. (a) Genomic structure of the 5' end of human CTBP1 and CTBP2 genes. Published cDNA sequences were compared to human genome sequences using BLAST analysis. Solid lines show splicing of the major CTBP1 and CTBP2 transcripts, and dotted lines indicate the alternate splicing to generate CtBP1-S and RIBEYE. (b) ClustalW alignment of CtBP sequences from multiple higher organisms. Putative nuclear localisation signals are highlighted in bold type. The residues deleted in the Δ4-14 constructs are marked. Zebrafish have two ctbp2/ribeye loci [36]. substitution in quail CtBP2. CtBP1, as well as the single CtBP in Drosophila and Xenopus, do not contain this sequence, though Drosophila CtBP does contain a short lysine-arginine rich sequence (KRSR) that is not present in CtBP1 proteins. It is also interesting to note that a.a. 1-20 in CtBP2, including the putative NLS, is encoded by a short exon 1 that is located more than 30 kb upstream of the rest of the gene.

Amino acids 4-14 of CtBP2 promote its localisation to the nucleus
In order to establish whether the unique N-terminal region of CtBP2 is important in determining CtBP2 subcellular distribution, we expressed various CtBP2-EGFP fusion proteins in HEK 293 cells (Fig. 2). 48 hours after transfection, cells were fixed and counterstained with DAPI, and analysed by fluorescence microscopy. Control, EGFP alone localised to the nucleus and cytoplasm (Fig.  2a). Both full-length CtBP2-EGFP (Fig. 2b) and a truncated version containing a.a. 8-13 and the N-terminal PxDLS-binding domain, CtBP2(1-119)-EGFP (Fig. 2c), were detectable exclusively in the nucleus. Deletion of eleven amino acids encompassing a.a. 8-13 in full length CtBP2 (CtBP2(1-445)Δ4-14-EGFP) resulted in a partial redistribution of the protein to the cytoplasm, although it was still predominantly nuclear (Fig. 2d). The Δ4-14 mutation was also made in the context of CtBP2(1-119)-EGFP (Fig. 2e), and resulted in a more pronounced redistribution to the cytoplasm compared to its effect on the full length protein, though again EGFP fluorescence was still detectable in the nucleus. Substitution of a.a. 4-14 with a bona fide NLS from SV40 large tumour antigen at the N-terminus of the truncated CtBP2 mutant (CtBP2(1-119)NLS-EGFP) also resulted in exclusive nuclear locali-Subcellular localisation of EGFP-tagged CtBP2 proteins in HEK 293 cells sation (Fig. 2f). Thus, residues 4-14 of CtBP2 are important for maintaining its nuclear localisation, although other regions within CtBP2 protein are clearly also involved.
To ensure that the above results were not affected by the presence of a large EGFP tag, we cloned various CtBP constructs into a vector containing a smaller myc-his tag (mh). Expressed proteins were detected using a 6xHis-specific primary antibody. A nuclear localisation in HEK 293 cells was confirmed for exogenous full-length CtBP2(1-445)mh (Fig. 3b). The deletion mutant (CtBP2(1-445)Δ4-14mh) showed a similar nuclear and cytoplasmic localisation to its corresponding EGFP fusion protein (Fig. 3c). We also cloned full length CtBP1 into this expression vector, in order to compare results with that of CtBP2, and with other studies. Consistent with previous studies on other cell lines, exogenous CtBP1(1-440)mh localises primarily to the nucleus, with some cytoplasmic staining (Fig. 3a).
Cell type-specific differences have been observed in the degree of nuclear and cytoplasmic localisation of overexpressed CtBP1 [23,26,30]. We wanted to investigate the behaviour of CtBP2 following over-expression in different cell lines, and specifically whether it would still be localised to the nucleus in cells in which CtBP1 is cytoplasmic. As expected [26], CtBP1(1-440)mh is distributed in the both the nucleus and cytoplasm of HeLa cells, with staining being strongest in the nucleus (Fig. 4a). Similar to a previous report [23], we found that CtBP1(1-440)mh localises predominantly to the cytoplasm in over 60% of Cos-7 cells, with some cells showing a nuclear and cytoplasmic distribution (Fig. 4d). CtBP1(1-440)mh is nuclear and cytoplasmic in MCF-7 cells, similar to HeLa cells (Fig. 4g). CtBP2(1-445)mh localises exclusively to the nucleus of all three cell lines (Figs. 4b,e,h). In the absence of a.a. 4-14, CtBP2(1-445)Δ4-14mh remains primarily nuclear in all three cell lines, though with a clear increase in cytoplasmic staining similar to our findings in HEK 293 cells, (Figs. 4c,f,i). These experiments, particularly those with Cos-7 cells, confirm that the presence of a.a. 4-14 in CtBP2 confers upon it an almost exclusively nuclear distribution. This is in contrast to CtBP1, which shows cell type-dependent variations in its localisation.

Binding of PxDLS-containing proteins is not required for the a.a. 4-14-independent nuclear localisation of CtBP2
Our experiments show that even when a.a. 4-14 are absent, a large proportion of the CtBP2 still localises to the nucleus. As a previous study has shown that binding of a PxDLS-containing protein to CtBP1 promotes its nuclear localisation [25], we decided to investigate whether such an interaction may also drive the nuclear localisation of CtBP2. We introduced a point mutation (V72R) into the PxDLS-binding motif of the CtBP2(1-445)mh constructs to generate CtBP2(1-445)V72Rmh and CtBP2(1-445)Δ4-14V72Rmh. This mutation renders CtBPs defective in their interaction with PxDLS-containing proteins [31]. Full-length CtBP2(1-445)mh with the V72R mutation localises to the nucleus in both Cos-7 and MCF-7 cells (Fig. 5a,c). CtBP2(1-445)Δ4-14V72Rmh localises to both the nucleus and cytoplasm (Fig. 5b,d), with no further increase in cytoplasmic distribution compared to the CtBP2(1-445)Δ4-14mh protein (compare Figs. 5b,d with Figs. 4f,i). We therefore conclude that the a.a. 4-14-independent nuclear localisation of CtBP2 in these cells also occurs independently of its PxDLS binding ability.

CtBP2 influences CtBP1 subcellular localisation
We, and others, have shown that the unique N-terminal region of CtBP2 is a major factor driving its accumulation Subcellular localisation of CtBP proteins in HEK 293 cells, using myc-his-tagged constructs nisms determining its subcellular distribution remain an important question for understanding the regulation of its function. To this end, we asked whether heterodimerisation with CtBP2 might be able to recruit CtBP1 to the nucleus. To examine this, we analysed the effects of coexpressing both mhCtBP1 and various CtBP2-EGFP constructs in Cos-7 cells. CtBP1(1-440)mh, when transfected individually (Fig. 4d) or with EGFP-N1 control (Fig. 6a), is predominantly cytoplasmic. Co-expression of CtBP1(1-440)mh with CtBP2(1-445)-EGFP results in a striking relocalisation of CtBP1(1-440)mh to the nucleus in a high proportion of the cells (Fig. 6b). Quantification of this by counting stained cells showed over-expressed CtBP1(1-440)mh to be primarily cytoplasmic in 75% of the cells, with a mixed nuclear/cytoplasmic localisation in 25%. When co-transfected with CtBP2(1-445)-EGFP, this changes to 45% nuclear/cytoplasmic and 55% primarily nuclear. This effect is dependent on CtBP2 being correctly localised to the nucleus, as demonstrated by the effects of co-expressing CtBP1(1-440)mh and CtBP2(1-445)Δ4-14-EGFP (Fig. 6c). Finally, we demonstrate that the recruitment of CtBP1 to the nucleus by CtBP2 requires a.a. 120-445 of CtBP2, as co-expression of EGFP-CtBP2(1-119) does not alter the localisation of CtBP1(1-440)mh (Fig. 6d), and the two proteins fail to co-localise. As a.a. 120-445 contain the dimerisation domain, this finding is consistent with heterodimerisation with CtBP2 being a mechanism whereby CtBP1 can be recruited to the nucleus.

Discussion
We show firstly that the deletion of eleven amino acids (Δ4-14) encompassing the putative NLS sequence KVKRQR at a.a. 8-13 of CtBP2 results in shift in the localisation of detectable CtBP2 from exclusively nuclear, to nuclear and partly cytoplasmic. This effect was observed in a number of cell lines of diverse origin. These initial findings confirm a recently published study which identi- fied a role for a.a. 1-21 in the nuclear localisation and retention of CtBP2 [27], as well as another study which was published whilst this manuscript was being submitted [28]. Our data further localises the critical sequence elements to a.a. 4-14. One of the cell lines, Cos-7, was chosen because a previous study had shown that transfected CtBP1 primarily localises to the cytoplasm in these cells, making them a useful experimental model [23]. Amino acids 4-14 also direct nuclear accumulation of CtBP2 in Cos-7 cells. Therefore, whatever the mechanism that underlies the cytoplasmic localisation of CtBP1 expressed in Cos-7 cells, it does not affect the ability of a.a. 4-14 to localise CtBP2 to the nucleus. As a primary function of CtBP proteins is as nuclear transcriptional corepressors, this sequence in CtBP2 is likely to play a key role in maintaining nuclear CtBP activity in cells where CtBP2 is expressed.

Influence of CtBP2 on CtBP1 subcellular localisation
Before considering the role of a.a. 4-14 further, it is important to note that we, as well as Zhao et al [27] who examined the effects of an a.a. 1-21 deletion, observed that CtBP2 with these N-terminal sequences deleted still retained a predominantly nuclear localisation, with only a partial redistribution to the cytoplasm. Interestingly, Verger et al [28] found than a CtBP2Δ1-25 mutant localised almost exclusively in the cytoplasm in Cos-1 cells. This different result could be due to the slightly larger deletion that they used, the cell type, or experimental differences. However, our experiments clearly show that domains other than a.a. 4-14 can be important in defining the nuclear-cytoplasmic distribution of CtBP2. These could potentially become important under conditions whereby the N-terminal sequences of CtBP2 might be masked, such as following binding of another protein or through post-translational modification. As the localisation of CtBP2(1-445)Δ4-14mh in all four cell lines closely resembled that of transfected CtBP1 in HEK 293, HeLa and MCF-7 cells it is possible this a.a. 4-14-independent nuclear localisation occurs through the same mechanisms that regulate CtBP1. In studies on Cos-7 cells, Criqui-Philipe et al [23] showed that CtBP1 could be recruited to the nucleus through an association with PxDLS containing transcription factors. Structural studies on rat CtBP1-S (BARS) have characterised the PxDLSbinding interface, and identified mutations (e.g. V55R) that disrupt CtBP-binding to the C-terminal domain of E1A. We therefore examined the effects of generating the CtBP2 equivalent of the CtBP1-SV55R mutation. Our finding that this V72R mutation of CtBP2(1-445)Δ4-14mh does not affect its subcellular localisation excludes the PxDLS-binding interface as the major determinant for the a.a. 4-14-independent localisation of CtBP2 to the nucleus in these experiments. The N-terminal 119 a.a of CtBP2 is able to engage in other protein-protein interactions through less well-defined interfaces, e.g. [12], and a role for these interactions in CtBP2 subcellular distribution cannot yet be excluded.
Our experiments showed that CtBP2(1-119)Δ4-14mh has a markedly more cytoplasmic distribution than CtBP2(1-445)Δ4-14mh. This identifies a.a. 120-445 as having a role in nuclear localisation. When compared to domains in this region of CtBP1 with a known role in subcellular localisation, CtBP2 lacks the PDZ binding motif present at the extreme C-terminus of CtBP1 [25], as well as the equivalent of the sumoylation site at K428 [32]. Sequences that are conserved which are good candidates for a role in CtBP2 nuclear localisation are the Pak1 phosphorylation site at Ser 164, given that the phosphorylation status of the corresponding site in CtBP1 (Ser158) regulates its subcellular localisation [22], and possibly the dimerisation domain. In the intact CtBP2 protein, the N-, C-and core domains do not function independently [31]. It is quite possible that whilst a.a. 120-445 are necessary for optimal nuclear localisation, a functional interaction between this region of the molecule and other sequences within a.a. 1-119 is required for a.a. 4-14-independent nuclear accumulation.
The experimental data obtained from our analysis of truncations and mutants of CtBP2 also provides additional insight into the mechanism of regulation of subcellular localisation by a.a. 4-14 containing the putative CtBP2 NLS. Zhao et al [27] demonstrated that this region does not, in fact, function as a classical NLS, but rather that it is necessary for lysines within it, primarily lysine 10, to be acetylated for it to direct localisation in the nucleus. Specifically they showed that lysines in this sequence are acetylated in vivo and that this is likely to be through the actions of the p300 acetyltransferase, a known CtBP binding protein. Their experiments using a non-acetylatable K10R mutant of CtBP2 showed that this acetylation is required for retention of CtBP2 in the nucleus, and that this mutant actually enhances CtBP2 nuclear export. Analogous to our experiments with the V72R mutant, they also showed that a different mutant in the PxDLS binding domain (A58E) does not affect acetylation by p300, and therefore p300 must bind to different sequences on CtBP2 than other PxDLS transcription factors. In contrast, Verger et al [28] concluded that this N-terminal lysine-arginine rich region functions as a classical nuclear localisation signal, with a role in nuclear import, rather than retention.
The experiments that we have performed do not distinguish between these two alternative mechanisms. Our finding that a.a 1-119 of CtBP2 is sufficient to drive efficient nuclear accumulation of EGFP provides an important advance in our understanding of the regions that regulate the subcellular distribution of CtBP2. However, the data are consistent with a role of a.a. 4-14 in either nuclear retention or import. Both previous studies demonstrated that the N terminal 25 amino acids cannot, alone, target a heterologous protein to the nucleus. This could be either due to an NLS or nuclear retention signal not being correctly presented to their target binding proteins in the context of these molecules, or the requirement for docking of acetyltransferases to a separate part of the molecule in order to achieve activation of the nuclear retention signal by acetylation. In CtBP2(1-119)-EGFP either the sequences may simply be sufficiently spaced from the EGFP for a.a. 4-14 to be correctly presented as an NLS, or this region may include the p300 binding site, allowing activation of the nuclear retention signal by acetylation. CtBP2(1-119)-EGFP is small enough to enter the nucleus by passive diffusion, and therefore the presence of a nuclear retention signal would be sufficient for nuclear accumulation. Expression of a.a. 1-119 in the context of a fusion with 2XEGFP would generate a larger protein that could only accumulate in the nucleus if it were actively imported through nuclear pores. However further experimentation would be required to determine conclusively whether this was due to a.a. 4-14 functioning as an NLS, or interaction of a.a. 1-119 with other actively imported proteins such as PxDLS containing transcription factors [23], HDM2 [12], or possibly p300 acetyltransferase [33].
Finally, we have identified heterodimerisation with CtBP2 as a novel mechanism that can promote the nuclear localisation of CtBP1. This interplay between the two proteins has also, very recently, been demonstrated by other investigators [28]. It will be interesting to determine the extent to which this heterodimerisation contributes to CtBP1 subcellular distribution in different cell types, compared with the other mechanisms that have been described previously. It is important to note that CtBP2 expression is clearly not an absolute requirement for nuclear CtBP1 activity in many cell types [34]. The contrasting subcellular localisations of over-expressed CtBP1 and CtBP2 in Cos-7 cells add weight to the growing argument that the two proteins are regulated differently. Indeed, studies on the role of CtBP1 and CtBP2 during murine development revealed a more severe and lethal phenotype in Ctbp2 -/mice compared to Ctbp1 -/mice [35]. This has been attributed to temporal and spatial differences in the expression of Ctbp1 and Ctbp2 during development [35]. Alternatively, it could be explained by the different modes of regulation of protein localisation and function between these two proteins, implying that perturbation of the constitutive nuclear function of CtBPs is responsible for the embryonic lethality of Ctbp2 -/animals. CtBP2 with an Nterminal motif that promotes its nuclear localisation is present in mice, man and fish. It is not yet known whether the smaller KRSR sequence in Drosophila CtBP is functional, and Xenopus CtBP does not contain any such motif in its N-terminus. Therefore, CtBP in Xenopus, and possibly Drosophila, will likely be dependent upon other protein-protein interactions for its recruitment to the nucleus. It is tempting to speculate, therefore, that this is an indicator of an increased importance of the nuclear activities of CtBP proteins in the regulation of the complex patterns of gene expression in higher organisms.

Conclusion
CtBP1 and CtBP2 show a high degree of similarity at the sequence and functional level. Differential control of their subcellular localisation is likely to provide mechanisms to regulate critical nuclear and cytoplasmic functions of CtBPs in the cells of higher organisms. Here we have identified distinct regions in CtBP2 that play a key role in regulating the subcellular distribution of both CtBP2 and CtBP1 proteins.

Cell culture and transfection
All cells were maintained in Dulbecco's modified Eagle medium (Invitrogen), supplemented with 10% foetal bovine serum (Autogen Bioclear) and penicillin (100 U/ ml), streptomycin (100 μg/ml) and L-glutamine (292 μg/ ml) (Invitrogen). Cells were seeded at the required density on glass coverslips 24 h prior to transfection. Effectene transfection reagent (Qiagen) was used to transfect 0.2 μg (0.4 μg for dual localisations) DNA into cells, as per the manufacturer's instructions. Cells were incubated for 48 h, and fixed for fluorescence microscopy.

Immunofluorescence analysis
Cells were fixed with 4% paraformaldehyde in phosphatebuffered saline (PBS) for 10 min, and permeabilised using 0.1% Triton X-100/PBS for 20 min. They were blocked for 30 min in 3% bovine serum albumin (BSA)/PBS, and incubated as required with primary antibody (Anti-6xHis antibody, Abcam) for 1 h in 0.6% BSA/PBS, followed by Alexa594-conjugated species-specific secondary antibody (Molecular Probes). Cells were counterstained with 1 μg/ ml DAPI during the secondary antibody incubation. When EGFP-fusion proteins only were visualised, all antibody incubation steps were omitted, and fixed and permeabilised cells were incubated with DAPI in 3% BSA/ PBS for 10 min. Coverslips were mounted on slides with fluorescent mounting media (DakoCytomation). Cells were visualised using a Zeiss Axiovert 200 fluorescent microscope and images were collected using an Orca-ER digital camera (Hamamatsu), and processed using Openlab 3.5.1 Software (Improvision).