-
Notifications
You must be signed in to change notification settings - Fork 10
/
aftab_08_identification_797070.pdf.txt
1431 lines (1082 loc) · 42.8 KB
/
aftab_08_identification_797070.pdf.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>1471-2148-8-226.fm</title>
<meta name="Author" content="abdulkadir.sufi"/>
<meta name="Creator" content="FrameMaker 8.0"/>
<meta name="Producer" content="Acrobat Distiller 8.1.0 (Windows)"/>
<meta name="CreationDate" content=""/>
</head>
<body>
<pre>
BMC Evolutionary Biology
BioMed Central
Open Access
Research article
Identification and characterization of novel human tissue-specific
RFX transcription factors
Syed Aftab, Lucie Semenec, Jeffrey Shih-Chieh Chu and Nansheng Chen*
Address: Department of Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
Email: Syed Aftab - saaftab@sfu.ca; Lucie Semenec - lucie.semenec@gmail.com; Jeffrey Shih-Chieh Chu - jeff.sc.chu@gmail.com;
Nansheng Chen* - chenn@sfu.ca
* Corresponding author
Published: 1 August 2008
BMC Evolutionary Biology 2008, 8:226
doi:10.1186/1471-2148-8-226
Received: 1 April 2008
Accepted: 1 August 2008
This article is available from: http://www.biomedcentral.com/1471-2148/8/226
© 2008 Aftab et al; licensee BioMed Central Ltd.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0),
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Background: Five regulatory factor X (RFX) transcription factors (TFs)–RFX1-5–have been
previously characterized in the human genome, which have been demonstrated to be critical for
development and are associated with an expanding list of serious human disease conditions
including major histocompatibility (MHC) class II deficiency and ciliaophathies.
Results: In this study, we have identified two additional RFX genes–RFX6 and RFX7–in the current
human genome sequences. Both RFX6 and RFX7 are demonstrated to be winged-helix TFs and
have well conserved RFX DNA binding domains (DBDs), which are also found in winged-helix TFs
RFX1-5. Phylogenetic analysis suggests that the RFX family in the human genome has undergone at
least three gene duplications in evolution and the seven human RFX genes can be clearly
categorized into three subgroups: (1) RFX1-3, (2) RFX4 and RFX6, and (3) RFX5 and RFX7. Our
functional genomics analysis suggests that RFX6 and RFX7 have distinct expression profiles. RFX6
is expressed almost exclusively in the pancreatic islets, while RFX7 has high ubiquitous expression
in nearly all tissues examined, particularly in various brain tissues.
Conclusion: The identification and further characterization of these two novel RFX genes hold
promise for gaining critical insight into development and many disease conditions in mammals,
potentially leading to identification of disease genes and biomarkers.
Background
The regulatory factor X (RFX) gene family transcription
factors (TFs) were first detected in mammals as the regulatory factor that binds to a conserved cis-regulatory element
called the X-box motif about 20 years ago [1]. The X-box
motifs, which are typically 14-mer DNA sequences, were
initially identified as a result of alignment and inspection
of the promoter regions of major histocompatibility complex (MHC) class II genes for conserved DNA elements
[2,3]. Further investigations revealed that the X-box motif
is highly conserved in the promoter regions of various
MHC class II genes [4]. The first RFX gene (RFX1) was later
characterized as a candidate major histocompatibility
complex (MHC) class II promoter binding protein [5].
RFX1 was later found to function also as a transactivator
of the hepatitis B virus enhancer [6]. Subsequent studies
revealed that RFX1 is not alone. Instead, it became the
founding member of a novel family of homodimeric and
heterodimeric DNA-binding proteins, which also
includes RFX2 and RFX3 [7]. More members of this gene
family were subsequently identified. A fourth RFX gene
(RFX4) was discovered in a human breast tumor tissue [8]
Page 1 of 11
(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:226
and the fifth, RFX5, was identified as a DNA-binding regulatory factor that is mutated in primary MHC class II
deficiency (bare lymphocyte syndrome, BLS) [9]. The
identification of RFX1-5 and RFX genes in other genomes
including the genomes of lower eukaryote species Saccharomyces cerevisiae [10] and Schizosaccharomyces pombe [11],
and higher eukaryote species the nematode Caenorhabdits
elegans [12] helped understand both the evolution of the
RFX gene family and the DNA binding domains [13].
Notably, while previous studies reported five RFX genes
(RFX1-5) in human, only one RFX gene has been identified in most invertebrate animals and yeast. In contrast,
the fruit fly (Drosophila melanogaster) genome has been
found to have two RFX genes, dRFX [14] and dRFX2 [15].
All of these RFX genes are transcription factors possessing
a novel and highly conserved DNA binding domain
(DBD) called RFX DNA binding domain [13], the defining feature of all members belonging to the RFX gene family, suggesting that these RFX TFs all bind to the X-box
motifs.
In addition to the defining DBD domains in all of these
RFX genes, most of these previously identified RFX genes
also contain other conserved domains including B, C, and
D domains [13]. The D domain is also called the dimerization domain [13]. The B and C domains also play a role
in dimerization and are thus called the extended dimerization domains [16]. Another important domain found
in many members of the RFX family is the RFX activation
domain (AD). For instance, RFX1 contains a well defined
AD [16]. However, AD is not found in many other members of the RFX family including the human RFX5 and C.
elegans DAF-19 [13]. Outside of these conserved domains,
RFX genes from different species or even from same species show little similarity in other regions, which is quite
consistent with their diverse functions and distinct expression profiles.
In humans, RFX1 is primarily found in the brain with high
expression in cerebral cortex and Purkinje cells [17]. RFX2
[18] and RFX4 [19] are found to be heavily expressed in
the testis. RFX4 is also expressed in the brain [20]. RFX3 is
expressed in ciliated cells and is required for growth and
function of cilia including pancreatic endocrine cells [21],
ependymal cells [22], and neuronal cells [23]. RFX3-deficient mice show left-right (L-R) asymmetry defects [23],
developmental defect, diabetes [21], and congenital
hydrocephalus in mice [22]. RFX5 is the most extensively
studied RFX gene so far primarily since it serves as a transcription activator of the clinically important MHC II
genes [24] and mediates a enhanceosome formation,
which results in a complex containing RFXANK (also
known as RFX-B), RFXAP, CREB, and CIITA [25]. Mutation in any one of these complex members leads to bare
lymphocyte syndrome (BLS) [25]. In C.elegans and S.cere-
http://www.biomedcentral.com/1471-2148/8/226
visae only one copy of the RFX gene exists. In C. elegans it
is called DAF-19 and in S.cerevisae it is called Crt1. DAF19 is involved in regulation of sensory neuron cilium
whereas Crt-1 is involved in regulating DNA replication
and damage checkpoint pathways [10,12]. In D.melanogaster, two of RFX genes have been identified, one is
called dRFX and the other is called dRFX2. dRFX is
expressed in the spermatid and brain and is necessary for
ciliated sensory neuron differentiation [14,26]. dRFX2 has
not been studied extensively and as such its function in
Drosophila still remains unclear; however, there is evidence suggesting that dRFX2 plays a role in cell-cycle of
the eye imaginal discs [15].
In this project, we have identified and characterized two
novel RFX genes in genomes of human and many other
mammals, which have now been sequenced, annotated,
and analyzed.
Results and discussions
With the current version of the human genome [27,28],
we explored whether additional members of the RFX TF
family could be identified and characterized in the human
genome. We applied a Hidden Markov Model (HMM)
based search method [29] and used DBD domain
sequences of known human RFX TFs to search the entire
human proteome. In addition to retrieving all known
human RFX genes–RFX1-5, we identified two additional
genes in the human genome that contain well conserved
RFX DBDs. These two genes were previously assigned as
RFXDC1 and RFXDC2 by the HUGO Gene Nomenclature
Committee (HGNC, http://www.genenames.org/); this
nomenclature was based solely on an initial bioinformatic analyses. There are no previous publications
describing these two genes. Here, we demonstrate that
these two genes are also RFX gene family members closely
related to RFX1-5, and our phylogenetic analysis suggests
two separate recent gene duplications leading to the generation of these two genes. Thus, we proposed new gene
nomenclature of RFX6 and RFX7 (Table 1), respectively.
Our proposal has been accepted by the HGNC.
Because all known human RFX genes–RFX1-5–are well
conserved and have been identified in other mammalian
genomes, we hypothesized that orthologs of RFX6 and
RFX7 also exist in other mammalian genomes. As
expected, we have retrieved all seven RFX genes in the
genomes of five other mammalian species including
chimpanzee (Pan troglodytes), monkey (Macaca mulatta),
dog (Canis familiaris), mouse (Mus musculus), and rat (Rattus norvegicus) with only one exception. In the rat genome,
all except RFX2 were found despite extensive searches
(Additional file 1). Most identified RFX genes are
expressed and their transcripts can be found in existing
EST libraries. Interestingly, existing EST evidence suggests
Page 2 of 11
(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:226
http://www.biomedcentral.com/1471-2148/8/226
Table 1: Names and Protein ID of Representative RFX genes.
Gene
names
Accession
Number
(RefSeq)
ESEMBL
protein ID
Genomic coordinates
Protein
lengths
Number of
exons
Number of
isoforms
chromosome
RFX1
NM_002918
RFX2
NM_000635
RFX3
NM_134428
RFX4
NM_213594
RFX5
NM_000449
RFX6
NM_173560
RFX7
NM_022841
ENSP000002
54325
ENSP000003
06335
ENSP000003
71434
ENSP000003
50552
ENSP000003
57864
ENSP000003
32208
ENSP000003
73793
start
end
strand
19
13933353
13978097
-1
979
21
1
19
5944175
6061554
-1
723
18
2
9
3208297
3515983
-1
749
18
8
12
105501163 105680710
1
744
18
4
1
149581060 149586457
-1
616
11
3
6
117305068 117351384
1
928
19
2
15
54166958
-1
1281
7
1
that RFX6 and RFX7 have no or very few alternative isoforms similar to RFX1. In contrast, RFX2-4 usually have
more alternative isoforms (Additional file 1).
To confirm that the two novel human RFX genes–RFX6
and RFX7 are indeed RFX TFs, we further examined their
DBDs by aligning them with DBDs from RFX1-5 protein
sequences. As expected, the DBDs of RFX6 and RFX7 align
well with those of RFX1-5 (Figure 1). RFX TFs belong to
the winged-helix family of DNA binding proteins because
their DBDs are related in structure and function to the
helix-turn-helix bacterial transcriptional regulatory proteins [30]. DBDs from RFX6 and RFX7 each contain one
wing (W1), which is the same as DBDs from RFX1-5. W1
interacts with the major groove and another conserved
fold H3 (helix 3) interacts with the minor groove of DNA.
In particular, the nine residues in DBDs (Figure 1, indicated with arrow heads) that make direct or water-mediated DNA contacts [31] are almost entirely conserved in
RFX6 and RFX7 (Figure 1) with a couple of minor exceptions. Of the nine residues, the human RFX7 DBD has two
residues different from most of the other RFX DBDs. The
first different residue is the first of the nine indicated residues. It is Lys in RFX7 DBD and RFX5 DBD, compared to
Arg in DBDs of other RFX genes. Thus this difference is
shared with the RFX5 DBD. The other different residue is
the third of the nine residues. It is Lys in RFX7, compared
to Arg at this site for DBDs of all other RFX genes. Because
both Lys and Arg are basic amino acids, such substitutions
are not expected to have dramatic impacts on the binding
between the DBDs and their cognate binding sites. This
high degree of conservation suggests that RFX6 and RFX7
may bind to similar if not identical cis-regulatory elements, i.e., the X-box motif [1]. Hence RFX6 and RFX7 are
54222377
new members of the human RFX gene family with conserved DBDs.
In addition to the highly conserved DBDs, other domains
including ADs, B, C, and D domains (also known as
dimerization domain) [13] have been described in
human RFX1-3 (Figure 2). Among these functional
domains, ADs have been identified in RFX1-3. However,
ADs have not been identified RFX4-5. The B and C
domains, which are usually called extended dimerization
domains, play supporting roles in dimerization [16]. B, C,
and D domains have also been identified in RFX4 but are
missing from RFX5. Using InterProScan [32] and HMMER
[29], we have found that RFX6 possesses B, C, and D
domains, but not AD (Figure 2). The motif composition
of RFX6 is similar to RFX4, which also has B, C, and D
domains but lacks AD. In contrast, we failed to identify B,
C, and D domains or AD in RFX7. None of these domains
can be found in RFX5 as well. Because these C-terminal
domains–B, C, and D domains–have been shown to
mediate dimerization as well as transcriptional repression
[33], RFX6, which contains B, C, D domain, and RFX7,
which does not possess B, C, or D domains, may therefore
play different role in transcriptional regulation.
Characterization of the functional domain composition
of RFX genes will provide insights into how different RFX
TFs function. In particular, how do RFX6 and RFX7, as
well as RFX4 and RFX5, function in transcription considering that they do not have identified ADs? There are two
possible mechanisms. First, because RFX TFs are known to
form dimers and bind to same or similar binding sites
(the X-box motifs) in DNA [31], they may function
together with RFX genes (RFX1-3) that do have ADs.
Page 3 of 11
(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:226
http://www.biomedcentral.com/1471-2148/8/226
Figure 1
Mammalian RFX DBDs are highly conserved
Mammalian RFX DBDs are highly conserved. DBDs from six mammalian RFX genes were aligned using ClustalW. The
conservation of amino acid is depicted by a color gradient from the color yellow, which indicates low conservation, to red,
which indicates high conservation. Nine residues that make direct or water-mediated DNA contacts are indicated with arrow
heads. The species names included in this figure are abbreviated. They are: Mus–mouse (Mus musculus); Rno–Rat (Rattus norvegicus); Cfa–dog (Canis familiaris); Ptr–chimpanzee (Pan troglodytes); Mmu–monkey (Macaca mulatta) and Hsa–human (Homo
sapiens).
Page 4 of 11
(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:226
http://www.biomedcentral.com/1471-2148/8/226
Figure 2
Functional domains in the known and novel human RFX genes
Functional domains in the known and novel human RFX genes. The functional domains, AD, DBD, B, C, and D are
indicated using color-coded boxes. Genes are represented using horizontal lines, which are proportional to the protein
lengths. The domain lengths and positions are also proportional to their actual lengths. The graphs are aligned based on the
position of the DBDs.
Examination of a recently available proteome-scale map
of the human protein-protein interaction network [34],
which was constructed using yeast-two-hybrid technique,
has shown that RFX6 and RFX1-4 interact with each other
and also interact with many other genes (Figure 3). RFX6
interacts directly with RFX2 and RFX3, the latter of which
has been shown to be expressed and to function in the
pancreas [21], as well as many other tissues. The interaction between RFX6 and other RFX TFs provides further
supporting evidence that RFX6 is indeed a member of the
RFX gene family. Interactions between RFX7 and other
genes were not observed, which is likely due to the incomplete coverage of the human protein-protein interactions
analyzed in this study. Second, RFX TFs may function by
interacting with many other non-RFX TFs. For example, it
has been demonstrated that mammalian RFX 5 forms a
complex ("enhanceosome") with RFXANK (also known
as RFX-B), RFXAP, CREB, and CIITA to regulate expression
of MHC class II genes [25]. Notably, all of the five genes
shown to interact with RFX6 (DTX1, DTX2, FHL3, CCNK,
and SS18L1) (Figure 3) except only one–SS18L1–are also
putative TFs.
To explore the relationship between RFX6 and RFX7 and
the known RFX family members RFX1-5, we have constructed a phylogenetic tree that contains all mammalian
RFX genes described above (Additional file 1, Figure 1), as
well as C. elegans RFX gene daf-19 product DAF-19 [12],
which has been extensively studied, for comparison. We
used the DBD sequence of the yeast Saccharomyces cerevisiae RFX gene Crt-1[10] as an out group in the phylogenetic tree construction. From the phylogenetic tree (Figure
4), all seven genes show perfect one-to-one orthologous
relationships between different mammalian genomes. It
is clear that the seven mammalian RFX genes fall into
three subgroups (Figure 4). The first subgroup contains
RFX1-3; the second RFX4 and RFX6; while the third RFX5
and RFX7. It is likely that RFX4 and RFX6 resulted from
one gene duplication that predated the split of these
mammalian species, while RFX5 and RFX7 resulted from
another similar independent duplication. This hypothesis
is generally consistent with the gene models of these RFX
genes (Additional file 2). RFX6 has 19 exons, which is
similar to the number of exons contained in RFX4 (18
exons); while RFX7 has 6 exons, which is similar to the
number of exons contained in RFX5 (9 exons). The C. elegans RFX gene, DAF-19 clusters together with RFX1-3
genes, supporting a previously proposed hypothesis that
the divergence of the subgroup RFX1-3 from other two
subgroups likely predated the divergence between mammals and the nematodes [13]. This hypothesis predicts
that C. elegans should have orthologous RFX TFs to RFX47 [35]. However, only one C. elegans RFX gene–daf-19–has
been reported so far and our extensive search has concluded that daf-19 is the only RFX TF in C. elegans. One
possible explanation is that additional RFX TFs were lost
in evolution. Alternatively, RFX4-7 may have undergone
positive selection in mammals to accommodate additional functional complexity in mammalian gene regulation, while RFX1-3 and daf-19 remained highly conserved
due to purifying evolution. Interestingly, although the
phylogenetic tree was constructed based only on DBDs,
the grouping of these mammalian RFX genes is also consistent with the composition of other conserved domains.
In particular, RFX1-3 all contain DBDs, ADs, Bs, Cs and
Ds, while RFX4 and RFX6 have all of these domains except
ADs, and RFX5 and RFX7 have only DBDs (Figures 2 and
4).
Page 5 of 11
(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:226
http://www.biomedcentral.com/1471-2148/8/226
Figure 3
RFX interactome
RFX interactome. Circles depict gene products and lines depict protein-protein interactions. The interactions between
RFX6 and its direct interactors were obtained using yeast-two-hybrid method in a large-scale human protein-protein interaction study [34]. Additional interactions were constructed by Rhodes et al[46]. The network was generated using program available at the HiMap website http://www.himap.org/[46].
To gain insight into the function of these two newly identified RFX genes, we explored the expression profiles of
RFX6 and RFX7 and compared them to those of RFX1-5.
We analyzed two independent datasets. First, we searched
the
dbEST
database
in
genBank
http://
www.ncbi.nlm.nih.gov/dbEST/[36] to examine which
EST libraries express transcripts of these RFX genes. The
results indicate that the expression profile of RFX1-5
matches well with previously published data (see INTRODUCTION): RFX1 is found in many different tissue types
including white blood cells, heart, eye, testis, and cancerous cell; RFX2 appears to be expressed in testis and brain;
RFX3 appears to be expressed in the placenta and brain
(i.e., medulla); RFX4 is found in the brain, as well as in
testis as RFX2; and RFX5 expression has been observed in
various different tissues including thymus, T-cells, kidney,
brain, and lymph. The consistency of expression for RFX15 obtained from the dbEST database with previous observations suggests that dbEST provides good estimations of
RFX genes' expression profiles. Using the same method,
we found that RFX6 is primarily expressed in pancreas,
with minor expression in liver, while RFX7 is widely and
heavily expressed in many different tissue types including
kidney (tumor tissues), thymus, brain, and placenta.
Second, to gain a quantitative understanding of the
expression of RFX genes, we took advantage of the recent
availability of serial analysis of gene expression (SAGE)
libraries constructed by the Mouse Atlas of Gene Expression Project http://www.mouseatlas.org/[37]. To start
with, we tested the hypothesis that the expression of
mouse RFX TFs approximates the expression of human
Page 6 of 11
(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:226
http://www.biomedcentral.com/1471-2148/8/226
Figure 4
Phylogenetic analysis of mammalian RFX genes
Phylogenetic analysis of mammalian RFX genes. This phylogenetic tree was constructed based on DBDs of RFX genes
for six mammalian species and C. elegans using yeast RFX gene product Crt1 as the out-group. The phylogenetic tree was bootstrapped for 100 times with the numbers at each internal node being the bootstrap values. Each ortholog group is colored differently. The species names included in this figure are abbreviated. They are: Mus–mouse (Mus musculus); Rno–Rat (Rattus
norvegicus); Cfa–dog (Canis familiaris); Ptr–chimpanzee (Pan troglodytes); Mmu–monkey (Macaca mulatta) and Hsa–human (Homo
sapiens).
Page 7 of 11
(page number not for citation purposes)
SA G E tag frequency
200
100
In te s tin e
In te s tin e
U ro g e n ita l
in te s tin e
S k in
S p le e n
U te ru s
S p le e n
U te ru s
S k in
P itu ita ry
P la c e n ta
U te ru s
S p le e n
P ro s ta te
T hym us
S k in
P itu ita ry
P la c e n ta
U ro g e n ita l
P ro s ta te
P a n c re a s
N e u ra l
O v a rie s
P a n c re a s
O v a rie s
P a n c re a s
P ro s ta te
P itu ita ry
P la c e n ta
O v a rie s
Ad re n a l
B la d d e r
B ra in
B ra n c h ia l
E m b ry o
E n d o d e rm
H e a rt
K id n e y
L im b
L iv e r
Lung
Ly m p h
M a m m a ry
M u s c le
N e u ra l
N e u ra l
M u s c le
Lym ph
M a m m a ry
Lung
L iv e r
L im b
K id n e y
RFX5
M u s c le
H e a rt
H e a rt
B ra n c h ia l
E m b ry o
E n d o d e rm
B ra in
Ad re n a l
B la d d e r
In te s tin e
U ro g e n ita l
U te ru s
T hym us
T e s tis
S to m a c h
S p le e n
P ro s ta te
S k in
P itu ita ry
P la c e n ta
P a n c re a s
O v a rie s
N e u ra l
M a m m a ry
M u s c le
Ly m p h
Lung
L iv e r
L im b
K id n e y
H e a rt
E n d o d e rm
E m b ry o
B ra n c h ia l
B ra in
B la d d e r
Ad re n a l
In te s tin e
U ro g e n ita l
U te ru s
T e s tis
T hym us
S p le e n
S to m a c h
T e s tis
S p le e n
P ro s ta te
S k in
P itu ita ry
P la c e n ta
P ro s ta te
P itu ita ry
O v a rie s
P a n c re a s
Ad re n a l
B la d d e r
B ra in
B ra n c h ia l
E m b ry o
E n d o d e rm
H e a rt
K id n e y
L im b
L iv e r
Lung
Ly m p h
M a m m a ry
M u s c le
N e u ra l
P a n c re a s
Ad re n a l
B la d d e r
B ra in
B ra n c h ia l
E m b ry o
E n d o d e rm
H e a rt
K id n e y
L im b
L iv e r
Lung
Ly m p h
M a m m a ry
M u s c le
N e u ra l
O v a rie s
P la c e n ta
S k in
S to m a c h
T hym us
U ro g e n ita l
R FX7
U te ru s
In te s tin e
T hym us
5
U ro g e n ita l
10
T hym us
15
S to m a c h
20
T e s tis
25
T e s tis
30
S to m a c h
RFX6
S to m a c h
R FX4
R FX3
M a m m a ry
Ly m p h
Lung
L iv e r
L im b
K id n e y
E n d o d e rm
B ra n c h ia l
E m b ry o
B ra in
Ad re n a l
S A G E ta g fre q u e n c y
300
35
S A G E ta g fre q u e n c y
400
S A G E ta g fre q u e n c y
1000
0
B la d d e r
In te s tin e
U ro g e n ita l
U te ru s
T hym us
T e s tis
S to m a c h
S p le e n
P ro s ta te
S k in
P itu ita ry
P la c e n ta
P a n c re a s
O v a rie s
N e u ra l
M u s c le
M a m m a ry
Ly m p h
Lung
L iv e r
L im b
K id n e y
H e a rt
E n d o d e rm
E m b ry o
B ra n c h ia l
B ra in
B la d d e r
S A G E ta g fre q u e n c y
RFX2
RFX1
T e s tis
1000
1000
180
160
140
120
100
80
60
40
20
0
S A G E ta g fre q u e n c y
500
1000
950
900
850
800
750
200
150
100
50
0
1000
80
70
60
50
40
30
20
10
0
Ad re n a l
1000
950
900
850
800
750
200
150
100
50
0
S A G E ta g fre q u e n c y
600