-
Notifications
You must be signed in to change notification settings - Fork 10
/
abba_06_dehydrationinducible_794038.pdf.txt
1771 lines (1456 loc) · 62.9 KB
/
abba_06_dehydrationinducible_794038.pdf.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>1471-2164-7-39.fm</title>
<meta name="Author" content="petere"/>
<meta name="Creator" content="FrameMaker 7.0"/>
<meta name="Producer" content="Acrobat Distiller 7.0 (Windows)"/>
<meta name="CreationDate" content=""/>
</head>
<body>
<pre>
BMC Genomics
BioMed Central
Open Access
Research article
A dehydration-inducible gene in the truffle Tuber borchii identifies a
novel group of dehydrins
Simona Abba', Stefano Ghignone and Paola Bonfante*
Address: Dipartimento di Biologia Vegetale dell'Università degli Studi di Torino and IPP-CNR-Sezione di Torino, Viale Mattioli 25, 10125 Torino,
Italy
Email: Simona Abba' - simona.abba@unito.it; Stefano Ghignone - stefano.ghignone@unito.it; Paola Bonfante* - p.bonfante@ipp.cnr.it
* Corresponding author
Published: 02 March 2006
BMC Genomics 2006, 7:39
doi:10.1186/1471-2164-7-39
Received: 10 October 2005
Accepted: 02 March 2006
This article is available from: http://www.biomedcentral.com/1471-2164/7/39
© 2006 Abba' et al; licensee BioMed Central Ltd.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0),
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Background: The expressed sequence tag M6G10 was originally isolated from a screening for
differentially expressed transcripts during the reproductive stage of the white truffle Tuber borchii.
mRNA levels for M6G10 increased dramatically during fruiting body maturation compared to the
vegetative mycelial stage.
Results: Bioinformatics tools, phylogenetic analysis and expression studies were used to support
the hypothesis that this sequence, named TbDHN1, is the first dehydrin (DHN)-like coding gene
isolated in fungi. Homologs of this gene, all defined as "coding for hypothetical proteins" in public
databases, were exclusively found in ascomycetous fungi and in plants. Although complete (or
almost complete) fungal genomes and EST collections of some Basidiomycota and Glomeromycota
are already available, DHN-like proteins appear to be represented only in Ascomycota. A new and
previously uncharacterized conserved signature pattern was identified and proposed to Uniprot
database as the main distinguishing feature of this new group of DHNs. Expression studies provide
experimental evidence of a transcript induction of TbDHN1 during cellular dehydration.
Conclusion: Expression pattern and sequence similarities to known plant DHNs indicate that
TbDHN1 is the first characterized DHN-like protein in fungi. The high similarity of TbDHN1 with
homolog coding sequences implies the existence of a novel fungal/plant group of LEA Class II
proteins characterized by a previously undescribed signature pattern.
Background
Hyperosmotic conditions and low temperatures cause cellular dehydration, i.e. removal of water from the cytoplasm into the extracellular space, resulting in the
reduction of cytosolic volumes and the alteration of cellular mechanisms. Dehydrins (DHNs) are a group of heatstable plant proteins believed to play a protective role during cellular dehydration [1,2]. They accumulate during
dehydrative stress caused by or associated with low or
freezing temperatures, drought, salinity, embryo desicca-
tion and abscissic acid synthesis. Dehydrins are very rich
in glycine residues, while cysteine and tryptophane are
lacking or under-represented [3]. They are characterized
by highly conserved 15-mer lysin rich sequences, called Ksegments, which may be present one or several times, one
or more Y-segments (DEYGNP) and/or S-segments (serine cluster) [2]. The K-segment can form a putative
amphipathic α-helix structure, with the potential for both
hydrophilic and hydrophobic interaction [4]. Due to this
property, dehydrins potentially have a chaperone-like
Page 1 of 15
(page number not for citation purposes)
BMC Genomics 2006, 7:39
function in stabilizing partially denatured proteins or
membranes, coating them with a cohesive water layer and
preventing their coagulation during desiccation [3]. Rinne
et al [5] demonstrated that dehydrins could help hydrolytic enzymes maintain their activity even in desiccating
environmental conditions, such as freezing. This result
confirms the general belief that dehydrins help the cell to
survive desiccation, probably creating local pools of water
that are required for survival and re-growth.
Dehydrins were initially found in flowering plants, but
immunological studies and screenings of cDNA and
genome libraries revealed that dehydrins are widely distributed in the plant kingdom [6]. In fact, they were found
in the brown algae Fucus spiralis, F. vesciculosus, and F. evanescens [7], in the lichen Selaginella lepidophylla [3] as well
as in the cyanobacterium Anabaena sp. [8]. Dehydrinhomolog sequences are also present in Escherichia coli [6]
and Chlamidia trachomatis [9], and even in Drosophila melanogaster [10]. To our knowledge, dehydrins have never
been reported in fungi, even if some fungal proteins are
classified as late embryogenesis- abundant' (LEA) or LEAlike proteins. Dehydrins belong in fact to this larger protein family. The LEA protein classification proposed by
Dure [4] and Bray [11] was recently revised by Wise [12]
on the basis of the Kyte and Doolitle hydrophobicity metric, predicted secondary structures, expression patterns
and sequence features. Dehydrins are now classified in
Class IIa and Class IIb of LEA proteins, corresponding to
the previous D11 family or Group 2.
LEA proteins belonging to different classes do not share
any evident sequence similarity, even if Garay-Arroyo et
al. [13] found that they are characterized by high
hydrophilicity and high percentage of glycines, leading to
their denomination as "hydrophilins". They are synthesised in the later stages of plant embryogenesis, when
seeds are maturing and their water content is decreasing
and, in vegetative tissues, in response to water stress [14].
Their precise function is still unknown, but it has been
suggested that they are involved in protecting cellular or
molecular structures from the damaging effects of water
loss by sequestration of ions, replacement of hydrogen
bonding function of water or renaturation of unfolded
proteins [11,15]. Although primarily found in plants, a
number of putative LEA genes have been found in nonplant species, including bacteria [16,17], nematodes [18]
and fungi. The first study on a LEA-like protein in fungi
was carried out by Mtwisha et al. [14] who suggested that
HSP12 from Saccharomyces cerevisiae should be considered
as a LEA-like protein on the basis of its expression pattern
and amino acid composition. Also GRE1 from S. cerevisiae
[19] and CON6 from Neurospora crassa [20] can be
ascribed to the family of LEA proteins, because they
exhibit a high content of hydrophilic amino acids and
http://www.biomedcentral.com/1471-2164/7/39
their corresponding transcripts accumulate respectively in
response to hyperosmosis and desiccation. Moreover, 12
fungal proteins are already classified as 'LEA 4' (named as
LEA Class III proteins by Wise [12]) under the Pfam
domain family PF02987 on the basis of the presence of at
least one IPR004238 (InterPro ID) domain.
In the framework of an expressed sequence tag project
aimed at identifying key regulators and master genes controlling the fruiting body formation in the white truffle
Tuber borchii Vittad. [21], we found that the EST called
M6G10 was the most up-regulated gene of the reproductive stage compared to the vegetative stage. Truffles are
ectomycorrhizal fungi, producing ascocarps which are
highly appreciated and commercialised for their organoleptic properties [22]. Since truffle fruiting bodies cannot yet be obtained under controlled conditions, most
studies on truffle primary and secondary metabolism are
based on vegetative mycelium cultivated in axenic conditions.
In this study, bioinformatics tools and expression studies
were used to support the hypothesis that M6G10 can be
considered not only as a LEA protein coding gene, but as
the first DHN-like coding gene isolated in fungi. In addition, homologs of this gene, all still defined as "coding for
hypothetical proteins" in public databases, were found in
other fungal ascomycetous and plant genomes. On the
basis of some physiochemical similarities to known plant
dehydrins, the identification of a new conserved signature
pattern and the expression profile in osmotic and cold
stress, we support the classification of these "hypothetical
proteins" as dehydrins belonging to Class II LEA proteins.
Results
Sequence analyses
The
771-bp-long
M6G10
fragment
[GenBank:DN601500] was one of the fruiting body-regulated
Expressed Sequence Tag (EST) retrieved from a gene
expression profiling study conducted in the ascomycetous
truffle T. borchii [21]. A blastx search against NCBI
(National Center for Biotechnology Information, NIH,
Bethesda) databases [23] using as a query the M6G10 EST
revealed the existence of a conserved, previously uncharacterised group of DHN proteins. This result was then
confirmed by using the complete TbDHN1 mRNA
sequence. Significant similarities (E-value < e-10) to other
proteins (Table 1), including a plant dehydrin from Hordeum vulgare ([GenBank: AAD02257]; blastx E = 4e-25)
(Fig. 1), were found only by masking off repeated or low
complexity regions. Dehydrins, like the majority of LEA
proteins, are low complexity proteins; in fact, applications
masking low complexity regions, like SEG [24], masked
between 30% and 71% of the amino acids of a LEA protein [12]. Since the main effect of masking low complexity
Page 2 of 15
(page number not for citation purposes)
BMC Genomics 2006, 7:39
http://www.biomedcentral.com/1471-2164/7/39
Table 1: Fungal hits found by homology search using TbDHN1 as query sequence. Blast searches were conducted with TbDHN1
nucleotide and amino acid sequences against a local database including records from the NCBI non-redundant, the Swiss-Protein and
the Broad Institute fungal databases. All records are intended to be NCBI Protein Database entries, with exception of SNU00161 (The
Broad Institute Accession Number) and Q7S6B0 (Swiss Protein Database Accession Number).
Accession number
Biological role
organism
Identities (%)
E-value
Sequence found by
SNU00161
EAA70367
CAD70810
EAA55454
EAA62484
EAL89059
EAL84332
EAK94850
EAK94909
EAA54716
EAL87020
Q7S6B0
predicted protein
hypothetical protein FG10051
putative protein
hypothetical protein MG09261.4
hypothetical protein AN5324.2
conserved hypothetical protein
conserved hypothetical protein
DHN6*
DHN*
hypothetical protein MG05507.4
hypothetical protein Afu7g04520
Predicted protein
Stagonospora nodorum
Gibberella zeae
Neurospora crassa
Magnaporthe grisea
Aspergillus nidulans
Aspergillus fumigatus
Aspergillus fumigatus
Candida albicans
Candida albicans
Magnaporthe grisea
Aspergillus fumigatus
Neurospora crassa
41
40
39
35
37
31
38
32
39
28
28
26
6e-50
4e-47
9e-44
7e-39
1e-37
3e-35
2e-27
5e-21
8e-21
1e-17
2e-16
0.009
blastp
blastx
blastx
blastx
blastx
blastx
blastx
blastx
blastx
blastx
blastx
blastp
*Putative identifications provided by automatical annotation of C. albicans genome.
regions is to reduce the number of amino acids available
for alignment, blast searches were all performed with the
filter for low complexity regions switched off.
A blastp search on Stagonospora nodorum protein database
characterized the predicted protein SNU00161 (The
Broad Institute Accession Number) as another possible
fungal homolog of TbDHN1. A tblastx search was performed querying all those genomic projects in which a
proteomic annotation was not yet available. Other possible fungal homologs were thus identified in Botrytis cinerea
[The
Broad
Institute
Accession
Number:00133947_B.cinerea_19866915837883], Fusarium verticillioides [The Broad Institute Accession
Number:0031597_F.verticillioides_19866917223803],
Chetomium globosum [GenBank:AAFU01000048], Coccidioides immitis [GenBank:AAEC01000142] and C. posadisii
[TIGR
Accession
Number:gnl|TIGR_222929|contig:3250:c_posadasii]. A search in the EST databases
showed that most of these fungal proteins are present as
expressed sequences. Additionally, other DHN-like members were found within cDNA sequences of the fungi Verticillium dahliae [GenBank:BQ110173] and Trichoderma
reseii [GenBank:CF872473], and the plants Tortula ruralis
[GenBank:CN2013881], Picea engelmannii x Picea sitchensis [GenBank:DR464575], Malus x domestica [GenBank:CV091950], Oryza sativa [GenBank:CA765427] and
Saccharum officinarum [GenBank:CA105075]. For each
different species we reported only one entry, because
sequences from unfinished/unannotated genome projects
and from EST databases are always redundant. Although
blast searches were performed also on sequences from
Basidiomycota and Glomeromycota, all the retrieved fungal sequences belong exclusively to ascomycetous fungi.
Only DHN-like proteins from T. borchii, Gibberella zeae,
Neurospora crassa, Magnaporthe grisea, Aspergillus nidulans,
A. fumigatus, S. nodorum and Candida albicans were further
analysed. All the other potential fungal and plant DHN
proteins were excluded because of poor sequence quality
(especially for ESTs), i.e. high percentage of Ns or too
short fragments, and because all tblastx searches were performed on an unfinished genome project.
Besides sequence similarities, additional evidences supported the relatedness between TbDHN1, the twelve fungal uncharacterized proteins and the big family of LEA
proteins (Table 2): a) the amino acid composition
(Pepinfo analysis), e.g. richness in Gly and polar amino
acids such as Thr and Ser and lack of both Cys and Trp; b)
the high percentages of low complexity regions (SEG analysis); c) more than 50% of the polypeptide is predicted to
be structured as a random coil (Predictprotein analysis);
d) the high hydrophilicity (ProtScale analysis), e.g. maximum hydrophobicity for all sequences is 0.7 in GenBank:EAL87020 from A. fumigatus. GenBank:EAA54716
from M. grisea is the sole sequence in which cysteine and
tryptophan are present, although in a very low percentage.
Low complexity regions represented a high percentage of
each protein, with 7 sequences showing more than 30%
of their amino acids masked by SEG. The exceptions are
Swiss-Prot:Q7S6B0 from N. crassa (0%) and GenBank:EAL87020 from A. fumigatus (5%) which also
shared the lowest sequence similarity with TbDHN1.
On the basis of these physiochemical characteristics,
TbDHN1 and its fungal homologs can be assigned to the
large family of LEA.
Page 3 of 15
(page number not for citation purposes)
BMC Genomics 2006, 7:39
http://www.biomedcentral.com/1471-2164/7/39
Panel A, 1
Figure TbDHN1 deduced amino acid sequence and dehydrin 6 from Hordeum vulgare
Panel A, TbDHN1 deduced amino acid sequence and dehydrin 6 from Hordeum vulgare. New CSPs of TbDHN1
(see text for description) [GenBank:DQ308610] are red background coloured; Prosite plant dehydrin signature patterns are
green background coloured in dehydrin 6 [GenBank:AAD02257]: S-segment in light green and K-segments in dark green.
Sequence fragments aligned by blastx are in upper cases (see panel B). Panel B, Basic local alignment between TbDHN1
and dehydrin 6 from Hordeum vulgare. The best basic local alignment (E-value = 4e-25) was built by blastx between amino
acids 75 and 305 of TbDHN1 and amino acids 82 and 324 of dehydrin 6 [GenBank:AAD02257]. New CSPs of TbDHN1 are
red background coloured.
Classification according to Wise's rules
The next step in the characterization of TbDHN1 and its
homologs was their classification within the big family of
LEA proteins. TbDHN1 and the other twelve fungal proteins were classified according to a set of rules defined by
Wise [12]. Each LEA Class is defined by a range of percentages for all the physicochemical features which usually
characterize LEA proteins: hydrophilicity, predicted secondary structure and amino acid composition. According
to this rule set, G. zeae, A. nidulans, A. fumigatus and C.
albicans proteins were placed into Class II; while the aromatic amino acid percentage of N. crassa and M. grisea
proteins and the values of minimum hydrophobicity for
T. borchii and S. nodorum proteins were out of the established ranges for this Class. It is important to point out
that, especially for N. crassa, M. grisea and T. borchii proteins, values of aromatic amino acid percentage and minimum hydrophobicity are very close to the limit of Class
II. The six fungal proteins not assigned to Class II clustered
with Class IV, but Wise concluded that members of Class
Page 4 of 15
(page number not for citation purposes)
BMC Genomics 2006, 7:39
http://www.biomedcentral.com/1471-2164/7/39
Table 2: Physiochemical analysis of TbDHN1 and its homolog sequences. Amino acid compositions, percentages of low complexity
regions, percentages of unstructured polypeptide and hydropathy profiles for TbDHN1 and its homologs. All figures are intended as
percentages.
Cys
Gly
Ser
Thr
Trp
L.C.R.*
Random
coil
TbDHN1 T. borchii
0.0
19.7
10.5
22.8
0.0
57
76.1
CAD70810 N. crassa
0.0
17.1
9.8
12.8
0.0
28
86.2
Q7S6B0 N. crassa
0.0
8.5
9.3
5.4
0.0
0
89.2
EAA62484 A. nidulans
0.0
14.6
11.8
14.6
0.0
22
88.6
EAA55454 M. grisea
0.0
16.5
12.9
14.1
0.0
44
82.3
EAA54716 M. grisea
0.4
16.3
12.6
9.1
0.2
25
87.0
SNU00161 S. nodorum
0.0
20.6
7.8
18.8
0.0
52
87.2
EAL89059 A. fumigatus
0.0
12.0
8.7
13.1
0.0
24
94.7
EAL84332 A. fumigatus
0.0
15.0
11.8
13.4
0.0
37
90.2
EAL87020 A. fumigatus
0.0
12.6
11.6
11.6
0.0
5
95.6
EAK94909 C. albicans
0.0
23.5
16.0
14.2
0.0
56
70.8
EAK94850 C. albicans
0.0
23.6
15.8
14.3
0.0
55
63.9
EAA70367 G. zeae
0.0
16.0
14.2
12.9
0.0
45
Hydropathy profile
92.5
*L.C.R., Low Complexity Regions
Page 5 of 15
(page number not for citation purposes)
BMC Genomics 2006, 7:39
IV should be more appropriately housed in Class II and
Class III. According to Wise's analysis of LEA amino acid
composition, glycine is highly represented in Class II,
while in Class III glycine is found only marginally more
than expected by chance. Moreover, Class III LEA proteins
have high helix content. On the basis of a high percentage
of glycines in the six proteins and/or a lower percentage of
helix content than the expected one for Class III, also proteins from N. crassa, M. grisea, T. borchii and S. nodorum
can be assigned to Class II, the group of plant dehydrins.
Furthermore, blastp results obtained with TbDHN1 as
query against the nr protein database shared no common
hits with the ones obtained using as queries the fungal
proteins already classified in Class III (Pfam PF02987).
A new common signature pattern
Members of plant DHN family are characterized by the
presence of two PROSITE signature patterns: the S-segment S(5)-[DE]-x-[DE]-G-x(1,2)-G-x(0,1)-[KR](4) (with
the exception of pea dehydrins, Arabidopsis thaliana
COR47 and XERO2 and wheat cold-shock proteins) and
the K-segment [KR]-[LIM]-K-[DE]-K-[LIM]-P-G. These
common signature patterns are not present in the fungal
sequences. However, another new repeated conserved signature pattern (CSP) was identified in sequences from filamentous fungi: [GRK]-[PV]-H-x-[ST]-x-x-x-N-[nonpolar
amino acid]-[nonpolar amino acid]-D-P-[RTP]-V-D-[SN].
This deduced signature pattern identified at the C-terminal represents the first block of amino acids aligned by TCoffee [25] and it is repeated from one to nine times
within each sequence with slight variations (Fig. 2). The
different number of repetitions of the CSP poses a challenging problem to the alignment of these regions, but TCoffee aligned all the first and the last repetitions
together, because further conserved residues flank these
two repetitions. Not only TbDHN1 and the analysed fungal proteins, but all the previously cited fungal and plant
sequences retrieved from blast searches show this CSP. It
has to be underlined that sequences from C. albicans do
not show this CSP but, in any case, they do share common
physicochemical features with the other fungal sequences
(Table 2).
http://www.biomedcentral.com/1471-2164/7/39
and fungal Class III proteins (Table 3) generated by T-coffee was used to infer the phylogenetic relationship. NJ
(data not shown) and Bayesian posterior probability analyses (Fig. 3) showed a congruent tree topology. Bayesian
analyses yielded a phylogenetic tree in which clades corresponding to LEA Class I, Class II, Class III proteins are recognizable, although the LEA Class III sequence group is
not well defined. A first clade, supported by a posterior
probability value of 0.74, comprises all LEA Class I proteins and a Class III protein from P. sativum. A second
clade, supported by a posterior probability value of 0.63,
includes all LEA 2 proteins from plant, TbDHN1 and its
fungal homologs. Fungal LEA Class II proteins clustered
together with a posterior probability value of 0.78 and
61% NJ bootstrap value. Despite the lack of CSP, also C.
albicans proteins are included in the same cluster. Notably, Swiss-Prot:Q9ZTR5_HORVU, the best hit of known
function provided by blastx searches, falls in this clade.
Plant and fungal Class III proteins clustered together but
did not form a well separated clade supported by posterior
probability and MP bootstrap values higher than 50%.
A multiple alignment among TbDHN1 and its fungal
homologs were generated with T-coffee and used to build
a Hidden Markov Model profile. This was used to query
Swiss-Prot and Pfam databases, but no hit with a significant score was retrieved: the first hit was an Ice nucleation
protein (Swiss-Prot:P09815), E-value = 0.076, from Pseudomonas fluorescens.
WoLF PSORTII predicted for TbDHN1 and the other
twelve fungal proteins a probable nuclear/cytoplasmic
localization, but no nuclear localization signals were
found by PredictNLS program.
As in plant DHNs, in TbDHN1, GenBank:EAA55454 from
M. grisea, GenBank:EAA70367 from G. zeae, SNU00161
from S. nodorum, GenBank:CAD70810 from N. crassa and
both proteins from C. albicans, there are repetitions of
amino acid strings which are identical within the same
sequence but different from one sequence to the other
(data not shown).
The case study of TbDHN1: gene structure
The entire sequence of TbDHN1 gene was obtained from
a T. borchii cDNA and genomic library screening. TbDHN1
mRNA was a 2300-bp-long open reading frame coding for
a predicted 351 amino acid long polypeptide with a predicted molecular mass of 34,8 kDa. The genomic
sequence revealed the presence of a 79-bp-long intron
after 178 nucleotides from the translation initiator ATG
and a putative TATA-box located at position -35 nt from
the transcription start site (Fig. 4). Besides the presence of
4 repetitions of the CSP, another peculiar feature has to be
pointed out: a block of 46 amino acids is repeated within
the sequence, separated only by a glutamic acid residue.
The corresponding nucleotide sequences are almost identical with slight variations only in the third nucleotide of
some codons (Fig. 4).
Phylogeny, structure and localization prediction
The sequence alignment among TbDHN1 and its fungal
homologs, selected plant LEA Class I, Class II, Class III
The genomic region amplified by DHNf/DHNr primers
contains one KpnI restriction site but no site for HindIII
and SmaI. As revealed by DNA gel blot data reported in
Page 6 of 15
(page number not for citation purposes)
BMC Genomics 2006, 7:39
http://www.biomedcentral.com/1471-2164/7/39
Figure 2
Distribution of the conserved signature pattern (CSP) within TbDHN1 and its homolog sequences
Distribution of the conserved signature pattern (CSP) within TbDHN1 and its homolog sequences. CSP is indicated with red boxes.
Figure 5A, a single-band pattern was produced by hybridization of the TbDHN1 probe to genomic DNA digested
with HindIII and SmaI and a double-band was produced
by KpnI, thus indicating that TbDHN1 is encoded by a single copy gene in T. borchii genome.
Stress-regulated expression of TbDHN1
Previous experiments showed that TbDHN1 transcript
was strongly upregulated in T. borchii fruiting bodies compared to vegetative mycelium grown on control medium
[21], but, since fruit bodies can be collected only in field,
we chose to further investigate TbDHN1 expression using
vegetative mycelium grown in axenic conditions.
RNA gel-blot experiments were conducted to determine
whether TbDHN1 transcription levels increase in the same
conditions affecting plant DHNs transcription, e.g. low
temperature stress and salinity stress. The RNA gel-blot
data reported in Figure 5B showed that the TbDHN1 messenger increased immediately following transfer on NaCl
medium for 30 min and such up-regulation is still evident
in a NaCl treatment prolonged to 24 h. The same strong
up-regulation response is also evident after a 48 h-long
cold treatment. TbDHN1 messenger rapidly returned to
basal levels when mycelia cultivated for 24 h on NaCl
medium were shifted for 5 h on control medium, while in
mycelia continually cultivated for 29 h on NaCl medium
the up-regulation was still evident.
Discussion
Almost 40 genomes of fungi have been completely
sequenced or are currently in the pipeline according to the
NCBI website. In any newly sequenced eukaryotic
genome, more than 30–40% of the genes usually do not
have an assigned function [26]. Remarkably, a relatively
small fraction of the uncharacterized genes is species- or
genus-specific; the majority of such "hypothetical" genes
have a wider phyletic distribution and therefore are usually referred to as "conserved hypothetical" genes [27,28].
Although it appears that the central pathways of information processing and metabolism are already known,
important signalling and stress response mechanisms
remain to be studied [29].
Page 7 of 15
(page number not for citation purposes)
BMC Genomics 2006, 7:39
http://www.biomedcentral.com/1471-2164/7/39
Table 3: LEA proteins used for phylogenetic analysis. Amino acid
sequences corresponding to LEA proteins were retrieved from
the Swiss-Protein database.
Accession numbera
Organism
Classified as
P04568
P09443
P11573
P17639
P22701
P46514
P46517
P46520
Q02973
Q05190
Q05191
Q07187
O04232
O22623
O48622
O65216
P12253
P12950
P22239
P22240
P28639
P42758
Q07322
Q9ZTR5*
O49816
P13934
P13939
P14928
P20075
P23283
Q39058
Q39873
Q40696
Q40869
Q40929
Q41060
Q00002
Q5UUY7
Q6FVF6
Q6BSI4
Q8SVY9
Q6CP34
Q7S1U2
Q7SH94
Q9Y7X6
Q6C0I4
Q6C052
Q6C800
Triticum aestivum
Gossypium hirsutum
Raphanus sativus
Daucus carota
Triticum aestivum
Helianthus annuus
Zea mays
Oryza sativa
Arabidopsis thaliana
Hordeum vulgare
Hordeum vulgare
Arabidopsis thaliana
Solanum tuberosum
Vaccinium corymbosum
Spinacia oleracea
Triticum aestivum
Oryza sativa
Zea mays
Craterostigma plantagineum
Lycopersicon esculentum
Pisum sativum
Arabidopsis thaliana
Daucus carota
Hordeum vulgare
Cicer arietinum
Brassica napus
Gossypium hirsutum
Hordeum vulgare
Daucus carota
Craterostigma plantagineum
Arabidopsis thaliana
Glycine max
Oryza sativa
Picea glauca
Pseudotsuga menziesii
Pisum sativum
Alernaria alternatab
Antonospora locustaeb
Candida glabratab
Debaryomyces hanseniib
Encephalitozoon cuniculib
Kluyveromyces lactisb
Neurospora crassab
Neurospora crassab
Schizosaccharomyces pombeb
Yarrowia lipolyticab
Yarrowia lipolyticab
Yarrowia lipolyticab
LEA 1
LEA 1
LEA 1
LEA 1
LEA 1
LEA 1
LEA 1
LEA 1
LEA 1
LEA 1
LEA 1
LEA 1
LEA 2
LEA 2
LEA 2
LEA 2
LEA 2
LEA 2
LEA 2
LEA 2
LEA 2
LEA 2
LEA 2
LEA 2
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
LEA 3
a Swiss
Protein Database Accession Number; b Fungal organisms; *This
is DHN6 from Hordeum vulgare (corresponding to GenBank:
AAD02257) which represents the first hit of known biological role
found by blastx using TbDHN1 as query sequence.
In our previous work on key regulators and master genes
controlling the morphogenetic events of truffle fruiting
body formation, 16 out of the 55 transcripts preferentially
expressed in fruit bodies showed significant homology to
other fungal hypothetical proteins of unknown function
[21]. The availability of experimental data on the upregulation of TbDHN1 (M6G10 EST) in T. borchii reproductive
stage, the presence of TbDHN1 homologs in other ascomycetous fungi and the absence in other fungal lineages
(especially in Basidiomycota and Glomeromycota) made
this gene a priority target for investigations on ascomycetous fruiting body formation. Conserved genes having a
patchy phyletic distribution could be functional determinants of particular phenotypes and let us hypothesize that
this DHN-like gene may have been acquired by some
organisms after the separation of the main fungal evolutionary lineages which was estimated at 1400 Myr [30].
On the other hand, TbDHN1 and its homologs are supposed to be a part of the larger family of LEA proteins, so
we can assess that this protein family has a nearly ubiquitous phyletic distribution: they were already found in
plants, algae, invertebrates, bacteria and in ascomycetous
fungi. Significant positive correlation between the
phyletic spread of a gene and the likelihood that it is
essential for cell growth has been demonstrated [31].
Since LEA proteins belonging to different classes have no
evident sequence similarity but only common structural
and physicochemical characteristics, the hypothesis of an
independent evolution in different organisms under
dehydration conditions is more likely than the hypothesis
that they share a common ancestor [13]. Defence against
water deficit caused by low temperature, salinity stress or
drought is, in fact, a crucial point in preventing cell damage, especially for those organisms, like plants and microrganisms, that are unable to escape from critical
environmental conditions.
Another point of interest raised by TbDHN1 and its
homologs bears upon the presence of a repeated conserved signature pattern (CSP) which groups together all
the sequences found in filamentous fungi and plants. We
supposed that the two proteins from C. albicans do not
show this CSP because they are from a lievitoid Ascomycete, but we do not exclude that they could share a not-yet
identified CSP with other Saccharomycetales.
The different number of repetitions of the common signature pattern in various organisms could be explained as a
possible consequence of multiple events of internal duplication. Repetitions of the CSP within the same sequence
are not exactly identical; therefore, after an initial event of
internal duplication, a differentiation of the CSP may
have occurred. The fungal CSP, as well as other repeated
modules of proteins [32], could be a structural/functional
domain.
Page 8 of 15
(page number not for citation purposes)
BMC Genomics 2006, 7:39
http://www.biomedcentral.com/1471-2164/7/39
Antonospora locustae Q5UUY7 *
Glycine max Q39873
Candida glabrata Q6FVF6 *
0.88
Neurospora crassa Q7SH94 *
Encephalitozoon cuniculi Q8SVY9 *
Yarrowia lipolytica Q6C0I4 *
Neurospora crassa Q7S1U2 *
LEA Class III
Kluyveromyces lactis Q6CP34 *
63
0.79
Yarrowia lipolytica Q6C800 *
0.79
Debaryomyces hansenii Q6BSI4 *
Alternaria alternata Q00002 *
Schizosaccharomyces pombe Q9Y7X6 *
Yarrowia lipolytica Q6C0S2 *
Daucus carota P20075
Pisum sativum Q41060
Triticum aestivum P04568
Hordeum vulgare Q05190
68
0.97
Triticum aestivum P22701
Oryza sativa P46520
0.96 Hordeum vulgare Q05191
0.74