-
Notifications
You must be signed in to change notification settings - Fork 10
/
akhunov_10_nucleotide_791289.pdf.txt
4401 lines (3049 loc) · 110 KB
/
akhunov_10_nucleotide_791289.pdf.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Nucleotide diversity maps reveal variation in diversity among wheat genomes and chromosomes</title>
<meta name="Subject" content="BMC Genomics 2010, 11:702. doi:10.1186/1471-2164-11-702"/>
<meta name="Keywords" content=" "/>
<meta name="Author" content="Eduard D Akhunov"/>
<meta name="Creator" content="Arbortext Advanced Print Publisher 10.0.1082/W Unicode"/>
<meta name="Producer" content="Acrobat Distiller 9.0.0 (Windows)"/>
<meta name="CreationDate" content=""/>
</head>
<body>
<pre>
Akhunov et al. BMC Genomics 2010, 11:702
http://www.biomedcentral.com/1471-2164/11/702
RESEARCH ARTICLE
Open Access
Nucleotide diversity maps reveal variation in
diversity among wheat genomes and chromosomes
Eduard D Akhunov1,2, Alina R Akhunova1,2, Olin D Anderson3, James A Anderson4, Nancy Blake5, Michael T Clegg6,
Devin Coleman-Derr3, Emily J Conley4, Curt C Crossman3, Karin R Deal1, Jorge Dubcovsky1, Bikram S Gill7, Yong Q Gu3,
Jakub Hadam7, Hwayoung Heo5, Naxin Huo3, Gerard R Lazo3, Ming-Cheng Luo1, Yaqin Q Ma1,8, David E Matthews9,
Patrick E McGuire1, Peter L Morrell4, Calvin O Qualset1, James Renfro3, Dindo Tabanao4,10, Luther E Talbert5, Chao Tian1,
Donna M Toleno6, Marilyn L Warburton11,12, Frank M You1, Wenjun Zhang1, Jan Dvorak1*
Abstract
Background: A genome-wide assessment of nucleotide diversity in a polyploid species must minimize the
inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into their
respective genomes. The same requirements complicate the development and deployment of single nucleotide
polymorphism (SNP) markers in polyploid species. We report here a strategy that satisfies these requirements and
deploy it in the sequencing of genes in cultivated hexaploid wheat (Triticum aestivum, genomes AABBDD) and wild
tetraploid wheat (Triticum turgidum ssp. dicoccoides, genomes AABB) from the putative site of wheat domestication
in Turkey. Data are used to assess the distribution of diversity among and within wheat genomes and to develop a
panel of SNP markers for polyploid wheat.
Results: Nucleotide diversity was estimated in 2114 wheat genes and was similar between the A and B genomes
and reduced in the D genome. Within a genome, diversity was diminished on some chromosomes. Low diversity
was always accompanied by an excess of rare alleles. A total of 5,471 SNPs was discovered in 1791 wheat genes.
Totals of 1,271, 1,218, and 2,203 SNPs were discovered in 488, 463, and 641 genes of wheat putative diploid
ancestors, T. urartu, Aegilops speltoides, and Ae. tauschii, respectively. A public database containing genome-specific
primers, SNPs, and other information was constructed. A total of 987 genes with nucleotide diversity estimated in
one or more of the wheat genomes was placed on an Ae. tauschii genetic map, and the map was superimposed
on wheat deletion-bin maps. The agreement between the maps was assessed.
Conclusions: In a young polyploid, exemplified by T. aestivum, ancestral species are the primary source of genetic
diversity. Low effective recombination due to self-pollination and a genetic mechanism precluding homoeologous
chromosome pairing during polyploid meiosis can lead to the loss of diversity from large chromosomal regions.
The net effect of these factors in T. aestivum is large variation in diversity among genomes and chromosomes,
which impacts the development of SNP markers and their practical utility. Accumulation of new mutations in older
polyploid species, such as wild emmer, results in increased diversity and its more uniform distribution across the
genome.
Background
While nucleotide diversity studies and the development
and deployment of single nucleotide polymorphism
(SNP) markers are straightforward in diploid and paleopolyploid species, such as maize or soybean [1-3], they
* Correspondence: jdvorak@ucdavis.edu
1
Department of Plant Sciences, University of California, Davis, CA 95616, USA
Full list of author information is available at the end of the article
are complicated in recently evolved polyploid species by
high levels of orthologous gene similarity. Sequence
similarity makes sequencing of single genes and allocation of sequences into respective genomes difficult. Special strategies are therefore required for nucleotide
diversity studies and the development of SNP markers
for young polyploid species, which include wheat and
other economically important plants.
© 2010 Akhunov et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative
Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work is properly cited.
Akhunov et al. BMC Genomics 2010, 11:702
http://www.biomedcentral.com/1471-2164/11/702
Wheat forms an allopolyploid series at three ploidy
levels: diploid (2x = 14), tetraploid (4x = 28), and hexaploid
(6x = 42). Wild tetraploid emmer wheat (Triticum turgidum ssp. dicoccoides, henceforth shortened to T. dicoccoides, genomes AABB) evolved between 0.2 and 0.5
million years ago [4,5] via hybridization of wild T. urartu
(genomes AA) and an extinct or undiscovered species in
the lineage of Aegilops speltoides (genomes SS, where S is
closely related but not identical to the wheat B genome)
[4,6-9]. Hexaploid wheat (T. aestivum, genomes AABBDD)
evolved about 8,500 years ago [10] via hybridization of
T. turgidum with diploid Ae. tauschii (genomes DD) [11,12].
A possible strategy for nucleotide diversity studies and
SNP discovery in young polyploid species, such as
wheat, is to find diverged regions in orthologous genes
and use them for the design of polymerase chain reaction (PCR) primers that anneal to only a single DNA
target. These genome-specific primers (GSPs) amplify
DNA from only a single genome and facilitate gene
sequencing and SNP discovery [13]. An alternative strategy is to shotgun-sequence cDNAs and then allocate
each sequence to a genome. Both approaches have been
used in polyploid wheat [13-15] although those studies
were of limited scope and genome coverage [14-17] and
none mapped the markers.
A domestication bottleneck at the tetraploid level and
a polyploidy bottleneck during the transition from the
tetraploid to hexaploid level are expected to have
reduced the diversity of polyploid wheat compared to
wild emmer. Nucleotide diversity θπ [18] was reported
to be 2.7 × 10 -3 in 24 A- and B-genome wild emmer
genes [16]. For comparison, θπ was estimated to be 9.7
× 10-3 in teosinte genes (Zea mays ssp. parviglumis) [3]
and 7.7 to 8.1 × 10-3 in wild barley genes (Hordeum vulgare ssp. spontaneum) [19,20]. The diversity of emmer
was reduced by the domestication bottleneck but, curiously, no further diversity loss took place in the A and
B genomes during the polyploidy bottleneck accompanying the evolution of T. aestivum from domesticated tetraploid wheat [16]. Levels of diversity in the T. aestivum
D genome are unknown.
Genetic evidence suggests that wild emmer was
domesticated in the Diyarbakir region in southeastern
Turkey [21,22]. The result was hulled domesticated
emmer (T. turgidum ssp. dicoccon), which was then the
primary source of free-threshing tetraploid wheat, such
as durum (T. turgidum ssp. durum, henceforth
T. durum). Transcaucasia and northwestern Caspian
Iran appear to be the primary sites of the evolution of
T. aestivum [23]. Gene flow from wild to domesticated
tetraploid wheat and from tetraploid wheat and
Ae. tauschii to T. aestivum has been experimentally
documented [23-27] but its impact on the evolution of
the T. aestivum A, B, and D genomes is not clear.
Page 2 of 22
We report here the development of GSPs for T. aestivum and their use in sequencing of T. aestivum genes
with the goal of characterizing the nucleotide diversity
of the wheat genomes and discovering SNPs. To make
the GSP development possible, a set of primers
anchored in conserved exons flanking one or several
introns was developed and is also reported. We refer to
these as conserved primers (CPs), as in [13]. Primers of
this type have also been known as conserved orthologous sets (COS) [28]. A map of genes bearing SNPs
constructed in diploid Ae. tauschii is presented and
compared with wheat deletion-bin gene maps [29].
Nucleotide diversity in individual chromosomes in a
wild emmer population from the Diyarbakir region in
Turkey and in T. aestivum was computed and the distribution of diversity among and within wild emmer and
T. aestivum genomes was used to analyze the early
stages of polyploid evolution.
Results
GSP development and SNP discovery
The process of GSP and SNP development is summarized in Figure 1. A total of 6,045 wheat ESTs was downloaded from the wEST database into the pipeline and
CPs anchored in exons and flanking one or two introns
were developed. The Southern hybridization profiles of
the ESTs were examined in the wEST database http://
wheat.pw.usda.gov/cgi-bin/westsql/map_locus.cgi and
CPs for those that showed a complex profile were eliminated. Amplicons were obtained with CPs for 1,599
T. urartu genes, 1,583 Ae. speltoides genes, and 1,574
Ae. tauschii genes and were sequenced. A total of 1,442
genes was cloned and sequenced from Langdon durum
wheat. A total of 11,764 GSPs was designed and tested
for genome specificity by PCR amplification of T. aestivum nullisomic-tetrasomic (N-T) lines. GSPs derived
from 1,102 EST unigenes (705 in the A genome, 703 in
the B genome, and 706 in the D genome) were validated
by PCR with N-T lines.
Target DNA was PCR amplified in 32 wheat lines
(Tables 1 and 2) using GSPs. A total of 41,065,555 bp of
the amplicons was sequenced (14,734,124 bp in the A
genome, 14,554,737 bp in the B genome, and 11,776,694
bp in the D genome) using GSP pairs as sequencing primers, and 5,471 SNPs at 1,791 loci were discovered.
SNP database
An online SNP database http://probes.pw.usda.gov:8080/
snpworld/Search was constructed. It contains sequences
of GSPs for the amplification and sequencing of 2114
loci and other relevant information about the ESTs and
SNPs (such as deletion-bin mapping of each EST), top
ten blast hits of each EST, alignments of nucleotide
sequences generated with primers derived from each
Akhunov et al. BMC Genomics 2010, 11:702
http://www.biomedcentral.com/1471-2164/11/702
Page 3 of 22
Flow of the project
CP design
Download bin-mapped ESTs from the wEST database into the
ConservedPrimers 2.0 pipeline.
(6,045 ESTs downloaded)
Design primers with the pipeline and manually.
(Primer pairs for 2111 unigenes designed)
GSP design
PCR amplify and sequence T. urartu, Ae. tauschii, and Ae. speltoides
amplicons and at lest 12 clones of each 'Langdon' durum amplicon.
Align sequences, allocate them to the A, B, and D genomes
and find divergent nucleotides.
Use a divergent nucleotide to design a genome-specific primer
and pair it with one of the CP primers (11,764 GSPs designed and tested)
GSP validation
Perform PCR using GSPs with relevant wheat nullisomic-tetrasomic lines
and validate primer genome specificity (GSPs for 2114 loci belonging to
1,102 unigenes validated)
SNP discovery
Amplify gene targets using GSPs in the panel of 32 wheat lines.
(2,114 genes amplified)
Align sequences and search for SNPs
(5,471 SNPs discovered at 1,791 gene loci)
Data base
construction
Submit the alignments with SNPs to the project database
(http://probes.pw.usda.gov:8080/snpworld/Search)
Figure 1 Project flow chart.
EST, a reference sequence for a locus and its source,
and graphical and numerical displays of each SNP.
Reference sequences were used to specify the positions
of SNPs. For the majority of the loci, the cv ‘Chinese
Spring’ (code Ta21, Table 1) sequence was used as a
reference sequence because of the central position of
Chinese Spring in the unrooted phylogenetic tree of 468
T. aestivum lines (Additional file 1, Figure S1). If the
sequence from Chinese Spring was unavailable, the next
most complete sequence for the locus was used. SNPs
can be viewed in the context of the entire reference
sequence in the expanded view window for each EST.
The database also contains data for portions of 1,651
genes amplified and sequenced with CPs in T. urartu,
Ae. speltoides, and Ae. tauschii http://probes.pw.usda.
gov:8080/snpworld/Search. The accession used as a
reference sequence for a locus is indicated for each
species. Data in the database include 488 polymorphic
loci containing 1,271 SNPs for T. urartu, 463 polymorphic loci containing 1,218 SNPs for Ae. speltoides,
and 641 polymorphic loci containing 2,203 SNPs for
Ae. tauschii. Additional SNPs for Ae. tauschii can be
found in the database for the D genomes of the synthetic wheats.
Diversity maps
A single Ae. tauschii EST linkage map [30] was used as
the backbone of the diversity maps. The Ae. tauschii
Akhunov et al. BMC Genomics 2010, 11:702
http://www.biomedcentral.com/1471-2164/11/702
Page 4 of 22
Table 1 Lines of tetraploid and hexaploid wheat used for
SNP discovery
Species
Database
code
Line
Origin
T. turg. ssp.
dicoccoides
Td01
PI 428020
Diyarbakir,
Turkey
T. turg. ssp.
dicoccoides
Td02
PI 428027
Diyarbakir,
Turkey
T. turg. ssp.
dicoccoides
Td03
PI 428053
Diyarbakir,
Turkey
T. turg. ssp.
dicoccoides
Td04
PI 428073
Diyarbakir,
Turkey
T. turg. ssp.
dicoccoides
Td05
PI 428064
Diyarbakir,
Turkey
T. turg. ssp.
dicoccoides
Td06
PI 428082
Diyarbakir,
Turkey
T. turg. ssp.
dicoccoides
Td07
PI 428083
Diyarbakir,
Turkey
T. turg. ssp.
dicoccoides
Td08
PI 428086
Diyarbakir,
Turkey
T. turg. ssp.
dicoccoides
Td09
G 2844
Diyarbakir,
Turkey
T. turg. ssp.
dicoccoides
Td10
G 2040
Diyarbakir,
Turkey
T. aest. ssp.
aestivum
Ta11
PI 166698
Turkey
T. aest. ssp.
aestivum
Ta13
PI 166792
Turkey
T. aest. ssp.
aestivum
Ta16
PI 622268
Iran,
T. aest. ssp.
aestivum
Ta17
Yangxian
Yangqianmai
Shaanxi (5660)
T. aest. ssp.
aestivum
T. aest. ssp.
aestivum
T. aest. ssp.
aestivum
T. aest. ssp.
aestivum
T. aest. ssp.
aestivum
Ta18
Yecora Rojo
California
Ta19
PI 119325
Turkey
Ta20
PI 622233
Iran
Ta21
Chinese Spring
China
Ta23
Opata 85
CIMMYT
T. aest. ssp.
compactum
Ta12
PI 166305
Turkey
T. aest. ssp.
compactum
Ta14
PI 350731
Austria
T. aest. ssp.
compactum
Ta15
PI 410595
Pakistan
T. aest. ssp. spelta
Ta22
405a (DV1132)
Iran
map backbone contained 870 loci (Table 3). Cosegregating genes were allocated into “recombination blocks”
which were sequentially numbered (Additional file 2,
Table S1). The order of orthologous genes in rice was
used to order genes within a recombination block.
Synteny of the Ae. tauschii genetic map with the rice
genome sequence [30] was exploited in mapping additional
loci for which the parents of the Ae. tauschii mapping
population were not polymorphic (Table 3) and which met
the conditions detailed in Materials and Methods (Map
construction). In a few cases, in which an ambiguity was
encountered in the rice genome sequence, sorghum and
Brachypodium distachyon genome sequences were
employed [30,31]. Consider, for example locus BG313769
located on the short arm of chromosome 1 D (Additional
file 2, Table S1). This locus was mapped to bins 1AS1,
1BS9, and 1DS1 [32]http://wheat.pw.usda.gov/cgi-bin/
westsql/map_locus.cgi. The locus with the highest
sequence similarity in rice is on pseudomolecule Os5 starting at nucleotide 1,903,106. Os5 is homoeologous with
1DS and mapping of the locus in the 1AS1, 1BS9, and
1DS1 bins is consistent with the position of the locus in
Os5 (Additional file 2, Table S1). PCR using genomic DNA
of N1A-T1B and N1A-T1 D as templates with the
BG313769 A-genome GSPs showed that the locus used for
the diversity study was on chromosome 1A http://probes.
pw.usda.gov:8080/snpworld/Search. Inserting locus
BG313769 into the map on the basis of synteny of 1AS1,
1BS9, and 1DS1 with Os5 placed it between recombination
block 36 (locus BE445121, which is at 56.85 cM on the Ae.
tauschii map and at nucleotide 1,679,201 in the Os5 pseudomolecule) and recombination block 37 (locus BF291549,
which is at 57.06 cM on the Ae. tauschii map and at
nucleotide 1,954,380 in the Os5 pseudomolecule). Locus
BG313769 and its diversity data were therefore placed
between loci BE445121 and BF291549. No cM value was
attached to the locus but its coordinates on Os5 were
given (Additional file 2, Table S1).
Loci corresponding to 484 ESTs were inserted into the
diversity maps on the basis of this process (Additional
file 2, Table S1), bringing the total number of loci on the
map to 1,354 (Table 3). Diversity was estimated from at
least one genome for 987 EST loci on the map. From
348,938 to 351,542 bp were sequenced and mapped on
the diversity maps for each genome × taxon combination
(Table 4). The numbers of discovered SNPs ranged from
377 in the T. aestivum D genome to 1,979 in the wild
emmer B genome. The highest average number of haplotypes per gene and highest average haplotype diversity
was in the D genome of synthetic wheats whereas the
lowest number of haplotypes per gene and lowest haplotype diversity was in the D genome of T. aestivum (Table
4). In wild emmer and T. aestivum, the average numbers
of haplotypes per gene and haplotype diversity did not
significantly differ between the A and B genomes (Table
4). However, both variables were significantly higher in
the genomes of wild emmer than in the corresponding
genomes of T. aestivum (Table 4).
Superimposition of diversity maps on the deletion-bin
maps
Wheat EST deletion-bin maps are an important resource for
the use of ESTs in wheat comparative mapping, map-based
Akhunov et al. BMC Genomics 2010, 11:702
http://www.biomedcentral.com/1471-2164/11/702
Page 5 of 22
Table 2 Synthetic wheats used for SNP discovery
Name*
Database
code
Source of AB
genomes
Source of the D genome
ITMI synthetic
Sn24
T. t. durum ’Altar’
CIMMYT W7984, CIGM86.940, Ae. tauschii ssp. strangulata, (collected by H. Kihara near
Mazadaran, Iran)
RL5402(2)
Sn25
TetraCantach
RL5261, Ae. tauschii ssp. typica
RL5403(2)
RL5405(2)
Sn26
Sn27
TetraCantach
TetraCantach
RL5266, Ae. tauschii ssp. anathera
RL5288. Ae. tauschii ssp. strangulata, (originally supplied by M. Tanaka as KUSE 2144)
RL5406(2)
Sn28
TetraCantach
RL5289. Ae. tauschii ssp. meyeri,
62052_4(1)
Sn29
T. durum ’Croc_1’
CIMMYT 205, Ae. tauschii ssp. tauschii
(PI452130, Hunan, China)
62056_4(1)
Sn30
T. durum ’Croc_1’
CIMMYT 224 = CLAE 25, Ae. tauschii ssp.tauschii (collected by H. Kihara near Gilan, Iran)
161725_0(1)
Sn31
T. durum ’Ceta’
CIMMYT 372 = CIGM86.940, Ae. tauschii ssp. strangulata (collected by H. Kihara near Kabul,
Afghanistan)
Unknown
Unknown
(1)
Sear’s synthetic Sn32
(3)
* Accessions designated with (1) were developed and supplies by A. Mujeeb-Kazi, at CIMMYT, those designated with (2) were developed and supplied by E.R.
Kerber, Agriculture Canada, Winnipeg, Manitoba, and that designated with (3) was developed and supplied by E.R. Search, USDA-ARS, Columbia, Missouri
cloning of wheat genes, comparative genomics, and other
genetic and genomic applications. To facilitate crossreferencing of EST diversity data developed here with EST
deletion-bin maps, the wheat diversity maps were superimposed on the deletion bin maps (Additional file 2, Table S1).
The Ae. tauschii linkage map [30] and wheat deletionbin maps share large numbers of loci, which facilitated
comparison of the two sets of maps. Only loci mapped by
linkage were used for these comparisons. Totals of 534,
654, and 646 ESTs on the wheat A-, B- and D-genome
deletion-bin maps were compared, respectively. The bin
location of a locus was considered incongruent between
the genetic and deletion-bin maps if it disagreed with the
order of recombination blocks (Additional file 2, Table
S1); the order of loci within recombination blocks was
disregarded. The known translocation differences involving chromosome 4A and chromosome arms 5AL and
7BS [33,34] were not considered. Because the genetic
maps of Ae. tauschii chromosomes are highly colinear
with the rice pseudomolecules (Additional file 2,
Table S1) most of the disagreements between the linkage
maps and deletion-bin maps would have to be due to
structural differences between wheat and Ae. tauschii
chromosomes or due to incompleteness or inconsistencies in the deletion-bin maps.
The Ae. tauschii linkage map portion of the diversity
maps (Additional file 2, Table S1) is expected to be
more consistent with the D-genome deletion-bin map
than the A- and B-genome deletion-bin maps because
the Ae. tauschii chromosomes are phylogenetically more
closely related to those of the wheat D genome than to
those of the wheat A and B genomes, and this was
indeed observed. While the locations of only 8.8% of the
loci on the D-genome deletion-bin maps were incongruent with the linkage map, 10.8 and 12.4% of the A- and
B-genome loci were incongruent (Table 5). The greatest
discrepancies relative to gene order in Ae. tauschii and
rice were encountered in chromosome arms 1AL, 5AS,
7AL, 1BL, 5BS, 4DL, and 7DS, and none were found in
chromosome arms 2BS, 2DS, 2DL, 3DS, and 5DL (Table
5 and Additional file 2, Table S1).
Nucleotide diversity
From 609 to 704 genes with estimated diversity were
mapped in a genome × species combination (Table 6).
However, some of the loci were excluded from diversity
Table 3 Loci mapped on the basis of linkage and synteny and the total number of EST loci with estimated diversity
(Div. loci) on the map
Chromosome
Length (cM)
Total loci mapped
Linkage mapped loci
Synteny mapped loci
Div. loci
1
180.5
212
148
64
125
2
3
186.9
197.0
171
240
92
186
79
54
142
151
4
127.4
198
107
91
155
5
171.2
133
86
47
113
6
149.3
205
119
86
145
7
154.5
195
132
63
156
Total
1166.8
1354
870
484
987
Akhunov et al. BMC Genomics 2010, 11:702
http://www.biomedcentral.com/1471-2164/11/702
Page 6 of 22
Table 4 Nucleotides sequenced, SNPs discovered,
average number of haplotypes (H), and haplotype
diversity (h)
Species
Genome
n
Nucl.
SNPs
Nucl./
SNP
H
h
T. aestivum
A
13 351542
966
364
T. aestivum
B
13 348938 1008
346
1.72c 0.21c
T. aestivum
D
13 349748
927
1.23d 0.06d
T. dicoccoides
A
10 351542 1516
232
2.05ab 0.31b
T. dicoccoides
B
10 348938 1979
176
2.22a 0.33b
Synthetic 6x
wheat
D
9 349029 1727
202
2.39a 0.47a
377
1.82c* 0.22c
* Means followed by the same letter are not significantly different at the 5%
probability level. Individual chromosome means (Table 9) were used as
variables in ANOVA.
analyses because of small sample size or because of
unreasonably high diversity indicating the possibility of
orthologous or paralogous sequences being included in
a diversity estimate. The numbers of loci used for analyses of diversity were therefore lower (Table 6). Of the
analyzed loci, 305 (52%) and 296 (51%) were polymorphic in the A and B genomes of T. aestivum,
respectively, and 316 (54%) and 338 (59%) were polymorphic in the A and B genomes of wild emmer,
respectively (Table 6). Only 138 (20%) loci of the 679
analyzed in the T. aestivum D genome were polymorphic (Table 6). Because the same GSPs resulted in
the discovery of 477 (74%) SNP-bearing loci in the D
genome of synthetic wheats (Table 6), the low number
of polymorphic loci in the wheat D genome must be an
attribute of wheat, not of Ae. tauschii, its diploid source.
Genome-wide θ w , and θ π were similar between the
T. aestivum A and B genomes (Table 7). Both estimates
were higher than those in the T. aestivum D genome
(Table 7). The estimates were also similar between the
A and B genomes in wild emmer, which showed higher
diversity than the corresponding genomes in T. aestivum (Table 7).
Tajima’s D contrasts θw, and θπ to detect differences
in the distribution of diversity relative to neutral expectations. The expectation for a neutral locus in a population is a Tajima’s D of zero. Positive values of Tajima’s
D indicate a paucity of rare alleles and a preponderance
of intermediate frequency alleles while negative values
indicate a preponderance of rare alleles and a paucity of
intermediate frequency alleles. Average Tajima’s D was
near zero in the A and B genomes of T. aestivum and
wild emmer but was negative in the T. aestivum D genome and positive in the Ae. tauschii genome present in
synthetic wheats (Table 7). The positive value of Tajima’s D in the D genome of synthetic wheats is very
likely due to strong subdivision of Ae. tauschii into two
major subpopulations. This subdivision has been
acknowledged taxonomically by elevating individuals of
the two subpopulations to subspecies, Ae. tauschii ssp.
strangulata and Ae. tauschii ssp. tauschii [35]. Estimates
of diversity at the replacement to silent codon sites in
the D genome were similar to those in Ae. tauschii and
differed in both genomes from those in the A and B
genomes of T. aestivum and wild emmer (Table 7).
Diversity among individual chromosomes
In the A genome of wild emmer and T. aestivum, diversity was lower in chromosome 4A than in the remaining
chromosomes (Table 8). This was true for diversity in
coding sequences and in replacement and silent codon
Table 5 Agreement between the locations of EST loci on the Ae. tauschii linkage map and wheat deletion-bin maps
A genome
Chrom. arm
B genome
Total loci
D genome
Total loci
% discordant loci
% discordant loci
Total loci
1S
62
6.5
60
20.0
73
% discordant loci
6.8
1L
38
31.6
49
20.4
48
16.7
2S
2L
24
39
8.3
2.6
36
36
0.0
5.6
26
37
0.0
0.0
3S
48
2.1
48
14.6
45
0.0
3L
82
1.2
92
8.7
97