-
Notifications
You must be signed in to change notification settings - Fork 10
/
abu-amero_08_mitochondrial_796219.pdf.txt
2150 lines (1669 loc) · 68.1 KB
/
abu-amero_08_mitochondrial_796219.pdf.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>1471-2148-8-45.fm</title>
<meta name="Author" content="Mosud.Ali"/>
<meta name="Creator" content="FrameMaker 8.0"/>
<meta name="Producer" content="Acrobat Distiller 8.1.0 (Windows)"/>
<meta name="CreationDate" content=""/>
</head>
<body>
<pre>
BMC Evolutionary Biology
BioMed Central
Open Access
Research article
Mitochondrial DNA structure in the Arabian Peninsula
Khaled K Abu-Amero1, José M Larruga2, Vicente M Cabrera2 and
Ana M González*2
Address: 1Mitochondrial Research Laboratory, Department of Genetics, King Faisal Specialist Hospital and Research Center, Riyadh, Saudi Arabia
and 2Department of Genetics, Faculty of Biology, University of La Laguna, Tenerife 38271, Spain
Email: Khaled K Abu-Amero - abuamero@gmail.com; José M Larruga - jlarruga@ull.es; Vicente M Cabrera - vcabrera@ull.es;
Ana M González* - amglez@ull.es
* Corresponding author
Published: 12 February 2008
BMC Evolutionary Biology 2008, 8:45
doi:10.1186/1471-2148-8-45
Received: 17 September 2007
Accepted: 12 February 2008
This article is available from: http://www.biomedcentral.com/1471-2148/8/45
© 2008 Abu-Amero et al; licensee BioMed Central Ltd.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0),
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Background: Two potential migratory routes followed by modern humans to colonize Eurasia
from Africa have been proposed. These are the two natural passageways that connect both
continents: the northern route through the Sinai Peninsula and the southern route across the Bab
al Mandab strait. Recent archaeological and genetic evidence have favored a unique southern
coastal route. Under this scenario, the study of the population genetic structure of the Arabian
Peninsula, the first step out of Africa, to search for primary genetic links between Africa and
Eurasia, is crucial. The haploid and maternally inherited mitochondrial DNA (mtDNA) molecule has
been the most used genetic marker to identify and to relate lineages with clear geographic origins,
as the African Ls and the Eurasian M and N that have a common root with the Africans L3.
Results: To assess the role of the Arabian Peninsula in the southern route, we genetically analyzed
553 Saudi Arabs using partial (546) and complete mtDNA (7) sequencing, and compared the
lineages obtained with those present in Africa, the Near East, central, east and southeast Asia and
Australasia. The results showed that the Arabian Peninsula has received substantial gene flow from
Africa (20%), detected by the presence of L, M1 and U6 lineages; that an 18% of the Arabian
Peninsula lineages have a clear eastern provenance, mainly represented by U lineages; but also by
Indian M lineages and rare M links with Central Asia, Indonesia and even Australia. However, the
bulk (62%) of the Arabian lineages has a Northern source.
Conclusion: Although there is evidence of Neolithic and more recent expansions in the Arabian
Peninsula, mainly detected by (preHV)1 and J1b lineages, the lack of primitive autochthonous M and
N sequences, suggests that this area has been more a receptor of human migrations, including
historic ones, from Africa, India, Indonesia and even Australia, than a demographic expansion
center along the proposed southern coastal route.
Background
The hypothesis that modern humans originated in Africa
and later migrated out to Eurasia replacing there archaic
humans [1,2] has continued to gain support from genetic
contributions [3-6]. Anthropologically, the most ancient
presence of modern humans out of Africa has been documented in the Levant about 95–125 kya [7,8], and in Australia about 50–70 kya [9]. Based on archaeological [10]
Page 1 of 15
(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:45
and classic genetic studies [11,12] two dispersals from
Africa were proposed: A northern route that reached western and central Asia through the Near East, and a Southern
route that, coasting Asia, reached Australia. However, ages
for these dispersals were very tentative. The first phylogeographic analysis using complete mtDNA genomic
sequences dated the out of Africa migrations around of
55–70 kya, when two branches, named M and N, of the
African macrohaplogroup L3 radiation supposedly began
the Eurasian colonization [5,6]. A more recent analysis,
based on a greater number of sequences, pushed back the
lower bound of the out-of-Africa migration, signed by the
L3 radiation, to around 85 kya [13]. This date is no so far
from the above commented presence of modern humans
in the Levant about 100–125 kya. Interestingly, this
migration is also in frame with the putative presence of
modern humans in Eritrean coasts [14], and corresponds
with an interglacial period (OIS 5), when African faunas
expanded to the Levant [15]. After that, it seems that, at
least in the Levant, there was a long period of population
bottleneck, as there is no modern human evidence in the
area until 50 kyr later, again in a relatively warm period
(OIS 3). This contraction phase might be reflected in the
basal roots of M and N lineages by the accumulation of 4
and 5 mutations before their next radiation around 60 kya
[13].
http://www.biomedcentral.com/1471-2148/8/45
recent Asian or African origins. Although a newly defined
clade L6 in Yemenis, with no close matches in the extant
African populations, could suggest an ancient migration
from Africa to Yemen [30], the lack of N and/or M autochthonous lineages left the southern route without
genetic support. It could be that unfavorable climatic conditions forced a fast migration through Arabia without
leaving a permanent track, but it is also possible that sample sizes have been insufficient to detect ancient residual
lineages in the present day Arab populations. To deal with
this last possibility we have enlarged our previous sample
of 120 Saudi Arabs [31] to 553, covering the main regions
of this country (Figure 1). In this sample we sequenced the
non-coding HVSI and HVSII mtDNA regions and unequivocally assorted the obtained haplotypes into haplogroups analyzing diagnostic coding region positions by
restriction fragment length polymorphisms (RFLP) or
fragment sequencing. Furthermore, when rare haplotypes
were found, we carried out genomic mtDNA sequencing
on them. In addition, the regional subdivision of the
Saudi samples and the analysis of the recently published
mtDNA data for Yemen [30] and for Yemen, Qatar, UAE
and Oman [32] allowed us to asses the population structure of the Arabian Peninsula and its relationships with
surrounding populations.
Results
Paradoxically this expansion began in a glacial period
(OIS 4). At glacial stages it is supposed that aridity in the
Levant was a strong barrier to human expansions and that
an alternative southern coastal route, crossing the Bab al
Mandab strait to Arabia, could be preferred. Consequently, based on the phylogeographic distribution of M
and N mtDNA clusters, with the latter prevalent in western Eurasia and the former more frequent in southern and
eastern Asia, it was proposed that two successive migrations out of Africa occurred, being M and N the mitochondrial signals of the southern and northern routes
respectively [6]. Furthermore, the star radiation found for
the Indian and East Asian M lineages was taken as indicative of a very fast southern dispersal [6]. However, posterior studies revealed the presence of autochthonous M
and N lineages all along the southern route, from South
Asia [16-21], through Malaysia [13] and to Near Oceania
and Australia [22-26]. Accordingly, it was hypothesized
that both lineages were carried out in a unique migration
[27,28], and even more, that the southern coastal trail was
the only route, being the western Eurasian colonization
the result of an early offshoot of the southern radiation in
India [29,13]. Under these suppositions, the Arabian
Peninsula, as an obliged step between East Africa and
South Asia, has gained crucial importance, and indeed
several mtDNA studies have recently been published for
this region [30-32]. However, it seems that the bulk of the
Arab mtDNA lineages have northern Neolithic or more
A total of 365 different mtDNA haplotypes were observed
in 553 Saudi Arab sequences. 299 of them (82%) could
have been detected using only the HVSI sequence information and 66 (18%) when the HVSII information was
also taken into account. Additional analysis of diagnostic
positions allowed the unequivocal assortment of the
majority (96%) of the haplotypes into subhaplogroups
[see Additional file 1]. However, 11 haplotypes were classified at the HV/R level, 3 assigned to macrohaplogroups
L3*, M* and N* respectively, and only one was left
unclassified [see Additional file 1]. The most probable origin of these Saudi haplotypes deserves a more detailed
analysis.
Macrohaplogroup L lineages
Sub-Saharan Africa L lineages in Saudi Arabia account for
10% of the total. χ2 analyses showed that there is not significant regional differentiation in this Country. However,
there is significant heterogeneity (p < 0.001) when all the
Arabian Peninsula countries are compared. This is mainly
due to the comparatively high frequency of sub-Saharan
lineages in Yemen (38%) compared to Oman-Qatar
(16%) and to Saudi Arabia-UAE (10%). Most probably,
the higher frequencies shown in southern countries reflect
their greater proximity to Africa, separated only by the Bab
al Mandab strait. However, when attending to the relative
contribution of the different L haplogroups, Qatar, Saudi
Arabia and Yemen are highly similar for their L3 (34%),
Page 2 of 15
(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:45
http://www.biomedcentral.com/1471-2148/8/45
Figure 1
Map of the Arabian Peninsula showing the Saudi regions and Arabian countries studied
Map of the Arabian Peninsula showing the Saudi regions and Arabian countries studied.
L2 (36%) and L0 (21%) frequencies whereas in Oman
and UAE the bulk of L lineages belongs to L3 (72%). In
this enlarged sample of Saudi Arabs, representatives of all
the recently defined East African haplogroups L4 [30], L5
[33], L6 [30] and L7 [34], have been found. The only L4
Saudi haplotype belongs to the L4a1 subclade defined by
16207T/C transversion. Although it has no exact matches
its most related types are found in Ethiopia [30]. Four L5
lineages have been found in Saudi Arabia but all have the
same haplotype that belongs to the L5a1 subclade defined
Page 3 of 15
(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:45
in the HVSI region by the 16355–16362 motif [30]. It has
matches in Egypt and Ethiopia. L6 was found the most
abundant clade in Yemen [30]. It has been now detected
in Saudi Arabia but only once. This haplotype (1604816223-16224-16243-16278-16311) differs from all the
previous L6 lineages by the presence of mutation 16243.
In addition it lacks the 16362 transition that is carried by
all L6 lineages from Yemen but has the ancestral 16048
mutation only absent in one Yemeni lineage [30]. This
Saudi type adds L6 variability to Arabia, because until
now L6 was only represented by a very abundant and a
rare haplotype in Yemen. Attending to the most probable
geographic origin of the sub-Saharan Africa lineages in
Saudi Arabia, 33 (61%) have matches with East Africa, 7
(13%) with Central or West Africa whereas the rest 14
(26%) have not yet been found in Africa. Nevertheless,
half of them belong to haplogroups with Western Africa
origin and the other half to haplogroups with eastern
Africa adscription [35,30]. It is supposed that the bulk of
these African lineages reached the area as consequence of
slave trade, but more ancient historic contacts with northeast Africa are also well documented [36,30,31].
Macrohaplogroup M lineages
M lineages in Saudi Arabia account for 7% of the total.
Half of them belong to the M1 African clade. There is no
significant heterogeneity within Saudi Arabia regions nor
among Arabian Peninsula countries for the total M frequency. However, when we compared the frequency of
the African clade M1 against that of the other M clades of
Asiatic provenance, it was significantly greater in western
Arabian Peninsula regions than in the East (χ2 = 12.53 d.f.
= 4 p < 0'05).
Inclusion of rare Saudi and other published African M1 sequences
into the M1 genomic phylogenetic tree
Recent phylogenetic and phylogeographic analysis of this
haplogroup [30,37,38] have suggested that the M1a1 subclade (following the nomenclature of Olivieri et al. [37]),
is particularly abundant and diverse in Ethiopia and M1b
in northwest Africa and the European and African Mediterranean areas. Other M1a subclades have a more generalized African distribution. Half of the M1 lineages in
Saudi Arabia belong to the Ethiopian M1a1 subclade and
the same proportion holds for other Arabian Peninsula
countries [30,32]. However, as a few M1 haplotypes did
not fit in the M1a1 cluster we did genome sequencing for
two of them (Figure 2). Lineage 471 resulted to be a member of the North African clade M1b, more specifically to
the M1b1a branch. As we have detected another M1b lineage in Jordan [38], it is possible that the Saudi one could
have reached Arabia from the Levant or from northwest
African areas. The second Saudi lineage (522) belongs to
a subcluster (M1a4) that is also frequent in East Africa
[37]. Recently, Tanzanian lineages have been studied by
http://www.biomedcentral.com/1471-2148/8/45
means of complete mtDNA sequences [39]. Three of these
sequences also fall into the M1 haplogroup. Two of them
belong to the Ethiopian M1a1 subclade (God 626 and
God 635), and the third (God637) shares the entire motif
that characterizes lineage M1a5 [37] with the exception of
transition 10694. Therefore, this mutation should define
a new subcluster M1a5a (Figure 2). The lineages found in
Tanzania further expand, southeastwards, the geographic
range of M1 in sub-Saharan Africa. Inspecting the M1 phylogeny of Olivieri et al. [37] we realized that our lineage
957 [38] has the diagnostic positions 13637, that defines
M1a3 and 6463 that defines the M1a3a branch. Therefore,
we have placed it as an M1a3a lineage with an 813 retromutation (Figure 2). It seems that, likewise L lineages, the
M1 presence in the Arabian Peninsula signals a predominant East African influence with possible minor introductions from the Levant.
Inclusion of rare Saudi Asiatic M sequences into the
macrohaplogroup M tree
The majority (12) of the 19 M lineages found in the Arabian Peninsula that do not belong to M1 [see Additional
file 1] have matches or are related to Indian clades, which
confirm previous results [30,31]. In addition, in this
expanded Saudi sample, we have found some sequences
with geographic origins far away from the studied area.
For instance, lineage 569 [see Additional file 1] has been
classified in the Eastern Asia subclade G2a1a [40] but
probably it has reached Saudi Arabia from Central Asia
where this branch is rather common and diverse [41].
Indubitably the four sequences (196, 479, 480 and 494)
are Q1 members and had to have their origin in Indonesia. In fact their most related haplotypes were found in
West New Guinea [42]. All these sequences could have
arrived to Arabia as result of recent gene flow. Particularly
documented is the preferential female Indonesian migration to Saudi Arabia as domestic workers [43]. Five undefined M lineages were genome sequenced (Figure 3). It is
confirmed that 5 of the 6 Saudi lineages analyzed have
also Indian roots. Lineage 691 falls into the Indian M33
clade because it has the diagnostic 2361 transition. In
addition, it shares 7 transitions (462, 5423, 8562, 13731,
15908, 16169, 16172) with the Indian lineage C182 [20],
which allows the definition of a new subclade M33a. Lineage 287 is a member of the Indian M36 clade because it
possesses its three diagnostic mutations (239, 7271,
15110). As it also shares 8 additional positions with the
Indian clade T135 [20], both conform an M36a branch
(Figure 3). Saudi 514 belongs to the Indian clade M30 as
it has its diagnostic motif (195A-514dCA-12007-15431).
Lineage 633 also belongs to the related Indian clade M4b
defined by transitions 511, 12007 and 16311. In addition
it shares mutation 8865 with the C51 Indian lineage [20]
that could define a new M4b2 subclade. We have classified sequence 551 as belonging to a new Indian clade M48
Page 4 of 15
(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:45
http://www.biomedcentral.com/1471-2148/8/45
195
6446
6680
12403
12950C
14110
16129
16189
16249
16311
M1
13111
M1b
813
6671
466
4936
8868
12346
15247g
16359
16185
M1b1
200
11671
M1b1a
4336
M1a1
M1a
13637
M1a3
6473
M1a3a
16223
3705
3513
8607
13215
M1a4
14323
199
14515
709
15770
(1781-2025)
(2053-2559)
(15720-15841)
16190d
471
303ii
150
530
303i
311i
513
813
6944
489
514dCA
1117
10873
9055
10322
11176
12403
43
Oli
183
10841
11024
13681
16182C
10881g
14696
16183C
16183C
12561
16129
16320
16249
16182C
30
Oli
957
Goz
16183C
626
God
14110
16129
635
God
13722
(1781-2025)
15799
(2053-2559)
16129
M1a5
8557
303i
(14685-15162)
489
2963
9379C
10694
M1a5a
40
Oli
14110
522
15770A
637
God
Figure 2
Phylogenetic tree based on complete M1 sequences
Phylogenetic tree based on complete M1 sequences. Numbers along links refer to nucleotide positions. A, C, G indicate transversions; "d" deletions and "i" insertions. Recurrent mutations are underlined. Regions not analyzed are in parenthesis. Star differs from rCRS [62, 63] at positions: 73, 263, 303i, 311i, 489, 750, 1438, 2706, 4769, 7028, 8701, 8860, 9540, 10398,
10873, 10400, 11719, 12705, 14766, 14783, 15043, 15301, 15326, 16223 and 16519. GenBank accession numbers of the subjects retrieved from the literature are: 43 Oli (EF060354), 30 Oli (EF060341) and 40 Oli (EF060351) from Olivieri et al. [37];
626 God (EF184626), 635 God EF184635 and 637 God (EF184637) from Gonder et al. [39]; and 957 Goz (DQ779926) from
González et al. [38].
Page 5 of 15
(page number not for citation purposes)
BMC Evolutionary Biology 2008, 8:45
16519
2361
239
568i3C
M33
7271
http://www.biomedcentral.com/1471-2148/8/45
M42’M10
709
462
5423
8562
13731
15908
16169
15110
3168i
M36
4140
8251
7250
8793
514dCA
6320
6917
12346
16172
M33a
Q’M29
8856
12771
10646
16287
16356
12549
13152
150
14302
14502
9156
(1861-2025) 10166
(2053-2559) 16278
16193
15040
M42a
M36a
15071
5563
7406
7407
(7850-7880)
T135
(15720-15841) Sun
(4720-4750)
8593
16080
(9290-9362)
16256
(11960-12020)
16303T
13151
(14685-15162)
15238
15514
16086
287
15218
16311
M10
200
8919
12360
15218
16066
16320
ON96
Tan
204
207
211
2880 (1781-2025)
8143
5124A
3783
150
1462T
152
6104
252
709
M48
6020 (2053-2559)
303i
C182 (1781-2025) 9685
Sun (2053-2059) 16144
151
M42
14203
(5190-5279)
5460
10750
16192
4508
152
3010
1719 234
8793 13500
16311
1598
8251
9410
14527
16519
M42a1
10786
6554
464
513
1048
4230
4117
5843
8790
12940
16129
16241
10245
Q
15067
5460
16362
Q1/Q2
89
12279
8155
7711
12792 (10975-11486)
13065
14687
14040
15859
15848 16189
8964
M29
14025
4050
16144
5263
16148
9582
16265C
12127
16092
16275
16111
16293
551
R58
Sun
152
16148 4216
6962
195
311i
6374
16468
64
345
146
5447
154
8808
182
14953
514dCA
15884
92
M28
544
15896
146
94
204
12172
568i
16111A
M28b
2135
2280
3808
3547
4092
4604
6734
16311
195A
M4
6794
514dCA
11101
M14
195
6281
12007
M4’30
569
201
Abu
511
15865
16249
15431
M4b
8865
M34
241
6367
M30
(1781-2025)
207
372i
(2053-2559)
239
3866
7406 514dCA (1781-2025)
(5230-5280)
11149 3398 (2053-2559)
6710
11969 8839
8559
(9300-9362)
16051 12429 11902
(11870-12020)
16250
C51 (14685-15162)
13174
16256
16145
C56 Sun
16356
Sun
16360
633
514
8279d9
16343
5772
12366
Q1
10595
12715
5951
13452
10876
14783
M29a
7681
16223
11110
16137
16294
45
VHP
16242
16519
691
8296
9956
DQ07
Mer
12952
12507
13020
92
310
12373
Q1a
13754
13934
15498 16291 13980
16362 16318T 14182A
Q1a2 DQ99 15908
AY85 Mer 16051
Ing
16243
16245
16270A
Au38
Hud
Figure 3
Phylogenetic tree based on complete M sequences
Phylogenetic tree based on complete M sequences. Numbers along links refer to nucleotide positions. A, C, T indicate
transversions; "d" deletions and "i" insertions. Recurrent mutations are underlined. Regions not analyzed are in parenthesis.
Star differs from rCRS [62, 63] at positions: 73, 263, 303i, 311i, 489, 750, 1438, 2706, 4769, 7028, 8701, 8860, 9540, 10398,
10873, 10400, 11719, 12705, 14766, 14783, 15043, 15301, 15326, 16223 and 16519. GenBank accession numbers of the subjects retrieved from the literature are: C182 Sun (AY922276), T135 Sun (AY922287), R58 Sun (AY922299), C56 Sun
(AY922274), and C51 Sun (AY922261) from Sun et al. [20]; ON96 Tan (AP008599) from Tanaka et al. [40]; 45 VHP
(DQ404445) from van Holst Pellekaan et al. [25]; DQ07 Mer (DQ137407), DQ99 Mer (DQ137399) from Merriwether et al.
[24]; AY85 Ing (AY289085) from Ingman and Gyllensten [22]; Au38 Hud (EF495222) from Hudjashov et al. [26]; and 201 Abu
(DQ904234) from Abu-Amero et al. [31].
defined by a four transitions motif (1598-5460-1075016192) which is shared with the M Indian lineage R58
(Figure 3). Australian clade M42 [44] and New Britain
M29 clade [24] also have 1598 transition as a basal mutation. However, they are respectively more related to the
East Asia clade M10 [40] and to the Melanesian Q clade
[27], as their additionally shared basal mutations are less
recurrent than transition1598 [45]. All these Indian M
sequences have been found in Arabia as isolated lineages
that belong to clusters with deep roots and high diversity
in India. Therefore, its presence in Arabia is better
explained by recent backflow from India than by suppos-
ing that these lineages are footsteps of an M ancestral
migration across Arabia.
The Saudi sequence 201 deserves special mention (Figure
3). It was previously tentatively related to the Indian M34
clade because both share the 3010 transition. However, it
was stated that due to the high recurrence of 3010 most
probably the 201 sequence would belong to a yet undefined clade [31]. The recent study of new Australian lineages [26] has allowed us to find out an interesting link
between their Australian M14 lineage and our Saudi 201
sequence (Figure 3). The authors related M14 to the Melanesian clade M28 [24] because both share the
Page 6 of 15
(page number not for citation purposes)