-
Notifications
You must be signed in to change notification settings - Fork 88
/
tei_simplePrint.odd
5197 lines (5170 loc) · 294 KB
/
tei_simplePrint.odd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://jenkins.tei-c.org/job/TEIP5/lastSuccessfulBuild/artifact/P5/release/xml/tei/odd/p5.nvdl" type="application/xml" schematypens="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:sch="http://purl.oclc.org/dsdl/schematron">
<teiHeader>
<fileDesc>
<titleStmt>
<title>An Introduction to TEI simplePrint</title>
</titleStmt>
<publicationStmt>
<p>For distribution from TEI </p>
</publicationStmt>
<sourceDesc>
<p>This document was derived from the tutorial for TEI Lite, with substantial additions and
modifications carried out within the TEI Simple project. It was also revised considerably
for approval by the TEI Council.</p>
</sourceDesc>
</fileDesc>
<encodingDesc>
<p>The following definitions provide support for the style of rendition URI recommended by the
Simple documentation. For them to be effective in a given document, it will be necessary to
include in that document a prefix definition which specifies the location of the Simple
documentation (i.e. the present document) in its <att>replacementPattern</att> attribute. A
suggested URI is given below.</p>
<listPrefixDef>
<prefixDef ident="simple" matchPattern="[a-z]+"
replacementPattern="http://www.tei-c.org/Simple/tei_simple.odd/#$1"/>
</listPrefixDef>
<tagsDecl>
<rendition xml:id="allcaps">text-transform: uppercase;</rendition>
<rendition xml:id="blackletter">font-family: fantasy;</rendition>
<rendition xml:id="bold">font-weight: bold;</rendition>
<rendition xml:id="bottombraced">padding-bottom: 2pt; border-bottom: dashed gray
2pt;</rendition>
<rendition xml:id="block">display:block;</rendition>
<rendition xml:id="boxed">padding: 2pt; border: solid black 1pt;</rendition>
<rendition xml:id="centre">text-align: center;</rendition>
<rendition xml:id="cursive">font-family: cursive;</rendition>
<rendition xml:id="doublestrikethrough">text-decoration: line-through; color:
red;</rendition>
<rendition xml:id="doubleunderline">text-decoration: underline; color: red;</rendition>
<rendition xml:id="dropcap">font-size : 6em; font-family: cursive; font-weight : bold;
vertical-align: top; height: 1em; line-height: 1em; float : left; width : 1em; color :
#c00; margin: 0em; padding: 0px;</rendition>
<rendition xml:id="float">float:right; display: block; font-size: smaller; clear: right;
padding: 4pt; width: 15%; </rendition>
<rendition xml:id="hyphen"/>
<rendition xml:id="inline">display:inline;</rendition>
<rendition xml:id="italic">font-style: italic;</rendition>
<rendition xml:id="justify">text-align: justify;</rendition>
<rendition xml:id="larger">font-size: larger;</rendition>
<rendition xml:id="left">text-align: left;</rendition>
<rendition xml:id="leftbraced">padding-left: 2pt; border-left: dotted gray 2pt; </rendition>
<rendition xml:id="letterspace">letter-spacing: 0.5em;</rendition>
<rendition xml:id="literal">font-family:monospace; white-space:pre;</rendition>
<rendition xml:id="normalstyle">font-style:roman;</rendition>
<rendition xml:id="normalweight">font-weight:normal;</rendition>
<rendition xml:id="right">text-align: right;</rendition>
<rendition xml:id="rightbraced">padding-right: 2pt; border-right: dotted gray 2pt; </rendition>
<rendition xml:id="rotateleft">-webkit-transform: rotate(90deg); transform:
rotate(90deg);</rendition>
<rendition xml:id="rotateright">-webkit-transform: rotate(-90deg); transform:
rotate(-90deg);</rendition>
<rendition xml:id="rules">border: 1px solid black; padding:
2px;border-collapse:collapse;border-spacing:0;</rendition>
<rendition xml:id="smallcaps">font-variant: small-caps;</rendition>
<rendition xml:id="smaller">font-size: smaller;</rendition>
<rendition xml:id="strikethrough">text-decoration: line-through;</rendition>
<rendition xml:id="subscript">vertical-align: bottom; font-size: smaller;</rendition>
<rendition xml:id="superscript">vertical-align: super; font-size: smaller;</rendition>
<rendition xml:id="topbraced">padding-top: 2pt; border-top: dotted gray 2pt; </rendition>
<rendition xml:id="typewriter">font-family:monospace;</rendition>
<rendition xml:id="underline">text-decoration: underline;</rendition>
<rendition xml:id="wavyunderline">text-decoration: underline; text-decoration-style:
wavy;</rendition>
</tagsDecl>
</encodingDesc>
<revisionDesc>
<change when="2022-06-17" who="ebleeker">Removed elements charName and glyphName from
charDecl, since these two elements have been removed as from the 4.4.0
release.</change>
<change when="2019-06-28" who="mholmes sbauman">Eliminate cases of multiple *Spec elements for
the same <att>ident</att> by combining them manually</change>
<change when="2016-11-17">Implement name change finally agreed by Council</change>
<change when="2016-10-26">Remove authority,sponsor,funder,principal; unicodeName localName;
add discussion of charDecl</change>
<change when="2016-10-12">More last minute changes to verbiage; checking examples for
validity</change>
<change when="2016-07-28">Check procmodtab; do att classes; re-insert valLists; add some
graphics; proofread first half</change>
<change when="2016-07-23">Rerun checks: remove space</change>
<change when="2016-07-23">Beef up facsimile</change>
<change when="2016-07-18">Add some prose at front; add particDesc listPerson and person </change>
<change when="2016-06-29">Add JC hack for rendition; drastically revise proc mod
section</change>
<change when="2016-06-28">Add hdr sections on listPrefixDef, tagUsage, abstract, postscript,
xenodata</change>
<change when="2016-06-22">Continue adding discussions for all textual elements</change>
<change when="2016-06-01"> Various minor tweaks to former Lite material. </change>
<change when="2016-05-31">Rewrite Jane Eyre discussion a bit. Revise mentions of resourceLike
elts. Remove weird PIs in valLists. Revise q/quote section to make sense.</change>
<change when="2016-05-19">Remove initial project desc, XML intro, Simple propaganda, etc. Roll
up sleeves. </change>
</revisionDesc>
</teiHeader>
<text>
<front>
<titlePage>
<docTitle>
<titlePart type="main">An Introduction to TEI simplePrint</titlePart>
</docTitle>
<docAuthor>Lou Burnard</docAuthor>
<docAuthor>Martin Mueller</docAuthor>
<docAuthor>Sebastian Rahtz</docAuthor>
<docAuthor>James Cummings</docAuthor>
<docAuthor>Magdalena Turska</docAuthor>
<docDate>January 2017</docDate>
</titlePage>
<div>
<head>Preface</head>
<p>This document is the formal specification for TEI simplePrint, an entry-level
customization of the Text Encoding Initiative (TEI) Guidelines, intended to be generally
useful to a large variety of encoders attempting to cope with the standardized
representation of a variety of documents in digital form.</p>
<p>Like every other TEI customization, TEI simplePrint was designed for use with a
particular type of material. If the material you are planning to encode matches the
following criteria, then TEI simplePrint is for you. If it does not, it may not be. <list>
<item>You are encoding print material, rather than manuscript: simplePrint provides no
way of encoding manuscript features such as correction, deletion, or scribal
variation</item>
<item>You are encoding material from the Early Modern period (i.e., up to the end of the
nineteenth century): some of the features for which simplePrint provides encodings are
rarely found in modern materials.</item>
<item>You are encoding material written, broadly speaking, within the Western European
tradition, using largely Western European characters. simplePrint does provide
facilities for encoding short passages in non Western European languages, but many
features needed to cope with Asian or ancient scripts are missing. </item>
<item>Your intention is to provide a relatively simple encoding for a large amount of
material, rather than a rich encoding of a small amount of material: simplePrint is
intended to help libraries and archives wishing to go beyond basic digital facsimiles,
rather than to support specialist research. It does not, for example, include features
for detailed linguistic tagging beyond simple word-level tagging, nor for specialised
text types such as dictionaries, historical or biographical databases, etc. </item>
</list> If your needs go beyond those summarized here, simplePrint may still be a good
point of departure, and may be very useful as a basis for the creation of your own TEI
customisation. We don't however discuss the creation of a TEI customization in this
document: the TEI website provides a number of links to tutorial material and tools which
may assist in this process. </p>
<p>The present document is intended to be generally comprehensible and accessible, but does
assume some knowledge of XML (the encoding language used by the TEI), and of the way it is
used by the TEI. Further information on both these topics are available from many places,
not least the TEI's own web site at <ptr target="http://www.tei-c.org"/>.</p>
<p>The TEI simplePrint schema was first elaborated as a part of the TEI Simple project
funded by the <ref target="https://mellon.org/">Andrew W. Mellon Foundation</ref>
(2012-2014). The project sought to define a new <soCalled>highly-constrained and
prescriptive subset</soCalled> of the Text Encoding Initiative (TEI) Guidelines suited
to the representation of early modern print materials, a formally-defined set of
processing rules which permit modern web applications to easily present and analyze the
encoded texts, mapping to other ontologies, and processes to describe the encoding status
and richness of a TEI digital text. Its choice of elements reflected the practices
followed in the encoding of large-scale literary archives, notably those produced by the
Text Creation Partnership. Practice of other comparable archives such as the German Text
Archive was also taken into account.</p>
<p>The most distinctive feature of TEI simplePrint is its use of the TEI Processing Model,
which provides explicit and recommended options for the display or processing of every
textual element. Programmers developing systems to handle texts encoded with TEI
simplePrint do not have to look beyond this when building stylesheets or other components.
This greatly reduces the complexity of developing applications that will work reliably and
consistently for many users and across large corpora of documents.</p>
<p>The TEI simplePrint schema and the TEI Processing Model were first defined by a working
group led by Martin Mueller (Northwestern University) and Sebastian Rahtz (Oxford
University). Major contributions to the project were made by Magdalena Turska (Oxford
University), James Cummings (Oxford University), and Brian Pytlik Zillig. The changes to
the TEI scheme needed to support the TEI Processing Model were reviewed and approved by
the TEI Technical Council for inclusion in release 3.0.0 of TEI P5 in February 2016. The
present document was extensively revised and extended by Lou Burnard in July 2016 for
submission to the TEI Technical Council. </p>
</div>
</front>
<body>
<!-- <xi:include href="out/elementList.xml"/> -->
<div xml:id="Simple-eg">
<head>A Short Example</head>
<p>We begin with a short example. How should we go about transferring into a computer a
passage of prose, such as the start of the last chapter of Charlotte Brontë's novel
<title>Jane Eyre</title>? We might start by simply copying what we see on the printed
page, typing it in such a way that what appears on the screen looks as similar as
possible, for example, by retaining the original line breaks, by introducing blanks to
represent the layout of the original headings, page breaks, and paragraphs, and so forth.
Of course, the possibilities are limited by the nature of the computer program we use to
capture the text: it may not be possible for example to reflect accurately the typographic
characteristics of our source with all such software. Some characters in the printed text
(such as the accented letter <mentioned>a</mentioned> in <mentioned>faàl</mentioned> or
the long dash) may not be available on the keyboard; some typographic distinctions (such
as that between small capitals and full capitals) may not be readily accessible. Our first
attempt tries to mimic the appearance of the former, and simply ignores the latter.</p>
<p>
<eg xml:space="preserve">
CHAPTER 38
READER, I married him. A quiet wedding we had: he and I, the par-
son and clerk, were alone present. When we got back from church, I
went into the kitchen of the manor-house, where Mary was cooking
the dinner, and John cleaning the knives, and I said --
'Mary, I have been married to Mr Rochester this morning.' The
housekeeper and her husband were of that decent, phlegmatic
order of people, to whom one may at any time safely communicate a
remarkable piece of news without incurring the danger of having
one's ears pierced by some shrill ejaculation and subsequently stunned
by a torrent of wordy wonderment. Mary did look up, and she did
stare at me; the ladle with which she was basting a pair of chickens
roasting at the fire, did for some three minutes hang suspended in air,
and for the same space of time John's knives also had rest from the
polishing process; but Mary, bending again over the roast, said only --
'Have you, miss? Well, for sure!'
A short time after she pursued, 'I seed you go out with the master,
but I didn't know you were gone to church to be wed'; and she
basted away. John, when I turned to him, was grinning from ear to
ear.
'I telled Mary how it would be,' he said: 'I knew what Mr Ed-
ward' (John was an old servant, and had known his master when he
was the cadet of the house, therefore he often gave him his Christian
name) -- 'I knew what Mr Edward would do; and I was certain he
would not wait long either: and he's done right, for aught I know. I
wish you joy, miss!' and he politely pulled his forelock.
'Thank you, John. Mr Rochester told me to give you and Mary
this.'
I put into his hand a five-pound note. Without waiting to hear
more, I left the kitchen. In passing the door of that sanctum some time
after, I caught the words --
'She'll happen do better for him nor ony o' t' grand ladies.' And
again, 'If she ben't one o' th' handsomest, she's noan faa\l, and varry
good-natured; and i' his een she's fair beautiful, onybody may see
that.'
I wrote to Moor House and to Cambridge immediately, to say what
I had done: fully explaining also why I had thus acted. Diana and
474
JANE EYRE 475
Mary approved the step unreservedly. Diana announced that she
would just give me time to get over the honeymoon, and then she
would come and see me.
'She had better not wait till then, Jane,' said Mr Rochester, when I
read her letter to him; 'if she does, she will be too late, for our honey-
moon will shine our life long: its beams will only fade over your
grave or mine.'
How St John received the news I don't know: he never answered
the letter in which I communicated it: yet six months after he wrote
to me, without, however, mentioning Mr Rochester's name or allud-
ing to my marriage. His letter was then calm, and though very serious,
kind. He has maintained a regular, though not very frequent correspond-
ence ever since: he hopes I am happy, and trusts I am not of those who
live without God in the world, and only mind earthly things.
</eg>
</p>
<p>This transcription suffers from a number of shortcomings: <list>
<item>the page numbers and running titles are intermingled with the text in a way which
makes it difficult for software to distinguish them;</item>
<item>no distinction is made between single quotation marks and apostrophe, so it is
difficult to be certain exactly which passages are in direct speech;</item>
<item>the preservation of the copy text's hyphenation means that simple-minded search
programs will not find words broken across a line;</item>
<item>the accented letter in <mentioned>faàl</mentioned> and the long dash have been
rendered by ad hoc keying conventions (<mentioned>faa\l</mentioned>) which follow no
standard pattern and will be processed correctly only if the transcriber remembers to
mention them in the documentation;</item>
<item>paragraph divisions are marked only by the use of white space, and hard carriage
returns have been introduced at the end of each line. Consequently, if the size of
type used to display the text changes, reformatting will be problematic.</item>
</list></p>
<p>We now present the same passage, as it might be encoded in TEI simplePrint. As we shall
see, there are many ways in which this encoding could be extended, but as a minimum, the
TEI approach allows us to represent the following distinctions in a standardized way: <list>
<item>Paragraph and chapter divisions are now marked explicitly by means of tags rather
than implicitly by white space.</item>
<item>Apostrophes are retained, but the quotation marks indicating direct speech have
been removed, and direct speech is now marked explicitly by means of a tag. </item>
<item>The accented letter and the long dash are accurately represented, using the
appropriate Unicode character.</item>
<item>Page divisions have been marked with an empty <gi>pb</gi> tag; the page heading
and running text have been suppressed. </item>
<item>The lineation of the original has also been suppressed and words broken by
typographic accident at the end of a line have been re-assembled without
comment.</item>
<item>For convenience of proof reading, a new line has been introduced at the start of
each paragraph, but the indentation is removed.</item>
</list>
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<pb n="474"/>
<div type="chapter" n="38">
<p>Reader, I married him. A quiet wedding we had: he and I, the parson and clerk, were
alone present. When we got back from church, I went into the kitchen of the
manor-house, where Mary was cooking the dinner, and John cleaning the knives, and I
said —</p>
<p><q>Mary, I have been married to Mr Rochester this morning.</q> The housekeeper and
her husband were of that decent, phlegmatic order of people, to whom one may at any
time safely communicate a remarkable piece of news without incurring the danger of
having one's ears pierced by some shrill ejaculation and subsequently stunned by a
torrent of wordy wonderment. Mary did look up, and she did stare at me; the ladle
with which she was basting a pair of chickens roasting at the fire, did for some
three minutes hang suspended in air, and for the same space of time John's knives
also had rest from the polishing process; but Mary, bending again over the roast,
said only —</p>
<p><q>Have you, miss? Well, for sure!</q></p>
<p>A short time after she pursued, <q>I seed you go out with the master, but I didn't
know you were gone to church to be wed</q>; and she basted away. John, when I
turned to him, was grinning from ear to ear. <q>I telled Mary how it would be,</q>
he said: <q>I knew what Mr Edward</q> (John was an old servant, and had known his
master when he was the cadet of the house, therefore he often gave him his Christian
name) — <q>I knew what Mr Edward would do; and I was certain he would not wait long
either: and he's done right, for aught I know. I wish you joy, miss!</q> and he
politely pulled his forelock.</p>
<p><q>Thank you, John. Mr Rochester told me to give you and Mary this.</q></p>
<p>I put into his hand a five-pound note. Without waiting to hear more, I left the
kitchen. In passing the door of that sanctum some time after, I caught the words
—</p>
<p><q>She'll happen do better for him nor ony o' t' grand ladies.</q> And again, <q>If
she ben't one o' th' handsomest, she's noan faàl, and varry good-natured; and i'
his een she's fair beautiful, onybody may see that.</q></p>
<p>I wrote to Moor House and to Cambridge immediately, to say what I had done: fully
explaining also why I had thus acted. Diana and <pb n="475"/> Mary approved the step
unreservedly. Diana announced that she would just give me time to get over the
honeymoon, and then she would come and see me.</p>
<p><q>She had better not wait till then, Jane,</q> said Mr Rochester, when I read her
letter to him; <q>if she does, she will be too late, for our honeymoon will shine
our life long: its beams will only fade over your grave or mine.</q></p>
<p>How St John received the news I don't know: he never answered the letter in which I
communicated it: yet six months after he wrote to me, without, however, mentioning
Mr Rochester's name or alluding to my marriage. His letter was then calm, and though
very serious, kind. He has maintained a regular, though not very frequent
correspondence ever since: he hopes I am happy, and trusts I am not of those who
live without God in the world, and only mind earthly things.</p>
</div></egXML>
</p>
<p>This encoding is expressed in TEI XML, a very widely used and standardized method of
representing information about a document within the document itself. The transcribed
words are complemented by special flags within angle brackets, called <term>tags</term>,
which both characterise and mark the beginning and end of a string of characters. For
example, each paragraph is marked by a tag <tag>p</tag> at its start, and a corresponding
<tag>/p</tag> at its end. We don't elaborate further on the syntax of TEI XML here.
<note place="foot">Many introductory tutorials on XML are available on the web, for
example at <ptr target="http://www.w3schools.com/xml/"/>. The way the TEI uses XML is
fully documented in <ref
target="http://www.tei-c.org/release/doc/tei-p5-doc/en/html/SG.html">section v of the
TEI Guidelines</ref>; a very basic introduction to TEI XML is also available at <ptr
target="http://www.ultraslavonic.info/intro-to-xml"/>. The formal specification of the
XML language is at <ptr target="http://www.w3.org/TR/xml"/>.</note>
</p>
<p>Aside from its syntax, it is important to note that this particular encoding represents a
set of choices or priorities. We have chosen to prioritize and simplify the representation
of the words of the text over the representation of the typographic layout associated with
them in this source document. This makes it easier for a computer to answer questions
about the words in the document than about its typesetting, reflecting our research
priorities. This priority also leads us to suppress end-of-line hyphenation. Conceivably
Brontë (or her printer) intended the word <q>honeymoon</q> to appear as <q>honey-moon</q>
on its second appearance, though this seems unlikely: our decision to focus on Brontë's
text, rather than on the printing of it in this particular edition, makes it impossible to
be certain. Similarly, our decision makes it impossible to use this transcription as a
means of statistically analysing hyphenation practice. An encoding makes explicit all and
only those textual features of importance to the encoder.</p>
<p>It is not difficult to think of ways in which the encoding of even this short passage
might readily be extended to address other research priorities. For example: <list
type="simple">
<item>a regularized form of the passages in dialect could be provided; </item>
<item>footnotes glossing or commenting on any passage could be added;</item>
<item>pointers linking parts of this text to others could be added;</item>
<item>proper names of various kinds could be distinguished from the surrounding
text;</item>
<item>names could be classified as personal, geographical, or institutional</item>
<item>detailed bibliographic information about the text's provenance and context could
be prefixed to it;</item>
<item>a linguistic analysis of the passage into sentences, clauses, words, etc., could
be provided, each unit being associated with appropriate category codes;</item>
<item>the text could be segmented into narrative or discourse units;</item>
<item>systematic analysis or interpretation of the text could be included in the
encoding, with potentially complex alignment or linkage between the text and the
analysis, or between the text and one or more translations of it;</item>
<item>passages in the text could be linked to images or sound held on other
media.</item>
</list></p>
<p>In the remainder of this document, we present a number of TEI-recommended ways of
supporting these and other encoding requirements. These ways generally involve the
application of specific TEI XML elements, selected from the full range of possibilities
documented in the TEI <title>Guidelines</title>. Like every other TEI project, TEI Simple
proposes a view of the TEI Guidelines. This document defines and documents that view. </p>
</div>
<div xml:id="Simple-struc">
<head>The Structure of a TEI simplePrint Document</head>
<p>A TEI-conformant text contains (a) a <term>TEI header</term> (marked up as a
<gi>teiHeader</gi> element) and (b) one or more representations of a text. These
representations may be of three kinds: a transcribed text, marked up as a <gi>text</gi>
element; a collection of digital images representing the text, marked up using a
<gi>facsimile</gi> element; or a literal transcription of one or more documents
instantiating the text, marked up using the <gi>sourceDoc</gi> element.</p>
<p>These elements are combined together to form a single <gi>TEI</gi> element, which must be
declared within the TEI <term>namespace</term>, and therefore usually takes the form
<tag>TEI xmlns="http://www.tei-c.org/ns/1.0"</tag>
<note place="foot">A <term>namespace</term> is an XML concept. Its function is to identify
the vocabulary from which a group of element names are drawn, using a standard
identifier resembling a web address. The namespace for TEI elements is
<val>http://www.tei-c.org/ns/1.0</val></note>.</p>
<p>Some aspects of the TEI header are described in more detail in section <ptr
target="#Simple-header"/>. In what follows, we will focus chiefly on the use of the
<gi>text</gi> element, though we describe one way of using the <gi>facsimile</gi>
element in combination with it or alone in section <ptr target="#Simple-fax"/>. We do not
consider the <gi>sourceDoc</gi> element further, since it is mainly used in very
specialised applications for which TEI simplePrint would not be appropriate. </p>
<p>A text may be <term>unitary</term> (a single work) or <term>composite</term> (a
collection of single works, such as an anthology). In either case, the text may have
optional <term>front</term> or <term>back</term> matter such as title pages, prefaces,
appendixes etc. We use the term <term>body</term> for whatever comes between these in the
source document. We discuss various kinds of composite text in section <ptr
target="#Simple-composite"/> below.</p>
<p>A unitary text will be encoded using an overall structure like this: <egXML
xmlns="http://www.tei-c.org/ns/Examples" valid="feasible"><TEI>
<teiHeader><!-- [ TEI Header information ] --></teiHeader>
<text>
<front><!-- [ front matter ... ] -->
</front>
<body>
<!-- [ body of text ... ] -->
</body>
<back><!-- [ back matter ... ] -->
</back>
</text>
</TEI></egXML>
</p>
<p>In each of the following sections we include a short list of the TEI
<term>elements</term> under discussion, along with a brief description, and in most
cases an example of how they are used. Throughout the text, element names are linked to
their detailed reference documentation, as given in the TEI Guidelines. Note that most of
the examples provided by the reference documentation, and all of the links, are not
specific to TEI simplePrint. </p>
<p>For example, here are the elements discussed so far: <specList>
<specDesc key="TEI"/>
<specDesc key="teiHeader"/>
<specDesc key="text"/>
<specDesc key="facsimile"/>
</specList></p>
</div>
<div xml:id="Simple-body">
<head>Encoding the Body</head>
<p>As indicated above, a unitary text is encoded by means of a <gi>text</gi> element, which
may contain the following elements: <specList>
<specDesc key="front"/>
<specDesc key="group"/>
<specDesc key="body"/>
<specDesc key="back"/>
</specList> Elements specific to front and back matter are described below in section <ptr
target="#Simple-fronbac"/>. In this section we discuss the elements making up the body
of a text. A text must always have a body.</p>
<div xml:id="divs">
<head>Text Division Elements and Global Attributes</head>
<p>The body of a prose text may be just a series of paragraphs or similar blocks of text,
or these may be grouped together into chapters, sections, subsections, etc. The
<gi>div</gi> element is used to represent any such grouping of blocks. <specList>
<specDesc key="div" atts="type"/>
</specList>
</p>
<p>The <att>type</att> attribute on the <gi>div</gi> element may be used to supply a
conventional name for this category of text division in order to distinguish them.
Typical values might be <val>book</val>, <val>chapter</val>, <val>section</val>,
<val>part</val>, <val>poem</val>, <val>song</val>, etc. TEI simplePrint does not
constrain the range of values that may be used here. </p>
<p>A <gi>div</gi> element may itself contain further, nested, <gi>div</gi>s, thus
mimicking the traditional structure of a book, which can be decomposed hierarchically
into units such as parts, containing chapters, containing sections, and so on. TEI texts
in general conform to this simple hierarchic model.</p>
<p>Here as elsewhere the <att>xml:id</att> attribute may be used to supply a unique
identifier for the division, which may be used for cross references or other links to
it, such as a commentary, as further discussed in section <ptr target="#Simple-ptrs"/>.
It is good practice to provide an <att>xml:id</att> attribute for every major structural
unit in a text, and to derive its values in some systematic way, for example by
appending a section number to a short code for the title of the work in question, as in
the examples below. </p>
<p>The <att>n</att> attribute may be used to supply (additionally or alternatively) a
short mnemonic name or number for a division, or any other element. If a conventional
form of reference or abbreviation for the parts of a work already exists (such as the
book/chapter/verse pattern of Biblical citations), the <att>n</att> attribute is the
place to record it; unlike the identifier supplied by the <att>xml:id</att> attribute,
it does not need to be unique.</p>
<p>The <att>xml:lang</att> attribute may be used to specify the language of the division.
Languages are identified by an internationally defined code, as further discussed in
section <ptr target="#z636"/> below.</p>
<p>The <att>rendition</att> attribute may be used to supply information about the
rendition (appearance) of a division, or any other element, as further discussed in
section <ptr target="#Simple-hilites"/> below. Note that this attribute is used to
describe the appearance of the <emph>source</emph> text, rather than the appearance of
any intended output when the encoded text is displayed. The two may of course be
similar, or identical, but the TEI does not assume or require this.</p>
<p>These four attributes, <att>xml:id</att>, <att>n</att>, <att>xml:lang</att>, and
<att>rendition</att> are so widely useful that they are allowed on any element in any
TEI schema: they are called <term>global attributes</term>. Other attributes defined in
the TEI simplePrint schema are discussed in section <ptr target="#xatts"/>.</p>
<p>As noted above, the value of every <att>xml:id</att> attribute must be unique within a
document. One simple way of ensuring this is to make it reflect the hierarchic structure
of the document. For example, Smith's <title>Wealth of Nations</title> as first
published consists of five books, each of which is divided into chapters, while some
chapters are further subdivided into parts. We might define <att>xml:id</att> values for
this structure as follows: <egXML xmlns="http://www.tei-c.org/ns/Examples"
valid="feasible"><body>
<div xml:id="WN1" n="I" type="book">
<div xml:id="WN101" n="I.1" type="chapter">
<!-- ... -->
</div>
<div xml:id="WN102" n="I.2" type="chapter">
<!-- ... -->
</div>
<!-- ... -->
<div xml:id="WN110" n="I.10" type="chapter">
<div xml:id="WN1101" n="I.10.1" type="part">
<!-- ... -->
</div>
<div xml:id="WN1102" n="I.10.2" type="part">
<!-- ... -->
</div>
</div>
<!-- ... -->
</div>
<div xml:id="WN2" n="II" type="book">
<!-- ... -->
</div>
</body></egXML>
</p>
<p>A different numbering scheme may be used for <att>xml:id</att> and <att>n</att>
attributes: this is often useful where a canonical reference scheme is used which does
not tally with the structure of the work. For example, in a novel divided into books
each containing chapters, where the chapters are numbered sequentially through the whole
work, rather than within each book, one might use a scheme such as the following: <egXML
xmlns="http://www.tei-c.org/ns/Examples" valid="feasible"><body>
<div xml:id="TS01" n="1" type="volume">
<div xml:id="TS011" n="1" type="chapter">
<!-- ... -->
</div>
<div xml:id="TS012" n="2" type="chapter">
<!-- ... --></div>
</div>
<div xml:id="TS02" n="2" type="volume">
<div xml:id="TS021" n="3" type="chapter">
<!-- ... --></div>
<div xml:id="TS022" n="4" type="chapter">
<!-- ... --></div>
</div>
</body></egXML> Here the work has two volumes, each containing two chapters. The
chapters are numbered conventionally 1 to 4, but the <att>xml:id</att> values specified
allow them to be regarded additionally as if they were numbered 1.1, 1.2, 2.1, 2.2.</p>
</div>
<div xml:id="h25">
<head>Headings and Closings</head>
<p>Every <gi>div</gi> may have a title or heading at its start, and (less commonly) a
trailer such as <q>End of Chapter 1</q> at its end. The following elements may be used
to transcribe them: <specList>
<specDesc key="head"/>
<specDesc key="trailer"/>
</specList> Some other elements which may be found at the beginning or ending of text
divisions are discussed below in section <ptr target="#h52"/>.</p>
<p>Whether or not headings and trailers are included in a transcription is a matter for
the individual transcriber to decide. Where a heading is completely regular (for example
<q>Chapter 1</q>) or may be automatically constructed from attribute values (e.g.
<tag>div type="chapter" n="1"</tag>), it may be omitted; where it contains otherwise
unrecoverable text it should always be included. For example, the start of Hardy's
<title>Under the Greenwood Tree</title> might be encoded as follows: <egXML
xmlns="http://www.tei-c.org/ns/Examples"><div xml:id="UGT1" n="Winter" type="part">
<div xml:id="UGT101" n="1" type="chapter">
<head>Mellstock-Lane</head>
<p>To dwellers in a wood almost every species of tree ... </p>
</div>
</div></egXML>
</p>
</div>
<div xml:id="vedr">
<head>Textual Components</head>
<p>In prose texts such as the Brontë example above, the divisions are generally composed
of paragraphs, represented as <gi>p</gi> elements, though in some circumstances it may
be preferred to use the <soCalled>anonymous block</soCalled> element <gi>ab</gi>. In
poetic or dramatic texts different elements are used, representing stanzas and verse
lines in the first case, and individual speeches or stage directions in the second: <specList>
<specDesc key="p"/>
<specDesc key="ab"/>
<specDesc key="l"/>
<specDesc key="lg"/>
<specDesc key="sp"/>
<specDesc key="speaker"/>
<specDesc key="stage"/>
</specList>
</p>
<p>We discuss each of these kinds of component separately below.</p>
<div>
<head>Verse</head>
<p>Here, for example, is the start of a poetic text in which verse lines and stanzas are
tagged: <egXML xmlns="http://www.tei-c.org/ns/Examples"><lg n="I">
<l>I Sing the progresse of a deathlesse soule,</l>
<l>Whom Fate, with God made, but doth not controule,</l>
<!-- ... -->
<l>A worke t'out weare Seths pillars, bricke and stone,</l>
<l>And (holy writs excepted) made to yeeld to none,</l>
</lg></egXML>
</p>
<p>Note that the <gi>l</gi> element marks verse lines, not typographic lines: as
elsewhere the original lineation of the source text is not therefore preserved by this
encoding. The <gi>lb</gi> element described in section <ptr target="#Simple-pln"/>
might additionally be used to mark typographic lines if so desired. </p>
<p>In a poetic text it may also be considered useful to identify the rhymes, for which
the following element may be used: <specList>
<specDesc key="rhyme" atts="label"/>
</specList> The following example shows how this element might be used both to
identify rhyming words or word parts and to assign each rhyme to a part of a rhyming
pattern by means of its <att>label</att> attribute. The rhyming pattern here is
specified by the <att>rhyme</att> attribute supplied on the <gi>lg</gi> representing
the stanza within which the pattern operates: <egXML
xmlns="http://www.tei-c.org/ns/Examples"><lg rhyme="AABCCBBA">
<l>The sunlight on the <rhyme label="A">garden</rhyme></l>
<l><rhyme label="A">Harden</rhyme>s and grows <rhyme label="B">cold</rhyme>,</l>
<l>We cannot cage the <rhyme label="C">minute</rhyme></l>
<l>Wi<rhyme label="C">thin it</rhyme>s nets of <rhyme label="B">gold</rhyme>
</l>
<l>When all is <rhyme label="B">told</rhyme>
</l>
<l>We cannot beg for <rhyme label="A">pardon</rhyme>.</l>
</lg>
</egXML> The <att>rhyme</att> attribute may be used independently of the
<gi>rhyme</gi> element, or in combination with it, as above. </p>
</div>
<div>
<head>Drama</head>
<p>A dramatic text contains speeches, which may be in prose or verse, and will also
contain stage directions. The <gi>sp</gi> element is used to represent each identified
speech. It contains an optional speaker indication, marked with the <gi>speaker</gi>
element, which can be followed by one or more <gi>l</gi> or <gi>p</gi> elements,
depending on whether the speech is considered to be in prose or in verse. Stage
directions, whether within or between speeches, are marked using the <gi>stage</gi>
element. </p>
<p>For example: <egXML xmlns="http://www.tei-c.org/ns/Examples">
<sp>
<speaker>Vladimir</speaker>
<p>Pull on your trousers.</p>
</sp>
<sp>
<speaker>Estragon</speaker>
<p>You want me to pull off my trousers?</p>
</sp>
<sp>
<speaker>Vladimir</speaker>
<p>Pull <hi>on</hi> your trousers.</p>
</sp>
<sp>
<speaker>Vladimir</speaker>
<p><stage>(realizing his trousers are down)</stage>. True</p>
</sp>
<stage>He pulls up his trousers</stage>
<sp>
<speaker>Vladimir</speaker>
<p>Well? Shall we go?</p>
</sp>
<sp>
<speaker>Estragon</speaker>
<p>Yes, let's go.</p>
</sp>
<stage>They do not move.</stage>
</egXML>
</p>
<p>In a verse drama, it is quite common to find that verse lines are split between
speakers. The easiest way of encoding this is to use the <att>part</att> attribute to
indicate that the lines so fragmented are incomplete: <egXML
xmlns="http://www.tei-c.org/ns/Examples"><div type="Act" n="I">
<head>ACT I</head>
<div type="Scene" n="1">
<head>SCENE I</head>
<stage rendition="#italic">Enter Barnardo and Francisco, two Sentinels, at
several doors</stage>
<sp>
<speaker>Barn</speaker>
<l part="Y">Who's there?</l>
</sp>
<sp>
<speaker>Fran</speaker>
<l>Nay, answer me. Stand and unfold yourself.</l>
</sp>
<sp>
<speaker>Barn</speaker>
<l part="I">Long live the King!</l>
</sp>
<sp>
<speaker>Fran</speaker>
<l part="M">Barnardo?</l>
</sp>
<sp>
<speaker>Barn</speaker>
<l part="F">He.</l>
</sp>
<sp>
<speaker>Fran</speaker>
<l>You come most carefully upon your hour.</l>
</sp>
<!-- ... -->
</div>
</div></egXML> The value of the <att>part</att> attribute may indicate just that the
element bearing is fragmented in some (unspecified) respect rather than a complete
verse line (<code>part="Y"</code>); alternatively it may indicate whether this is an
initial (I), medial (M) or F (final) fragment. </p>
<p>The same mechanism may be applied to stanzas which are divided between two speakers:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><div>
<sp>
<speaker>First voice</speaker>
<lg type="stanza" part="I">
<l>But why drives on that ship so fast</l>
<l>Withouten wave or wind?</l>
</lg>
</sp>
<sp>
<speaker>Second Voice</speaker>
<lg type="stanza" part="F">
<l>The air is cut away before.</l>
<l>And closes from behind.</l>
</lg>
</sp>
<!-- ... -->
</div></egXML>
</p>
<p>The <gi>sp</gi> element can also be used for dialogue presented in a prose work as if
it were drama, as in the next example, which also demonstrates the use of the
<att>who</att> attribute to bear a code identifying the speaker of the piece of
dialogue concerned: <egXML xmlns="http://www.tei-c.org/ns/Examples"><div>
<sp who="#OPI">
<speaker>The reverend Doctor Opimian</speaker>
<p>I do not think I have named a single unpresentable fish.</p>
</sp>
<sp who="#GRM">
<speaker>Mr Gryll</speaker>
<p>Bream, Doctor: there is not much to be said for bream.</p>
</sp>
<sp who="#OPI">
<speaker>The Reverend Doctor Opimian</speaker>
<p>On the contrary, sir, I think there is much to be said for him. In the first
place....</p>
<p>Fish, Miss Gryll -- I could discourse to you on fish by the hour: but for the
present I will forbear.</p>
</sp>
</div></egXML> Here the <att>who</att> attribute values (<val>#OPI</val> etc.) are
links, pointing to items in a list of the characters in the novel. In the case of a
play, this list of characters might appear in the original source as a cast list or
dramatic personae, which might be marked up using the <gi>castList</gi> element
described in section <ptr target="#specfronbac"/> below. Such a list would not, of
course, be appropriate to provide descriptive information about each character, much
of which does not appear in the original source. Instead a <gi>particDesc</gi>
(participant description) element should be provided in the TEI header, as further
discussed in section <ptr target="#profDesc"/> below. </p>
</div>
<div xml:id="abseg">
<head>Other Kinds of Text Block</head>
<p>As mentioned above, the <gi>ab</gi> element may also be used in preference to the
<gi>p</gi> element. It should be used for blocks of text which are not clearly
paragraphs, verse lines, or dramatic speeches. Typical examples include the canonical
verses of the Bible, and the textual blocks of other ancient documents which predate
the invention of the paragraph, such as Greek inscriptions or Egyptian hieroglyphs.
The element is also useful as a means of encoding more specialized kinds of textual
block, such as the question and answer structure of a catechism, or the highly
formalized substructure of a legal document (if <gi>div</gi> is not considered
appropriate for these). In more modern documents, it can be used to encode
semi-organized or fragmentary materials such as an artist's notebook or work in
progress; or to faithfully capture the substructure of a file produced by an OCR
system. </p>
</div>
</div>
<div xml:id="Simple-pln">
<head>Page and Line Numbers</head>
<p>Page and line breaks etc. may be marked with the following elements:<specList>
<specDesc key="pb"/>
<specDesc key="lb"/>
<specDesc key="cb"/>
<specDesc key="milestone"/>
<specDesc key="fw"/>
</specList> The <gi>pb</gi>, <gi>lb</gi>, and <gi>cb</gi> elements are special cases of
a general class of elements known as <term>milestone</term>s because they mark reference
points within a text. The generic <gi>milestone</gi> element can mark any kind of
reference point: for example, a column break, the start of a new kind of section not
otherwise tagged, a change of author or style, or in general any significant change in
the text not enclosed by an XML element. Unlike other elements, milestone elements do
not enclose a piece of text and make an assertion about it; instead they indicate a
point in the text where something changes, as indicated by a change in the values of the
milestone's attributes <att>unit</att>, which indicates the <q>something</q> concerned,
and <att>n</att> which indicates the new value. </p>
<p>The <gi>pb</gi>, <gi>lb</gi>, and <gi>cb</gi> elements are shortcuts or <term>syntactic
sugar</term> for <tag>milestone unit="page"/</tag>
<tag>milestone unit="line"/</tag> and <tag>milestone unit="column"/</tag> respectively. </p>
<p>When working from a paginated original, it is often useful to record its pagination,
whether to simplify later proof-reading, or to align the transcribed text with a set of
page images, as further discussed below.</p>
<p>Because <gi>pb</gi> and other milestone elements are empty, they may be placed freely
within or between other elements. However, it is recommended practice always to put them
at the <emph>beginning</emph> of whatever unit it is that their presence implies, and
not to nest them within elements contained by that unit. For example, in the following
example a page break occurs between two lines of a poem: <egXML
xmlns="http://www.tei-c.org/ns/Examples">
<l>Mary had a little lamb</l>
<pb n="13"/>
<l>Its fleece was white as snow</l>
</egXML> The <gi>pb</gi> element should be placed ahead of all the text encoded on the
13th page. Contrast this with the following less accurate encoding: <egXML
xmlns="http://www.tei-c.org/ns/Examples">
<l>Mary had a little lamb</l>
<l>
<pb n="13"/>Its fleece was white as snow</l>
</egXML> This is less accurate because it implies that the second verse line actually
begins before the page break. </p>
<p>Similar considerations apply to line breaks (<gi>lb</gi>), though these are less
frequently considered useful when encoding modern printed textual sources. When
transcribing manuscripts or early printed books, however, it is often helpful to retain
them in an encoding, if only to facilitate alignment of transcription and original. Like
<gi>pb</gi>, the <gi>lb</gi> element should appear <emph>before</emph> the text of the
line whose start it signals.</p>
<p>If features such as pagination or lineation are marked for more than one edition, the
edition in question may be specified by the <att>ed</att> attribute. For example, in the
following passage we indicate where the page breaks occur in two different editions
(<val>ED1</val> and <val>ED2</val>): <egXML xmlns="http://www.tei-c.org/ns/Examples"
><p>I wrote to Moor House and to Cambridge immediately, to say what I had done:
fully explaining also why I had thus acted. Diana and <pb ed="ED1" n="475"/> Mary
approved the step unreservedly. Diana announced that she would <pb ed="ED2" n="485"
/>just give me time to get over the honeymoon, and then she would come and see
me.</p></egXML>
</p>
<p>When transcribing from a paginated source, the encoder must decide whether to suppress
such features as running titles, page signatures, catch words etc., to replace them by a
simplified representation using the <gi>pb</gi> element, perhaps using the <att>n</att>
attribute to preserve some of the information, or to preserve them entirely using the
<gi>fw</gi> element. The latter strategy is appropriate in encodings which aim to
retain as much information as possible about the original typography; it will however
make more complex the processing of the source for other purposes, as in the following
example: <egXML xmlns="http://www.tei-c.org/ns/Examples"><l>He also fix'd the wandering
QUEEN OF NIGHT,</l>
<fw type="sig">Ii 2</fw>
<fw type="catch">Whether</fw>
<pb n="244"/>
<l>Whether she wanes into a scanty orb</l>...<!-- Thomson, Seasons, 1730-->
</egXML>
</p>
<p>The <gi>pb</gi> element is also used to align parts of a transcription with a digital
image of the page concerned. This may be done in a very simple but inflexible way by
using the <att>facs</att> attribute to point to each page image concerned: <egXML
xmlns="http://www.tei-c.org/ns/Examples"><p>I wrote to Moor House and to Cambridge
immediately, to say what I had done: fully explaining also why I had thus acted.
Diana and <pb ed="ED1" n="475" facs="ed1p475.png"/> Mary approved the step
unreservedly... </p></egXML> The <att>facs</att> attribute can supply (as here) a
filename, or any other form of URI, if for example the page image is stored remotely.
One drawback of this simplistic approach is that there must be exactly one image file
per page of text. It is not therefore suitable in the case where the available page
images represent double page spreads, or where there are multiple images of the same
page (for example at different resolutions). </p>
<p>A more powerful approach, discussed in section <ptr target="#Simple-fax"/> below, is to
use the <gi>facsimile</gi> element to define the organisation of the set of images
representing the text, and then use the <att>facs</att> attribute to point to individual
components of that representation. </p>
</div>
<div xml:id="Simple-hilites">
<head>Marking Highlighted Phrases</head>
<div xml:id="faces">
<head>Changes of Typeface, etc.</head>
<p>Highlighted words or phrases are those made visibly different from the rest of the
text, typically by a change of type font, handwriting style, ink colour etc., which is
intended to draw the reader's attention to some associated change.</p>
<p>The global <att>rendition</att> attribute can be attached to any element, and used
wherever necessary to specify details of the highlighting used for it in the source.
For example, a heading rendered in bold might be tagged <tag>head
rendition="simple:bold"</tag>, and one in italic <tag>head
rendition="simple:italic"</tag>.</p>
<p>The values used for the <att>rendition</att> attribute point to definitions provided
for the formatting concerned. These definitions are typically provided by a
<gi>rendition</gi> element in the document's header, as further discussed in section
<ptr target="#hdr-rend"/>. </p>
<p>It is not always possible or desirable to interpret the reasons for such changes of
rendering in a text. In such cases, the element <gi>hi</gi> may be used to mark a
sequence of highlighted text without making any claim as to its status. <specList>
<specDesc key="hi"/>
</specList></p>
<p>In the following example, the use of a distinct typeface for the subheading and for
the included name are recorded but not interpreted: <egXML
xmlns="http://www.tei-c.org/ns/Examples"><p><hi rendition="simple:blackletter">And
this Indenture further witnesseth</hi> that the said <hi
rendition="simple:italic">Walter Shandy</hi>, merchant, in consideration of the
said intended marriage ...</p></egXML>
</p>
<p>Alternatively, where the cause for the highlighting can be identified with
confidence, a number of other, more specific, elements are available. <specList>
<!-- <specDesc key="emph"/>-->
<specDesc key="foreign"/>
<!--<specDesc key="gloss"/>-->
<specDesc key="label"/>
<!--<specDesc key="mentioned"/>-->
<!--<specDesc key="term"/>-->
<specDesc key="title"/>
</specList></p>
<p>Some features (notably quotations, <!--and glosses--> titles, and foreign words) may
be found in a text either marked by highlighting, or with quotation marks. In either
case, the element <gi>q</gi>
<!--and
<gi>gloss</gi>--> (as discussed in the following section) should be used.
Again, the global <att>rendition</att> attribute can be used to record details of the
highlighting used in the source if this is thought useful. </p>
<p>As an example of the elements defined here, consider the following sentence: <q
rendition="simple:display">On the one hand the <hi rendition="simple:italic"
>Nibelungenlied</hi> is associated with the new rise of romance of twelfth-century
France, the <hi rendition="simple:italic">romans d'antiquité</hi>, the romances of
Chrétien de Troyes, and the German adaptations of these works by Heinrich van
Veldeke, Hartmann von Aue, and Wolfram von Eschenbach.</q> Interpreting the role of
the highlighting, the sentence might be encoded as follows: <egXML
xmlns="http://www.tei-c.org/ns/Examples"><p>On the one hand the
<title>Nibelungenlied</title> is associated with the new rise of romance of
twelfth-century France, the <foreign>romans d'antiquité</foreign>, the romances of
Chrétien de Troyes, ...</p></egXML> Describing only the appearance of the
original, it might be encoded like this: <egXML
xmlns="http://www.tei-c.org/ns/Examples"><p>On the one hand the <hi
rendition="simple:italic">Nibelungenlied</hi> is associated with the new rise of
romance of twelfth-century France, the <hi rendition="simple:italic">romans
d'antiquité</hi>, the romances of Chrétien de Troyes, ...</p></egXML>
</p>
</div>
<div xml:id="z635">
<head>Quotations and Related Features</head>
<p>Like changes of typeface, quotation marks are conventionally used to denote several
different features within a text, of which the most frequent is quotation, though many
other features are possible. The full TEI Guidelines provide additional elements such
as <gi>mentioned</gi> or <gi>said</gi> to distinguish some of these features, but
these more specialised elements are not included in TEI simplePrint. In TEI Simple
however, we use the <gi>quote</gi> element for quotation only, and the <gi>q</gi>
element for all other material found within quotation marks in the text.<specList>
<specDesc key="q"/>
<specDesc key="quote"/>
<!--specDesc key="said"/-->
<!--specDesc key="mentioned"/>-->
<!--specDesc key="soCalled"/>-->
<!--<specDesc key="gloss"/>-->
</specList>
</p>
<p>Here is a simple example of a quotation: <egXML
xmlns="http://www.tei-c.org/ns/Examples"><p>Few dictionary makers are likely to
forget Dr. Johnson's description of the lexicographer as <quote>a harmless
drudge.</quote></p></egXML>
</p>
<p>As elsewhere, the way that a citation or quotation was printed (for example,
<term>in-line</term> or set off as a <hi>display</hi> or <hi>block quotation</hi>),
may be represented using the <att>rendition</att> attribute. This may also be used to
indicate the kind of quotation marks used.</p>
<p>Direct speech interrupted by a narrator can be represented simply by ending the
<gi>q</gi> element and beginning it again after the interruption, as in the
following example: <egXML xmlns="http://www.tei-c.org/ns/Examples"><p><q>Who-e debel
you?</q> — he at last said — <q>you no speak-e, damme, I kill-e.</q> And so
saying, the lighted tomahawk began flourishing about me in the dark.</p></egXML>
If it is important to convey the idea that the two <gi>q</gi> elements together make
up a single speech, the linking attributes <att>next</att> and <att>prev</att> may be
used, as described in section <ptr target="#xatts"/>.</p>
<p>Direct speech may be accompanied by a reference to the source or speaker, using the
<att>who</att> attribute, whether or not this is explicit in the text, as in the
following example: <egXML xmlns="http://www.tei-c.org/ns/Examples"><q who="#Wilson"
>Spaulding, he came down into the office just this day eight weeks with this very
paper in his hand, and he says:—<q who="#Spaulding">I wish to the Lord, Mr.
Wilson, that I was a red-headed man.</q></q></egXML> This example also
demonstrates how quotations may be embedded within other quotations: one speaker
(Wilson) quotes another speaker (Spaulding).</p>
<p>The creator of the electronic text must decide whether quotation marks are replaced
by the tags or whether the tags are added and the quotation marks kept. If the
quotation marks are removed from the text, the <att>rendition</att> attribute may be
used to record the way in which they were rendered in the copy text.</p>
<!-- example please -->
</div>
<div xml:id="z636">
<head>Foreign Words or Expressions</head>
<p>Words, phrases, or longer stretches of text that are not in the main language of the
texts may be tagged as such in one of two ways. The global <att>xml:lang</att>
attribute may be attached to any element to show that it uses some other language than
that of the surrounding text. Where there is no applicable element, the element
<gi>foreign</gi> may be used, again using the <att>xml:lang</att> attribute. For
example: <egXML xmlns="http://www.tei-c.org/ns/Examples"><p>John has real <foreign
xml:lang="fr">savoir-faire</foreign>.</p><p>Have you read <title xml:lang="de"
>Die Dreigroschenoper</title>?</p></egXML>
</p>
<p>As these examples show, the <gi>foreign</gi> element should not be used to tag
foreign words if some other more specific element such as <gi>title</gi>, or
<gi>div</gi> applies. </p>
<p>The value of the <att>xml:lang</att> attribute on an element applies hierarchically
to everything contained by that element, unless overridden:</p>
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<div xml:lang="la">
<p>Pars haec Latine composita est.</p>
<p xml:lang="en">Except that this sentence is in English.</p>
<p>Vita brevis, ars longa.</p>
</div>
</egXML>
<p>Here we specify that the whole <gi>div</gi> element uses the language with the coded
identifier <val>la</val> i.e., Latin. Since it is contained by that <gi>div</gi> there
is no need to supply this information again for the first <gi>s</gi> element. The
second <gi>s</gi> element however overrides this value, and indicates that its content
is in English (the language with identifier <val>en</val>). The third <gi>s</gi>
element is again in Latin.</p>
<p>The codes used to identify languages, supplied on the <att>xml:lang</att> attribute,
are defined by an international standard<note place="foot">The relevant Internet
standard is <title>Best Current Practice 47</title> (<ptr
target="http://tools.ietf.org/html/bcp47"/>). The authoritative list of registered
subtags is maintained by IANA and is available at <ptr