Exact history was lost before Sept. 18th, 2000, but old source code comments show that Writer core dates back until at least November 1990.
inc
: headers available to all source files inside the moduleqa
: unit, slow and subsequent testssdi
source
: see belowuiconfig
: user interface configurationutil
: UNO passive registration config
core
: Writer core (document model, layout, UNO API implementation)filter
: Writer internal filtersascii
: plain text filterbasflt
docx
: wrapper for the UNO DOCX import filter (in writerfilter) for autotext purposeshtml
: HTML filterinc
: include files for filtersrtf
: thin copy&paste helper around the UNO RTF import filter (in writerfilter)writer
ww8
: DOC import, DOC/DOCX/RTF exportxml
: ODF import/export, subclassed from xmloff (where most of the work is done)
uibase
: user interface (those parts that are linked intosw
& always loaded)ui
: user interface (optional parts that are loaded on demand (swui
))
There is a good overview documentation of basic architecture of Writer core in the OOo wiki:
- https://wiki.openoffice.org/wiki/Writer/Core_And_Layout
- https://wiki.openoffice.org/wiki/Writer/Text_Formatting
Writer specific WhichIds are defined in sw/inc/hintids.hxx
.
The details below are mainly about details missing from the wiki pages.
The central class for a document is SwDoc
, which represents a document.
A lot of the functionality is split out into separate Manager classes,
each of which implements some IDocument*
interface; there are
SwDoc::getIDocument*()
methods to retrieve the managers.
However there are still too many members and methods in this class, many of which could be moved to some Manager or other...
Basically a (fancy) array of SwNode
pointers. There are special subclasses of
SwNode
(SwStartNode
and SwEndNode
) which are used to encode a nested tree
structure into the flat array; the range of nodes from SwStartNode
to its
corresponding SwEndNode
is sometimes called a "section" (but is not necessarily
what the high-level document model calls a "Section"; that is just one of the
possibilities).
The SwNodes
contains the following top-level sections:
- Empty
- Footnote content
- Frame / Header / Footer content
- Deleted Change Tracking content
- Body content
The Undo/Redo information is stored in a sw::UndoManager
member of SwDoc
,
which implements the IDocumentUndoRedo
interface.
Its members include a SwNodes
array containing the document content that
is currently not in the actual document but required for Undo/Redo, and
a stack of SwUndo
actions, each of which represents one user-visible
Undo/Redo step.
There are also ListActions
which internally contain several individual SwUndo
actions; these are created by the StartUndo/EndUndo wrapper methods.
The sub-structure of paragraphs is stored in the SwpHintsArray
member
SwTextNode::m_pSwpHints
. There is a base class SwTextAttr
with numerous
subclasses; the SwTextAttr
has a start and end index and a SfxPoolItem
to store the actual formatting attribute.
There are several sub-categories of SwTextAttr
:
-
formatting attributes: Character Styles (
SwTextCharFormat
,RES_TXTATR_CHARFMT
) and Automatic Styles (no special class,RES_TXTATR_AUTOFMT
): these are handled bySwpHintsArray::BuildPortions
and MergePortions, which create non-overlapping portions of formatting attributes. -
nesting attributes: Hyperlinks (
SwTextINetFormat
,RES_TXTATR_INETFMT
), Ruby (SwTextRuby
,RES_TXTATR_CJK_RUBY
) and Meta/MetaField (SwTextMeta
,RES_TXTATR_META/RES_TXTATR_METAFIELD
): these maintain a properly nested tree structure. The Meta/Metafield are "special" because they have both start/end and a dummy character at the start. -
misc. attributes: Reference Marks, ToX Marks
-
attributes without end: Fields, Footnotes, Flys (
AS_CHAR
) These all have a corresponding dummy character in the paragraph text, which is a placeholder for the "expansion" of the attribute, e.g. field content.
There are multiple model classes involved for fields:
enum SwFieldIds
enumerates the different types of fields.SwFieldType
contains some shared stuff for all fields of a type. There are many subclasses ofSwFieldType
, one for each different type of field. For most types of fields there is one shared instance of this per type, which is created inDocumentFieldsManager::InitFieldTypes()
but for some there are more than one, and they are dynamically created, seeDocumentFieldsManager::InsertFieldType()
. An example for the latter are variable fields (SwFieldIds::GetExp/SwFieldIds::SetExp
), with oneSwFieldType
per variable.SwXFieldMaster
is the UNO wrapper of a field type. It is aSwClient
registered at theSwFieldType
. Its life-cycle is determined by UNO clients outside ofsw
; it will get disposed when theSwFieldType
dies.SwFormatField
is theSfxPoolItem
of a field. TheSwFormatField
is aSwClient
registered at itsSwFieldType
. TheSwFormatField
owns theSwField
of the field.SwField
contains the core logic of a field. TheSwField
is owned by theSwFormatField
of the field. There are many subclasses ofSwField
, one for each different type of field. Note that there are not many places that can Expand the field to its correct value, since for example page number fields require a View with an up to date layout; therefore the correct expansion is cached.SwTextField
is the text attribute of a field. It owns theSwFormatField
of the field (like all text attributes).SwXTextField
is the UNO wrapper object of a field. It is aSwClient
registered at theSwFormatField
. Its life-cycle is determined by UNO clients outside ofsw
; it will get disposed when theSwFormatField
dies.
-
SwNumFormat
(subclass ofSvxNumFormat
) determines the formatting of a single numbering level. -
SwNumRule
(NOT a subclass ofSvxNumRule
) is a list style, containing oneSwNumFormat
per list level.SwNumRule::maTextNodeList
is the list ofSwTextNode
that have this list style applied. -
SwNumberTreeNode
is a base class that represents an abstract node in a hierarchical tree of numbered nodes. -
SwNodeNum
is the subclass ofSwNumberTreeNode
that connects it with an actualSwTextNode
and also with aSwNumRule
;SwTextNode::mpNodeNum
points back in the other direction -
SwList
represents a list, which is (mostly) a vector ofSwNodeNum
trees, one perSwNodes
top-level section (why that?). -
IDocumentListsAccess
,sw::DocumentListsManager
owns allSwList
instances, and maintains mappings:- from list-id to
SwList
- from list style name to
SwList
(the "default"SwList
for that list style)
- from list-id to
-
IDocumentListItems
,sw::DocumentListItemsManager
contains a set of allSwNodeNum
instances, ordered bySwNode
index -
the special Outline numbering rule:
SwDoc::mpOutlineRule
-
IDocumentOutlineNodes
,sw::DocumentOutlineNodesManager
maintain a list (which is actually stored inSwNodes::m_pOutlineNodes
) ofSwTextNodes
that either have the Outline numrule applied, or have theRES_PARATR_OUTLINELEVEL
item set (note that in the latter case, theSwTextNode
does not have aSwNodeNum
and is not associated with theSwDoc::mpOutlineRule
). -
SwTextNodes
and paragraph styles have items/properties:RES_PARATR_OUTLINELEVEL/"OutlineLevel"
to specify an outline level without necessarily having the outlineSwNumRule
assignedRES_PARATR_NUMRULE/"NumberingStyleName"
the list style to apply; may be empty""
which means no list style (to override inherited value) OnlySwTextNode
has these items:RES_PARATR_LIST_ID/"ListId"
determines theSwList
to which the node is addedRES_PARATR_LIST_LEVEL/"NumberingLevel"
the level at which theSwTextNode
will appear in the listRES_PARATR_LIST_ISRESTART/"ParaIsNumberingRestart"
restart numbering sequence at thisSwTextNode
RES_PARATR_LIST_RESTARTVALUE/"NumberingStartValue"
restart numbering sequence at thisSwTextNode
with this valueRES_PARATR_LIST_ISCOUNTED/"NumberingIsNumber"
determines if the node is actually counted in the numbering sequence; these are different from"phantoms"
because there's still aSwTextNode
.
Note that there is no UNO service to represent a list.
The layout is a tree of SwFrame
subclasses, the following relationships are
possible between frames:
- You can visit the tree by following the upper, lower, next and previous pointers.
- The functionality of flowing of a frame across multiple parents (e.g. pages)
is implemented in
SwFlowFrame
, which is not anSwFrame
subclass. The logical chain of such frames can be visited using the follow and precede pointers. ("Leaf" is a term that refers to such a relationship.) - In case a frame is split into multiple parts, then the first one is called master, while the others are called follows.