-
Notifications
You must be signed in to change notification settings - Fork 272
Version 0.60.0 Changes
In many cases, the extent of the changes and optimizations made it impossible to use the slow
@Deprecated
annotation with eventual removal. Manual changes to code will be required to
migrate code.
ℹ️ If you encounter difficulty in migrating some code constructs please open an issue with details of the construct to be migrated.
Major improvements include:
-
New implementation of
SegmentedSequence
using binary offset tree with efficient access, storage and instantiation. -
New
SequenceBuilder
class, used to create segmented sequences of arbitrary content without concern for segment ordering or whether they share a common base sequence.A segment which cannot be converted to an offset range from the base sequence will be converted to out of base characters, preserving the expected character sequence result.
The builder will optimize literal characters when they match corresponding base sequence characters with special handling of spaces and EOL characters. This means that adding literal spaces and EOL characters instead of using a subsequence will result in them being efficiently replaced by segments from the original base sequence.
For convenience, an instance of
SequenceBuilder
can be obtained from any based sequence throughBasedSequence.getBuilder()
method. -
New
LineAppendable
implementation used for rendering text. Internally, the class builds a list of lines and keeps track of each line's prefix portion, allowing efficient access and manipulation of lines and prefixes in the rendered result.The generated
BasedSequence
result will result in aSegmentedSequence
with offsets into the source sequence preserved, allowing mapping offsets in result to original sequence.The result lines are stored as separate
BasedSequences
to maximize preservation of original sequence offset information when the rendering rearranges the lines of the source, as in the case of formatting with reference definition sorting orMarkdownTable
sorting. -
Formatting module is now part of the core library with additional features:
- Paragraph text wrapping to fit within the set right margin.
- Source offset tracking from formatted to original markdown
- Table formatting includes sorting by columns and transpose table methods
- Major reorganization and code cleanup of implementation
-
Formatter implementation is now part of core implementation in
flexmark
module -
Formatter
improved with more options including wrapping text to margins.- added ability to track and map source offset(s) to their index in formatted sequence. This feature allows editor caret position preservation across formatting operation.
- Offset tracking unified using
TrackedOffset
. Used byMarkdownParagraph
for text wrapping andMarkdownTable
for table formatting and able to handle caret position during typing and backspace editing operations which are immediately followed by formatting or the edited source.
-
Tests cleaned up to eliminate duplication and hacks
-
flexmark-test-util
made reusable for other projects. Having markdown as the source code for tests is too convenient for use only inflexmark-java
tests. -
Optimized
SegmentedSequence
implementation using binary trees for searching segments and byte efficient segment packing. Parser performance is either slightly improved or not affected but allows usingSegmentedSequences
for collectingFormatter
andHtmlRenderer
output to track source location of all text with minimal overhead and double the performance of old implementation. -
new implementation of
LineAppendable
replacesLineFormattingAppendable
used for text generation in rendering:-
uses
SequenceBuilder
to generateBasedSequence
result with original source offsets for those character segments which come from the source. This allows round trip source tracking from Source -> AST -> Formatted Source -> Source throughout the library.As an added bonus using the appendable makes formatting to it 40% faster than previous implementation and 160 times more efficient in memory use. For the tests below, old implementation allocated 6GB worth of segmented sequences, new implementation 37MB. The % overhead for the new implementation is four times greater than before but that is after a 43 fold reduction in total overhead bytes, old implementation needed 342MB of overhead, new implementation 8MB.
As a result of increased efficiency, two additional files of about 600kB each can be included in the test run and only add 0.6 sec to the formatter run time.
Tests run on 1141 markdown files from GitHub projects and some other user samples. Largest was 256k bytes.
Description Old SegmentedSequence New Segmented Sequence New LineAppendable Total wall clock time 13.896 sec 9.672 sec 8.344 sec Parse time 2.402 sec 2.335 sec 2.297 sec Formatter appendable 0.603 sec 0.602 sec 0.831 sec Formatter sequence builder 7.264 sec 3.109 sec 1.772 sec The overhead difference is significant. The totals are for all segmented sequences created during the test run of 1141 files. Parser statistics show requirements during parsing and formatting.
Description Old Parser Old Formatter New Parser New Formatter New LineAppendable Bytes for characters of all segmented sequences 917,016 6,029,774,526 917,016 6,029,774,526 37,663,196 Bytes for overhead of all segmented sequences 1,845,048 12,060,276,408 93,628 342,351,155 8,204,796 Overhead % 201.2% 200.0% 10.2% 5.7% 21.8% -
-
-
Break: split out generic AST utilities from
flexmark-util
module into separate smaller modules.com.vladsch.flexmark.util
no longer contains any files, only separate utility modules withflexmark-utils
module being an aggregate of all utilities modules, similar toflexmark-all
-
ast/
classes toflexmark-util-ast
-
builder/
classes toflexmark-util-builder
-
collection/
classes toflexmark-util-collection
-
data/
classes toflexmark-util-data
-
dependency/
classes toflexmark-util-dependency
-
format/
classes toflexmark-util-format
-
html/
classes toflexmark-util-html
-
mappers/
classes toflexmark-util-sequence
-
options/
classes toflexmark-util-options
-
sequence/
classes toflexmark-util-sequence
-
visitor/
classes toflexmark-util-visitor
-
-
Break: delete deprecated properties, methods and classes
-
Add:
org.jetbrains:annotations:15.0
dependency to have@Nullable
/@NotNull
annotations added for all parameters. When using IntelliJ IDEA for development, it helps to have these annotations for analysis of potential problems and makes it easier to use the library with Kotlin. -
Break: refactor and cleanup tests to eliminate duplicated code and allow easier reuse of test cases with spec example data.
-
Break: move formatter tests to
flexmark-core-test
module to allow sharing of formatter base classes in extensions without causing dependency cycles in formatter module. -
Break: move formatter module into
flexmark
core. this module is almost always included anyway because most extension have a dependency on formatter for their custom formatting implementations. Having it as part of the core allows relying on its functionality in all modules. -
Break: move
com.vladsch.flexmark.spec
andcom.vladsch.flexmark.util
inflexmark-test-util
tocom.vladsch.flexmark.test.spec
andcom.vladsch.flexmark.test.util
respectively to respect the naming convention between modules and their packages. -
Break:
NodeVisitor
implementation details have changed. If you were overridingNodeVisitor.visit(Node)
in the previous version it is nowfinal
to ensure compile time error is generated. You will need to change your implementation. See javadoc comment in theNodeVisitor
class for instructions.ℹ️
com.vladsch.flexmark.util.ast.Visitor
is only needed for implementation ofNodeVisitor
andVisitHandler
. If all anonymous implementations ofVisitHandler
are converted to lambdas, then imports forVisitor
can be eliminated.- Fix: remove old visitor like adapters and implement ones based on generic classes not linked to flexmark AST node.
- remove old base classes:
-
com.vladsch.flexmark.util.ast.NodeAdaptedVisitor
see javadoc for class com.vladsch.flexmark.util.ast.NodeAdaptingVisitHandler
com.vladsch.flexmark.util.ast.NodeAdaptingVisitor
-
IntelliJ-IDEA migration migrate flexmark-java 0_50_x to 0_60_0.xml can be used to assist in migrating from 0.50.40 to 0.60 version of the library. It will migrate class name and package changes only.
Changes to arguments and method changes have to be addressed manually.
This class is renamed to LineAppendable
. Implementation and subclasses are similarly renamed
to remove Formatting
in the class name.
All formatting flags are now prefixed with F_
and when present, select the given modification
of appended text. Previously, ALLOW_LEADING_WHITESPACE
and ALLOW_LEADING_EOL
were inverted
and setting them disabled the text modification.
-
ALLOW_LEADING_WHITESPACE
is nowF_TRIM_LEADING_WHITESPACE
and has inverted meaning. -
ALLOW_LEADING_EOL
is nowF_TRIM_LEADING_EOL
and has inverted meaning. -
CONVERT_TABS
is nowF_CONVERT_TABS
-
COLLAPSE_WHITESPACE
is nowF_COLLAPSE_WHITESPACE
-
TRIM_TRAILING_WHITESPACE
is nowF_TRIM_TRAILING_WHITESPACE
-
PASS_THROUGH
is nowF_PASS_THROUGH
-
TRIM_LEADING_WHITESPACE
is nowF_TRIM_LEADING_WHITESPACE
-
PREFIX_PRE_FORMATTED
is nowF_PREFIX_PRE_FORMATTED
-
FORMAT_ALL
is nowF_FORMAT_ALL
This interface and the implementation classes were refactored and were reworked for efficient
use with SequenceBuilder
.
-
CharPredicate
class is now used to provide character sets instead ofCharSequence
to provide consistent and efficient character tests. Methods withCharSequence
arguments which were used for selecting character sets, are nowCharPredicate
.The simplest way to change the method call is to use
CharPredicate.anyOf(CharSequence)
to convert a character sequence to predicate. -
some methods were renamed to better reflect their operation. In these cases the old name methods are deprecated and default implementation invokes the new methods.
This class was renamed to SegmentedSequenceFull
, which contains the old, inefficient
implementation. It is not recommended that the old class be used due to its inefficient and in
some cases buggy implementation.
The new SegmentedSequence
is an abstract class with concrete implementation by
SegmentedSequenceFull
and SegmentedSequenceTree
. The latter is an efficient implementation
using binary search tree.
The right way to create an instance of SegmentedSequence
is to use an instance of
SequenceBuilder
to build a sequence then use SequenceBuilder.toSequence()
to return an
instance of SegmentedSequenceTree
if the result requires a segmented sequence or a subsequence
of underlying BasedSequence
if the single segment.