Having trouble with syntax highlighting? Need a simple, yet powerful parser? Welcome to SyntaxColor, a simple yet powerful syntax highlighter. It's the last one you'll ever need.
The SyntaxColor parser requires that you pass rules for it to color. A rule is an object that contains properties that help the parser decide what to color.
Each rule must contain the regex
and token
properties, and may contain next
and/or caseSensitive
properties.
The regex
property contains either a regular expression to match, or a regular expression in the form of a string.
For example, /Hello world/
and "Hello world"
are both valid values, but 23
is not.
Any flags except i
on the regular expression will be removed.
The token
property can contain either a string, an array, or a function.
If the token
property is a string, the result matched by regex
will be wrapped in a <span>
with a class of token
.
To add several classes, use class names seperated by .
. Note that classes will have a prefix applied to them, so abc
becomes sc_abc
.
If the token
property is an array, each group in the regex
will have the classes specified in the array added to them in order.
WARNING: Make sure that you have as many groups in your
regex
as there are elements in the list, otherwise you may get unexpected results.
WARNING: You should make sure that every character in your
regex
is contained in a group, and that groups are non-nested, otherwise you may get unexpected results.
If the token
property is a function, SyntaxColor will evaluate the function, passing the matched result in the way the browser returned it.
SyntaxColor will then take the returned result and evaluate it in the manner above.
The next
property is NOT REQUIRED, and defaults to the current state.
It specifies the next state for the parser to go to. More on states later.
The caseSensitive
property is NOT REQUIRED, and defaults to true
.
It specifies whether the regular expression should be matched as case-sensitive.
WARNING: If the
i
flag is set on the regular expression, it overrides thecaseSensitive
option.
The SyntaxColor parser groups rules into different states, which the parser goes through based on the rules specified.
The parser always starts in the "start"
state, so that state is where you should put your beginnng rules.
Then, as the parser evaluates your rules, if it encounters a match and the next
property is set, it goes to the state specified by the next
property.
A rule is a JavaScript object containing the regex
, token
, next?
, and caseSensitive?
properties.
Here's the format for a rule:
{
regex: RegExp or String,
token: String, Array, or Function
[,next: String]
[,caseSensitive: Boolean]
}
A rule set is a JavaScript array containing several rules.
Here's the format for a rule set:
[
rule1,
rule2,
...
]
A complete rule set is a JavaScript object containing several states and their rule sets.
Here's the format for a complete rule set:
{
state1: ruleSet1,
state2: ruleSet2,
...
}
The highlight()
function highlights the text
given with the rules
given. If rules
is an array, it is converted to {start: rules}
.
It also applies a prefix of prefix
to each class, so as not to get mixed up with other classes. The prefix
option defaults to "zsnout_"
.
It returns the HTML string with the classes applied.
The escapeText()
function escapes common HTML characters (<>'"&
) into their &xxx; forms.
The escapeChar()
function escapes the text provided IF it is a single character; otherwise it returns the text.
The parseRules()
function converts the rules provided into a more machine-readable format. It converts all strings to regular expressions, converts aliases to their original forms, removes flags from regular expressions, and adds the i
flag if set.
The addClasses()
function wraps the match provided in a <span>
tag with classes based on token
, prefixed with prefix
.
The compress()
function compresses <span>
s like so: <span class="abc">a</span><span class="abc">b</span><span class="abc">c</span>
--> <span class="abc">abc</span>
.
The changeSpaces()
function replaces a series of at least two spaces with a string of "‌ "
s.
The highlightLine()
function highlights a single line of code based on the current state.