Skip to content
This repository has been archived by the owner on Feb 5, 2022. It is now read-only.
/ code-blocks Public archive

Parse fenced code blocks from markdown and do stuff with them

License

Notifications You must be signed in to change notification settings

shawnbot/code-blocks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

code-blocks

npm build status

Parse fenced code blocks from Markdown with useful metadata.

npm install [--save | --save-dev] code-blocks

Usage

// ES5/CommonJS
const codeBlocks = require('code-blocks')
// ES2015/ES6/Babel, etc.
import codeBlocks from 'code-blocks'

codeBlocks.fromFile('README.md')
  .then(blocks => {
    // do stuff with blocks here
  })

See the API documentation for more examples.

How it works

This library uses remark to parse Markdown into a unist tree, then finds all of the fenced code blocks. Those with a language identifier after the opening ``` or ~~~ get some additional properties.

Code block info

According to the CommonMark Spec:

The first word of the info string is typically used to specify the language of the code sample, and rendered in the class attribute of the code tag. However, this spec does not mandate any particular treatment of the info string.

In other words, CommonMark-compliant parsers should safely ignore everything after the language identifier. That's where we can attach additional key/value pairs, which are parsed as each code block's info, as in:

```html title="A dumb example" foo=bar "x.y.z"="1 2 3"
<h1>Hello, world!</h1>
```

When parsed with code-blocks, this would yield an array with one object:

[{
  type: 'code',
  lang: 'html',
  value: '<h1>Hello, world!</h1>',
  info: {
    title: 'A dumb example',
    foo: 'bar',
    'x.y.z': '1 2 3'
  },
  title: 'A dumb example',
  source: {
    file: 'README.md',
    line: 1
  },
  position: {
    // see https://github.com/syntax-tree/unist#position
  }
}]

Node properties

The unist node objects returned by all of the block parsing functions are "enhanced" with the following properties:

  • lang contains only the first "word" of the info string
  • info is an object of key/value pairs parsed from the remainder of the info string
  • title is the title of the code block, as determined by this algorithm
  • source is an object with two keys:
    • file is the path provided as the first argument to fromFile() and fromFileSync(), or as the last argument to fromString() or fromAST(). (If no file is provided, this value will be buffer.)
    • line is the starting line of the code block in markdown input.

See the mdast documentation for more info about the Code nodes generated by remark, and the unist documentation for more on the underlying structures.

Code block titles

Because code blocks are often meaningless without at least some context, every block parsed gets a title according to the following algorithm:

  1. If a title key is found in the block's info object, use that.

  2. Otherwise, find the previous heading in the markdown and use its text.

    1. If two or more code blocks share the same heading, add a numeric suffix: (2) for the second, (3) for the third, and so on.
  3. If no previous heading is found, provide a title that describes where it comes from, in the form:

    Code block {n} from {filename}:{line}
    

    Where {n} is the 1-based index of the code block in the parsed file, {filename} is the parsed file (or buffer), and {line} is the line at which the code block starts.

About

Parse fenced code blocks from markdown and do stuff with them

Resources

License

Stars

Watchers

Forks

Packages

No packages published