-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH pandoc parser minor improvements and support older pandoc #284
ENH pandoc parser minor improvements and support older pandoc #284
Conversation
Tested and works with the following pandoc versions: - 2.0 - 2.2 - 2.5 - 2.19 - 3.0 Main change was with (afaict) pandoc 2.5, which changed handling of tables. Now there are two table implementations, old and new, which differ slightly. Minimum pandoc version has been decreased to 2.0.
Ready for review @skops-dev/maintainers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
@@ -63,7 +63,7 @@ jobs: | |||
else pip install "scikit-learn~=${{ matrix.sklearn_version }}"; | |||
fi | |||
if [ ${{ matrix.os }} == "ubuntu-latest" ]; | |||
then wget -q https://github.com/jgm/pandoc/releases/download/2.19.2/pandoc-2.19.2-1-amd64.deb && sudo dpkg -i pandoc-2.19.2-1-amd64.deb; | |||
then sudo apt install pandoc && pandoc --version; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
woohoo :D
skops/card/_markup.py
Outdated
@@ -185,7 +189,16 @@ def _code(item: tuple[Any, str]) -> str: | |||
_, txt = item | |||
return f"`{txt}`" | |||
|
|||
def _table_cols(self, items) -> list[str]: | |||
def _table_cols_old(self, items) -> list[str]: # pragma: no cover | |||
# pandoc < 2.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any idea when we can drop support for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess as soon as apt updates to a more recent pandoc version. I have no idea when that will be.
|
||
columns = self._table_cols(thead_body) | ||
|
||
columns = self._table_cols_new(thead_body) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have expected these lines to be covered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the table change actually happens after 2.9, not 2.5 as I first suspected. Therefore, given that we now install pandoc via apt on CI, which uses 2.9, the old table implementation is covered, not the new one. I thus moved around the pragmas to the new implementation.
So the table change actually happens after 2.9, not 2.5 as I first expected. Therefore, given that we now install pandoc via apt on CI, which uses 2.9, the old table implementation is covered, not the new one. I thus moved around the pragmas to the new implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we basically tested the new stuff, and now also the old stuff. This is good to go. Thanks @BenjaminBossan
Description
Add support for
LineBreak
itemThis one was a bit obscure, it requires the model card to contain a line break with trailing whitespace to be triggered.
Add support for older pandoc versions
This one came up when working on a HF space. There, non-Python dependencies can be installed, but only as debian packages via
apt
. However, theapt
repo only has an old pandoc version, which doesn't work with the current implementation.The solutions are either to support older pandoc versions or to use a Docker space, which seems like overkill. Fortunately, supporting older pandoc versions wasn't as hard as expected. The main change was introduced with (afaict) pandoc 2.5, which changed handling of tables. Now there are two table implementations, old and new, which differ slightly.
Tested and works with the following pandoc versions:
Minimum pandoc version has been decreased to 2.0. CI has been adopted to install pandoc via apt, resulting in version 2.9.2.1.