Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Curated list of languages #168

Open
vors opened this issue May 29, 2018 · 11 comments
Open

Curated list of languages #168

vors opened this issue May 29, 2018 · 11 comments

Comments

@vors
Copy link
Contributor

vors commented May 29, 2018

Extracting from #113 into its own issue.

Many people expressed a desire to add a support for new languages.
The current story is unclear because the list of syntaxes is coming from https://github.com/sublimehq/Packages/ and its README says

Pull requests for new packages won't be accepted at this stage, as new packages can cause issues for users who have a package with the same name installed via Package Control. There are some planned changes that will address this in the future.

Decouping list of supported languages from sublimehq/Packages would allow to move forward.
List of already discussed options:

In #113 (comment) @trishume said

If someone's willing to do the work to automatically convert all those tmBundles to sublime-syntaxes and add a cargo feature to bundle the extra syntaxes, I'm probably willing to accept that. Might need to set up Git LFS for the packdumps for that so they don't bloat the Git repo too much, but that's fine.

I'm totally okay with some curation though, especially since we can use sublime-synaxes I bet there are some higher quality syntaxes for certain languages than the ones Github uses.

I'm opening this issue in a hope to continue this conversation and start some work.

@sharkdp
Copy link
Contributor

sharkdp commented May 30, 2018

I would definitely be interested in this as well. I currently also maintain a (small) list of syntaxes for bat: bat/assets/syntaxes.

I think a separate repository might be useful such that the syntect repository will not be polluted with issues/PRs for new syntaxes(?).

Concerning linguist, note that a lot of these are atom syntaxes. Still, this could be a useful resource.

Regarding logistics, I think Git submodules are a great way of bundling different repositories since we keep the link to the upstream source. Unfortunately, a lot of Sublime Syntax repositories do not contain a .sublime-syntax file. Ideally, I think, the .tmBundle => .sublime-syntax conversion should be done by some script when the syntax-bundle repository is "built".

@sharkdp
Copy link
Contributor

sharkdp commented May 30, 2018

Also, I just found this fork of sublimehq/Packages which is used by syntect_server. It has a lot more syntaxes than the default Packages repo.

@emidoots
Copy link
Contributor

emidoots commented Jun 1, 2018

Just saw this. I am indeed also interested in how we could make adding more language syntaxes a more official process (there was some prior discussion of this at sourcegraph/syntect_server#3 but the issue is a bit stale unfortunately).

I saw you found my fork of sublimehq/Packages @sharkdp :)

In that repo all I've really done is taken existing .tmLanguage files and converted them to .sublime-syntax via ST3's builtin command (and added some SOURCE and VERSION files to keep track of where they came from), e.g. https://github.com/slimsag/Packages/tree/master/Swift

But one thing to consider is: how can we override what is provided by sublimehq/Packages, as well? For example, with JavaScript/JSX files we've found e.g. https://github.com/babel/babel-sublime to be superior than sublimehq/Package's builtin JavaScript syntax, but doing so requires either replacing or having some type of package disablement feature like ST3 has.

@chriskrycho
Copy link

I'm late to this discussion, but have reason to be looking for just such a thing right now, and it seems like a shared repo which we can all contribute to (combined with a relatively simple config file and build.rs to allow subsetting?) would be a better long-term solution than everyone maintaining their own forks and repositories.

If folks are interested, and @Keats is up for it, the work they've done with Zola (formerly Gutenberg) seems like it's pretty ready-made for this as a well-set-up starting point.

Thoughts?

@trishume
Copy link
Owner

Yup that sounds reasonable. Would be nice if it was just a crate that people could include that depended on syntect and just provided functions kind of like the existing ones for loading default syntaxes. Maybe with Cargo features to disable really obscure languages or something.

If someone makes such a crate/repo I will link to it prominently in the readme and docs.

@daurnimator
Copy link

How would I go about getting mdcat to highlight zig files? https://github.com/ziglang/sublime-zig-language/

@trishume
Copy link
Owner

@daurnimator probably mdcat would need to use a syntax dump created from the list of syntaxes of bat, zola or syntect_server. Then just add the Zig language file to the repo of syntaxes you're using.

@jrappen
Copy link
Contributor

jrappen commented Jan 5, 2021

Current blockers by syntect using sublimehq/Packages

Currently, syntect neither supports branching in ST4050+ (compare #271) nor v2 related changes of ST4075+.

Compare the new docs at:

Recent changes at sublimehq/Packages

Major re-writes for ST4xxx+:

Missing languages at sublimehq/Packages

SublimeHQ would like to have support for Swift. Old related PRs that have stalled:

There are currently no plans to accept languages other than Swift to https://github.com/sublimehq/Packages as far as I know.

@Canop
Copy link
Contributor

Canop commented Jan 7, 2022

This looks like something the syntect projects really needs.
Broot deals with the problem by using the work made by the bat project, which looks like a very good starting point, but an alternative would be convenient for everybody and it shouldn't be based on git submodules which don't really make sense as a packaging solution.
A crate with features looks like a good solution.

@Nezteb
Copy link

Nezteb commented Mar 29, 2023

Have there been any decisions or progress made on this in the last year? I'm particularly interested in Elixir support (a la #134).

If there've been no updates, that's totally okay. I just wanted to check! 😄

@CosmicHorrorDev
Copy link
Contributor

👋 Shameless plug: I maintain the two-face crate which bundles bat's syntaxes/themes/acknowledgements. and provides them through the following functions

  • syntect::parsing::SyntaxSet::load_defaults_newlines() -> two_face::syntax::extra_newlines()
  • syntect::parsing::SyntaxSet::load_defaults_nonewlines() -> two_face::syntax::extra_no_newlines()
  • syntect::highlighting::ThemeSet::load_defaults() -> two_face::theme::extra()

There are some small extras included like acknowledgements for the embedded assets, something akin to bat's LazyThemeSet, and an enum enumerating all of the embedded themes

Re: feature flags

The only feature flags I opted for are for toggling on syntect's default features1 or for indicating the regex backend2

I did investigate splitting up the set of languages further, but the asset handling starts becoming much more complex, and the largest languages are all decently popular. Instead I've been focusing on trying to reduce the size of embedded syntaxes within syntect itself

New syntax/theme inclusion

For the time being I'm only including all of the assets that bat contains, but it should be possible to extend things to include assets from some other sources too. I don't want to personally start another source of assets, but being aggregate of a handful of sources seems reasonable enough

If people want to ship a slimmer set for their application then you can always use a build script or some similar mechanism to build a minimal set containing only the syntaxes/themes you need

Footnotes

  1. Out of convenience as syntect is provided as a re-export

  2. Corresponding to either syntect/regex-onig or syntect/regex-fancy. It's very important to correctly match the regex backend to keep highlighting from panicking internally since fancy-regex doesn't support all of the features used in the set. See two-face's README for more details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests