Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to map an extension to a language #67

Open
xpayn opened this issue Sep 21, 2016 · 23 comments
Open

Add an option to map an extension to a language #67

xpayn opened this issue Sep 21, 2016 · 23 comments
Assignees

Comments

@xpayn
Copy link
Contributor

xpayn commented Sep 21, 2016

It would be very convenient to have such an option when an extension is unknown for a given language or when an extension is ambiguous (e.g. .cgi, .inc).
It could be used to override a default mapping, or even discard a mapping, for instance I have a file .pro which is a QT Creator project and not a Prolog file. But maybe in the later case, it would be cleaner to have a dedicated option to ignore a given extension.
I'll try to submit a PR, but any guidance would be greatly appreciated :)

@xpayn
Copy link
Contributor Author

xpayn commented Sep 21, 2016

In order to check if a language exists, I thought about impl<'a> From<&'a str> for LanguageType, but from isn't supposed to fail...
I thought about implementing TryFrom, but it's tagged as unstable.
What would be a reasonable solution ?

@XAMPPRocky
Copy link
Owner

I have thought about this feature for a few months. I don't think it should be through flags. Then you'd always have to remember to have the set the correct flags every time. Instead I think it should be like .gitignore in that tokei reads that and it applies recursively throughout, unless there is a .tokeirc in a subfolder in which case it applies to that subfolder and all subfolders in that, and so on and so forth.

@xpayn
Copy link
Contributor Author

xpayn commented Sep 21, 2016

I agree that a rc file would be really nice but being able to use this feature on the cli can be useful when you want something quick.
For a recurring use of tokei it's definitely not enough.
As support for a rc file isn't available yet, I think it won't hurt to have something like

tokei --map foo=Perl -m bar=Lisp

What do you think?

@XAMPPRocky
Copy link
Owner

@xpayn Well I think this should be implemented on the road to to rc. As I don't want to make an implementation that will have to be rewritten soon. As rc is the next feature I want to implement.

@xpayn
Copy link
Contributor Author

xpayn commented Sep 22, 2016

ok, seems fair to me.
i'll try to keep an eye on the project and help if i can.

@XAMPPRocky
Copy link
Owner

@xpayn Please do. And of course if there are any other problems or feature requests please do make an issue for them too!

@xpayn
Copy link
Contributor Author

xpayn commented Sep 26, 2016

@Aaronepower Maybe you can have a look at what @BurntSushi did for managing extensions an file types in ripgrep: https://github.com/BurntSushi/ripgrep/blob/master/src/types.rs
He already handles the mapping rg --type-add 'foo:*.foo,*.foobar'
Maybe a common crate for both projects, could be considered

@BurntSushi
Copy link

FYI, I'm working on splitting out the ignore/gitignore/filetype logic from ripgrep into a separate crate. It should be done in the next couple weeks.

@remexre
Copy link

remexre commented Dec 4, 2016

A heads-up for contributors, looks like the .*ignore crate split for ripgrep got finished a while back:

https://crates.io/crates/ignore
https://github.com/BurntSushi/ripgrep/tree/master/ignore

@XAMPPRocky
Copy link
Owner

XAMPPRocky commented Dec 4, 2016

@remexre Tokei has already integrated the ignore crate. The filetype functionality isn't being used yet however. I'm still trying to figure out the best solution.

@remexre
Copy link

remexre commented Dec 4, 2016

Forgive my ignorance, but what exactly is the problem that needs to be solved? ignore's types module vs tokei's? Or is there something else?

@XAMPPRocky
Copy link
Owner

@remexre Having the same functionality as .gitignore, but as it relates to mapping extensions against languages.

@remexre
Copy link

remexre commented Dec 4, 2016

Doesn't ignore have that functionality in https://docs.rs/ignore/0.1.5/ignore/types/index.html ?

@XAMPPRocky
Copy link
Owner

@remexre ignores's types aren't 1 to 1 with tokei's. Those are for ignoring file types, where tokei wants to map them to a different language.

@BurntSushi
Copy link

BurntSushi commented Dec 11, 2016 via email

@ghost
Copy link

ghost commented Dec 22, 2016

A modeline (vim) or file-local-vars (emacs) at the beginning or end of a file are also often used to enable a certain language mode and can be used for better accuracy. There's also the problem that languages have conventions for certain files which are not merely the extension but the whole filename. For example, in Erlang you have <app_name>.app, sys.config, <app_name>.rel, etc. which (with or without a modeline) can be considered to be Erlang syntax by mapping filename patterns to languages in .tokeirc. Of course, if one edits such files regularly, it's either mapped in your shared editor config or via a modeline in the file, so we cannot assume either one or the other, and thus both .tokeirc and the possibility to look for ex: ft=rust or -*- mode:rust -*- would be useful in order to avoid the need to set up projects for tokei.

@XAMPPRocky
Copy link
Owner

@Tuncer I'm not really familiar with modeline, or file-local-vars, could you provide a few examples?

@ghost
Copy link

ghost commented Dec 24, 2016

The relevance for tokei is that it's common to select a filetype (vim) or mode (emacs) via a file local var and modeline in files that do not have a uniquely mappable file extension (main.c) or unique name (Makefile). This can be used to detect a file's type and hence Tokei language.

@Armavica
Copy link

Would it be an option to apply heuristics on the files with ambiguous extensions to try and guess the language?

@XAMPPRocky
Copy link
Owner

I'm going to close this issue, moving everything to #195 as I don't think I'm going to implement a solution that isn't a configuration file.

@XAMPPRocky

This comment has been minimized.

@LunarLambda
Copy link

What about files with no extension? tokei completely ignores those currently.

For example my zsh configuration is made out of modular files with no extension (example: rc.d/aliases). Currently tokei returns 0 lines for the entire directory, including the plain-text README.

Would it be possible to specify a "fallback" language used if the language can't be determined, or be able to map full glob patterns or similar to languages, rather than only extensions?

@XAMPPRocky
Copy link
Owner

What about files with no extension? tokei completely ignores those currently.

It doesn't, tokei just currently requires a well-known file name before it will count them, you can see this for dockerfiles for example.

"filenames": ["dockerfile"],

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants