-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YAML file for options (--opts my_opts.yaml) #622
Conversation
if that's going to be a thing, might it be worthwhile having a check for a standardised filename (probably hidden, say something like .bdfr.yml ) in the current dir or base/destination dir and use those options? |
@sinclairkosh Not sure if I understand the rationale. This file (unlike While there might be a default name (e.g. |
the rationale would be the same or similar to the first two points of your general rationale. Repeat downloads in specific locations with the same set of options. It was a suggestion, nothing more, nothing less. More user QOL than anything. |
Question from me (the developer): what are the advantages of this over a bash script? That is the method that I have been recommending (and personally use) for repetitive, identical commands. What are the advantages of this method over simple scripting? To me, it seems to add more points of possible failure and redundancy, as well as introduce more complexity, albeit minor, in the form of the YAML file and format. |
@Serene-Arc The answer to that depends on who your audience is. Who do you, the developers, see as the users of this program? If, as I seem to recall somewhere in the text, you're talking about using it for research (amongst other things). In that case, someone using it for NLP or ML research might have no issues at all with a shellscript, but another researcher using it for political/sociological etc research might not be anywhere near as computer savvy and familiar with things such as shell scripts. Then it's a matter of if you want to "support" your users with QOL features that makes life easier for them or if it's a case of, well it works for us, sort your own crap out. This is open-source, so both paths are reasonable and generally acceptable, if not necessarily enjoyable if you're on the end of the "sort your own crap out" if you don't know how :) There are, to be honest, likely no real technical advantages to this approach, although I'm sure if you thought about it hard enough you might come up with some. In the end, what going down this path will do is make life easier for a subset of those using the program, especially those with less programming related skill. |
First of all, hello Piotr, one of the first contributors to the project :) I hope you are well. The biggest concern of ours about new features and additions is that it might bloat the program. We want to keep it as simple as possible. There were many other things which would make the life of the users "easier" but we had to refuse them solely because there were other ways to achieve it rather than implementing it in the source code. As far as I see, this yaml option feature can be achieved as such: custom_downloader.sh # custom_downloader.sh
python -m bdfr downloader ./output \
--skip mp4, avi, mov \
--file_scheme "{UPVOTES}_{REDDITOR}_{POSTID}_{DATE}" \
--limit 10 \
--sort top \
--time all \
--no_dupes \
--subreddit EarthPorn+CityPorn Syntax and options might be a little different but to run this program, you only need to execute: > ./custom_downlader.sh Instead of
Please correct me if I am missing something. However, I do think that not everyone is as computer literate as us. So, there should be a guide/explanation to use bdfr in this kind of workflow. So, this PR might be documentation contribution instead. |
@sinclairkosh It is up to someone's workflow. If bash for everything is one that works for you the best, awesome. For me (well, I came from the ML background), while I can certainly do bash, too long bash commands are neither pretty nor convenient. In fact, most ML and DS frameworks I know provide YAML (or JSON, TOML) configs.
@aliparlakci I understand your point. Still, I wrote the YAML part for myself. As:
|
Just to answer your questions:
I don't understand this part. You would still have different
This is also the same for .sh files compared to .yaml. You can create .sh scripts running other, small .sh scripts. .yaml does not introduce any advantage over .sh files.
Below syntax is also valid:
Just have some .sh file without the parameter and use it as such
However, I don't think this will make too much clutter in the codebase and it is not so strange that a CLI tool accepts a .yaml config and it does not seem to be against our design choices. Let us discuss it if you still think that the feature is needed. |
My position is that this feature is not needed. I don't see any difference in skill level between writing a bash script and writing a YAML file, and there is little difference in the actual format. That being said, I did specifically design the BDFR to accommodate things like this, which is why the Configuration class is how it is. I don't see any use for this but there's no reason for it not to be implemented as it should be a very low-maintainence module. However I would request that any changes, and any code related to YAML processing, be moved to a separate module. I can do this refactoring myself if you prefer. |
I understand that you @Serene-Arc and @aliparlakci skeptical if it offers any improvement. I wrote it for my personal use, as it fits my workflow/preferences better. I shared it as a PR as I believe some others would prefer this way. Since it does not change the CLI style, for anyone not intending to use YAML, it would make no difference at all. And you @Serene-Arc noticed, it is designed to be a low-maintenance module. There is a general trend of moving in that direction of exchangeable data formats used as configs. See e.g.:
Of course, bash scripts can do more. But following the rule of least power, YAML configs offer a more predictable way for many protocols than any executable (or potentially executable) files.
Not sure how do you like to refactor it, so it would be wonderful if you could take it from here. One note, which is not related to YAML per see. I used in a few places |
I added an option to load options form a YAML file. So, we can run
A file looks like
General rationale:
If there are more options provided from the CLI (e.g.
-L 20
), they take priority over the file.--opts my_opts.yaml
)