Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to read IDs from files #482

Merged
merged 2 commits into from
Sep 12, 2021
Merged

Add ability to read IDs from files #482

merged 2 commits into from
Sep 12, 2021

Conversation

Serene-Arc
Copy link
Owner

Implements #481 which people have been asking for for a while.

Copy link
Collaborator

@aliparlakci aliparlakci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this feature since it is same as running multiple commands back to back. Maybe we can make this specific to Reddit GDPR file templates.

@Serene-Arc
Copy link
Owner Author

Kind of, this makes the BDFR more efficient over large series of IDs. There's a limit to how many you can fit on the command line before the OS errors out, and then you have to repeat the setup over and over again. That adds up over time. This makes it so that you can find IDs whichever way and then use it in a single command. Also makes it easier to use a variety of Bash tools and scripts like uniq and diff to sort and do more complex things with the lists.

The one thing I think we shouldn't do is tie it to the GPDR templates. That's narrowing the focus of a broadly useful feature to a single use case when it should be trivial to extract the data from the CSV in the first place. I feel that the BDFR should be as data-agnostic as possible. It shouldn't matter where we got the list of IDs, just that there is a list. Then the BDFR can download them after whatever preprocessing the user desires has been done.

@Serene-Arc Serene-Arc closed this Jul 17, 2021
@Serene-Arc Serene-Arc deleted the branch Serene-Arc:development July 17, 2021 11:23
@Serene-Arc Serene-Arc reopened this Jul 21, 2021
@Serene-Arc
Copy link
Owner Author

@aliparlakci are we good to merge this PR?

@@ -17,6 +17,7 @@
click.option('--authenticate', is_flag=True, default=None),
click.option('--config', type=str, default=None),
click.option('--disable-module', multiple=True, default=None, type=str),
click.option('--include-id-file', multiple=True, default=None),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--from-id-file more suitable

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current name is better, it provides an opposite to the option --exclude-id-file. Since the latter already exists, I think the same convention for the former would be less confusing.

@aliparlakci
Copy link
Collaborator

I have no other comment, it can be merged after the parameter name change.

@aliparlakci aliparlakci merged commit 483f179 into Serene-Arc:development Sep 12, 2021
@Serene-Arc Serene-Arc deleted the enhancement_481 branch September 12, 2021 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants