Add ability to read IDs from files #482

Serene-Arc · 2021-07-05T07:07:03Z

Implements #481 which people have been asking for for a while.

aliparlakci

I am not sure about this feature since it is same as running multiple commands back to back. Maybe we can make this specific to Reddit GDPR file templates.

Serene-Arc · 2021-07-11T00:29:28Z

Kind of, this makes the BDFR more efficient over large series of IDs. There's a limit to how many you can fit on the command line before the OS errors out, and then you have to repeat the setup over and over again. That adds up over time. This makes it so that you can find IDs whichever way and then use it in a single command. Also makes it easier to use a variety of Bash tools and scripts like uniq and diff to sort and do more complex things with the lists.

The one thing I think we shouldn't do is tie it to the GPDR templates. That's narrowing the focus of a broadly useful feature to a single use case when it should be trivial to extract the data from the CSV in the first place. I feel that the BDFR should be as data-agnostic as possible. It shouldn't matter where we got the list of IDs, just that there is a list. Then the BDFR can download them after whatever preprocessing the user desires has been done.

Serene-Arc · 2021-08-03T04:13:25Z

@aliparlakci are we good to merge this PR?

aliparlakci · 2021-08-03T11:43:24Z

bdfr/__main__.py

@@ -17,6 +17,7 @@
    click.option('--authenticate', is_flag=True, default=None),
    click.option('--config', type=str, default=None),
    click.option('--disable-module', multiple=True, default=None, type=str),
+    click.option('--include-id-file', multiple=True, default=None),


--from-id-file more suitable

I think the current name is better, it provides an opposite to the option --exclude-id-file. Since the latter already exists, I think the same convention for the former would be less confusing.

aliparlakci · 2021-08-03T11:45:02Z

I have no other comment, it can be merged after the parameter name change.

Serene-Arc requested a review from aliparlakci July 5, 2021 07:07

aliparlakci requested changes Jul 10, 2021

View reviewed changes

Serene-Arc closed this Jul 17, 2021

Serene-Arc deleted the branch Serene-Arc:development July 17, 2021 11:23

Serene-Arc reopened this Jul 21, 2021

Serene-Arc added 2 commits July 21, 2021 17:32

Add ability to read IDs from files

1a4ff07

Update README

7a1663d

aliparlakci requested changes Aug 3, 2021

View reviewed changes

aliparlakci merged commit 483f179 into Serene-Arc:development Sep 12, 2021

Serene-Arc deleted the enhancement_481 branch September 12, 2021 23:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to read IDs from files #482

Add ability to read IDs from files #482

Serene-Arc commented Jul 5, 2021

aliparlakci left a comment

Serene-Arc commented Jul 11, 2021

Serene-Arc commented Aug 3, 2021

aliparlakci Aug 3, 2021

Serene-Arc Aug 3, 2021

aliparlakci commented Aug 3, 2021

Add ability to read IDs from files #482

Add ability to read IDs from files #482

Conversation

Serene-Arc commented Jul 5, 2021

aliparlakci left a comment

Choose a reason for hiding this comment

Serene-Arc commented Jul 11, 2021

Serene-Arc commented Aug 3, 2021

aliparlakci Aug 3, 2021

Choose a reason for hiding this comment

Serene-Arc Aug 3, 2021

Choose a reason for hiding this comment

aliparlakci commented Aug 3, 2021