Add CLIs for managing workers. #11

rob-dalton · 2019-03-28T23:50:43Z

Added commands for:

Blocking workers.
Unblocking workers.
Sending notification messages to workers.
Associating qualifications with workers.
Disassociating qualifications with workers.

Added new utils file for worker management functions.

chaitanyamalaviya

Looks good, can you also add the descriptions for the new commands in the main README.md file?

chaitanyamalaviya · 2019-04-09T00:59:30Z

amti/clis/associate.py

+    worker_ids = list(ids)
+
+    # read ids from file (adds to provided ids)
+    if file is not None:


Seems like file might be more suitable as a default argument. As well as some other arguments such as which can take the value None.

chaitanyamalaviya · 2019-04-09T01:00:30Z

amti/clis/associate.py

+    is_flag=True,
+    help='View the status of HITs from the live MTurk site.')
+def associate_qual(file, ids, qual, name, value, notify, live):
+    """Associate workers with a qualification.


Would be helpful if you could provide a more complete description of the args as well as what the method returns.

chaitanyamalaviya · 2019-04-09T01:01:11Z

amti/clis/associate.py

+def associate_qual(file, ids, qual, name, value, notify, live):
+    """Associate workers with a qualification.
+
+    Given a space seperated list of WorkerIds and/or a path to


"separated" :)

nalourie-ai2

Thanks so much for submitting this PR! It implements a few much-needed features.

I left a some comments inlined with the code, but something I'm generally not sure about is how to handle the UX differences between performing these actions with one worker ID versus with several. It seems useful to have batch and one-off modes because it's much more performant to batch the requests. Would it make sense to implement these in separate commands though? Also, in the batch mode, can things like the qual to assign or take away / the reason for a block be passed per worker by using a CSV format with those columns as well (instead of just worker IDs)?

nalourie-ai2 · 2019-04-09T00:13:05Z

amti/clis/associate.py

+    is_flag=True,
+    help='View the status of HITs from the live MTurk site.')
+def associate_qual(file, ids, qual, name, value, notify, live):
+    """Associate workers with a qualification.


Documentation conventions for the code base are implicit at the moment, and should probably be put in a document somewhere, but the gist is that:

click commands and command groups

Doc strings should begin with 1 line summarizing the function of the script.

After the summary line, optionally several paragraphs can explain the arguments and usage of the script in more detail.

Since arguments are described in the main text, they don't need a separate section describing them.

Arguments should be referred to in all caps with underscores (matching how they'll appear in help output).

Options are documented with their own help strings and should not be referred to in the main text (unless there's a really good reason).

python functions and classes (that are not click commands)

First follow numpy doc string conventions.

Then follow pep257 (python doc string conventions).

Then pep8 (python style guide).

The click command doc strings here mostly just need the arguments described in the running text, the parameters sections removed, and the argument names put in all caps.

nalourie-ai2 · 2019-04-09T01:16:57Z

amti/utils/workers.py

+import csv
+from typing import List, Optional
+
+def create_batches(items: List, n=100) -> List:


create_batches might be easy to confuse with the create_batch command, which can be avoided by naming this function something like batch_list or chunk_list.

Similarly, for the n argument, naming it batch_size and having no default forces people to explicitly pass it and could make the loops much more readable. Since the right batch size varies a lot from situation to situation, it's probably worth forcing callers to explicitly consider it.

Lastly, since the function is pretty general it might be better to put it in a new module (amti.utils.misc for example).

nalourie-ai2 · 2019-04-09T01:20:34Z

amti/utils/workers.py

+from typing import List, Optional
+
+def create_batches(items: List, n=100) -> List:
+    """ Create generator that splits items into batches of size n. """


Please remove the leading and trailing space from the first line of these doc strings, and add Parameters and Returns sections (see this function for example).

nalourie-ai2 · 2019-04-09T01:31:53Z

amti/utils/workers.py

+    for i in range(0, len(items), n):
+        yield items[i:i + n]
+
+def read_workerids_from_file(file: click.Path) -> List:


It looks like the expected file format for this function is one worker id per line, possibly with a header. If that's the case, it might be simpler to just require these files not have headers and read the file directly:

with open(worker_ids_path, 'r') as worker_ids_file: worker_ids = [worker_id.strip() for ln in worker_ids_file]

If it is important to read CSVs, I'd suggest having read_workerids_from_file take the file-like object as an argument. The advantage over passing around the path is that then the CLI commands can open the files using click.open_file which supports things like using - to represent standard out. This feature enables amti to be used in unix pipelines.

nalourie-ai2 · 2019-04-09T01:33:56Z

amti/utils/workers.py

+
+    return worker_ids
+
+def get_qual_by_name(client: boto3.client, qual_name: str) -> Optional[dict]:


I think this utility can go in amti.utils.mturk.

rob-dalton · 2019-04-10T18:11:53Z

For UI changes - we could break out commands into groups. For example:

amti batch <command>
amti workers <command>

I think click allows you to group commands pretty easily.

rob-dalton · 2019-04-10T21:49:58Z

Also I can't think of a better way to handle single vs multiple WorkerIds. It's a bit clunky, but I feel like you should be able to handle them with the same command, since it's the same action.

This way, you can provide any number of WorkerIds on the command line (makes it easy to handle one or two workers). And if you want to do a large batch, you can leave the IDS arg empty and just provide a file path.

nalourie-ai2 · 2019-04-11T19:49:16Z

Sounds good to me.

I really like the idea of having amti batch and amti workers command groups, but I think we could handle that in a follow up PR to keep things moving. Changing the UI like that might also justify a larger refactor of the code base, with subpackages for the different command groups.

Having one command per action makes sense to me UI-wise. We should handle the additional columns for other arguments in the case that it takes CSV input (like --reason for example). Similarly, we should allow the delimiter to be specified by the user, since at a minimum people will probably have both CSV and TSV files. Also, with two input modes, we should have more thorough validation, i.e.

if --file is passed, we should check that ids is None and --reason is not used.
if ids is passed, we should validate that --file is not passed and --reason is present.

Or something like that. This issue discusses implementing mutually exclusive options in click, though I think the simplest and probably best approach is to just put a little validation logic at the beginning of the command rather than trying to fit it into click's callbacks / parameter validation.

nalourie-ai2

Thanks for the changes, let's get this merged 😄!

chaitanyamalaviya

LGTM!

rob-dalton added 5 commits March 27, 2019 16:41

Added block script.

9b262ee

Added block-workers, notify-workers commands.

8d4506c

Added unblock-workers

a361ae0

Added associate-qual

51d89e2

Fixed ordering of argument names

fceea61

rob-dalton requested review from csbhagav and nalourie-ai2 March 28, 2019 23:50

nalourie-ai2 requested review from chaitanyamalaviya and removed request for csbhagav April 2, 2019 05:22

chaitanyamalaviya suggested changes Apr 9, 2019

View reviewed changes

nalourie-ai2 suggested changes Apr 9, 2019

View reviewed changes

Updated comments, structure.

d71c86c

nalourie-ai2 previously approved these changes Apr 26, 2019

View reviewed changes

chaitanyamalaviya previously approved these changes Apr 26, 2019

View reviewed changes

Merge branch 'master' into add-worker-management

1419ff7

rob-dalton dismissed stale reviews from chaitanyamalaviya and nalourie-ai2 via 1419ff7 May 1, 2019 18:12

rob-dalton merged commit 158a9a1 into allenai:master May 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CLIs for managing workers. #11

Add CLIs for managing workers. #11

rob-dalton commented Mar 28, 2019

chaitanyamalaviya left a comment

chaitanyamalaviya Apr 9, 2019

chaitanyamalaviya Apr 9, 2019

chaitanyamalaviya Apr 9, 2019

nalourie-ai2 left a comment

nalourie-ai2 Apr 9, 2019

nalourie-ai2 Apr 9, 2019

nalourie-ai2 Apr 9, 2019

nalourie-ai2 Apr 9, 2019

nalourie-ai2 Apr 9, 2019

rob-dalton commented Apr 10, 2019

rob-dalton commented Apr 10, 2019

nalourie-ai2 commented Apr 11, 2019

nalourie-ai2 left a comment

chaitanyamalaviya left a comment


		return worker_ids

		def get_qual_by_name(client: boto3.client, qual_name: str) -> Optional[dict]:

Add CLIs for managing workers. #11

Add CLIs for managing workers. #11

Conversation

rob-dalton commented Mar 28, 2019

chaitanyamalaviya left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nalourie-ai2 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rob-dalton commented Apr 10, 2019

rob-dalton commented Apr 10, 2019

nalourie-ai2 commented Apr 11, 2019

nalourie-ai2 left a comment

Choose a reason for hiding this comment

chaitanyamalaviya left a comment

Choose a reason for hiding this comment