Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add examples telemetry #17552

Merged
merged 6 commits into from
Jun 7, 2022
Merged

Add examples telemetry #17552

merged 6 commits into from
Jun 7, 2022

Conversation

sgugger
Copy link
Collaborator

@sgugger sgugger commented Jun 3, 2022

What does this PR do?

This PR adds a function to send telemetry to help us track the examples usage and uses it in the current examples. For now, I've just added in the PyTorch run_glue.py, but will paste it in all other examples if you agree with the format/data tracked.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 3, 2022

The documentation is not available anymore as the PR was closed or merged.

Comment on lines 218 to 227
# Sending telemetry. Tracking the example usage helps us better allocate resources to maintain them. The
# information sent is the one passed as arguments along with your Python/PyTorch versions.
model_name = None if os.path.isdir(model_args.model_name_or_path) else model_args.model_name_or_path
if data_args.task_name is not None:
dataset_name = f"glue-{data_args.task_name}"
elif data_args.dataset_name is not None:
dataset_name = data_args.dataset_name
else:
dataset_name = None
send_example_telemetry("run_glue", model_name=model_name, dataset_name=dataset_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather have as few lines of code as possible for the telemetry, so as to not trouble understanding. Here I understand all of that block is just for the send_example_telemetry thanks to the diff, but if I was reading the example in order to understand it I think I would think that model_name and dataset_name are re-used afterwards.

How about passing the model_args/data_args directly to the method? Or if you think there is too much information in there, how about defining a local method so that we understand the scope of these variables?

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's perfect! Thanks for iterating, @sgugger

@@ -399,6 +399,10 @@ def main():
else:
model_args, data_args, training_args = parser.parse_args_into_dataclasses()

# Sending telemetry. Tracking the example usage helps us better allocate resources to maintain them. The
# information sent is the one passed as arguments along with your Python/PyTorch versions.
send_example_telemetry("flax_run_summarization", model_args, data_args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this, I was wondering if we shouldn't add the framework in front of all examples: flax_run_summarization, pytorch_run_summarization, etc.

But I wonder if we can't do it better by doing it with pytorch/run_summarization -> would that lead to better analysis with kibana? In any case I think having the framework name is super useful, so I'd personally add it to all examples

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent!

@sgugger sgugger merged commit 3cab902 into main Jun 7, 2022
@sgugger sgugger deleted the telemetry branch June 7, 2022 15:57
@patrickvonplaten
Copy link
Contributor

Thanks a lot for working on this @sgugger - that's super useful!

elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022
* Add examples telemetry

* Alternative approach

* Add to all other examples

* Add to templates as well

* Put framework separately

* Same for TensorFlow
amyeroberts pushed a commit to amyeroberts/transformers that referenced this pull request Jun 16, 2022
* Add examples telemetry

* Alternative approach

* Add to all other examples

* Add to templates as well

* Put framework separately

* Same for TensorFlow
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants