-
-
Notifications
You must be signed in to change notification settings - Fork 620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute metric per image and handler mutex in DistributedDataParallel #1563
Comments
@Nic-Ma Hi !
|
Hi @Nic-Ma , As far as I understand your question, it is about evaluating a pretrained model on images and output the metric per image. For a single process, IMO, this can be done like that: from ignite.engine import Engine, Events
from ignite.metrics import Accuracy, BatchWise
num_classes = 10
batch_size = 4
data = [
{
"x": torch.rand(batch_size, 3, 32, 32),
"y": torch.randint(0, num_classes, size=(batch_size, 32, 32)),
"filenames": [f"file_{i + 0 * batch_size}" for i in range(batch_size)]
},
{
"x": torch.rand(batch_size, 3, 32, 32),
"y": torch.randint(0, num_classes, size=(batch_size, 32, 32)),
"filenames": [f"file_{i + 1 * batch_size}" for i in range(batch_size)]
},
]
def infer_step(engine, batch):
x = batch["x"]
y = batch["y"]
return {
"y_pred": torch.rand(batch_size, num_classes, 32, 32),
"y": y,
}
evaluator = Engine(infer_step)
acc_metric = Accuracy()
@evaluator.on(Events.ITERATION_COMPLETED)
def compute_metric_per_image(engine):
print("compute_metric_per_image")
output = engine.state.output
batch = engine.state.batch
assert len(output["y_pred"]) == len(output["y"])
assert len(batch["filenames"]) == len(output["y"])
evaluator.state.metrics = []
for y_pred, y, filename in zip(output["y_pred"], output["y"], batch["filenames"]):
acc_metric.reset()
acc_metric.update((y_pred.unsqueeze(0), y.unsqueeze(0)))
o = {
"filename": filename,
"accuracy": acc_metric.compute(),
}
evaluator.state.metrics.append(o)
@evaluator.on(Events.ITERATION_COMPLETED)
def save_files():
print("append data to CSV:", evaluator.state.metrics)
evaluator.run(data)
Maybe, the idea is to gather all data from all participating processes to a single one and then use barrier ( |
Hi @vfdev-5 , Thanks very much for your detailed example! Thanks. |
Yes, that's true. As we need to compute metric per image of the batch we can not currently simply attach metric to an engine and compute
yes, this makes sense and we can put such handler into contrib module !
yes, this can be also a possible solution 👍 |
@sdesrozis maybe we can think about another |
Thanks, let me do more experiments tomorrow. |
Let's think about it. From my side, I have a similar need, so I'm very interested in such a feature. |
Hi @vfdev-5 and @sdesrozis , I think at least it's valuable to add Thanks. |
@Nic-Ma thanks for the idea. Which use-case you think of while using engine inside Metric.update/compute ? Thanks |
Hi @vfdev-5 , I think context information is widely used in most system SW arch, like a data bus for every components to produce/consume. Your Thanks. |
I see, thanks for explaining !
In my mind, update/compute methods are sort of independent of Engine etc. def attach(self, engine: Engine, name: str, usage: Union[str, MetricUsage] = EpochWise()) -> None:
self.engine = engine thus, it can be accessible in all methods. We discussed that previously and seems like it was not retained... @sdesrozis what do you think ? On the other hand, out-of-the-box metrics do not use |
I agree that having In the case of I even think we could go further in the separation and adopt a composite pattern rather than inheritance. But that's irrelevant here. Keep a reference of |
@sdesrozis the problem here is that we already have existing API for |
Yes there should have collateral effects to control... |
Hi @vfdev-5 and @sdesrozis , Thanks for your discussion. Thanks. |
Sounds good ! Thanks for the update @Nic-Ma |
❓ Questions/Help/Support
Hi @vfdev-5 ,
I am writing an ignite handler to write the segmentation metrics of every image into 1 CSV file as the summary, for example:
The problems are that:
metrics.update()
to cache every record and write to CSV inmetrics.complete()
, but ignite.metrics only acceptsoutput_transform
, so I can't extract the filenames fromengine.state.batch
.engine.state.metrics
, handler is not easy to get every metric value corresponding to every image.DistributedDataParallel
, when I run the handler in multi-processsing, how do you usually use the multi-processing lock to save content into 1 CSV in both unix and windows OS?Thanks.
The text was updated successfully, but these errors were encountered: