-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concat dataset #1354
concat dataset #1354
Conversation
@@ -13,21 +14,29 @@ public static partial class data | |||
/// <summary> | |||
/// Map-style data set | |||
/// </summary> | |||
public abstract class Dataset : Dataset<Dictionary<string, torch.Tensor>> | |||
public abstract class Dataset : Dataset<Dictionary<string, Tensor>>, | |||
IDataset<IReadOnlyDictionary<string, Tensor>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As commented in the codes below, due to covariation, Dataset
should naturally be IDataset<IReadOnlyDictionary<string, Tensor>>
(because it is IDataset<Dictionary<string, Tensor>>
). However FSharp.Examples
cannot be complied without this. I don't know why this happens.
I would suggest to remove this line, because it influences the detection of torch.utils.data.ConcatDataset
(IReadOnlyDictionary
or Dictionary
are both ok, so it can't automatically choose one), but I failed to make the fsharp program work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more radical idea is to remove Dataset
and IterableDataset
at all. Actually we don't need them. DataLoader
s could work on any IDataset<IReadOnlyDictionary<string, torch.Tensor>>
and IDataset<IEnumerable<Tensor>>
.
By the way, our IterableDataset
is occupying the position of PyTorch IterableDataset
. #1353 That's also part of the reason why I suggest to remove them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F# samples have to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... I suppose so... However it doesn't https://dev.azure.com/dotnet/TorchSharp/_build/results?buildId=107828&view=results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So. just to check -- with the current commits, the F# examples build, correct? I don't see any errors in the latest build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I could be built by adding : IDataset<IReadOnlyDictionary<string, Tensor>>
.
/// <param name="index">Index for tensor</param> | ||
/// <returns>Tensors of index. DataLoader will catenate these tensors into batches.</returns> | ||
[IndexerName("DatasetItems")] | ||
T this[long index] { get; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's using indexers instead of GetTensor
in the new interface. I'm not really sure about its IndexerName
.
And... shall we remove GetTensor
at the same time? Currently Dataset<T>
is keeping an abstract GetTensor
to ensure compatibility. (And its indexer is calling GetTensor
.)
Ready to merge? |
yes |
@LittleLittleCloud, @yueyinqiu -- After merge, builds are failing: https://dev.azure.com/dotnet/TorchSharp/_build/results?buildId=107925&view=results |
This was reverted due to breaking changes preventing a build. We're eager to make a new release with support for Ubuntu 20.04. |
#1348 (comment)
torch.utils.data.ConcatDataset
DataLoader
s has been relaxed toIDataset<out T>
(a new covariant interface), thus accepting the concated datasetDataset<T>
implementsIDataset<T>
Dataset
implements bothIDataset<Dictionary<string, Tensor>>
andIDataset<IReadOnlyDictionary<string, Tensor>>
IterableDataset
implementsIDataset<IList<string, Tensor>>
andIDataset<IEnumerable<string, Tensor>>
IReadOnlyList