Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

N-Dim Blobs vs Datum #2152

Closed
jyegerlehner opened this issue Mar 18, 2015 · 5 comments
Closed

N-Dim Blobs vs Datum #2152

jyegerlehner opened this issue Mar 18, 2015 · 5 comments
Labels

Comments

@jyegerlehner
Copy link
Contributor

So I see the Datum message is still hard-wired for channels/width/height. There is a fair bit of code still coupled to that such as data transformer. I'd expect the way forward is to update Datum to also have a repeated axis_dim property, and enforce in code that either channels/width/height is specified or axis_dims are specified but not both.

This question arose as I was looking at following up on shelhammer's suggestion here that extract_features should be updated for N-dim blobs.

Does anyone have some perspective on this? Perhaps a Brewer could weigh-in before I start down a path that is doomed because I don't understand something. Thanks in advance.

@erictzeng erictzeng added the JL label Mar 18, 2015
@longjon
Copy link
Contributor

longjon commented Mar 27, 2015

Yes, Datum needs an update both for n-d blobs and decoupling of data/labels.

@shelhamer
Copy link
Member

Actually I vote for the deprecation of Datum or keeping it in its current form while porting everything else direct use of Blob / BlobProto / BlobProtoVec. Blob is more general, and the real currency for data in Caffe, so life could be simpler with Blob alone.

A Blob<uint8> could be used for storing data at lower precision then casting to Blob<float> for computation. This takes care of the use case of Datum for int / less-than-float precision data storage.

The one responsibility of Datum that is distinct from Blob is holding encoded image data. For this purpose, the current Datum dimensions and implementation are fine.

Fleshing out these ideas will take a bit of work. We'll try to come up with a coherent plan and dedicate an issue to it to guide our own and community development. Thanks for your interest in making this better.

@jeffdonahue @longjon

@longjon
Copy link
Contributor

longjon commented Mar 27, 2015

Right, I did not mean to suggest that we should necessarily keep around Datum as an n-d thing, just that we should have a path toward working with generic n-d data. Deprecating or special-casing Datum and switching to (usually?) reading Blobs (potentially with some type upgrades) seems like a good approach.

@ducha-aiki
Copy link
Contributor

Datum, which contains encoded data is the best thing suitable for OpenCV-based online image augmentation, so keeping it is good idea :)

@jyegerlehner
Copy link
Contributor Author

Thanks all for the replies. Seems like a sensible direction to me.

So based on the discussion above I think I see a way forward to move extract_features over to n-dim blobs. Since Datum looks to be kept around for image blobs and their transformations, also support non-image n-dim feature blobs by doing a bit of protobuf reflection on message to determine if it is a Datum or a Blob, and then handle it accordingly.

Since shelhamer says will devote an issue to fleshing this all out I'll close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants