[io] Add option to strip padding bytes while dumping and add padding while reading `PCLPointCLoud2` #1566

stefanbuettner · 2016-03-08T14:00:17Z

Hey there,

during investigating the test failures of Set default alpha value 255 I was mainly wondering why there is so much code which appears to be duplicate and stumbled over the following:
Point clouds saved and loaded with different functions lead to different internal representations.

If creating a PointCloud<PointT> cloud, converting it to a PCLPointCloud2 object, saving that to a file, loading it back into another object, then the data vector of the two don't have the same length.
Code for reproducing this behavior can be found here.
Actually one of them loads the data with and the other without padding bytes (like the w value of the position). This makes sense to me in order to save disk space, but after loading the cloud again the data should be equal to the original, shouldn't it?
So I don't think this behavior is intended.

Also I have the impression that there is some duplicate code.
For example there are 4 implementations of generateHeader which basically all do the same and could be merged into one.
There are also 8 versions of writing pcd data compared to effectivly 1 read method (read(PCLPointCloud2)). The read method reads a PCLPointCloud2 and the others just use that along with from PCLPointCloud2 to read the templated types. Of course this read function contains reading ASCII, binary and compressed binary files which is split for the writing methods.
in addition there are methods for saving subsets of PointCloud<PointT> types but none PCLPointCloud2 types. Probably it doesn't make much sense for the latter ones. But if they existed one could reduce the implementation to

writeASCII(PCLPointCloud2, vector<int>[, bool useIndices])
writeBinary(PCLPointCloud2, vector<int>[, bool useIndices])
writeBinaryCompressed(PCLPointCloud2, vector<int>[, bool useIndices])

What do you think?

Cheers,
Stefan

The text was updated successfully, but these errors were encountered:

SergioRAgostinho · 2016-03-26T02:12:07Z

Started to dig into this

If creating a PointCloud cloud, converting it to a PCLPointCloud2 object, saving that to a file, loading it back into another object, then the data vector of the two don't have the same length.
Code for reproducing this behavior can be found here.
Actually one of them loads the data with and the other without padding bytes (like the w value of the position). This makes sense to me in order to save disk space, but after loading the cloud again the data should be equal to the original, shouldn't it?
So I don't think this behavior is intended.

I could confirm everything you mentioned and the different data size is being caused by the different padding/alignment byte(s) that are inserted in each PointT fields. When you convert from PointCloud<PointT> to PCLPointCloud2 those extra bytes are kept because it allows to simply copy data using a single memcpy instruction, which is fast. Nevertheless, for the PointXYZI we do this at the expense of having data of twice the size that is actually needed. These extra bytes are ignored when saving to a PCD file except when you ask to save a PCLPointCloud2 to a binary file.
Upon reading a PCD file to a PCLPointCloud2, these extra byte paddings are not added which leads to the size mismatch.

In this situation I think that best solution might arguably be to introduce the padding upon reading the PCD files.

jefft255 · 2018-08-08T14:27:29Z

Hi,

I'm commenting on the quite old issue because it think it highlights another problem: the padding bytes are saved when saving a PCLPointCloud2 as binary. I noticed the problem while using CloudCompare. I get that it is a lot faster and simpler to dump the binary data array in the file when saving but with big point clouds this can lead to a significant waste of disk space. For example, one of my point cloud goes from 400MB to 800MB after opening it and saving it back from PCLPointCloud2.

SergioRAgostinho · 2018-08-31T10:03:32Z

@taketwo What's your feeling on providing an extra parameter during saving operations to allow the user to decide on the trade-off speed vs file size during a save procedure?

taketwo · 2018-08-31T15:36:35Z

Actually I'm quite skeptical about speed benefits. Disk I/O is a true bottleneck and I doubt that single memcpy vs. multiple will change the total saving time in any noticeable way. Thus I'd go all-in for stripping padding bytes. (But of course I'm ready to abolish my skepticism if shown otherwise.)

stale · 2020-05-19T07:59:13Z

Marking this as stale due to 30 days of inactivity. It will be closed in 7 days if no further activity occurs.

jefft255 mentioned this issue Aug 8, 2018

Save as PCD adding random scalar fields CloudCompare/CloudCompare#776

Closed

SergioRAgostinho added module: io changelog: enhancement Meta-information for changelog generation labels Aug 31, 2018

taketwo removed the changelog: enhancement Meta-information for changelog generation label Feb 22, 2020

stale bot added the status: stale label May 19, 2020

kunaltyagi added the kind: todo Type of issue label May 20, 2020

stale bot removed the status: stale label May 20, 2020

kunaltyagi changed the title ~~PCD IO~~ [io] Add option to strip padding bytes while dumping and add padding while reading PCLPointCLoud2 May 20, 2020

mvieth mentioned this issue Sep 8, 2021

[conversions] toPCLPointCloud2 or toROSMsg<PointCloud2> wastes memory with alignment bytes #4939

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[io] Add option to strip padding bytes while dumping and add padding while reading `PCLPointCLoud2` #1566

[io] Add option to strip padding bytes while dumping and add padding while reading `PCLPointCLoud2` #1566

stefanbuettner commented Mar 8, 2016

SergioRAgostinho commented Mar 26, 2016

jefft255 commented Aug 8, 2018

SergioRAgostinho commented Aug 31, 2018

taketwo commented Aug 31, 2018

stale bot commented May 19, 2020

[io] Add option to strip padding bytes while dumping and add padding while reading PCLPointCLoud2 #1566

[io] Add option to strip padding bytes while dumping and add padding while reading PCLPointCLoud2 #1566

Comments

stefanbuettner commented Mar 8, 2016

SergioRAgostinho commented Mar 26, 2016

jefft255 commented Aug 8, 2018

SergioRAgostinho commented Aug 31, 2018

taketwo commented Aug 31, 2018

stale bot commented May 19, 2020

[io] Add option to strip padding bytes while dumping and add padding while reading `PCLPointCLoud2` #1566

[io] Add option to strip padding bytes while dumping and add padding while reading `PCLPointCLoud2` #1566