Request for code of datasets preprocess #28

HarshWinterBytes · 2024-07-15T03:25:21Z

Thank you for your excellent work!

I noticed that you used four larger datasets: hypersim, Replica, 3D Ken Burns, and Objaverse, and filtered these datasets before training. I think this filtering operation is crucial for the final training result, so I would like to ask if you can release this part of the code or filtered filename list. This will help a lot.

Waiting for your early reply.
Best wishes!

Guirassy43 · 2024-07-15T10:50:53Z

Hi,

Would it be also possible that you share the processing code that you used for the surface normal estimation test datasets (or even release the processed data)? I saw that you mentioned some filtering/processing to reduce the noise in the GT.

Thank you very much!

fuxiao0719 · 2024-07-16T07:51:56Z

Hi, thanks for your interest!

(1) To HarshWinterBytes: Many of our training datasets are sourced from the Metric3D V2 server, including Hypersim, Replica, and 3D Ken Burns. While it might be challenging to prepare specific items for all training data, we can list the main training components and potential alternatives:

1.Hypersim: Filter out 191 scenes without tilt-shift photography (see here)
2.Replica: Exclude samples with fewer than 50 invalid pixels.
3.3D Ken Burns: All samples
4.Synthetic Urban Dataset (Supp. B.2): These data are owned by DJI Auto and may not be publicly available due to company licensing. However, StableNormal (another great work) uses MatrixCity as an alternative, which you may also consider.
5.Objaverse: Similar to this filter

(2) To Guirassy43: The evaluation surface normal test sets are also sourced from the Metric3D V2 server and have been further manually processed to address "over-smooth" (potentially erroneous) regions with the aid of the company. We will catch up later regarding this part. (similar to (1)-4, restricted by DJI company licensing, the release of data is not as easy as the release of code/weight)

HarshWinterBytes · 2024-07-17T03:10:36Z

Hi, thanks for your interest!

(1) To HarshWinterBytes: Many of our training datasets are sourced from the Metric3D V2 server, including Hypersim, Replica, and 3D Ken Burns. While it might be challenging to prepare specific items for all training data, we can list the main training components and potential alternatives:

1.Hypersim: Filter out 191 scenes without tilt-shift photography (see here) 2.Replica: Exclude samples with fewer than 50 invalid pixels. 3.3D Ken Burns: All samples 4.Synthetic Urban Dataset (Supp. B.2): These data are owned by DJI Auto and may not be publicly available due to company licensing. However, StableNormal (another great work) uses MatrixCity as an alternative, which you may also consider. 5.Objaverse: Similar to this filter

(2) To Guirassy43: The evaluation surface normal test sets are also sourced from the Metric3D V2 server and have been further manually processed to address "over-smooth" (potentially erroneous) regions with the aid of the company. We will catch up later regarding this part. (similar to (1)-4, restricted by DJI company licensing, the release of data is not as easy as the release of code/weight)

Thank you for your early reply! I'm sorry to bother you again.

I have not worked with simulators before. Right now I'm having trouble with replica. May I ask you how to use replica to generate the depth and normal? If it possible for you to release the code of generation or even the processed data?

And I also noticed that the issue(apple/ml-hypersim#24)) has been solved. The rgb, depth and normal of the example scene in this issue now is aligned. I wonder if it is necessary now to filter out Hypersim and align the normal and depth in the process of dataloader?

Thank you a lot!

Baijiong-Lin · 2024-07-17T03:28:51Z

the same issue. could you share those .json files or the generated code? thanks.

GeoWizard/geowizard/training/dataloader/mix_loader.py

Line 39 in ee814b7

    
           data_dir = os.path.join(self.data_dir, 'Hypersim', 'annotations', 'annos_all.json')

fuxiao0719 · 2024-07-18T05:54:32Z

It is necessary to align depth and normal during data preprocessing in Hypersim.

For code generation of .json files and replica data, I will reach out to Metric3D v2 team for their initial preparation (as our data are sourced from their server) and see if it is convenient to release.

HarshWinterBytes · 2024-07-18T06:00:28Z

It is necessary to align depth and normal during data preprocessing in Hypersim.

For code generation of .json files and replica data, I will reach out to Metric3D v2 team for their initial preparation (as our data are sourced from their server) and see if it is convenient to release.

Thank you for your early reply!
Looking forward to your good news!

zero0kiriyu mentioned this issue Dec 9, 2024

Request for code of datasets preprocess YvanYin/Metric3D#187

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for code of datasets preprocess #28

Request for code of datasets preprocess #28

HarshWinterBytes commented Jul 15, 2024 •

edited

Loading

Guirassy43 commented Jul 15, 2024

fuxiao0719 commented Jul 16, 2024 •

edited

Loading

HarshWinterBytes commented Jul 17, 2024 •

edited

Loading

Baijiong-Lin commented Jul 17, 2024

fuxiao0719 commented Jul 18, 2024 •

edited

Loading

HarshWinterBytes commented Jul 18, 2024

Request for code of datasets preprocess #28

Request for code of datasets preprocess #28

Comments

HarshWinterBytes commented Jul 15, 2024 • edited Loading

Guirassy43 commented Jul 15, 2024

fuxiao0719 commented Jul 16, 2024 • edited Loading

HarshWinterBytes commented Jul 17, 2024 • edited Loading

Baijiong-Lin commented Jul 17, 2024

fuxiao0719 commented Jul 18, 2024 • edited Loading

HarshWinterBytes commented Jul 18, 2024

HarshWinterBytes commented Jul 15, 2024 •

edited

Loading

fuxiao0719 commented Jul 16, 2024 •

edited

Loading

HarshWinterBytes commented Jul 17, 2024 •

edited

Loading

fuxiao0719 commented Jul 18, 2024 •

edited

Loading