Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualization for SemanticKITTI demo #72

Closed
Alexander-Yao opened this issue Sep 18, 2023 · 0 comments
Closed

Visualization for SemanticKITTI demo #72

Alexander-Yao opened this issue Sep 18, 2023 · 0 comments

Comments

@Alexander-Yao
Copy link

Hi @anhquancao, thanks for your excellent repo.

I met a problem when I was trying to generate the demo for SemanticKITTI dataset. That is, I have generated all .pkl files for validation (sequence 08) set, but I am not sure where is the camera pose I could use or how to fix the same viewpoint for these frames. The script I wrote as following:

import open3d as o3d
import pickle
import numpy as np
import cv2

def get_grid_coords(dims, resolution):
'''
:param dims: the dimensions of the grid [x, y, z] (i.e. [256, 256, 32])
:return coords_grid: is the center coords of voxels in the grid
'''

The sensor in centered in X (we go to dims/2 + 1 for the histogramdd)

g_xx = np.arange(0, dims[0] + 1)

The sensor is in Y=0 (we go to dims + 1 for the histogramdd)

g_yy = np.arange(0, dims[1] + 1)

The sensor is in Z=1.73. I observed that the ground was to voxel levels above the grid bottom, so Z pose is at 10

if bottom voxel is 0. If we want the sensor to be at (0, 0, 0), then the bottom in z is -10, top is 22

(we go to 22 + 1 for the histogramdd)

ATTENTION.. Is 11 for old grids.. 10 for new grids (v1.1) (PRBonn/semantic-kitti-api#49)

sensor_pose = 10
g_zz = np.arange(0, dims[2] + 1)

Obtaining the grid with coords...

xx, yy, zz = np.meshgrid(g_xx[:-1], g_yy[:-1], g_zz[:-1])

coords_grid = np.array([xx.flatten(), yy.flatten(), zz.flatten()]).T
coords_grid = coords_grid.astype(np.float32)

coords_grid = (coords_grid * resolution) + resolution/2

temp = np.copy(coords_grid)
temp[:, 0] = coords_grid[:, 1]
temp[:, 1] = coords_grid[:, 0]
coords_grid = np.copy(temp)

return coords_grid, g_xx, g_yy, g_zz

def draw_semantic_open3d(
voxels,
cam_param_path="",
voxel_size=0.2):

colors = np.array([
    [0  , 0  , 0, 255],
    [100, 150, 245, 255],
    [100, 230, 245, 255],
    [30, 60, 150, 255],
    [80, 30, 180, 255],
    [100, 80, 250, 255],
    [255, 30, 30, 255],
    [255, 40, 200, 255],
    [150, 30, 90, 255],
    [255, 0, 255, 255],
    [255, 150, 255, 255],
    [75, 0, 75, 255],
    [175, 0, 75, 255],
    [255, 200, 0, 255],
    [255, 120, 50, 255],
    [0, 175, 0, 255],
    [135, 60, 0, 255],
    [150, 240, 80, 255],
    [255, 240, 150, 255],
    [255, 0, 0, 255]]) / 255.0


grid_coords, _, _, _ = get_grid_coords([voxels.shape[0], voxels.shape[1], voxels.shape[2]], voxel_size)    

points = np.vstack([grid_coords.T, voxels.reshape(-1)]).T

# Obtaining voxels with semantic class
points = points[(points[:, 3] != 0) & (points[:, 3] != 255)] # remove empty voxel and unknown class

colors = np.take_along_axis(colors, points[:, 3].astype(np.int32).reshape(-1, 1), axis=0)

vis = o3d.visualization.Visualizer()
vis.create_window(width=1200, height=600)
ctr = vis.get_view_control()
param = o3d.io.read_pinhole_camera_parameters(cam_param_path)

Line80: #param.extrinsic = pose_matrix # set extrinsic matrix from 4x4 camera pose matrix

pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(points[:, :3])
pcd.colors = o3d.utility.Vector3dVector(colors[:, :3])
pcd.estimate_normals()
vis.add_geometry(pcd)

ctr.convert_from_pinhole_camera_parameters(param)

vis.run()  # user changes the view and press "q" to terminate
param = vis.get_view_control().convert_to_pinhole_camera_parameters()
o3d.io.write_pinhole_camera_parameters(cam_param_path, param)

def process(scan, dataset="Semantic_KITTI"):

with open(scan, "rb") as handle:
    b = pickle.load(handle)

fov_mask_1 = b["fov_mask_1"]
T_velo_2_cam = b["T_velo_2_cam"]
vox_origin = np.array([0, -25.6, -2])

y_pred = b["y_pred"]

print(T_velo_2_cam.shape)

if dataset == "Semantic_KITTI":
    # Visualize Semantic_KITTI
    draw_semantic_open3d(
        y_pred,
        cam_param_path="./Semantic_KITTI/cam_params.json",
        voxel_size=0.2
    )
else:
    # Visualize Semantic KITTI
    draw_semantic_open3d(
        y_pred,
        T_velo_2_cam,
        vox_origin,
        fov_mask_1,
        img_size=(1220, 370),
        f=707.0912,
        voxel_size=0.2,
        d=7,
    )

from glob import glob

if name == "main":
pkl_file = glob("./Semantic_KITTI/*.pkl")
process(pkl_file[0])

Could you plesae tell me how to set the camera matrix at line:80 I marked? I have tried to find the camera matrix from https://www.cvlibs.net/datasets/kitti/eval_odometry.php and #16 . But it seems both of them does not work.

Thank you so much for your time and help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant