Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change another dataset #76

Open
Wang-Wenqing opened this issue Sep 10, 2024 · 1 comment
Open

Change another dataset #76

Wang-Wenqing opened this issue Sep 10, 2024 · 1 comment

Comments

@Wang-Wenqing
Copy link

Thanks for your great work! I want to use another dataset, and try to get camera_info like this

def readNeumanCameras(path):
    
    cameras = read_cameras(f'{path}/sparse/cameras.txt')
    images_meta = read_images_meta(f'{path}/sparse/images.txt', f'{path}/images')
    
    keys = []
    frames = []
    for k, v in images_meta.items():
        keys.append(k)
        frames.append(os.path.basename(v.image_path))
    keys = [x for _, x in sorted(zip(frames, keys))]
    keys = sorted(keys, key=int)

    all_time = keys
    max_time = max(all_time)
    all_time = [i / max_time for i in all_time]
    
    train_num = len(frames)
    
    cam_infos = []
    
    for i, key in enumerate(keys):
        cur_cam_id = images_meta[key].camera_id
        cur_cam = cameras[cur_cam_id]
        cur_camera_pose = images_meta[key].camera_pose
        image_path = images_meta[key].image_path 
        cap = RGBPinholeCapture(image_path, cur_cam, cur_camera_pose)
        
        cap.frame_id = {'frame_id': i, 'total_frames': len(images_meta)}
        idx = i
        image = cap.image
        image = Image.fromarray((image).astype(np.uint8))

        image_name = cap.image_path.split('/')[-1]
        width = cap.shape[1]
        height = cap.shape[0]
        FovY = float(2 * np.arctan(cap.shape[0] / (2 * cap.intrinsic_matrix[1, 1]))) # 0.6565035439079898
        FovX = float(2 * np.arctan(cap.shape[1] / (2 * cap.intrinsic_matrix[0, 0]))) # 1.0895537198941696
        R = cap.cam_pose.rotation_matrix[:3, :3] # 4x4 
        T = cap.cam_pose.translation_vector
        fid = all_time[i]
        
        
        cam_info = CameraInfo(uid=idx, R=R, T=T, FovY=FovY, FovX=FovX, image=image,
                              image_path=image_path, image_name=image_name, width=width, height=height,
                              fid=fid)
        cam_infos.append(cam_info)

    sys.stdout.write('\n')
    return cam_infos, train_num

and after trained, the rendered training image look like this,

00000
00019

and the testing image look like this,

00007

00004

It will be very helpful if you can share your advice about what will cause this problem, should be the camera information not right?

@ingra14m
Copy link
Owner

Hi, thanks for your interest.

Looks that the dataset used in your setting is not strict COLMAP format. Ideally, Pinhole camera model would not lead to such a degree of blurry.

So, I suggest that:

  • Use COLMAP to obtain the intrinsics and extrinsics again.
  • Get more accurate camera pose. From my experiments, gs splatting-based methods are highly sensitive to the camera pose. The reconstruction would fail if the camera poses deviate too much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants