Skip to content

[LEGACY] Tutorial 2: pose estimation

Tzu-Mao Li edited this page Dec 16, 2019 · 1 revision

In this tutorial we will learn a few more advanced features of redner, through a pose estimation example. In particular, we demonstrate:

Like first tutorial, we first render a target image:

and an initial guess:

We want to find the translation and rotation that transforms the initial guess to the target.

To begin, we need to represent the teapot in PyTorch tensors. We provide a utility function pyredner.load_obj to load from a Wavefront object file and corresponding material description:

material_map, mesh_list, light_map = pyredner.load_obj('teapot.obj')

load_obj function returns three lists/dicts. material_map is a dict containing all the materials used in the obj file, where the key is the name of material, and the value is a pyredner.Material. mesh_list is a list containing all the meshes in the obj file, grouped by use_mtl calls. Each element in the list is a tuple with length 2, the first is the name of material, the second is a pyredner.TriangleMesh with mesh information. light_map is a Python dict, where the key is the material names with non zeros Ke, and the values are the Ke

The teapot we have in the repository is relatively low-polygon and does not have vertex normal. If we just render the teapot, we would get this:

Notice the strip patterns. To render a smooth teapot, we generate a normal direction for each vertex, taking average of neighbor faces. We provide an utility function pyredner.compute_vertex_normal for this. The following two lines generates and assigns vertex normals to the teapot:

for _, mesh in mesh_list:
    mesh.normals = pyredner.compute_vertex_normal(mesh.vertices, mesh.indices)

Next we setup the camera just like the previous tutorial:

cam = pyredner.Camera(position = torch.tensor([0.0, 30.0, 200.0]),
                      look_at = torch.tensor([0.0, 30.0, 0.0]),
                      up = torch.tensor([0.0, 1.0, 0.0]),
                      fov = torch.tensor([45.0]), # in degree
                      clip_near = 1e-2, # needs to > 0
                      resolution = (256, 256),
                      fisheye = False)

And next we convert the materials loaded from the Wavefront object files to a Python list of material. At the same time we keep track of the id of the materials, so that we can assign them to the shapes later.

material_id_map = {}
materials = []
count = 0
for key, value in material_map.items():
    material_id_map[key] = count
    count += 1
    materials.append(value)

Now we build a list of shapes using the list loaded from the Wavefront object file. Meshes loaded from .obj files may have different indices for uvs and normals, we use mesh.uv_indices and mesh.normal_indices to access them. This mesh does not have normal_indices so the value is None.

# Get a list of shapes
shapes = []
for mtl_name, mesh in mesh_list:
    assert(mesh.normal_indices is None)
    shapes.append(pyredner.Shape(\
        vertices = mesh.vertices,
        indices = mesh.indices,
        material_id = material_id_map[mtl_name],
        uvs = mesh.uvs,
        normals = mesh.normals,
        uv_indices = mesh.uv_indices))

The previous tutorial used a mesh area light for the scene lighting, here we use an environment light, which is a texture representing infinitely far away light sources in spherical coordinates.

envmap = pyredner.imread('sunsky.exr')
if pyredner.get_use_gpu():
    envmap = envmap.cuda()
envmap = pyredner.EnvironmentMap(envmap)

Finally we construct our scene using all the variables we setup previously.

scene = pyredner.Scene(cam, shapes, materials, area_lights = [], envmap = envmap)

We render the scene and save it to the disk just like the previous tutorial.

Here is our target:

Now we want to generate the initial guess. We want to rotate and translate the teapot. We do this by declaring PyTorch tensors of translation and rotation parameters, then apply them to all teapot vertices. The translation and rotation parameters have very different ranges, so we normalize them by multiplying the translation parameters 100 to map to the actual translation amounts.

translation_params = torch.tensor([0.1, -0.1, 0.1],
    device = pyredner.get_device(), requires_grad=True)
translation = translation_params * 100.0
euler_angles = torch.tensor([0.1, -0.1, 0.1], requires_grad=True)

We obtain the teapot vertices we want to apply the transformation on.

shape0_vertices = shapes[0].vertices.clone()
shape1_vertices = shapes[1].vertices.clone()

We provide an utility pyredner.gen_rotate_matrix to generate 3x3 rotation matrices:

rotation_matrix = pyredner.gen_rotate_matrix(euler_angles)
if pyredner.get_use_gpu():
    rotation_matrix = rotation_matrix.cuda()
center = torch.mean(torch.cat([shape0_vertices, shape1_vertices]), 0)

We shift the vertices to the center, apply rotation matrix, then shift back to the original space.

shapes[0].vertices = \
    (shape0_vertices - center) @ torch.t(rotation_matrix) + \
    center + translation
shapes[1].vertices = \
    (shape1_vertices - center) @ torch.t(rotation_matrix) + \
    center + translation

Since we changed the vertices, we need to regenerate the shading normals

shapes[0].normals = pyredner.compute_vertex_normal(shapes[0].vertices, shapes[0].indices)
shapes[1].normals = pyredner.compute_vertex_normal(shapes[1].vertices, shapes[1].indices)

Then we serialize the scene and render the initial guess

scene_args = pyredner.RenderFunction.serialize_scene(\
    scene = scene,
    num_samples = 512,
    max_bounces = 1)
img = render(1, *scene_args)

Here is our initial guess

Again, we use PyTorch's Adam optimizer to refine the initial guess. We run for 200 iterations. A difference between this and the previous tutorial, is that in the forward pass, we need to re-apply the translation, rotation, and the vertex normal generation, in order for PyTorch to build the computational graph and backpropagate.

# Optimize for pose parameters.
optimizer = torch.optim.Adam([translation_params, euler_angles], lr=1e-2)
# Run 200 Adam iterations.
for t in range(200):
    print('iteration:', t)
    optimizer.zero_grad()
    # Forward pass: apply the mesh operation and render the image.
    translation = translation_params * 100.0
    rotation_matrix = pyredner.gen_rotate_matrix(euler_angles)
    if pyredner.get_use_gpu():
        rotation_matrix = rotation_matrix.cuda()
    center = torch.mean(torch.cat([shape0_vertices, shape1_vertices]), 0)
    shapes[0].vertices = \
        (shape0_vertices - center) @ torch.t(rotation_matrix) + \
        center + translation
    shapes[1].vertices = \
        (shape1_vertices - center) @ torch.t(rotation_matrix) + \
        center + translation
    shapes[0].normals = pyredner.compute_vertex_normal(shapes[0].vertices, shapes[0].indices)
    shapes[1].normals = pyredner.compute_vertex_normal(shapes[1].vertices, shapes[1].indices)
    scene_args = pyredner.RenderFunction.serialize_scene(\
        scene = scene,
        num_samples = 4,
        max_bounces = 1)
    img = render(t+1, *scene_args)
    # Compute the loss function. Here it is L2.
    loss = (img - target).pow(2).sum()

    # Backpropagate the gradients.
    loss.backward()

    # Take a gradient descent step.
    optimizer.step()

Here is our final optimized result:

And the optimization video: