Skip to content

Commit

Permalink
Merge pull request #16 from midlik/mvs
Browse files Browse the repository at this point in the history
MVS
  • Loading branch information
dsehnal authored Mar 27, 2024
2 parents 1dc8df3 + 427fb6d commit bff7d83
Show file tree
Hide file tree
Showing 21 changed files with 3,806 additions and 7 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
site/
.idea
.idea
.DS_Store
5 changes: 0 additions & 5 deletions docs/extensions/mvs.md

This file was deleted.

182 changes: 182 additions & 0 deletions docs/extensions/mvs/annotations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
# MolViewSpec annotations

Annotations are used to define substructures (components) and apply colors, labels, or tooltips to them. In contrast to [selectors](./selectors.md), annotations are defined in a separate file, which can then be referenced in the main MVS file.

## MVS annotation files

MVS annotations can be encoded in multiple different formats, but their logic is always the same and in fact very similar to that of selectors.

### JSON format

The simplest example of an annotation in JSON format is just a JSON-encoded [union component expression](./selectors.md) selector. Here is a simple annotation containing 4 **annotation rows**:

```json
[
{ "label_asym_id": "A" },
{ "label_asym_id": "B" },
{ "label_asym_id": "B", "beg_label_seq_id": 100, "end_label_seq_id": 200 },
{ "label_asym_id": "B", "beg_label_seq_id": 150, "end_label_seq_id": 160 }
]
```

However, in a typical annotation, there is at least one extra field that provides the value of the dependent variable (such as color or label) mapped to each annotation row:

```json
[
{ "label_asym_id": "A", "color": "#00ff00" },
{ "label_asym_id": "B", "color": "blue" },
{ "label_asym_id": "B", "beg_label_seq_id": 100, "end_label_seq_id": 200, "color": "skyblue" }
{ "label_asym_id": "B", "beg_label_seq_id": 150, "end_label_seq_id": 160, "color": "lightblue" }
]
```

This particular annotation (when applied via `color_from_uri` node) will apply green color (#00ff00) to the whole chain A and three shades of blue to the chain B. Later annotation rows override earlier rows, therefore residues 1–99 will be blue, 100–149 skyblue, 150–160 lightblue, 161–200 skyblue, and 201–end blue. (Tip: to color all the rest of the structure in one color, add an annotation row with no selector fields (e.g. `{ "color": "yellow" }`) to the beginning of the annotation.)

Real-life annotation files can include huge numbers of annotation rows. To avoid repeating the same field keys in every row, we can convert the array-of-objects into object-of-arrays. This will result in an equivalent annotation but smaller file size:

```json
{
"label_asym_id": ["A", "B", "B", "B"],
"beg_label_seq_id": [null, null, 100, 150],
"end_label_seq_id": [null, null, 200, 160],
"color": ["#00ff00", "blue", "skyblue", "lightblue"]
}
```

A more complex example of JSON annotation is provided in [1h9t_domains.json](./files/1h9t_domains.json).

### CIF format

Annotations can also be encoded using CIF format, a table-based format which is commonly used in structure biology to store structures or any kind of tabular data.

The example from above, encoded as CIF, would look like this:

```cif
data_annotation
loop_
_coloring.label_asym_id
_coloring.beg_label_seq_id
_coloring.end_label_seq_id
_coloring.color
A . . '#00ff00'
B . . 'blue'
B 100 200 'skyblue'
B 150 160 'lightblue'
```

An advantage of the CIF format is that it can include multiple annotation tables in the same file, organized into blocks and categories. Then the MVS file can reference individual tables using `block_header` (or `block_index`) and `category_name` parameters. The column containing the dependent variable can be specified using `field_name` parameter. In this case, we could use `"block_header": "annotation", "category_name": "coloring", "field_name": "color"`.

### BCIF format

This has exactly the same structure as the CIF format, but encoded using [BinaryCIF](https://github.com/molstar/BinaryCIF).

## Referencing MVS annotations in MVS tree

### From URI

MVS annotations can be referenced in `color_from_uri`, `label_from_uri`, `tooltip_from_uri`, and `component_from_uri` nodes in MVS tree.

For example this part of a MVS tree:

```txt
- representation {type: "cartoon"}
- color {selector: {label_asym_id: "A"}, color: "#00ff00"}
- color {selector: {label_asym_id: "B"}, color: "blue"}
- color {selector: {label_asym_id: "B", beg_label_seq_id: 100, end_label_seq_id: 200}, color: "skyblue"}
- color {selector: {label_asym_id: "B", beg_label_seq_id: 150, end_label_seq_id: 160}, color: "lightblue"}
```

can be replaced by:

```txt
- representation {type: "cartoon"}
- color_from_uri {uri: "https://example.org/annotations.json", format: "json", schema: "residue_range"}
```

assuming that the JSON annotation file shown in the previous section is available at `https://example.org/annotations.json`.

#### Relative URIs

The `uri` parameter can also hold a URI reference (relative URI). In such cases, this URI reference is relative to the URI of the MVS file itself (e.g. if the MVS file is available from `https://example.org/spanish/inquisition/expectations.mvsj`, then the relative URI `./annotations.json` is equivalent to `https://example.org/spanish/inquisition/annotations.json`). This is however not applicable in all cases (e.g. the MVS tree can be constructed ad-hoc within a web application, therefore it has no URI; or the MVS file is loaded from a local disk using drag&drop, therefore the relative location is not accessible by the browser).

A special case is when the MVS tree is saved in MVSX format. An MVSX file is a ZIP archive containing the MVS tree in `index.mvsj` and possibly other files. In this case, the relative URIs will resolve to the files within the archive (e.g. `./annotations.json` points to the file `annotations.json` stored in the MSVX archive).

### From source

The MVS annotations can in fact be stored within the same mmCIF file from which the structure coordinates are loaded. To reference these annotations, we can use `color_from_source`, `label_from_source`, `tooltip_from_source`, and `component_from_source` nodes. Example:

```txt
- representation {type: "cartoon"}
- color_from_source {schema: "residue_range", block_header: "annotation", category_name: "coloring"}
```

## Annotation schemas

The `schema` parameter of all `*_from_uri` and `*_from_source` nodes specifies the MVS annotation schema, i.e. a set of fields used to select a substructure. In the example above we are using `residue_range` schema, which uses columns `label_entity_id`, `label_asym_id`, `beg_label_seq_id`, and `end_label_seq_id`. (We didn't provide values for `label_entity_id`, so it is not taken into account even though the schema supports it).

Table of selector field names supported by individual MVS annotation schemas:

| Field \ Schema | whole_structure | entity | chain | residue | residue_range | atom | auth_chain | auth_residue | auth_residue_range | auth_atom | all_atomic |
| :---------------- | :-------------: | :----: | :---: | :-----: | :-----------: | :--: | :--------: | :----------: | :----------------: | :-------: | :--------: |
| label_entity_id | | X | X | X | X | X | | | | | X |
| label_asym_id | | | X | X | X | X | | | | | X |
| label_seq_id | | | | X | | X | | | | | X |
| beg_label_seq_id | | | | | X | | | | | | X |
| end_label_seq_id | | | | | X | | | | | | X |
| label_atom_id | | | | | | X | | | | | X |
| auth_asym_id | | | | | | | X | X | X | X | X |
| auth_seq_id | | | | | | | | X | | X | X |
| pdbx_PDB_ins_code | | | | | | | | X | | X | X |
| beg_auth_seq_id | | | | | | | | | X | | X |
| end_auth_seq_id | | | | | | | | | X | | X |
| auth_atom_id | | | | | | | | | | X | X |
| type_symbol | | | | | | X | | | | X | X |
| atom_id | | | | | | X | | | | X | X |
| atom_index | | | | | | X | | | | X | X |

To include all selector field names that are present in the annotation, one can use `"schema": "all_atomic"` (we could use it in the example above and the result would be the same). In future versions of MVS, non-atomic schemas might be added, to select parts of structures that are not composed of atoms, e.g. coarse models or geometric primitives.

## `group_id` field

The `group_id` field is a special field supported by all MVS annotation schemas. It does not change the sets of atoms selected by individual rows but instead groups annotation rows together to create more complex selections. This is useful when adding labels to our visualization.

The following example (when applied via `label_from_uri` node) will create 7 separate labels, each bound to a single residue:

```cif
data_annotation
loop_
_labels.label_asym_id
_labels.label_seq_id
_labels.color
_labels.label
A 100 pink 'Substrate binding site'
A 150 pink 'Substrate binding site'
A 170 pink 'Substrate binding site'
A 200 blue 'Inhibitor binding site'
A 220 blue 'Inhibitor binding site'
A 300 lime 'Glycosylation site'
A 330 lime 'Glycosylation site'
```

On the other hand, the next example will only create 4 labels ("Substrate binding site" label bound to residues 100, 150, and 170; "Inhibitor binding site" label bound to residues 200 and 220; "Glycosylation site" label bound to residue 300; and "Glycosylation site" label bound to residue 330):

```cif
data_annotation
loop_
_labels.group_id
_labels.label_asym_id
_labels.label_seq_id
_labels.color
_labels.label
1 A 100 pink 'Substrate binding site'
1 A 150 pink 'Substrate binding site'
1 A 170 pink 'Substrate binding site'
2 A 200 blue 'Inhibitor binding site'
2 A 220 blue 'Inhibitor binding site'
. A 300 lime 'Glycosylation site'
. A 330 lime 'Glycosylation site'
```

Note: Annotation rows with empty `group_id` field (`.` in CIF, ommitted field or `null` in JSON) are always treated as separate groups.

Note 2: `group_id` field has no effect on colors, tooltips, components. It only makes any difference for labels.
75 changes: 75 additions & 0 deletions docs/extensions/mvs/camera-settings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# MVS camera settings

Camera position and orientation in MVS views can be adjusted in two ways: using a `camera` node or a `focus` node. Global attributes of the MVS view unrelated to camera positioning can be adjusted via a `canvas` node.

## **`camera`** node

This node instructs to directly set the camera position and orientation. This is done by passing `target`, `position`, and optional `up` vector. The `camera` node is placed as a child of the `root` node (see [MVS tree schema](./mvs-tree-schema.md#camera)).

![Camera parameters](./files/camera.png 'Camera parameters')

However, if the `target` and `position` vectors were interpreted directly, the resulting view would wildly depend on the camera _field of view_ (FOV). For example, assume we have a sphere with center in the point [0,0,0] and radius 10 Angstroms, and we set `target=[0,0,0]` and `position=[0,0,20]`. With a camera with vertical FOV=90°, the sphere will fit into the camera's view nicely, with some margin above and under the sphere. But with a camera with vertical FOV=30°, the top and bottom of sphere will be cropped. To avoid these differences, MVS always uses position of a "reference camera" instead of the real camera position.

We define the "reference camera" as a camera with such FOV that a sphere with radius _R_ viewed from distance 2*R* (from the center of the sphere) will just fit into view (i.e. there will be no margin but the sphere will not be cropped). This happens to be FOV = 2 arcsin(1/2) = 60° for perspective projection, and FOV = 2 arctan(1/2) ≈ 53° for orthographic projection.

When using **perspective** projection, the real camera distance from target and the real camera position can be calculated using these formulas:

$d _\mathrm{adj} = d _\mathrm{ref} \cdot \frac{1}{2 \sin(\alpha/2)}$

$\mathbf{p} _\mathrm{adj} = \mathbf{t} + (\mathbf{p} _\mathrm{ref} - \mathbf{t}) \cdot \frac{1}{2 \sin(\alpha/2)}$

Where $\alpha$ is the vertical FOV of the real camera, $d _\mathrm{ref}$ is the reference camera distance from target, $d _\mathrm{adj}$ is the real (adjusted) camera distance from target, $\mathbf{t}$ is the target position, $\mathbf{p} _\mathrm{ref}$ is the reference camera position (the value in the MVS file), and $\mathbf{p} _\mathrm{adj}$ is the real (adjusted) camera position.

When using **orthographic** projection, the formulas are slightly different:

$d _\mathrm{adj} = d _\mathrm{ref} \cdot \frac{1}{2 \tan(\alpha/2)}$

$\mathbf{p} _\mathrm{adj} = \mathbf{t} + (\mathbf{p} _\mathrm{ref} - \mathbf{t}) \cdot \frac{1}{2 \tan(\alpha/2)}$

The following image illustrates the camera position adjustment (left: reference camera with FOV=60°, right: real camera with FOV=30° must be positioned further from the target to obtain a similar view):

![Camera position adjustment (left: reference camera, right: real camera)](./files/camera-adjustment.png 'Camera position adjustment (left: reference camera, right: real camera)')

Using the example above (`target=[0,0,0]` and `position=[0,0,20]`), we can calculate that the real camera position will have to be set to:

- [0, 0, 14.14] for FOV=90° (perspective projection)
- [0, 0, 20] for FOV=60° (perspective projection)
- [0, 0, 38.68] for FOV=30° (perspective projection)

Note that for orthographic projection this adjustment achieves that the resulting view does not depend on the FOV value. For perspective projection, this is not possible and there will always be some "fisheye effect", but still it greatly reduces the dependence on FOV and avoids the too-much-zoomed-in and too-much-zoomed-out views when FOV changes.

The `up` vector describes how the camera should be rotated around the position-target axis, i.e. it is the vector in 3D space that will be point up when projected on the screen. For this, the `up` vector must be perpendicular to the position-target axis. However, the MVS specification does not require that the provided `up` vector be perpendicular. This can be solved by a simple adjustment:

$\mathbf{u} _\mathrm{adj} = \mathrm{normalize} ( ((\mathbf{t}-\mathbf{p}) \times \mathbf{u}) \times (\mathbf{t}-\mathbf{p}) )$

Where $\mathbf{u}$ is the unadjusted up vector (the value in the MVS file), $\mathbf{u} _\mathrm{adj}$ is the adjusted up vector, $\mathbf{t}$ is the target position, and $\mathbf{p}$ is the camera position (can be either reference or adjusted camera position, the result will be the same).

If the up vector parameter is not provided, the default value ([0, 1, 0]) will be used (after adjustment).

## **`focus`** node

The other way to adjust camera is to use a `focus` node. This node is placed as a child of a `component` node and instructs to set focus to the parent component (zoom in). This means that the camera target should be set to the center of the bounding sphere of the component, and the camera position should be set so that the bounding sphere just fits into view (vertically and horizontally).

By default, the camera will be oriented so that the X axis points right, the Y axis points up, and the Z axis points towards the observer. This orientation can be changed using the optional vector parameters `direction` and `up` (see [MVS tree schema](./mvs-tree-schema.md#focus)). The `direction` vector describes the direction from the camera position towards the target position (default [0, 0, -1]). The meaning of the `up` vector is the same as for the `camera` node and the same adjustment applies to it (default [0, 1, 0]).

The reference camera position for a `focus` node can be calculated as follows:

$\mathbf{p} _\mathrm{ref} = \mathbf{t} - \mathrm{normalize}(\mathbf{d}) \cdot 2 r \cdot \max(1, \frac{h}{w})$

Where $\mathbf{t}$ is the target position (center of the bounding sphere of the component), $r$ is the radius of the bounding sphere of the component, $\mathbf{d}$ is the direction vector, $h$ is the height of the viewport, $w$ is the width of the viewport, and $\mathbf{p} _\mathrm{ref}$ is the reference camera position (see explanation above).

The following image illustrates the camera position calculation to fit the bounding sphere of a structure:

![Focus calculation](./files/focus.png 'Focus calculation')

Applying the FOV-adjustment formulas from the previous section, we can easily calculate the real position that we have to set to the camera ($\mathbf{p} _\mathrm{adj}$):

For perspective projection: $\mathbf{p} _\mathrm{adj} = \mathbf{t} - \mathrm{normalize}(\mathbf{d}) \cdot \frac{r}{\sin(\alpha/2)} \cdot \max(1, \frac{h}{w})$

For orthographic projection: $\mathbf{p} _\mathrm{adj} = \mathbf{t} - \mathrm{normalize}(\mathbf{d}) \cdot \frac{r}{\tan(\alpha/2)} \cdot \max(1, \frac{h}{w})$

## **`canvas`** node

Attributes that apply to the MVS view as a whole, but are not related to camera positioning, can be set using a `canvas` node. This node is placed as a child of the `root` node (see [MVS tree schema](./mvs-tree-schema.md#canvas)).

Currently, this only includes one parameter: `background_color`. Its value can be set to either a [X11 color](http://www.w3.org/TR/css3-color/#svg-color) (e.g. `"red"`), or a hexadecimal color code (e.g. `"#FF0011"`). If there is no `canvas` node, the background will be white.
Loading

0 comments on commit bff7d83

Please sign in to comment.