-
-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
row/col stochastic matrix documentation #807
base: master
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -672,6 +672,72 @@ z_k | |
. | ||
$$ | ||
|
||
## Stochastic Matrix {#stochastic-matrix-transform.section} | ||
|
||
The `column_stochastic_matrix[N, M]` and `row_stochastic_matrix[N, M]` type in | ||
Stan represents an \(N \times M\) matrix where each column(row) is a unit simplex | ||
of dimension \(N\). In other words, each column(row) of the matrix is a vector | ||
constrained to have non-negative entries that sum to one. | ||
|
||
### Definition of a Stochastic Matrix {-} | ||
|
||
A column stochastic matrix \(X \in \mathbb{R}^{N \times M}\) is defined such | ||
that for each column \(j\) (where \(1 \leq j \leq M\)): | ||
|
||
$$ | ||
X_{ij} \geq 0 \quad \text{for } 1 \leq i \leq N, | ||
$$ | ||
|
||
and | ||
|
||
$$ | ||
\sum_{i=1}^N X_{ij} = 1. | ||
$$ | ||
|
||
A row stochastic matrix is defined similarly but with the axis flipped such | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just as easy to say a row stochastic matrix is any matrix whose transpose is a column stochastic matrix. I would also say in words that a column stochastic matrix has columns that are simplexes, whereas the row version has rows that are simplexes. I've tried to write matrix subscripts as "i, j" rather than "ij" to allow for multiple-character subscripts. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree, can't believe I forgot to just say what this is in simple terms |
||
that | ||
|
||
|
||
$$ | ||
X_{ij} \geq 0 \quad \text{for } 1 \leq j \leq N, | ||
$$ | ||
|
||
and | ||
|
||
$$ | ||
\sum_{j=1}^N X_{ij} = 1. | ||
$$ | ||
|
||
This definition ensures that each column(row) of the matrix \(X\) lies on the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. space between column and ( --- this also appears later Just stick to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I had a reason for why I did that (eigen docs did column(row)), but it looks like they fixed it in their docs so I'll fix it here as well |
||
\(N-1\) dimensional unit simplex, similar to the `simplex[N]` type, but | ||
extended across multiple columns(rows). | ||
|
||
### Inverse Transform for Stochastic Matrix {-} | ||
|
||
For the column and row stochastic matrices the inverse transform is the same | ||
as simplex, but applied to each column(row). | ||
|
||
### Absolute Jacobian Determinant for the Inverse Transform {-} | ||
|
||
The Jacobian determinant of the inverse transform for each column \(j\) in | ||
the matrix is given by the product of the diagonal entries \(J_{i,i,j}\) of | ||
the lower-triangular Jacobian matrix. This determinant is calculated as: | ||
|
||
$$ | ||
\left| \det J_j \right| = \prod_{i=1}^{N-1} \left( z_{ij} (1 - z_{ij}) \left( 1 - \sum_{i'=1}^{i-1} X_{i'j} \right) \right). | ||
$$ | ||
|
||
Thus, the overall Jacobian determinant for the entire `column_stochastic_matrix` and `row_stochastic_matrix` | ||
is the product of the determinants for each column(row): | ||
|
||
$$ | ||
\left| \det J \right| = \prod_{j=1}^{M} \left| \det J_j \right|. | ||
$$ | ||
|
||
### Transform for Stochastic Matrix {-} | ||
|
||
For the column and row stochastic matrices the transform is the same | ||
as simplex, but applied to each column(row). | ||
|
||
## Unit vector {#unit-vector.section} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -673,6 +673,82 @@ iterations, and in either case, with less dispersed parameter | |
initialization or custom initialization if there are informative | ||
priors for some parameters. | ||
|
||
### Stochastic Matrices {-} | ||
|
||
A stochastic matrix is a matrix where each column, row, or both is a | ||
unit simplex, meaning that each column(row) vector has non-negative | ||
values that sum to 1. For example, a \(3 \times 4\) | ||
column stochastic matrix will look like: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd say: "The following example is a Note that it's a period at the end this way and doesn't assert that a column stochastic matrix will look one way or another. Also note that when you use a noun compound like "column stochastic" as an adjective (here modifying "matrix") then it should be hyphenated. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also fixed Column-matrix below |
||
|
||
$$ | ||
\begin{bmatrix} | ||
0.2 & 0.5 & 0.1 & 0.3 \\ | ||
0.3 & 0.3 & 0.6 & 0.4 \\ | ||
0.5 & 0.2 & 0.3 & 0.3 | ||
\end{bmatrix} | ||
$$ | ||
|
||
While a row stochastic matrix will look like: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This isn't a complete sentence---it's just a free-floating clause. How about: "An example of a row-stochastic matrix is the following." |
||
|
||
$$ | ||
\begin{bmatrix} | ||
0.2 & 0.5 & 0.1 & 0.2 \\ | ||
0.2 & 0.1 & 0.6 & 0.1 \\ | ||
0.5 & 0.2 & 0.2 & 0.1 | ||
\end{bmatrix} | ||
$$ | ||
|
||
|
||
In this example, each column(row) sums to 1, making the matrix a | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "each row" --- this is a single example. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I fixed this a bit to make it more clear we are talking about both of the examples |
||
valid `column_stochastic_matrix` and `row_stochastic_matrix`. | ||
|
||
bob-carpenter marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Column stochastic matrices are often used in models where | ||
each column represents a probability distribution across a | ||
set of categories, such as in multiple multinomial distributions, | ||
transition matrices in Markov models, or compositional data analysis. | ||
They can also be used in situations where multiple Dirichlet-distributed v | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd drop the Dirichlet comment here. You can just say they can be used whenever you need multiple simplexes of the same dimensionality. The other big application here is factor models, so you should definitely mention those. The rows in the row stochastic matrix in these models is, for exmaple, is something like the proportion of pollutants being emitted from a factory. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added factor model to the examples |
||
ariables are required across different dimensions. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. premature line break There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cut sentence |
||
|
||
The `column_stochastic_matrix` and `row_stochastic_matrix` types are declared | ||
with full dimensionality. For instance, a matrix `theta` with | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. full dimensionality ---> row and column sizes There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed |
||
3 rows and 4 columns, where each | ||
column is a 3-simplex, is declared as: | ||
|
||
```stan | ||
column_stochastic_matrix[3, 4] theta; | ||
``` | ||
|
||
A matrix `theta` with | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Too many line breaks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed |
||
3 rows and 4 columns, where each | ||
row is a 4-simplex, is declared as: | ||
|
||
```stan | ||
row_stochastic_matrix[3, 4] theta; | ||
``` | ||
|
||
As with simplexes, `column(row)_stochastic_matrix` variables are subject to | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is too hard to parse---just repeat, we have tons of space here. That is, separate out column_stochastic_matrix and row_stochastic_matrix. |
||
validation, ensuring that each column(row) satisfies the simplex constraints. | ||
This validation accounts for floating-point imprecision, with checks | ||
performed up to a statically specified accuracy threshold \(\epsilon\). | ||
|
||
#### Stability Considerations {-} | ||
|
||
In high-dimensional settings or when the matrix has many columns, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not clear how "high-dimensional" and "has many columns" differ. Is high-dimensional the rows? I'd just say "high-dimensional" here. |
||
`column_stochastic_matrix` types may require careful tuning of the inference | ||
algorithms. To ensure stability: | ||
|
||
- **Smaller Step Sizes:** In samplers like Hamiltonian Monte Carlo (HMC), | ||
smaller step sizes can help maintain stability, especially in high dimensions. | ||
- **Higher Target Acceptance Rates:** Setting higher target acceptance | ||
rates can improve the robustness of the sampling process. | ||
- **Longer Warmup Periods:** Increasing the warmup period allows the sampler | ||
to better explore the parameter space before the actual sampling begins. | ||
- **Tighter Optimization Tolerances:** For optimization-based inference, | ||
tighter tolerances with more iterations can yield more accurate results. | ||
- **Custom Initialization:** If prior information about the parameters is | ||
available, custom initialization or less dispersed initialization can lead | ||
to more efficient inference. | ||
|
||
### Unit vectors {-} | ||
|
||
A unit vector is a vector with a norm of one. For instance, $[0.5, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've used "col" everywhere else for column. Should this match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed irl, we are wishy washy with col vs column abbreviation, but I like column here so going to keep it