-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* add a tutorial file: up to probabilities so far * re-write the surrounding docs to account for tutorial file * rename `Bayes` to `BayesianRegularization` Because Bayes and Bayesian are so huge terms in the literature, it doesn't feel appropriate to use it here like that. * correctly rename file * add discrete info section to tutorial * correct merging * fix complexit ydocstring * finish first draft of tutorial * use literate to build the tutorial. * port emphasis of what is a new entropy * Correct file name * Typo * Minor typo and text fixes to the tutorial. * Punctuation. * Amplitude and first difference encoding * Finish first difference and amplitude encodings * Add CombinationEncoding * `encode`/`decode` for state vectors for `GaussianCDF`` * Systematic tests for encoding. * Add `CombinationEncoding` to docs * Test `CombinationEncoding` * Remove utils file * Inner/outer constructors and tests * Better descriptions of the new encodings * Clarify inputs to `CombinationEncoding` * Use outer constructors, not inner * Add comment on not why we're not enforcing multi-element vectors. * Fix tests * Disallow `CombinationEncoding` as input to `CombinationEncoding`s * Correct `total_outcomes` - use `prod`, not `sum` * Change names * Add references in `Encoding` docstring * Remove redundant type info * Base.show for the encodings. * Use `RectangularBinEncoding` internally for GaussianCDFEncoding Fixes #300 too. * `RectangularBinEncoding` internally for the new encodings * Add test * More tests * More tests * Update src/encoding_implementations/gaussian_cdf.jl Co-authored-by: George Datseris <datseris.george@gmail.com> * Update src/encoding_implementations/combination_encoding.jl Co-authored-by: George Datseris <datseris.george@gmail.com> * Analytical encoding/decoding tests * Analytical tests for `CombinationEncoding` * Symbol naming, and drop extra doctest * Better description * Remove type restriction. Code will error at lower level if relevant * Remove show methods Do at abstract level later * New constructor * Update src/encoding_implementations/combination_encoding.jl Co-authored-by: George Datseris <datseris.george@gmail.com> * Ensure encodings for `CombinationEncoding` is always a tuple * Return a tuple of decoded symbols * Enforce encoding tuple input. Use number of encodings a type param * Test convenience constructor * Fix and rearrange tests * Fix tests --------- Co-authored-by: Datseris <datseris.george@gmail.com>
- Loading branch information
Showing
20 changed files
with
710 additions
and
187 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
export CombinationEncoding | ||
|
||
""" | ||
CombinationEncoding <: Encoding | ||
CombinationEncoding(encodings) | ||
A `CombinationEncoding` takes multiple [`Encoding`](@ref)s and creates a combined | ||
encoding that can be used to encode inputs that are compatible with the | ||
given `encodings`. | ||
## Encoding/decoding | ||
When used with [`encode`](@ref), each [`Encoding`](@ref) in `encodings` returns | ||
integers in the set `1, 2, …, n_e`, where `n_e` is the total number of outcomes | ||
for a particular encoding. For `k` different encodings, we can thus construct the | ||
cartesian coordinate `(c₁, c₂, …, cₖ)` (`cᵢ ∈ 1, 2, …, n_i`), which can uniquely | ||
be identified by an integer. We can thus identify each unique *combined* encoding | ||
with a single integer. | ||
When used with [`decode`](@ref), the integer symbol is converted to its corresponding | ||
cartesian coordinate, which is used to retrieve the decoded symbols for each of | ||
the encodings, and a tuple of the decoded symbols are returned. | ||
The total number of outcomes is `prod(total_outcomes(e) for e in encodings)`. | ||
## Examples | ||
```julia | ||
using ComplexityMeasures | ||
# We want to encode the vector `x`. | ||
x = [0.9, 0.2, 0.3] | ||
# To do so, we will use a combination of first-difference encoding, amplitude encoding, | ||
# and ordinal pattern encoding. | ||
encodings = ( | ||
RelativeFirstDifferenceEncoding(0, 1; n = 2), | ||
RelativeMeanEncoding(0, 1; n = 5), | ||
OrdinalPatternEncoding(3) # x is a three-element vector | ||
) | ||
c = CombinationEncoding(encodings) | ||
# Encode `x` as integer | ||
ω = encode(c, x) | ||
# Decode symbol (into a vector of decodings, one for each encodings `e ∈ encodings`). | ||
# In this particular case, the first two element will be left-bin edges, and | ||
# the last element will be the decoded ordinal pattern (indices that would sort `x`). | ||
d = decode(c, ω) | ||
``` | ||
""" | ||
struct CombinationEncoding{N, L, C} <: Encoding | ||
# An iterable of encodings. | ||
encodings::NTuple{N, Encoding} | ||
|
||
# internal fields: LinearIndices/CartesianIndices for encodings/decodings. | ||
linear_indices::L | ||
cartesian_indices::C | ||
|
||
function CombinationEncoding(encodings::NTuple{N, Encoding}, l::L, c::C) where {N, L, C} | ||
if any(e isa CombinationEncoding for e in encodings) | ||
s = "CombinationEncoding doesn't accept a CombinationEncoding as one of its " * | ||
"sub-encodings." | ||
throw(ArgumentError(s)) | ||
end | ||
new{N, L, C}(encodings, l, c) | ||
end | ||
end | ||
CombinationEncoding(encodings) = CombinationEncoding(encodings...) | ||
function CombinationEncoding(encodings::Vararg{Encoding, N}) where N | ||
ranges = tuple([1:total_outcomes(e) for e in encodings]...) | ||
linear_indices = LinearIndices(ranges) | ||
cartesian_indices = CartesianIndices(ranges) | ||
return CombinationEncoding(tuple(encodings...), linear_indices, cartesian_indices) | ||
end | ||
|
||
function encode(encoding::CombinationEncoding, χ) | ||
symbols = CartesianIndex(map(e -> encode(e, χ), encoding.encodings)) | ||
ω::Int = encoding.linear_indices[symbols] | ||
return ω | ||
end | ||
|
||
function decode(encoding::CombinationEncoding, ω::Int) | ||
es = encoding.encodings | ||
cidx = encoding.cartesian_indices[ω] | ||
return map(e -> decode(e, cidx[findfirst(eᵢ -> eᵢ == e, es)]), es) | ||
end | ||
|
||
function total_outcomes(encoding::CombinationEncoding) | ||
return prod(total_outcomes.(encoding.encodings)) | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
99 changes: 99 additions & 0 deletions
99
src/encoding_implementations/relative_first_difference_encoding.jl
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
export RelativeFirstDifferenceEncoding | ||
|
||
""" | ||
RelativeFirstDifferenceEncoding <: Encoding | ||
RelativeFirstDifferenceEncoding(minval::Real, maxval::Real; n = 2) | ||
`RelativeFirstDifferenceEncoding` encodes a vector based on the relative position the average | ||
of the *first differences* of the vectors has with respect to a predefined minimum and | ||
maximum value (`minval` and `maxval`, respectively). | ||
## Description | ||
This encoding is inspired by Azami & Escudero[^Azami2016]'s algorithm for amplitude-aware | ||
permutation entropy. They use a linear combination of amplitude information and | ||
first differences information of state vectors to correct probabilities. Here, however, | ||
we explicitly encode the first differences part of the correction as an a integer symbol | ||
`Λ ∈ [1, 2, …, n]`. The amplitude part of the encoding is available | ||
as the [`RelativeMeanEncoding`](@ref) encoding. | ||
## Encoding/decoding | ||
When used with [`encode`](@ref), an ``m``-element state vector | ||
``\\bf{x} = (x_1, x_2, \\ldots, x_m)`` is encoded | ||
as ``Λ = \\dfrac{1}{m - 1}\\sum_{k=2}^m |x_{k} - x_{k-1}|``. The value of ``Λ`` is then | ||
normalized to lie on the interval `[0, 1]`, assuming that the minimum/maximum value any | ||
single ``abs(x_k - x_{k-1})`` can take is `minval`/`maxval`, respectively. Finally, the | ||
interval `[0, 1]` is discretized into `n` discrete bins, enumerated by positive integers | ||
`1, 2, …, n`, and the number of the bin that the normalized ``Λ`` falls into is returned. | ||
The smaller the mean first difference of the state vector is, the smaller the bin number is. | ||
The higher the mean first difference of the state vectors is, the higher the bin number is. | ||
When used with [`decode`](@ref), the left-edge of the bin that the normalized ``Λ`` | ||
fell into is returned. | ||
## Performance tips | ||
If you are encoding multiple input vectors, it is more efficient to construct a | ||
[`RelativeFirstDifferenceEncoding`](@ref) instance and re-use it: | ||
```julia | ||
minval, maxval = 0, 1 | ||
encoding = RelativeFirstDifferenceEncoding(minval, maxval; n = 4) | ||
pts = [rand(3) for i = 1:1000] | ||
[encode(encoding, x) for x in pts] | ||
``` | ||
[^Azami2016]: | ||
Azami, H., & Escudero, J. (2016). Amplitude-aware permutation entropy: | ||
Illustration in spike detection and signal segmentation. Computer methods and | ||
programs in biomedicine, 128, 40-51. | ||
""" | ||
Base.@kwdef struct RelativeFirstDifferenceEncoding{R} <: Encoding | ||
n::Int = 2 | ||
minval::Real | ||
maxval::Real | ||
binencoder::R # RectangularBinEncoding | ||
|
||
function RelativeFirstDifferenceEncoding(n::Int, minval::Real, maxval::Real, binencoder::R) where R | ||
if minval > maxval | ||
s = "Need minval <= maxval. Got minval=$minval and maxval=$maxval." | ||
throw(ArgumentError(s)) | ||
end | ||
if n < 1 | ||
throw(ArgumentError("n must be ≥ 1")) | ||
end | ||
new{typeof(binencoder)}(n, minval, maxval, binencoder) | ||
end | ||
end | ||
|
||
function RelativeFirstDifferenceEncoding(minval::Real, maxval::Real; n = 2) | ||
binencoder = RectangularBinEncoding(FixedRectangularBinning(0, 1, n + 1)) | ||
return RelativeFirstDifferenceEncoding(n, minval, maxval, binencoder) | ||
end | ||
|
||
function encode(encoding::RelativeFirstDifferenceEncoding, x::AbstractVector{<:Real}) | ||
(; n, minval, maxval, binencoder) = encoding | ||
|
||
L = length(x) | ||
Λ = 0.0 # a loop is much faster than using `diff` (which allocates a new vector) | ||
for i = 2:L | ||
Λ += abs(x[i] - x[i - 1]) | ||
end | ||
Λ /= (L - 1) | ||
|
||
# Normalize to [0, 1] | ||
Λ_normalized = (Λ - minval) / (maxval - minval) | ||
|
||
# Return an integer from the set {1, 2, …, encoding.n} | ||
return encode(binencoder, Λ_normalized) | ||
end | ||
|
||
function decode(encoding::RelativeFirstDifferenceEncoding, ω::Int) | ||
# Return the left-edge of the bin. | ||
return decode(encoding.binencoder, ω) | ||
end | ||
|
||
function total_outcomes(encoding::RelativeFirstDifferenceEncoding) | ||
return total_outcomes(encoding.binencoder) | ||
end |
Oops, something went wrong.