Skip to content

Commit

Permalink
Fixing broken links in documentation, deprecation warnings (#77)
Browse files Browse the repository at this point in the history
* - switching from `setup.cfg` to poetry, - replacing flake8, isort, and black with ruff
- fixing ruff linting errors

* fixing ruff errors on `tests`

* more authors

* fixing github actions

* 1. upgrading versions of pandas, pyparsing, fastapi, uvicorn,  and pydantic.
2. upgrading pydantic model classes
3. fixing tests

* Fixing deprecation warning

* Solving deprecation warnings on pydantic ConfigDict

* fixing github actions

* Fixing mkdocs link errors and warnings

* fixing deprecations

* Add nice format to evaluated metrics in long format

* Fixing warnings and tests

* Fixing some deprecation warnings
  • Loading branch information
ondraz authored Jun 18, 2024
1 parent a8d8f49 commit 83913e3
Show file tree
Hide file tree
Showing 14 changed files with 941 additions and 286 deletions.
2 changes: 1 addition & 1 deletion docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ While we built Experimentation Portal in Avast, we are far from being able to op

Event data, goals, metrics, database, storage, access those are all implementation-specific and heavily proprietary. Ep-Stats does not require any particular database or data format. Ep-Stats provides abstract experiment definition including definitions of goals required for experiment evaluation. It is left to implementors to transform this abstract experiment definition to SQL queries, Spark queries, or anything that has been in place. It is also left to implementators if they support all Ep-Stats features in these queries e.g. filtering results by domains, having goals with parameters, etc.

Ep-Stats only requires goals aggregated per variant in [`Experiment.evaluate_agg`](./api/experiment.md#epstats.toolkit.experiment.Experiment.evaluate_agg) method input. See [this example](./user_guide/aggregation.md#example) for details.
Ep-Stats only requires goals aggregated per variant in [`Experiment.evaluate_agg`][epstats.toolkit.experiment.Experiment.evaluate_agg] method input. See [this example](./user_guide/aggregation.md#example) for details.

Ep-Stats abstracts access to date using proprietary implementations of [`Dao`](./api/dao.md) and [`DaoFactory`](./api/dao_factory.md) classes that are passed via dependency injection into the rest app.

Expand Down
2 changes: 1 addition & 1 deletion docs/principles.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ The exposure is a (special) goal in the ep-stats as well.

We attribute goals to experiments and variants based on units exposures. This means that every event that should make a goal in ep-stats must reference some unit. Without that, ep-stats is not able to recognize what exposure to look up to determine which experiment(s) and variant(s) the goal should be attributed to.

See [Supported Unit Types](user_guide/aggregation.md#supported_unit_types).
See [Supported Unit Types](./user_guide/aggregation.md#supported-unit-types).

## Metrics

Expand Down
2 changes: 1 addition & 1 deletion docs/stats/basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Two-sample t-test is only correct when we deal with absolute difference, i.e. $B
### Independent and Identically Distributed Observations
Welch's t-test assumes that observations are independent and identically distributed (i.i.d.). Unfortunately this assumption does not hold always.

Let's assume Click-through rate metric (i.e. clicks / views). Since multiple views (and clicks) from the same user (user being randomization unit here) are allowed, the assumption of independence is violated. Multiple observations from the same user are not independent. [Delta method for iid](ctr.md##asymptotic-distribution-of-ctr) is necessary (has not been implemented yet).
Let's assume Click-through rate metric (i.e. clicks / views). Since multiple views (and clicks) from the same user (user being randomization unit here) are allowed, the assumption of independence is violated. Multiple observations from the same user are not independent. [Delta method for iid](ctr.md#asymptotic-distribution-of-ctr) is necessary (has not been implemented yet).

### Multiple Comparisons Problem
If we have only one control $A$ and one treatment $B$ variant, we need to run just one Welch's t-test, i.e. relative difference between $B$ and $A$. If we have multiple treatment variants, e.g. $B$, $C$ and $D$, we need to run three Welch's t-tests, i.e. relative difference between $B$ and $A$, $C$ and $A$, $D$ and $A$. If we run every single test on 5% level of significance, the overall level of significance is lower. The probability of false-positive error (i.e. we wrongly reject at least one null hypothesis) is higher than required 5% level.
Expand Down
8 changes: 4 additions & 4 deletions docs/user_guide/aggregation.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@ Ep-stats support any type of randomization unit. It is a responsibility of an in

### Note on Randomization

It is necessary for the statistics to work correctly that unit exposures are randomly (independently and identically a.k.a. IID) distributed within one experiment into its variants. This is usually the case when we randomize at pageview or event session unit types.
It is necessary for the statistics to work correctly that unit exposures are randomly (independently and identically a.k.a. IID) distributed within one experiment into its variants. This is usually the case when we randomize at page view or event session unit types.

In general, one unit can experience the experiment variant multiple times.

Violation of IID leads to uncontrolled false-positive errors in metric evaluation. Ep-stats remedies this IID violation by using [delta method for IID](../stats/ctr.md##asymptotic-distribution-of-ctr).
Violation of IID leads to uncontrolled false-positive errors in metric evaluation. Ep-stats remedies this IID violation by using [delta method for IID](../stats/ctr.md#asymptotic-distribution-of-ctr).

## Aggregation Types

Expand Down Expand Up @@ -69,7 +69,7 @@ value(test_unit_type.unit.conversion(product=p_1, country=A)) / count(test_unit_

## Example

Following SQL snippet shows how top-level aggregation should be made to obtain `goals` for [`Experiment.evaluate_agg`](./api/experiment.md#epstats.toolkit.experiment.Experiment.evaluate_agg).
Following SQL snippet shows how top-level aggregation should be made to obtain `goals` for [`Experiment.evaluate_agg`][epstats.toolkit.experiment.Experiment.evaluate_agg].

```SQL
SELECT
Expand Down Expand Up @@ -111,4 +111,4 @@ SELECT
goal,
```

See [Test Data](test_data.md) for examples of pre-aggregated goals that make input to statistical evaluation using [`Experiment.evaluate_agg`](../api/experiment.md#epstats.toolkit.experiment.Experiment.evaluate_agg) or per unit goals that make input to statistical evaluation using [`Experiment.evaluate_by_unit`](../api/experiment.md#epstats.toolkit.experiment.Experiment.evaluate_by_unit).
See [Test Data](test_data.md) for examples of pre-aggregated goals that make input to statistical evaluation using [`Experiment.evaluate_agg`][epstats.toolkit.experiment.Experiment.evaluate_agg] or per unit goals that make input to statistical evaluation using [`Experiment.evaluate_by_unit`][epstats.toolkit.experiment.Experiment.evaluate_by_unit].
2 changes: 1 addition & 1 deletion docs/user_guide/configuring_api.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ SELECT

## Configuring REST API

After having access to our data in custom implementation of [`Dao`](../api/dao.md) class e.g. `CustomDao`, we can follow up an example in [`main.py`](/doodlebug/ep-stats-lib/tree/master/src/epstats/main.py) to configure the REST API with our `CustomDao`. We need to implement `CustomDaoFactory` that creates instances of our `CustomDao` for every request served. We can then customize `get_dao_factory()` method in `main.py` and to launch the server.
After having access to our data in custom implementation of [`Dao`](../api/dao.md) class e.g. `CustomDao`, we can follow up an example in [`main.py`](https://github.com/avast/ep-stats/blob/master/src/epstats/main.py) to configure the REST API with our `CustomDao`. We need to implement `CustomDaoFactory` that creates instances of our `CustomDao` for every request served. We can then customize `get_dao_factory()` method in `main.py` and to launch the server.

```python
def get_dao_factory():
Expand Down
2 changes: 1 addition & 1 deletion docs/user_guide/protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ Checking for SRM-Sample Ratio Mismatch, is an important test everyone doing expe

Failing SRM check tells us there is some underlying problem in the experiment randomization. Experiments with failed SRM check should not be evaluated at all.

See [SRM](../stats/basics.md#sample_ratio_mismatch) for details about EP implementation.
See [SRM](../stats/basics.md#sample-ratio-mismatch-check) for details about EP implementation.

## Pros and Cons

Expand Down
2 changes: 1 addition & 1 deletion docs/user_guide/test_data.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ example of input data and formats required by `epstats`.

There are test goal data in both pre-aggregated and by-unit forms. See [`TestData`](../api/test_data.md) for various access methods.

Test data itself are saved as csv files in [src/epstats/toolkit/testing/resources](/doodlebug/ep-stats-lib/tree/master/src/epstats/toolkit/testing/resources). They include pre-aggregated and by-unit goals together with pre-computed evaluations of metrics, checks, and exposures that are used to assert our unit-tests against (e.g. in [`test_experiment.py`](/doodlebug/ep-stats-lib/tree/master/tests/epstats/toolkit/test_experiment.py)).
Test data itself are saved as csv files in [src/epstats/toolkit/testing/resources](https://github.com/avast/ep-stats/tree/master/src/epstats/toolkit/testing/resources). They include pre-aggregated and by-unit goals together with pre-computed evaluations of metrics, checks, and exposures that are used to assert our unit-tests against (e.g. in [`test_experiment.py`](https://github.com/avast/ep-stats/blob/master/tests/epstats/toolkit/test_experiment.py)).

## How to Update Test Data

Expand Down
5 changes: 3 additions & 2 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ plugins:
python:
paths: [src]
- search
- autorefs
- mkdocs-jupyter:
include_source: True
execute: False
Expand Down Expand Up @@ -82,5 +83,5 @@ markdown_extensions:
- pymdownx.arithmatex:
generic: true
- pymdownx.emoji:
emoji_generator: !!python/name:materialx.emoji.to_svg
emoji_index: !!python/name:materialx.emoji.twemoji
emoji_generator: !!python/name:material.extensions.emoji.to_svg
emoji_index: !!python/name:material.extensions.emoji.twemoji
Loading

0 comments on commit 83913e3

Please sign in to comment.