Skip to content

Commit

Permalink
Merge pull request #13 from NKeleher/rename-to-statsframe
Browse files Browse the repository at this point in the history
Rename to statsframe
  • Loading branch information
NKeleher authored Jan 25, 2024
2 parents 80a4455 + 4c4d0ed commit 081dc62
Show file tree
Hide file tree
Showing 12 changed files with 44 additions and 43 deletions.
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Contributing

Contributions to `pydatasummary` are welcome and encouraged! The goal of this
Contributions to `statsframe` are welcome and encouraged! The goal of this
project is to make summarizing data and statistical models in python easier and
more intuitive. If you have an idea for a new feature, or a bug fix, please make
a suggestion. Every contribution is appreciated and will be considered.
8 changes: 4 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ install-dev:
style:
poetry run pre-commit run --hook-stage manual --all-files

pytest-cov:
poetry run pytest --cov-report term --cov=pydatasummary tests/
pytest-cov: install-dev
poetry run pytest --cov-report term --cov=statsframe tests/

build: pytest-cov
build:
poetry build

publish: build
publish: style build
poetry publish
32 changes: 16 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,42 @@
# pydatasummary
# statsframe

Customizable data and model summaries in Python.

`pydatasummary` creates tables that provide descriptive statistics of
`statsframe` creates tables that provide descriptive statistics of
numeric and categorical data.

The goal is to provide a simple -- yet customizable -- way to summarize
data and models in Python.

`pydatasummary` is heavily inspired by [`modelsummary`](https://modelsummary.com/)
`statsframe` is heavily inspired by [`modelsummary`](https://modelsummary.com/)
in R. The goal is not to replicate all that `modelsummary` does, but to provide
a way of achieving similar results in Python.

In order to achieve this, `pydatasummary` builds on the [`polars`](https://docs.pola.rs/)
In order to achieve this, `statsframe` builds on the [`polars`](https://docs.pola.rs/)
library to produce tables that can be easily customized and exported to other formats.

## Basic Usage

As an example of `pydatasummary` usage, the `skim` function provides a
As an example of `statsframe` usage, the `skim` function provides a
summary of a DataFrame (either `polars.DataFrame` or `pandas.DataFrame`).
The default summary statistics returned by `pydatasummary.skim()` are unique values,
The default summary statistics returned by `statsframe.skim()` are unique values,
percentage missing, mean, standard deviation, minimum, median, and maximum.

Where possible, `pydatasummary` will print a table to the console and return a
Where possible, `statsframe` will print a table to the console and return a
polars DataFrame with the summary statistics. This allows for easy customization.
For example, the `polars.DataFrame` with statistics from `pydatasummary` can be
For example, the `polars.DataFrame` with statistics from `statsframe` can be
modified using the [`Great Tables`](https://posit-dev.github.io/great-tables/reference/) package.

```python
import polars as pl
import pydatasummary as ds
import statsframe as sf

df = (
pl.read_csv("https://vincentarelbundock.github.io/Rdatasets/csv/datasets/mtcars.csv")
.drop("rownames")
)

stats = ds.skim(df)
stats = sf.skim(df)

Summary Statistics
Rows: 32, Columns: 11
Expand All @@ -61,13 +61,13 @@ We can achieve the same result above with a pandas DataFrame.

```python
import pandas as pd
import pydatasummary as ds
import statsframe as sf

trees_df = pd.read_csv(
"https://vincentarelbundock.github.io/Rdatasets/csv/datasets/trees.csv"
).drop(columns=["rownames"])

trees_stats = ds.skim(trees_df)
trees_stats = sf.skim(trees_df)

Summary Statistics
Rows: 31, Columns: 3
Expand All @@ -84,19 +84,19 @@ Rows: 31, Columns: 3
## Contributing

If you encounter a bug, have usage questions, or want to share ideas to make
the `pydatasummary` package more useful, please feel free to file an
[issue](https://github.com/NKeleher/pydatasummary/issues).
the `statsframe` package more useful, please feel free to file an
[issue](https://github.com/NKeleher/statsframe/issues).

## Code of Conduct

Please note that the **pydatasummary** project is released with a
Please note that the **statsframe** project is released with a
[contributor code of conduct](https://www.contributor-covenant.org/version/2/1/code_of_conduct/).

By participating in this project you agree to abide by its terms.

## License

**pydatasummary** is licensed under the MIT license.
**statsframe** is licensed under the MIT license.

## Governance

Expand Down
6 changes: 3 additions & 3 deletions docs/_quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ project:
type: website
output-dir: .
# website:
# title: "pydatasummary"
# title: "statsframe"
# favicon: favicon.ico
# twitter-card: true
# navbar:
Expand All @@ -27,12 +27,12 @@ project:
# - _sidebar.yml

# quartodoc:
# package: pydatasummary
# package: statsframe
# parser: google
# sidebar: _sidebar.yml

# sections:
# - title: "Function reference"
# desc: "What pydatasummary's functions do"
# desc: "What statsframe's functions do"
# contents:
# - skim
14 changes: 7 additions & 7 deletions examples/example_datasummary_skim.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@
import pandas as pd
import polars as pl

import pydatasummary as ds
import statsframe as sf

# %%
df = pl.read_csv(
"https://vincentarelbundock.github.io/Rdatasets/csv/datasets/mtcars.csv"
).drop("rownames")

stats = ds.skim(df)
stats = sf.skim(df)

# %% [markdown]
# Import a csv file to a polars DataFrame:
Expand All @@ -26,7 +26,7 @@
# Create a skim table

# %%
penguins_stats = ds.skim(penguins_df)
penguins_stats = sf.skim(penguins_df)

# %% [markdown]
# Return the polars DataFrame with the summary statistics
Expand All @@ -35,15 +35,15 @@
penguins_stats

# %%
ds.skim(
sf.skim(
penguins_df,
output="markdown",
title="Palmer's Penguins Summary Statistics",
align="l",
)

# %%
ds.skim(
sf.skim(
penguins_df,
stats="moments",
output="markdown",
Expand All @@ -52,7 +52,7 @@
)

# %%
ds.skim(
sf.skim(
penguins_df,
stats="full",
output="markdown",
Expand All @@ -72,6 +72,6 @@
trees_df.info()

# %%
trees_stats = ds.skim(trees_df)
trees_stats = sf.skim(trees_df)

# %%
3 changes: 2 additions & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
[tool.poetry]
name = "pydatasummary"
name = "statsframe"
version = "0.0.2"
description = "Customizable data and model summaries in Python."
authors = ["Niall Keleher <niall.keleher@gmail.com>"]
packages = [{ include = "*", from = "src" }]
license = "MIT"
readme = "README.md"
homepage = "https://github.com/NKeleher/pydatasummary#readme"
repository = "https://github.com/NKeleher/pydatasummary"
homepage = "https://github.com/NKeleher/statsframe#readme"
repository = "https://github.com/NKeleher/statsframe"
keywords = ["tables", "statistics", "econometrics"]
classifiers = [
# https://pypi.org/classifiers/
Expand All @@ -29,6 +29,7 @@ classifiers = [
python = "^3.9"
polars = "^0.20.5"
pandas = "^2.1.4"
importlib-metadata = "^7.0.1"

[tool.poetry.group.dev.dependencies]
pytest = "^7.4.4"
Expand All @@ -40,7 +41,6 @@ bandit = "^1.7.6"
docformatter = "^1.7.5"
mypy = "^1.8.0"
jupyterlab = "^4.0.11"
importlib-metadata = "^7.0.1"

[build-system]
requires = ["poetry-core"]
Expand Down
8 changes: 4 additions & 4 deletions src/pydatasummary/__init__.py → src/statsframe/__init__.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Define pydatasummary version
# Define statsframe version
from importlib_metadata import version as _v

# __version__ = "0.0.1"
__version__ = _v("pydatasummary")
__version__ = _v("statsframe")

del _v

# Import pydatasummary objects
# Import statsframe objects
# from ._tbl_data import * # noqa: F401, F403, E402
# from ._databackend import * # noqa: F401, F403, E402
from .ds import * # noqa: F401, F403, E402
from .datasummary import * # noqa: F401, F403, E402
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion src/pydatasummary/ds.py → src/statsframe/datasummary.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from __future__ import annotations

# Main ds imports ----
# Main sf.imports ----
import pandas as pd
import polars as pl
import polars.selectors as cs
Expand Down
4 changes: 2 additions & 2 deletions tests/test_skim.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import polars as pl
from polars.testing import assert_frame_equal

from pydatasummary.ds import skim
import statsframe as sf

df = pl.DataFrame(
{
Expand All @@ -28,7 +28,7 @@

def test_skim_numeric_df(data=df):
# Act
result = skim(data)
result = sf.skim(data)

# Assert
assert_frame_equal(result, expected_df)

0 comments on commit 081dc62

Please sign in to comment.