Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename to statsframe #13

Merged
merged 2 commits into from
Jan 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Contributing

Contributions to `pydatasummary` are welcome and encouraged! The goal of this
Contributions to `statsframe` are welcome and encouraged! The goal of this
project is to make summarizing data and statistical models in python easier and
more intuitive. If you have an idea for a new feature, or a bug fix, please make
a suggestion. Every contribution is appreciated and will be considered.
8 changes: 4 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ install-dev:
style:
poetry run pre-commit run --hook-stage manual --all-files

pytest-cov:
poetry run pytest --cov-report term --cov=pydatasummary tests/
pytest-cov: install-dev
poetry run pytest --cov-report term --cov=statsframe tests/

build: pytest-cov
build:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (llm): The removal of the 'pytest-cov' dependency from the 'build' target in the Makefile could lead to a situation where tests are not run before building, potentially allowing bugs to slip into the build. If this change is intentional, consider documenting the rationale behind it to maintain clarity for future maintainers.

poetry build

publish: build
publish: style build
poetry publish
32 changes: 16 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,42 @@
# pydatasummary
# statsframe

Customizable data and model summaries in Python.

`pydatasummary` creates tables that provide descriptive statistics of
`statsframe` creates tables that provide descriptive statistics of
numeric and categorical data.

The goal is to provide a simple -- yet customizable -- way to summarize
data and models in Python.

`pydatasummary` is heavily inspired by [`modelsummary`](https://modelsummary.com/)
`statsframe` is heavily inspired by [`modelsummary`](https://modelsummary.com/)
in R. The goal is not to replicate all that `modelsummary` does, but to provide
a way of achieving similar results in Python.

In order to achieve this, `pydatasummary` builds on the [`polars`](https://docs.pola.rs/)
In order to achieve this, `statsframe` builds on the [`polars`](https://docs.pola.rs/)
library to produce tables that can be easily customized and exported to other formats.

## Basic Usage

As an example of `pydatasummary` usage, the `skim` function provides a
As an example of `statsframe` usage, the `skim` function provides a
summary of a DataFrame (either `polars.DataFrame` or `pandas.DataFrame`).
The default summary statistics returned by `pydatasummary.skim()` are unique values,
The default summary statistics returned by `statsframe.skim()` are unique values,
percentage missing, mean, standard deviation, minimum, median, and maximum.

Where possible, `pydatasummary` will print a table to the console and return a
Where possible, `statsframe` will print a table to the console and return a
polars DataFrame with the summary statistics. This allows for easy customization.
For example, the `polars.DataFrame` with statistics from `pydatasummary` can be
For example, the `polars.DataFrame` with statistics from `statsframe` can be
modified using the [`Great Tables`](https://posit-dev.github.io/great-tables/reference/) package.

```python
import polars as pl
import pydatasummary as ds
import statsframe as sf

df = (
pl.read_csv("https://vincentarelbundock.github.io/Rdatasets/csv/datasets/mtcars.csv")
.drop("rownames")
)

stats = ds.skim(df)
stats = sf.skim(df)

Summary Statistics
Rows: 32, Columns: 11
Expand All @@ -61,13 +61,13 @@ We can achieve the same result above with a pandas DataFrame.

```python
import pandas as pd
import pydatasummary as ds
import statsframe as sf

trees_df = pd.read_csv(
"https://vincentarelbundock.github.io/Rdatasets/csv/datasets/trees.csv"
).drop(columns=["rownames"])

trees_stats = ds.skim(trees_df)
trees_stats = sf.skim(trees_df)

Summary Statistics
Rows: 31, Columns: 3
Expand All @@ -84,19 +84,19 @@ Rows: 31, Columns: 3
## Contributing

If you encounter a bug, have usage questions, or want to share ideas to make
the `pydatasummary` package more useful, please feel free to file an
[issue](https://github.com/NKeleher/pydatasummary/issues).
the `statsframe` package more useful, please feel free to file an
[issue](https://github.com/NKeleher/statsframe/issues).

## Code of Conduct

Please note that the **pydatasummary** project is released with a
Please note that the **statsframe** project is released with a
[contributor code of conduct](https://www.contributor-covenant.org/version/2/1/code_of_conduct/).

By participating in this project you agree to abide by its terms.

## License

**pydatasummary** is licensed under the MIT license.
**statsframe** is licensed under the MIT license.

## Governance

Expand Down
6 changes: 3 additions & 3 deletions docs/_quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ project:
type: website
output-dir: .
# website:
# title: "pydatasummary"
# title: "statsframe"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (llm): The commented-out configuration in 'docs/_quarto.yml' still references the old project name. If these lines are to be used in the future, they should be updated to reflect the new project name 'statsframe'.

# favicon: favicon.ico
# twitter-card: true
# navbar:
Expand All @@ -27,12 +27,12 @@ project:
# - _sidebar.yml

# quartodoc:
# package: pydatasummary
# package: statsframe
# parser: google
# sidebar: _sidebar.yml

# sections:
# - title: "Function reference"
# desc: "What pydatasummary's functions do"
# desc: "What statsframe's functions do"
# contents:
# - skim
14 changes: 7 additions & 7 deletions examples/example_datasummary_skim.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@
import pandas as pd
import polars as pl

import pydatasummary as ds
import statsframe as sf

# %%
df = pl.read_csv(
"https://vincentarelbundock.github.io/Rdatasets/csv/datasets/mtcars.csv"
).drop("rownames")

stats = ds.skim(df)
stats = sf.skim(df)

# %% [markdown]
# Import a csv file to a polars DataFrame:
Expand All @@ -26,7 +26,7 @@
# Create a skim table

# %%
penguins_stats = ds.skim(penguins_df)
penguins_stats = sf.skim(penguins_df)

# %% [markdown]
# Return the polars DataFrame with the summary statistics
Expand All @@ -35,15 +35,15 @@
penguins_stats

# %%
ds.skim(
sf.skim(
penguins_df,
output="markdown",
title="Palmer's Penguins Summary Statistics",
align="l",
)

# %%
ds.skim(
sf.skim(
penguins_df,
stats="moments",
output="markdown",
Expand All @@ -52,7 +52,7 @@
)

# %%
ds.skim(
sf.skim(
penguins_df,
stats="full",
output="markdown",
Expand All @@ -72,6 +72,6 @@
trees_df.info()

# %%
trees_stats = ds.skim(trees_df)
trees_stats = sf.skim(trees_df)

# %%
3 changes: 2 additions & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
[tool.poetry]
name = "pydatasummary"
name = "statsframe"
version = "0.0.2"
description = "Customizable data and model summaries in Python."
authors = ["Niall Keleher <niall.keleher@gmail.com>"]
packages = [{ include = "*", from = "src" }]
license = "MIT"
readme = "README.md"
homepage = "https://github.com/NKeleher/pydatasummary#readme"
repository = "https://github.com/NKeleher/pydatasummary"
homepage = "https://github.com/NKeleher/statsframe#readme"
repository = "https://github.com/NKeleher/statsframe"
keywords = ["tables", "statistics", "econometrics"]
classifiers = [
# https://pypi.org/classifiers/
Expand All @@ -29,6 +29,7 @@ classifiers = [
python = "^3.9"
polars = "^0.20.5"
pandas = "^2.1.4"
importlib-metadata = "^7.0.1"

[tool.poetry.group.dev.dependencies]
pytest = "^7.4.4"
Expand All @@ -40,7 +41,6 @@ bandit = "^1.7.6"
docformatter = "^1.7.5"
mypy = "^1.8.0"
jupyterlab = "^4.0.11"
importlib-metadata = "^7.0.1"

[build-system]
requires = ["poetry-core"]
Expand Down
8 changes: 4 additions & 4 deletions src/pydatasummary/__init__.py → src/statsframe/__init__.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Define pydatasummary version
# Define statsframe version
from importlib_metadata import version as _v

# __version__ = "0.0.1"
__version__ = _v("pydatasummary")
__version__ = _v("statsframe")

del _v

# Import pydatasummary objects
# Import statsframe objects
# from ._tbl_data import * # noqa: F401, F403, E402
# from ._databackend import * # noqa: F401, F403, E402
from .ds import * # noqa: F401, F403, E402
from .datasummary import * # noqa: F401, F403, E402
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion src/pydatasummary/ds.py → src/statsframe/datasummary.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from __future__ import annotations

# Main ds imports ----
# Main sf.imports ----
import pandas as pd
import polars as pl
import polars.selectors as cs
Expand Down
4 changes: 2 additions & 2 deletions tests/test_skim.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import polars as pl
from polars.testing import assert_frame_equal

from pydatasummary.ds import skim
import statsframe as sf

df = pl.DataFrame(
{
Expand All @@ -28,7 +28,7 @@

def test_skim_numeric_df(data=df):
# Act
result = skim(data)
result = sf.skim(data)

# Assert
assert_frame_equal(result, expected_df)