SummarizedExperiment

This package provides containers to represent genomic experimental data as 2-dimensional matrices, follows Bioconductor's SummarizedExperiment. In these matrices, the rows typically denote features or genomic regions of interest, while columns represent samples or cells.

The package currently includes representations for both SummarizedExperiment and RangedSummarizedExperiment. A distinction lies in the fact RangedSummarizedExperiment object provides an additional slot to store genomic regions for each feature and is expected to be GenomicRanges (more here).

Install

To get started, Install the package from PyPI,

pip install summarizedexperiment

Usage

A SummarizedExperiment contains three key attributes,

assays: A dictionary of matrices with assay names as keys, e.g. counts, logcounts etc.
row_data: Feature information e.g. genes, transcripts, exons, etc.
column_data: Sample information about the columns of the matrices.

First lets mock feature and sample data:

from random import random
import pandas as pd
import numpy as np
from biocframe import BiocFrame

nrows = 200
ncols = 6
counts = np.random.rand(nrows, ncols)
row_data = BiocFrame(
    {
        "seqnames": [
            "chr1",
            "chr2",
            "chr2",
            "chr2",
            "chr1",
            "chr1",
            "chr3",
            "chr3",
            "chr3",
            "chr3",
        ]
        * 20,
        "starts": range(100, 300),
        "ends": range(110, 310),
        "strand": ["-", "+", "+", "*", "*", "+", "+", "+", "-", "-"] * 20,
        "score": range(0, 200),
        "GC": [random() for _ in range(10)] * 20,
    }
)

col_data = pd.DataFrame(
    {
        "treatment": ["ChIP", "Input"] * 3,
    }
)

To create a SummarizedExperiment,

from summarizedexperiment import SummarizedExperiment

tse = SummarizedExperiment(
    assays={"counts": counts}, row_data=row_data, column_data=col_data,
    metadata={"seq_platform": "Illumina NovaSeq 6000"},
)

## output
class: SummarizedExperiment
dimensions: (200, 6)
assays(1): ['counts']
row_data columns(6): ['seqnames', 'starts', 'ends', 'strand', 'score', 'GC']
row_names(0):
column_data columns(1): ['treatment']
column_names(0):
metadata(1): seq_platform

To create a RangedSummarizedExperiment

from summarizedexperiment import RangedSummarizedExperiment
from genomicranges import GenomicRanges

trse = RangedSummarizedExperiment(
    assays={"counts": counts}, row_data=row_data,
    row_ranges=GenomicRanges.from_pandas(row_data.to_pandas()), column_data=col_data
)

## output
class: RangedSummarizedExperiment
dimensions: (200, 6)
assays(1): ['counts']
row_data columns(6): ['seqnames', 'starts', 'ends', 'strand', 'score', 'GC']
row_names(0):
column_data columns(1): ['treatment']
column_names(0):
metadata(0):

For more examples, checkout the documentation.

Note

This project has been set up using PyScaffold 4.5. For details and usage information on PyScaffold see https://pyscaffold.org/.

Name	Name	Last commit message	Last commit date
Latest commit pre-commit-ci[bot] [pre-commit.ci] pre-commit autoupdate (#88 ) Mar 27, 2025 948d3a9 · Mar 27, 2025 History 87 Commits
.github/workflows	.github/workflows	Use the trusted publisher workflow (#92 )	Mar 20, 2025
docs	docs	Enhancements to documentation (#73 )	Jun 14, 2024
src/summarizedexperiment	src/summarizedexperiment	[pre-commit.ci] pre-commit autoupdate (#88 )	Mar 27, 2025
tests	tests	`set_assay` accepts either an index or assay name (#94 )	Mar 26, 2025
.coveragerc	.coveragerc	init: create SE and RSE classes	Jun 15, 2022
.gitignore	.gitignore	init: create SE and RSE classes	Jun 15, 2022
.pre-commit-config.yaml	.pre-commit-config.yaml	[pre-commit.ci] pre-commit autoupdate (#88 )	Mar 27, 2025
.readthedocs.yml	.readthedocs.yml	init: create SE and RSE classes	Jun 15, 2022
AUTHORS.md	AUTHORS.md	Changes to reflect Google style guide (#28 )	Aug 21, 2023
CHANGELOG.md	CHANGELOG.md	`set_assay` accepts either an index or assay name (#94 )	Mar 26, 2025
CONTRIBUTING.md	CONTRIBUTING.md	Changes to reflect Google style guide (#28 )	Aug 21, 2023
LICENSE.txt	LICENSE.txt	chore: update license	Nov 22, 2022
README.md	README.md	Rename GitHub actions (#84 )	Jan 2, 2025
pyproject.toml	pyproject.toml	chore: remove Python 3.8 support (#81 )	Dec 20, 2024
setup.cfg	setup.cfg	Fix subsetting when slice arguments are numpy vectors (#87 )	Jan 6, 2025
setup.py	setup.py	[pre-commit.ci] pre-commit autoupdate (#64 )	Feb 13, 2024
tox.ini	tox.ini	init: create SE and RSE classes	Jun 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SummarizedExperiment

Install

Usage

Note

About

Contributors 4

Languages

License

BiocPy/SummarizedExperiment

Folders and files

Latest commit

History

Repository files navigation

SummarizedExperiment

Install

Usage

Note

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 4

Languages