log-count-utils

Introduction & Usage

Suppose we have an action log data df like

user_id	timestamp	expense
0	2021-02-18 10:00:00	100
0	2021-02-18 10:00:10	10
0	2021-02-18 10:00:21	1
0	2021-02-18 11:00:21	0.1
1	2020-02-18 10:00:10	100
1	2020-02-18 10:00:20	10
1	2020-02-18 10:00:20	1
1	2020-02-18 10:00:29	0

Suppose that you have to compute the following quantity for each row in this dataframe:

the number of actions each user has taken within 10 seconds
total amount of expenses of a user within 10 seconds

The following naive way is fine for this tiny example but becomes costly (O(N^2)) for large data frame.

from datetime import timedelta
import numpy as np

td = timedelta(seconds=10)

answers = []
for uid, time_point in zip(df.user_id, df.timestamp):
    cnt = np.sum(
        (df.user_id == uid) & (df.timestamp < time_point) & (df.timestamp >= (time_point - td))
    )
    answers.append(cnt)

If df is sorted (by user_id as the primary and timestamp as the secondary key), we can do this blazing fast (O(N)) using log_count_util.

from log_count_util import find_n_records_within_interval

answers = find_n_records_within_interval(
    df.user_id, df.timestamp, df_user_id, df.timestamp, td
)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
log_count_util		log_count_util
src		src
tests		tests
.codecov.yml		.codecov.yml
.coveragerc		.coveragerc
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CMakeLists.txt		CMakeLists.txt
Readme.md		Readme.md
create_pb_stubs.sh		create_pb_stubs.sh
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

log-count-utils

Introduction & Usage

About

Releases 1

Packages

Languages

tohtsky/log_count_util

Folders and files

Latest commit

History

Repository files navigation

log-count-utils

Introduction & Usage

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages