Skip to content

Misleading "Downloads by Python version over time" is only showing a subset of used versions #105

Closed
@lesteve

Description

@lesteve

Describe the bug

The "Downloads by Python version over time" is only showing a subset of versions which is confusing.

To Reproduce
Steps to reproduce the behavior:
Look at the scikit-learn dashboard for the last year https://clickpy.clickhouse.com/dashboard/scikit-learn?min_date=2024-01-01&max_date=2025-01-01

It makes you think that nobody is using scikit-learn with Python >= 3.11 which is rather suprising.

Looking at the query there is a LIMIT 4 on the last line which I believe is the source of the problem (there may be a reason why is there maybe efficiency?):

SELECT
            python_minor as name,
            if(date_diff('month', {min_date:Date32},{max_date:Date32}) <= 6,toStartOfDay(date)::Date32, toStartOfWeek(date)::Date32) AS x,
            sum(count) AS y
        FROM pypi.pypi_downloads_per_day_by_version_by_python
        WHERE (date >= {min_date:Date32}) AND (date < if(date_diff('month', {min_date:Date32},{max_date:Date32}) <= 6,toStartOfDay({max_date:Date32})::Date32, toStartOfWeek({max_date:Date32})::Date32)) AND (project = {package_name:String}) 
        AND 1=1 AND python_minor != '' 
        AND 1=1 AND 1=1
        GROUP BY name, x
        ORDER BY x ASC, y DESC LIMIT 4 BY x

If I remove the LIMIT 4 I do see that scikit-learn is used with Python >= 3.11 as expected.

Expected behavior

All Python versions are shown.

Screenshots
If applicable, add screenshots to help explain your problem.
image

Desktop (please complete the following information):

  • I don't think is relevant

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions