Skip to content

Commit 32db625

Browse files
KN4CK3Rlunny
andauthored
Add package registry cleanup rules (#21658)
Fixes #20514 Fixes #20766 Fixes #20631 This PR adds Cleanup Rules for the package registry. This allows to delete unneeded packages automatically. Cleanup rules can be set up from the user or org settings. Please have a look at the documentation because I'm not a native english speaker. Rule Form ![grafik](https://user-images.githubusercontent.com/1666336/199330792-c13918a6-e196-4e71-9f53-18554515edca.png) Rule List ![grafik](https://user-images.githubusercontent.com/1666336/199331261-5f6878e8-a80c-4985-800d-ebb3524b1a8d.png) Rule Preview ![grafik](https://user-images.githubusercontent.com/1666336/199330917-c95e4017-cf64-4142-a3e4-af18c4f127c3.png) Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
1 parent d3f850c commit 32db625

27 files changed

+1243
-36
lines changed

Diff for: docs/content/doc/packages/storage.en-us.md

+84
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
---
2+
date: "2022-11-01T00:00:00+00:00"
3+
title: "Storage"
4+
slug: "packages/storage"
5+
draft: false
6+
toc: false
7+
menu:
8+
sidebar:
9+
parent: "packages"
10+
name: "storage"
11+
weight: 5
12+
identifier: "storage"
13+
---
14+
15+
# Storage
16+
17+
This document describes the storage of the package registry and how it can be managed.
18+
19+
**Table of Contents**
20+
21+
{{< toc >}}
22+
23+
## Deduplication
24+
25+
The package registry has a build-in deduplication of uploaded blobs.
26+
If two identical files are uploaded only one blob is saved on the filesystem.
27+
This ensures no space is wasted for duplicated files.
28+
29+
If two packages are uploaded with identical files, both packages will display the same size but on the filesystem they require only half of the size.
30+
Whenever a package gets deleted only the references to the underlaying blobs are removed.
31+
The blobs get not removed at this moment, so they still require space on the filesystem.
32+
When a new package gets uploaded the existing blobs may get referenced again.
33+
34+
These unreferenced blobs get deleted by a [clean up job]({{< relref "doc/advanced/config-cheat-sheet.en-us.md#cron---cleanup-expired-packages-croncleanup_packages" >}}).
35+
The config setting `OLDER_THAN` configures how long unreferenced blobs are kept before they get deleted.
36+
37+
## Cleanup Rules
38+
39+
Package registries can become large over time without cleanup.
40+
It's recommended to delete unnecessary packages and set up cleanup rules to automatically manage the package registry usage.
41+
Every package owner (user or organization) manages the cleanup rules which are applied to their packages.
42+
43+
|Setting|Description|
44+
|-|-|
45+
|Enabled|Turn the cleanup rule on or off.|
46+
|Type|Every rule manages a specific package type.|
47+
|Apply pattern to full package name|If enabled, the patterns below are applied to the full package name (`package/version`). Otherwise only the version (`version`) is used.|
48+
|Keep the most recent|How many versions to *always* keep for each package.|
49+
|Keep versions matching|The regex pattern that determines which versions to keep. An empty pattern keeps no version while `.+` keeps all versions. The container registry will always keep the `latest` version even if not configured.|
50+
|Remove versions older than|Remove only versions older than the selected days.|
51+
|Remove versions matching|The regex pattern that determines which versions to remove. An empty pattern or `.+` leads to the removal of every package if no other setting tells otherwise.|
52+
53+
Every cleanup rule can show a preview of the affected packages.
54+
This can be used to check if the cleanup rules is proper configured.
55+
56+
### Regex examples
57+
58+
Regex patterns are automatically surrounded with `\A` and `\z` anchors.
59+
Do not include any `\A`, `\z`, `^` or `$` token in the regex patterns as they are not necessary.
60+
The patterns are case-insensitive which matches the behaviour of the package registry in Gitea.
61+
62+
|Pattern|Description|
63+
|-|-|
64+
|`.*`|Match every possible version.|
65+
|`v.+`|Match versions that start with `v`.|
66+
|`release`|Match only the version `release`.|
67+
|`release.*`|Match versions that are either named or start with `release`.|
68+
|`.+-temp-.+`|Match versions that contain `-temp-`.|
69+
|`v.+\|release`|Match versions that either start with `v` or are named `release`.|
70+
|`package/v.+\|other/release`|Match versions of the package `package` that start with `v` or the version `release` of the package `other`. This needs the setting *Apply pattern to full package name* enabled.|
71+
72+
### How the cleanup rules work
73+
74+
The cleanup rules are part of the [clean up job]({{< relref "doc/advanced/config-cheat-sheet.en-us.md#cron---cleanup-expired-packages-croncleanup_packages" >}}) and run periodicly.
75+
76+
The cleanup rule:
77+
78+
1. Collects all packages of the package type for the owners registry.
79+
1. For every package it collects all versions.
80+
1. Excludes from the list the # versions based on the *Keep the most recent* value.
81+
1. Excludes from the list any versions matching the *Keep versions matching* value.
82+
1. Excludes from the list the versions more recent than the *Remove versions older than* value.
83+
1. Excludes from the list any versions not matching the *Remove versions matching* value.
84+
1. Deletes the remaining versions.

Diff for: models/migrations/migrations.go

+2
Original file line numberDiff line numberDiff line change
@@ -439,6 +439,8 @@ var migrations = []Migration{
439439
NewMigration("Alter package_version.metadata_json to LONGTEXT", v1_19.AlterPackageVersionMetadataToLongText),
440440
// v233 -> v234
441441
NewMigration("Add header_authorization_encrypted column to webhook table", v1_19.AddHeaderAuthorizationEncryptedColWebhook),
442+
// v234 -> v235
443+
NewMigration("Add package cleanup rule table", v1_19.CreatePackageCleanupRuleTable),
442444
}
443445

444446
// GetCurrentDBVersion returns the current db version

Diff for: models/migrations/v1_19/v234.go

+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
// Copyright 2022 The Gitea Authors. All rights reserved.
2+
// Use of this source code is governed by a MIT-style
3+
// license that can be found in the LICENSE file.
4+
5+
package v1_19 //nolint
6+
7+
import (
8+
"code.gitea.io/gitea/modules/timeutil"
9+
10+
"xorm.io/xorm"
11+
)
12+
13+
func CreatePackageCleanupRuleTable(x *xorm.Engine) error {
14+
type PackageCleanupRule struct {
15+
ID int64 `xorm:"pk autoincr"`
16+
Enabled bool `xorm:"INDEX NOT NULL DEFAULT false"`
17+
OwnerID int64 `xorm:"UNIQUE(s) INDEX NOT NULL DEFAULT 0"`
18+
Type string `xorm:"UNIQUE(s) INDEX NOT NULL"`
19+
KeepCount int `xorm:"NOT NULL DEFAULT 0"`
20+
KeepPattern string `xorm:"NOT NULL DEFAULT ''"`
21+
RemoveDays int `xorm:"NOT NULL DEFAULT 0"`
22+
RemovePattern string `xorm:"NOT NULL DEFAULT ''"`
23+
MatchFullName bool `xorm:"NOT NULL DEFAULT false"`
24+
CreatedUnix timeutil.TimeStamp `xorm:"created NOT NULL DEFAULT 0"`
25+
UpdatedUnix timeutil.TimeStamp `xorm:"updated NOT NULL DEFAULT 0"`
26+
}
27+
28+
return x.Sync2(new(PackageCleanupRule))
29+
}

Diff for: models/packages/package.go

+15
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,21 @@ const (
4545
TypeVagrant Type = "vagrant"
4646
)
4747

48+
var TypeList = []Type{
49+
TypeComposer,
50+
TypeConan,
51+
TypeContainer,
52+
TypeGeneric,
53+
TypeHelm,
54+
TypeMaven,
55+
TypeNpm,
56+
TypeNuGet,
57+
TypePub,
58+
TypePyPI,
59+
TypeRubyGems,
60+
TypeVagrant,
61+
}
62+
4863
// Name gets the name of the package type
4964
func (pt Type) Name() string {
5065
switch pt {

Diff for: models/packages/package_cleanup_rule.go

+110
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
// Copyright 2022 The Gitea Authors. All rights reserved.
2+
// Use of this source code is governed by a MIT-style
3+
// license that can be found in the LICENSE file.
4+
5+
package packages
6+
7+
import (
8+
"context"
9+
"errors"
10+
"fmt"
11+
"regexp"
12+
13+
"code.gitea.io/gitea/models/db"
14+
"code.gitea.io/gitea/modules/timeutil"
15+
16+
"xorm.io/builder"
17+
)
18+
19+
var ErrPackageCleanupRuleNotExist = errors.New("Package blob does not exist")
20+
21+
func init() {
22+
db.RegisterModel(new(PackageCleanupRule))
23+
}
24+
25+
// PackageCleanupRule represents a rule which describes when to clean up package versions
26+
type PackageCleanupRule struct {
27+
ID int64 `xorm:"pk autoincr"`
28+
Enabled bool `xorm:"INDEX NOT NULL DEFAULT false"`
29+
OwnerID int64 `xorm:"UNIQUE(s) INDEX NOT NULL DEFAULT 0"`
30+
Type Type `xorm:"UNIQUE(s) INDEX NOT NULL"`
31+
KeepCount int `xorm:"NOT NULL DEFAULT 0"`
32+
KeepPattern string `xorm:"NOT NULL DEFAULT ''"`
33+
KeepPatternMatcher *regexp.Regexp `xorm:"-"`
34+
RemoveDays int `xorm:"NOT NULL DEFAULT 0"`
35+
RemovePattern string `xorm:"NOT NULL DEFAULT ''"`
36+
RemovePatternMatcher *regexp.Regexp `xorm:"-"`
37+
MatchFullName bool `xorm:"NOT NULL DEFAULT false"`
38+
CreatedUnix timeutil.TimeStamp `xorm:"created NOT NULL DEFAULT 0"`
39+
UpdatedUnix timeutil.TimeStamp `xorm:"updated NOT NULL DEFAULT 0"`
40+
}
41+
42+
func (pcr *PackageCleanupRule) CompiledPattern() error {
43+
if pcr.KeepPatternMatcher != nil || pcr.RemovePatternMatcher != nil {
44+
return nil
45+
}
46+
47+
if pcr.KeepPattern != "" {
48+
var err error
49+
pcr.KeepPatternMatcher, err = regexp.Compile(fmt.Sprintf(`(?i)\A%s\z`, pcr.KeepPattern))
50+
if err != nil {
51+
return err
52+
}
53+
}
54+
55+
if pcr.RemovePattern != "" {
56+
var err error
57+
pcr.RemovePatternMatcher, err = regexp.Compile(fmt.Sprintf(`(?i)\A%s\z`, pcr.RemovePattern))
58+
if err != nil {
59+
return err
60+
}
61+
}
62+
63+
return nil
64+
}
65+
66+
func InsertCleanupRule(ctx context.Context, pcr *PackageCleanupRule) (*PackageCleanupRule, error) {
67+
return pcr, db.Insert(ctx, pcr)
68+
}
69+
70+
func GetCleanupRuleByID(ctx context.Context, id int64) (*PackageCleanupRule, error) {
71+
pcr := &PackageCleanupRule{}
72+
73+
has, err := db.GetEngine(ctx).ID(id).Get(pcr)
74+
if err != nil {
75+
return nil, err
76+
}
77+
if !has {
78+
return nil, ErrPackageCleanupRuleNotExist
79+
}
80+
return pcr, nil
81+
}
82+
83+
func UpdateCleanupRule(ctx context.Context, pcr *PackageCleanupRule) error {
84+
_, err := db.GetEngine(ctx).ID(pcr.ID).AllCols().Update(pcr)
85+
return err
86+
}
87+
88+
func GetCleanupRulesByOwner(ctx context.Context, ownerID int64) ([]*PackageCleanupRule, error) {
89+
pcrs := make([]*PackageCleanupRule, 0, 10)
90+
return pcrs, db.GetEngine(ctx).Where("owner_id = ?", ownerID).Find(&pcrs)
91+
}
92+
93+
func DeleteCleanupRuleByID(ctx context.Context, ruleID int64) error {
94+
_, err := db.GetEngine(ctx).ID(ruleID).Delete(&PackageCleanupRule{})
95+
return err
96+
}
97+
98+
func HasOwnerCleanupRuleForPackageType(ctx context.Context, ownerID int64, packageType Type) (bool, error) {
99+
return db.GetEngine(ctx).
100+
Where("owner_id = ? AND type = ?", ownerID, packageType).
101+
Exist(&PackageCleanupRule{})
102+
}
103+
104+
func IterateEnabledCleanupRules(ctx context.Context, callback func(context.Context, *PackageCleanupRule) error) error {
105+
return db.Iterate(
106+
ctx,
107+
builder.Eq{"enabled": true},
108+
callback,
109+
)
110+
}

Diff for: models/packages/package_version.go

+9
Original file line numberDiff line numberDiff line change
@@ -320,6 +320,15 @@ func SearchLatestVersions(ctx context.Context, opts *PackageSearchOptions) ([]*P
320320
return pvs, count, err
321321
}
322322

323+
// ExistVersion checks if a version matching the search options exist
324+
func ExistVersion(ctx context.Context, opts *PackageSearchOptions) (bool, error) {
325+
return db.GetEngine(ctx).
326+
Where(opts.toConds()).
327+
Table("package_version").
328+
Join("INNER", "package", "package.id = package_version.package_id").
329+
Exist(new(PackageVersion))
330+
}
331+
323332
// CountVersions counts all versions of packages matching the search options
324333
func CountVersions(ctx context.Context, opts *PackageSearchOptions) (int64, error) {
325334
return db.GetEngine(ctx).

Diff for: options/locale/locale_en-US.ini

+23
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,9 @@ remove = Remove
8686
remove_all = Remove All
8787
edit = Edit
8888

89+
enabled = Enabled
90+
disabled = Disabled
91+
8992
copy = Copy
9093
copy_url = Copy URL
9194
copy_content = Copy content
@@ -3186,3 +3189,23 @@ settings.delete.description = Deleting a package is permanent and cannot be undo
31863189
settings.delete.notice = You are about to delete %s (%s). This operation is irreversible, are you sure?
31873190
settings.delete.success = The package has been deleted.
31883191
settings.delete.error = Failed to delete the package.
3192+
owner.settings.cleanuprules.title = Manage Cleanup Rules
3193+
owner.settings.cleanuprules.add = Add Cleanup Rule
3194+
owner.settings.cleanuprules.edit = Edit Cleanup Rule
3195+
owner.settings.cleanuprules.none = No cleanup rules available. Read the docs to learn more.
3196+
owner.settings.cleanuprules.preview = Cleanup Rule Preview
3197+
owner.settings.cleanuprules.preview.overview = %d packages are scheduled to be removed.
3198+
owner.settings.cleanuprules.preview.none = Cleanup rule does not match any packages.
3199+
owner.settings.cleanuprules.enabled = Enabled
3200+
owner.settings.cleanuprules.pattern_full_match = Apply pattern to full package name
3201+
owner.settings.cleanuprules.keep.title = Versions that match these rules are kept, even if they match a removal rule below.
3202+
owner.settings.cleanuprules.keep.count = Keep the most recent
3203+
owner.settings.cleanuprules.keep.count.1 = 1 version per package
3204+
owner.settings.cleanuprules.keep.count.n = %d versions per package
3205+
owner.settings.cleanuprules.keep.pattern = Keep versions matching
3206+
owner.settings.cleanuprules.keep.pattern.container = The <code>latest</code> version is always kept for Container packages.
3207+
owner.settings.cleanuprules.remove.title = Versions that match these rules are removed, unless a rule above says to keep them.
3208+
owner.settings.cleanuprules.remove.days = Remove versions older than
3209+
owner.settings.cleanuprules.remove.pattern = Remove versions matching
3210+
owner.settings.cleanuprules.success.update = Cleanup rule has been updated.
3211+
owner.settings.cleanuprules.success.delete = Cleanup rule has been deleted.

Diff for: routers/web/org/setting_packages.go

+87
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
// Copyright 2022 The Gitea Authors. All rights reserved.
2+
// Use of this source code is governed by a MIT-style
3+
// license that can be found in the LICENSE file.
4+
5+
package org
6+
7+
import (
8+
"fmt"
9+
"net/http"
10+
11+
"code.gitea.io/gitea/modules/base"
12+
"code.gitea.io/gitea/modules/context"
13+
"code.gitea.io/gitea/modules/setting"
14+
shared "code.gitea.io/gitea/routers/web/shared/packages"
15+
)
16+
17+
const (
18+
tplSettingsPackages base.TplName = "org/settings/packages"
19+
tplSettingsPackagesRuleEdit base.TplName = "org/settings/packages_cleanup_rules_edit"
20+
tplSettingsPackagesRulePreview base.TplName = "org/settings/packages_cleanup_rules_preview"
21+
)
22+
23+
func Packages(ctx *context.Context) {
24+
ctx.Data["Title"] = ctx.Tr("packages.title")
25+
ctx.Data["PageIsOrgSettings"] = true
26+
ctx.Data["PageIsSettingsPackages"] = true
27+
28+
shared.SetPackagesContext(ctx, ctx.ContextUser)
29+
30+
ctx.HTML(http.StatusOK, tplSettingsPackages)
31+
}
32+
33+
func PackagesRuleAdd(ctx *context.Context) {
34+
ctx.Data["Title"] = ctx.Tr("packages.title")
35+
ctx.Data["PageIsOrgSettings"] = true
36+
ctx.Data["PageIsSettingsPackages"] = true
37+
38+
shared.SetRuleAddContext(ctx)
39+
40+
ctx.HTML(http.StatusOK, tplSettingsPackagesRuleEdit)
41+
}
42+
43+
func PackagesRuleEdit(ctx *context.Context) {
44+
ctx.Data["Title"] = ctx.Tr("packages.title")
45+
ctx.Data["PageIsOrgSettings"] = true
46+
ctx.Data["PageIsSettingsPackages"] = true
47+
48+
shared.SetRuleEditContext(ctx, ctx.ContextUser)
49+
50+
ctx.HTML(http.StatusOK, tplSettingsPackagesRuleEdit)
51+
}
52+
53+
func PackagesRuleAddPost(ctx *context.Context) {
54+
ctx.Data["Title"] = ctx.Tr("packages.title")
55+
ctx.Data["PageIsOrgSettings"] = true
56+
ctx.Data["PageIsSettingsPackages"] = true
57+
58+
shared.PerformRuleAddPost(
59+
ctx,
60+
ctx.ContextUser,
61+
fmt.Sprintf("%s/org/%s/settings/packages", setting.AppSubURL, ctx.ContextUser.Name),
62+
tplSettingsPackagesRuleEdit,
63+
)
64+
}
65+
66+
func PackagesRuleEditPost(ctx *context.Context) {
67+
ctx.Data["Title"] = ctx.Tr("packages.title")
68+
ctx.Data["PageIsOrgSettings"] = true
69+
ctx.Data["PageIsSettingsPackages"] = true
70+
71+
shared.PerformRuleEditPost(
72+
ctx,
73+
ctx.ContextUser,
74+
fmt.Sprintf("%s/org/%s/settings/packages", setting.AppSubURL, ctx.ContextUser.Name),
75+
tplSettingsPackagesRuleEdit,
76+
)
77+
}
78+
79+
func PackagesRulePreview(ctx *context.Context) {
80+
ctx.Data["Title"] = ctx.Tr("packages.title")
81+
ctx.Data["PageIsOrgSettings"] = true
82+
ctx.Data["PageIsSettingsPackages"] = true
83+
84+
shared.SetRulePreviewContext(ctx, ctx.ContextUser)
85+
86+
ctx.HTML(http.StatusOK, tplSettingsPackagesRulePreview)
87+
}

0 commit comments

Comments
 (0)