Skip to content

Commit b223d36

Browse files
lunny6543techknowlogick
authored
Rework repository archive (#14723)
* Use storage to store archive files * Fix backend lint * Add archiver table on database * Finish archive download * Fix test * Add database migrations * Add status for archiver * Fix lint * Add queue * Add doctor to check and delete old archives * Improve archive queue * Fix tests * improve archive storage * Delete repo archives * Add missing fixture * fix fixture * Fix fixture * Fix test * Fix archiver cleaning * Fix bug * Add docs for repository archive storage * remove repo-archive configuration * Fix test * Fix test * Fix lint Co-authored-by: 6543 <6543@obermui.de> Co-authored-by: techknowlogick <techknowlogick@gitea.io>
1 parent c9c7afd commit b223d36

File tree

25 files changed

+628
-460
lines changed

25 files changed

+628
-460
lines changed

custom/conf/app.example.ini

+10
Original file line numberDiff line numberDiff line change
@@ -2048,6 +2048,16 @@ PATH =
20482048
;; storage type
20492049
;STORAGE_TYPE = local
20502050

2051+
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
2052+
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
2053+
;; settings for repository archives, will override storage setting
2054+
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
2055+
;[storage.repo-archive]
2056+
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
2057+
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
2058+
;; storage type
2059+
;STORAGE_TYPE = local
2060+
20512061
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
20522062
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
20532063
;; lfs storage will override storage

docs/content/doc/advanced/config-cheat-sheet.en-us.md

+17
Original file line numberDiff line numberDiff line change
@@ -995,6 +995,23 @@ MINIO_USE_SSL = false
995995

996996
And used by `[attachment]`, `[lfs]` and etc. as `STORAGE_TYPE`.
997997

998+
## Repository Archive Storage (`storage.repo-archive`)
999+
1000+
Configuration for repository archive storage. It will inherit from default `[storage]` or
1001+
`[storage.xxx]` when set `STORAGE_TYPE` to `xxx`. The default of `PATH`
1002+
is `data/repo-archive` and the default of `MINIO_BASE_PATH` is `repo-archive/`.
1003+
1004+
- `STORAGE_TYPE`: **local**: Storage type for repo archive, `local` for local disk or `minio` for s3 compatible object storage service or other name defined with `[storage.xxx]`
1005+
- `SERVE_DIRECT`: **false**: Allows the storage driver to redirect to authenticated URLs to serve files directly. Currently, only Minio/S3 is supported via signed URLs, local does nothing.
1006+
- `PATH`: **./data/repo-archive**: Where to store archive files, only available when `STORAGE_TYPE` is `local`.
1007+
- `MINIO_ENDPOINT`: **localhost:9000**: Minio endpoint to connect only available when `STORAGE_TYPE` is `minio`
1008+
- `MINIO_ACCESS_KEY_ID`: Minio accessKeyID to connect only available when `STORAGE_TYPE` is `minio`
1009+
- `MINIO_SECRET_ACCESS_KEY`: Minio secretAccessKey to connect only available when `STORAGE_TYPE is` `minio`
1010+
- `MINIO_BUCKET`: **gitea**: Minio bucket to store the lfs only available when `STORAGE_TYPE` is `minio`
1011+
- `MINIO_LOCATION`: **us-east-1**: Minio location to create bucket only available when `STORAGE_TYPE` is `minio`
1012+
- `MINIO_BASE_PATH`: **repo-archive/**: Minio base path on the bucket only available when `STORAGE_TYPE` is `minio`
1013+
- `MINIO_USE_SSL`: **false**: Minio enabled ssl only available when `STORAGE_TYPE` is `minio`
1014+
9981015
## Other (`other`)
9991016

10001017
- `SHOW_FOOTER_BRANDING`: **false**: Show Gitea branding in the footer.

docs/content/doc/advanced/config-cheat-sheet.zh-cn.md

+15
Original file line numberDiff line numberDiff line change
@@ -382,6 +382,21 @@ MINIO_USE_SSL = false
382382

383383
然后你在 `[attachment]`, `[lfs]` 等中可以把这个名字用作 `STORAGE_TYPE` 的值。
384384

385+
## Repository Archive Storage (`storage.repo-archive`)
386+
387+
Repository archive 的存储配置。 如果 `STORAGE_TYPE` 为空,则此配置将从 `[storage]` 继承。如果不为 `local` 或者 `minio` 而为 `xxx`, 则从 `[storage.xxx]` 继承。当继承时, `PATH` 默认为 `data/repo-archive``MINIO_BASE_PATH` 默认为 `repo-archive/`
388+
389+
- `STORAGE_TYPE`: **local**: Repository archive 的存储类型,`local` 将存储到磁盘,`minio` 将存储到 s3 兼容的对象服务。
390+
- `SERVE_DIRECT`: **false**: 允许直接重定向到存储系统。当前,仅 Minio/S3 是支持的。
391+
- `PATH`: 存放 Repository archive 上传的文件的地方,默认是 `data/repo-archive`
392+
- `MINIO_ENDPOINT`: **localhost:9000**: Minio 地址,仅当 `STORAGE_TYPE``minio` 时有效。
393+
- `MINIO_ACCESS_KEY_ID`: Minio accessKeyID,仅当 `STORAGE_TYPE``minio` 时有效。
394+
- `MINIO_SECRET_ACCESS_KEY`: Minio secretAccessKey,仅当 `STORAGE_TYPE``minio` 时有效。
395+
- `MINIO_BUCKET`: **gitea**: Minio bucket,仅当 `STORAGE_TYPE``minio` 时有效。
396+
- `MINIO_LOCATION`: **us-east-1**: Minio location ,仅当 `STORAGE_TYPE``minio` 时有效。
397+
- `MINIO_BASE_PATH`: **repo-archive/**: Minio base path ,仅当 `STORAGE_TYPE``minio` 时有效。
398+
- `MINIO_USE_SSL`: **false**: Minio 是否启用 ssl ,仅当 `STORAGE_TYPE``minio` 时有效。
399+
385400
## Other (`other`)
386401

387402
- `SHOW_FOOTER_BRANDING`: 为真则在页面底部显示Gitea的字样。
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
aacbdfe9e1c4b47f60abe81849045fa4e96f1d75

models/fixtures/repo_archiver.yml

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
[] # empty

models/migrations/migrations.go

+2
Original file line numberDiff line numberDiff line change
@@ -319,6 +319,8 @@ var migrations = []Migration{
319319
NewMigration("Create PushMirror table", createPushMirrorTable),
320320
// v184 -> v185
321321
NewMigration("Rename Task errors to message", renameTaskErrorsToMessage),
322+
// v185 -> v186
323+
NewMigration("Add new table repo_archiver", addRepoArchiver),
322324
}
323325

324326
// GetCurrentDBVersion returns the current db version

models/migrations/v181.go

+1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
// Copyright 2021 The Gitea Authors. All rights reserved.
12
// Use of this source code is governed by a MIT-style
23
// license that can be found in the LICENSE file.
34

models/migrations/v185.go

+22
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
// Copyright 2021 The Gitea Authors. All rights reserved.
2+
// Use of this source code is governed by a MIT-style
3+
// license that can be found in the LICENSE file.
4+
5+
package migrations
6+
7+
import (
8+
"xorm.io/xorm"
9+
)
10+
11+
func addRepoArchiver(x *xorm.Engine) error {
12+
// RepoArchiver represents all archivers
13+
type RepoArchiver struct {
14+
ID int64 `xorm:"pk autoincr"`
15+
RepoID int64 `xorm:"index unique(s)"`
16+
Type int `xorm:"unique(s)"`
17+
Status int
18+
CommitID string `xorm:"VARCHAR(40) unique(s)"`
19+
CreatedUnix int64 `xorm:"INDEX NOT NULL created"`
20+
}
21+
return x.Sync2(new(RepoArchiver))
22+
}

models/models.go

+1
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,7 @@ func init() {
136136
new(RepoTransfer),
137137
new(IssueIndex),
138138
new(PushMirror),
139+
new(RepoArchiver),
139140
)
140141

141142
gonicNames := []string{"SSL", "UID"}

models/repo.go

+47-50
Original file line numberDiff line numberDiff line change
@@ -1587,6 +1587,22 @@ func DeleteRepository(doer *User, uid, repoID int64) error {
15871587
return err
15881588
}
15891589

1590+
// Remove archives
1591+
var archives []*RepoArchiver
1592+
if err = sess.Where("repo_id=?", repoID).Find(&archives); err != nil {
1593+
return err
1594+
}
1595+
1596+
for _, v := range archives {
1597+
v.Repo = repo
1598+
p, _ := v.RelativePath()
1599+
removeStorageWithNotice(sess, storage.RepoArchives, "Delete repo archive file", p)
1600+
}
1601+
1602+
if _, err := sess.Delete(&RepoArchiver{RepoID: repoID}); err != nil {
1603+
return err
1604+
}
1605+
15901606
if repo.NumForks > 0 {
15911607
if _, err = sess.Exec("UPDATE `repository` SET fork_id=0,is_fork=? WHERE fork_id=?", false, repo.ID); err != nil {
15921608
log.Error("reset 'fork_id' and 'is_fork': %v", err)
@@ -1768,64 +1784,45 @@ func DeleteRepositoryArchives(ctx context.Context) error {
17681784
func DeleteOldRepositoryArchives(ctx context.Context, olderThan time.Duration) error {
17691785
log.Trace("Doing: ArchiveCleanup")
17701786

1771-
if err := x.Where("id > 0").Iterate(new(Repository), func(idx int, bean interface{}) error {
1772-
return deleteOldRepositoryArchives(ctx, olderThan, idx, bean)
1773-
}); err != nil {
1774-
log.Trace("Error: ArchiveClean: %v", err)
1775-
return err
1776-
}
1777-
1778-
log.Trace("Finished: ArchiveCleanup")
1779-
return nil
1780-
}
1781-
1782-
func deleteOldRepositoryArchives(ctx context.Context, olderThan time.Duration, idx int, bean interface{}) error {
1783-
repo := bean.(*Repository)
1784-
basePath := filepath.Join(repo.RepoPath(), "archives")
1785-
1786-
for _, ty := range []string{"zip", "targz"} {
1787-
select {
1788-
case <-ctx.Done():
1789-
return ErrCancelledf("before deleting old repository archives with filetype %s for %s", ty, repo.FullName())
1790-
default:
1791-
}
1792-
1793-
path := filepath.Join(basePath, ty)
1794-
file, err := os.Open(path)
1795-
if err != nil {
1796-
if !os.IsNotExist(err) {
1797-
log.Warn("Unable to open directory %s: %v", path, err)
1798-
return err
1799-
}
1800-
1801-
// If the directory doesn't exist, that's okay.
1802-
continue
1803-
}
1804-
1805-
files, err := file.Readdir(0)
1806-
file.Close()
1787+
for {
1788+
var archivers []RepoArchiver
1789+
err := x.Where("created_unix < ?", time.Now().Add(-olderThan).Unix()).
1790+
Asc("created_unix").
1791+
Limit(100).
1792+
Find(&archivers)
18071793
if err != nil {
1808-
log.Warn("Unable to read directory %s: %v", path, err)
1794+
log.Trace("Error: ArchiveClean: %v", err)
18091795
return err
18101796
}
18111797

1812-
minimumOldestTime := time.Now().Add(-olderThan)
1813-
for _, info := range files {
1814-
if info.ModTime().Before(minimumOldestTime) && !info.IsDir() {
1815-
select {
1816-
case <-ctx.Done():
1817-
return ErrCancelledf("before deleting old repository archive file %s with filetype %s for %s", info.Name(), ty, repo.FullName())
1818-
default:
1819-
}
1820-
toDelete := filepath.Join(path, info.Name())
1821-
// This is a best-effort purge, so we do not check error codes to confirm removal.
1822-
if err = util.Remove(toDelete); err != nil {
1823-
log.Trace("Unable to delete %s, but proceeding: %v", toDelete, err)
1824-
}
1798+
for _, archiver := range archivers {
1799+
if err := deleteOldRepoArchiver(ctx, &archiver); err != nil {
1800+
return err
18251801
}
18261802
}
1803+
if len(archivers) < 100 {
1804+
break
1805+
}
18271806
}
18281807

1808+
log.Trace("Finished: ArchiveCleanup")
1809+
return nil
1810+
}
1811+
1812+
var delRepoArchiver = new(RepoArchiver)
1813+
1814+
func deleteOldRepoArchiver(ctx context.Context, archiver *RepoArchiver) error {
1815+
p, err := archiver.RelativePath()
1816+
if err != nil {
1817+
return err
1818+
}
1819+
_, err = x.ID(archiver.ID).Delete(delRepoArchiver)
1820+
if err != nil {
1821+
return err
1822+
}
1823+
if err := storage.RepoArchives.Delete(p); err != nil {
1824+
log.Error("delete repo archive file failed: %v", err)
1825+
}
18291826
return nil
18301827
}
18311828

models/repo_archiver.go

+86
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
// Copyright 2021 The Gitea Authors. All rights reserved.
2+
// Use of this source code is governed by a MIT-style
3+
// license that can be found in the LICENSE file.
4+
5+
package models
6+
7+
import (
8+
"fmt"
9+
10+
"code.gitea.io/gitea/modules/git"
11+
"code.gitea.io/gitea/modules/timeutil"
12+
)
13+
14+
// RepoArchiverStatus represents repo archive status
15+
type RepoArchiverStatus int
16+
17+
// enumerate all repo archive statuses
18+
const (
19+
RepoArchiverGenerating = iota // the archiver is generating
20+
RepoArchiverReady // it's ready
21+
)
22+
23+
// RepoArchiver represents all archivers
24+
type RepoArchiver struct {
25+
ID int64 `xorm:"pk autoincr"`
26+
RepoID int64 `xorm:"index unique(s)"`
27+
Repo *Repository `xorm:"-"`
28+
Type git.ArchiveType `xorm:"unique(s)"`
29+
Status RepoArchiverStatus
30+
CommitID string `xorm:"VARCHAR(40) unique(s)"`
31+
CreatedUnix timeutil.TimeStamp `xorm:"INDEX NOT NULL created"`
32+
}
33+
34+
// LoadRepo loads repository
35+
func (archiver *RepoArchiver) LoadRepo() (*Repository, error) {
36+
if archiver.Repo != nil {
37+
return archiver.Repo, nil
38+
}
39+
40+
var repo Repository
41+
has, err := x.ID(archiver.RepoID).Get(&repo)
42+
if err != nil {
43+
return nil, err
44+
}
45+
if !has {
46+
return nil, ErrRepoNotExist{
47+
ID: archiver.RepoID,
48+
}
49+
}
50+
return &repo, nil
51+
}
52+
53+
// RelativePath returns relative path
54+
func (archiver *RepoArchiver) RelativePath() (string, error) {
55+
repo, err := archiver.LoadRepo()
56+
if err != nil {
57+
return "", err
58+
}
59+
60+
return fmt.Sprintf("%s/%s/%s.%s", repo.FullName(), archiver.CommitID[:2], archiver.CommitID, archiver.Type.String()), nil
61+
}
62+
63+
// GetRepoArchiver get an archiver
64+
func GetRepoArchiver(ctx DBContext, repoID int64, tp git.ArchiveType, commitID string) (*RepoArchiver, error) {
65+
var archiver RepoArchiver
66+
has, err := ctx.e.Where("repo_id=?", repoID).And("`type`=?", tp).And("commit_id=?", commitID).Get(&archiver)
67+
if err != nil {
68+
return nil, err
69+
}
70+
if has {
71+
return &archiver, nil
72+
}
73+
return nil, nil
74+
}
75+
76+
// AddRepoArchiver adds an archiver
77+
func AddRepoArchiver(ctx DBContext, archiver *RepoArchiver) error {
78+
_, err := ctx.e.Insert(archiver)
79+
return err
80+
}
81+
82+
// UpdateRepoArchiverStatus updates archiver's status
83+
func UpdateRepoArchiverStatus(ctx DBContext, archiver *RepoArchiver) error {
84+
_, err := ctx.e.ID(archiver.ID).Cols("status").Update(archiver)
85+
return err
86+
}

models/unit_tests.go

+2
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,8 @@ func MainTest(m *testing.M, pathToGiteaRoot string) {
7474

7575
setting.RepoAvatar.Storage.Path = filepath.Join(setting.AppDataPath, "repo-avatars")
7676

77+
setting.RepoArchive.Storage.Path = filepath.Join(setting.AppDataPath, "repo-archive")
78+
7779
if err = storage.Init(); err != nil {
7880
fatalTestError("storage.Init: %v\n", err)
7981
}

modules/context/context.go

+15
Original file line numberDiff line numberDiff line change
@@ -380,6 +380,21 @@ func (ctx *Context) ServeFile(file string, names ...string) {
380380
http.ServeFile(ctx.Resp, ctx.Req, file)
381381
}
382382

383+
// ServeStream serves file via io stream
384+
func (ctx *Context) ServeStream(rd io.Reader, name string) {
385+
ctx.Resp.Header().Set("Content-Description", "File Transfer")
386+
ctx.Resp.Header().Set("Content-Type", "application/octet-stream")
387+
ctx.Resp.Header().Set("Content-Disposition", "attachment; filename="+name)
388+
ctx.Resp.Header().Set("Content-Transfer-Encoding", "binary")
389+
ctx.Resp.Header().Set("Expires", "0")
390+
ctx.Resp.Header().Set("Cache-Control", "must-revalidate")
391+
ctx.Resp.Header().Set("Pragma", "public")
392+
_, err := io.Copy(ctx.Resp, rd)
393+
if err != nil {
394+
ctx.ServerError("Download file failed", err)
395+
}
396+
}
397+
383398
// Error returned an error to web browser
384399
func (ctx *Context) Error(status int, contents ...string) {
385400
var v = http.StatusText(status)

0 commit comments

Comments
 (0)