Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor sync #3312

Merged
merged 68 commits into from
Dec 19, 2024
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
3d4c761
Add new sync implementation (WIP, untested)
eemeli May 20, 2024
9e65c9a
Address code review comments, start adding tests
eemeli Sep 9, 2024
3d6a1ec
Merge branch 'main' into sync-refactor
eemeli Sep 14, 2024
397e5f5
Add integration tests for new code; fix discovered issues
eemeli Sep 14, 2024
5778090
Update requirements
eemeli Sep 17, 2024
7f3bfcb
Satisfy lint
eemeli Sep 17, 2024
83b8fd7
Call repo commit() with pre-formatted author string rather than User …
eemeli Sep 21, 2024
3005870
Add end-to-end test
eemeli Sep 21, 2024
5b227e9
Add task wrapper & force option
eemeli Sep 21, 2024
e657223
Refactor handle_upload_content() into sync_uploaded_file()
eemeli Sep 22, 2024
5da483f
Replace get_download_content() with download_translations_zip()
eemeli Sep 30, 2024
2f0ea10
Remove old sync implementation
eemeli Oct 1, 2024
bb97199
Pretranslate added & changed resources, fix task invocations
eemeli Oct 1, 2024
3da84c0
Move & rename pontoon.sync contents around, adding pontoon.sync.core
eemeli Oct 1, 2024
92fb993
Combine upload & download functions into pontoon.sync.utils
eemeli Oct 1, 2024
410a278
Merge branch 'main' into sync-refactor
eemeli Oct 1, 2024
83049d9
Include --no-strip-extras in `uv pip compile` calls
eemeli Oct 1, 2024
f92e331
Satisfy ruff
eemeli Oct 1, 2024
fcfc5fa
File format detector cleanup, drop unused template
eemeli Oct 1, 2024
15ef5b6
Fix last_synced_revision data, drop remaining multi_locale references
eemeli Oct 1, 2024
f5e16db
More dead code removal
eemeli Oct 1, 2024
455c64a
Revert Repository.permalink_prefix help_text change
eemeli Oct 1, 2024
a5177cf
Support file renames for git repos
eemeli Oct 10, 2024
4efe373
Fix issues discovered by manual testing
eemeli Oct 10, 2024
2714f81
Improve sync logging
eemeli Oct 12, 2024
bb26ab5
Merge branch 'main' into sync-refactor
eemeli Oct 12, 2024
76dfa40
More sync fixes & logging
eemeli Oct 12, 2024
c49e95b
Update moz.l10n dependency
eemeli Oct 13, 2024
63aac31
Oops, fix file upload handling
eemeli Oct 13, 2024
fb538a1
Fix zip download, add test for it
eemeli Oct 15, 2024
8aebdd6
Drop unnecessary extras from test_download
eemeli Oct 15, 2024
646ab4b
Simplify aggregated stats for .po plurals, use SQL UPDATE queries
eemeli Oct 16, 2024
b4854c4
Reduce stats updates further, include total_strings calculation
eemeli Oct 16, 2024
d322f3d
Use simpler query for looking up entity identifiers
eemeli Oct 16, 2024
68f94f7
Use new update_stats() for `manage.py calculate_stats` command
eemeli Oct 17, 2024
240cd73
Add & remove TranslatedResource objects when locales change
eemeli Oct 17, 2024
b70451b
Sum project total_strings from translated resources, not resources
eemeli Oct 18, 2024
33c307d
Update moz.l10n to 0.5.2, log changed resources
eemeli Oct 18, 2024
263a2f1
Always sync all translated resources
eemeli Oct 18, 2024
e6d0425
Merge branch 'main' into sync-refactor
eemeli Oct 18, 2024
5eb6a9f
Merge branch 'main' into sync-refactor
eemeli Nov 26, 2024
f7171bf
Fix manual pretranslation task
eemeli Nov 26, 2024
cf4e4b8
Fix file upload, ensure that it reports at least some error on failure
eemeli Nov 27, 2024
bd649e7
Update to moz.l10n 0.5.5
eemeli Nov 27, 2024
cebc5b2
Satisfy ruff
eemeli Nov 27, 2024
a470d4a
Update to moz.l10n 0.5.6
eemeli Dec 2, 2024
37bdb89
Dedupe updates for multiple changes made to the same resource
eemeli Dec 4, 2024
ff10ffa
Apply suggested changes from code review
eemeli Dec 4, 2024
5b4f8b6
Merge branch 'main' into sync-refactor
eemeli Dec 4, 2024
d575b64
Drop dead code: Entity.reset_active_translation()
eemeli Dec 4, 2024
5d49838
Oops, it's EntityQuerySet.reset_active_translations() that is no long…
eemeli Dec 4, 2024
f166d76
Add test case for translation arriving before its source is added
eemeli Dec 5, 2024
a2ac4fb
Use shallow clones for downloads from projects using git repos
eemeli Dec 12, 2024
c244927
When downloading translations, skip missing files & use full target r…
eemeli Dec 12, 2024
ea4edf0
Merge branch 'main' into sync-refactor
eemeli Dec 12, 2024
619a22a
Fix download tests
eemeli Dec 12, 2024
1f298f4
Update to translate-toolkit 3.14.1
eemeli Dec 12, 2024
c309617
Fix total_strings counts to depend on gettext locale plurals in aggre…
eemeli Dec 13, 2024
c239d7d
Drop unused ResourceQuerySet
eemeli Dec 13, 2024
784dec2
Dismiss local git repo edits when branch is specified
eemeli Dec 16, 2024
8f26503
During update from repo, keep previously fuzzy suggestions unchanged …
eemeli Dec 16, 2024
3762da5
Rather than creating zip, "download" by redirecting to target repository
eemeli Dec 16, 2024
74e4a14
Add shortcut (read: hack) for downloading from projects with separate…
eemeli Dec 16, 2024
357f7fe
Include active fuzzy translations when writing to repo
eemeli Dec 17, 2024
008fa56
When approving matching prior translations, do not also reject them
eemeli Dec 18, 2024
ff95768
Use locale's total_strings for GET <locale>/<slug>/parts/
eemeli Dec 18, 2024
c931f71
Count translation updates before dropping approvals from the dict
eemeli Dec 18, 2024
ff2b00d
Rename add_errors() as add_failed_checks()
eemeli Dec 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions docker/compile_requirements.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

export CUSTOM_COMPILE_COMMAND="./docker/compile_requirements.sh"

uv pip compile --generate-hashes $@ requirements/default.in -o requirements/default.txt
uv pip compile --generate-hashes $@ requirements/dev.in -o requirements/dev.txt
uv pip compile --generate-hashes $@ requirements/lint.in -o requirements/lint.txt
uv pip compile --generate-hashes $@ requirements/test.in -o requirements/test.txt
uv pip compile --generate-hashes --no-strip-extras $@ requirements/default.in -o requirements/default.txt
uv pip compile --generate-hashes --no-strip-extras $@ requirements/dev.in -o requirements/dev.txt
uv pip compile --generate-hashes --no-strip-extras $@ requirements/lint.in -o requirements/lint.txt
uv pip compile --generate-hashes --no-strip-extras $@ requirements/test.in -o requirements/test.txt
6 changes: 3 additions & 3 deletions pontoon/administration/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
from pontoon.base.utils import require_AJAX
from pontoon.pretranslation.tasks import pretranslate
from pontoon.sync.models import SyncLog
from pontoon.sync.tasks import sync_project
from pontoon.sync.tasks import sync_project_task


log = logging.getLogger(__name__)
Expand Down Expand Up @@ -542,9 +542,9 @@ def manually_sync_project(request, slug):
"Forbidden: You don't have permission for syncing projects"
)

sync_log = SyncLog.objects.create(start_time=timezone.now())
project = Project.objects.get(slug=slug)
sync_project.delay(project.pk, sync_log.pk)
sync_log = SyncLog.objects.create(start_time=timezone.now())
sync_project_task.delay(project.pk, sync_log.pk)

return HttpResponse("ok")

Expand Down
13 changes: 0 additions & 13 deletions pontoon/base/__init__.py
Original file line number Diff line number Diff line change
@@ -1,13 +0,0 @@
MOZILLA_REPOS = (
"ssh://hg.mozilla.org/users/m_owca.info/firefox-beta/",
"ssh://hg.mozilla.org/users/m_owca.info/firefox-for-android-beta/",
"ssh://hg.mozilla.org/users/m_owca.info/thunderbird-beta/",
"ssh://hg.mozilla.org/users/m_owca.info/lightning-beta/",
"ssh://hg.mozilla.org/users/m_owca.info/seamonkey-beta/",
"ssh://hg.mozilla.org/users/m_owca.info/firefox-central/",
"ssh://hg.mozilla.org/users/m_owca.info/firefox-for-android-central/",
"ssh://hg.mozilla.org/users/m_owca.info/thunderbird-central/",
"ssh://hg.mozilla.org/users/m_owca.info/lightning-central/",
"ssh://hg.mozilla.org/users/m_owca.info/seamonkey-central/",
"git@gitlab.com:seamonkey-project/seamonkey-central-l10n.git",
)
6 changes: 2 additions & 4 deletions pontoon/base/migrations/0018_populate_entity_context.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,20 @@
from django.db import migrations
from django.db.models import F, Func, TextField, Value

from pontoon.sync import KEY_SEPARATOR


def add_entity_context(apps, schema_editor):
Entity = apps.get_model("base", "Entity")

split_key_po = Func(
F("key"),
Value(KEY_SEPARATOR),
Value("\x04"),
Value(1),
function="split_part",
output_field=TextField(),
)
split_key_xliff = Func(
F("key"),
Value(KEY_SEPARATOR),
Value("\x04"),
Value(2),
function="split_part",
output_field=TextField(),
Expand Down
10 changes: 8 additions & 2 deletions pontoon/base/models/changed_entity_locale.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,21 @@
from typing import TYPE_CHECKING

from django.db import models
from django.utils import timezone


if TYPE_CHECKING:
from pontoon.base.models import Entity, Locale


class ChangedEntityLocale(models.Model):
"""
ManyToMany model for storing what locales have changed translations for a
specific entity since the last sync.
"""

entity = models.ForeignKey("Entity", models.CASCADE)
locale = models.ForeignKey("Locale", models.CASCADE)
entity: models.ForeignKey["Entity"] = models.ForeignKey("Entity", models.CASCADE)
locale: models.ForeignKey["Locale"] = models.ForeignKey("Locale", models.CASCADE)
when = models.DateTimeField(default=timezone.now)

class Meta:
Expand Down
25 changes: 10 additions & 15 deletions pontoon/base/models/entity.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from collections.abc import Iterable
from functools import reduce
from operator import ior
from re import escape, findall, match
Expand All @@ -14,7 +15,6 @@
from pontoon.base.models.project import Project
from pontoon.base.models.project_locale import ProjectLocale
from pontoon.base.models.resource import Resource
from pontoon.sync import KEY_SEPARATOR


def get_word_count(string):
Expand Down Expand Up @@ -532,18 +532,6 @@ class Meta:
models.Index(fields=["resource", "obsolete", "string_plural"]),
]

@property
def cleaned_key(self):
"""
Get cleaned key, without the source string and Translate Toolkit
separator.
"""
key = self.key.split(KEY_SEPARATOR)[0]
if key == self.string:
key = ""

return key

def __str__(self):
return self.string

Expand Down Expand Up @@ -942,7 +930,9 @@ def map_entities(
):
entities_array = []

entities = entities.prefetch_entities_data(locale, preferred_source_locale)
entities: Iterable[Entity] = entities.prefetch_entities_data(
locale, preferred_source_locale
)

# If requested entity not in the current page
if requested_entity and requested_entity not in [e.pk for e in entities]:
Expand Down Expand Up @@ -971,13 +961,18 @@ def map_entities(
if original_plural != "":
original_plural = entity.alternative_originals[-1].string

key_separator = "\x04"
cleaned_key = entity.key.split(key_separator)[0]
if cleaned_key == entity.string:
cleaned_key = ""

entities_array.append(
{
"pk": entity.pk,
"original": original,
"original_plural": original_plural,
"machinery_original": entity.string,
"key": entity.cleaned_key,
"key": cleaned_key,
"context": entity.context,
"path": entity.resource.path,
"project": entity.resource.project.serialize(),
Expand Down
28 changes: 16 additions & 12 deletions pontoon/base/models/project.py
Original file line number Diff line number Diff line change
@@ -1,20 +1,25 @@
from collections import defaultdict
from os.path import basename, join, normpath
from typing import TYPE_CHECKING
from urllib.parse import urlparse

from django.conf import settings
from django.contrib.auth.models import User
from django.db import models
from django.db.models import Prefetch
from django.db.models.manager import BaseManager
from django.utils import timezone
from django.utils.functional import cached_property

from pontoon.base import utils
from pontoon.base.models.aggregated_stats import AggregatedStats
from pontoon.base.models.changed_entity_locale import ChangedEntityLocale
from pontoon.base.models.locale import Locale


if TYPE_CHECKING:
from pontoon.base.models import Resource


class Priority(models.IntegerChoices):
LOWEST = 1, "Lowest"
LOW = 2, "Low"
Expand Down Expand Up @@ -103,6 +108,8 @@ class Project(AggregatedStats):
slug = models.SlugField(unique=True)
locales = models.ManyToManyField(Locale, through="ProjectLocale")

resources: BaseManager["Resource"]

class DataSource(models.TextChoices):
REPOSITORY = "repository", "Repository"
DATABASE = "database", "Database"
Expand Down Expand Up @@ -325,14 +332,14 @@ def repository_for_path(self, path):
Return the repository instance whose checkout contains the given
path. If no matching repo is found, raise a ValueError.
"""
repo = utils.first(
self.repositories.all(), lambda r: path.startswith(r.checkout_path)
)

if repo is None:
try:
return next(
repo
for repo in self.repositories.all()
if path.startswith(repo.checkout_path)
)
except StopIteration:
raise ValueError(f"Could not find repo matching path {path}.")
else:
return repo

@property
def has_multi_locale_repositories(self):
Expand All @@ -352,10 +359,7 @@ def source_repository(self):
Returns an instance of repository which contains the path to source files.
"""
if not self.has_single_repo:
from pontoon.sync.vcs.project import VCSProject

source_directories = VCSProject.SOURCE_DIR_SCORES.keys()

source_directories = {"templates", "en-US", "en-us", "en_US", "en_us", "en"}
for repo in self.repositories.all():
last_directory = basename(normpath(urlparse(repo.url).path))
if repo.source_repo or last_directory in source_directories:
Expand Down
18 changes: 8 additions & 10 deletions pontoon/base/models/repository.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
import logging
import re

from os import sep
from os.path import join
from os.path import join, normpath
from urllib.parse import urlparse

from jsonfield import JSONField
Expand Down Expand Up @@ -115,21 +114,20 @@ def checkout_path(self):
Path where the checkout for this repo is located. Does not
include a trailing path separator.
"""
path_components = [self.project.checkout_path]

# Include path components from the URL in case it has locale
# information, like https://hg.mozilla.org/gaia-l10n/fr/.
# No worry about overlap between repos, any overlap of locale
# directories is an error already.
path_components += urlparse(self.url).path.split("/")
path_components = [
self.project.checkout_path,
*urlparse(self.url).path.split("/"),
]
if self.multi_locale:
path_components = [c for c in path_components if c != "{locale_code}"]

if self.source_repo:
path_components.append("templates")

# Remove trailing separator for consistency.
return join(*path_components).rstrip(sep)
# Normalize path for consistency.
return normpath(join(*path_components))

@cached_property
def api_config(self):
Expand Down Expand Up @@ -234,7 +232,7 @@ def pull(self, locales=None):

return current_revisions

def commit(self, message, author, path):
def commit(self, message: str, author: str, path: str):
"""Commit changes to VCS."""
# For multi-locale repos, figure out which sub-repo corresponds
# to the given path.
Expand Down
36 changes: 1 addition & 35 deletions pontoon/base/models/resource.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@
from os.path import splitext

from django.db import models
from django.utils import timezone


class ResourceQuerySet(models.QuerySet):
def asymmetric(self):
return self.filter(format__in=Resource.ASYMMETRIC_FORMATS)
pass


class Resource(models.Model):
Expand Down Expand Up @@ -46,19 +43,6 @@ class Format(models.TextChoices):

deadline = models.DateField(blank=True, null=True)

SOURCE_EXTENSIONS = ["pot"] # Extensions of source-only formats.
ALLOWED_EXTENSIONS = Format.values + SOURCE_EXTENSIONS

ASYMMETRIC_FORMATS = {
Format.DTD,
Format.FTL,
Format.INC,
Format.INI,
Format.JSON,
Format.PROPERTIES,
Format.XML,
}

# Formats that allow empty translations
EMPTY_TRANSLATION_FORMATS = {
Format.DTD,
Expand All @@ -72,11 +56,6 @@ class Format(models.TextChoices):
class Meta:
unique_together = (("project", "path"),)

@property
def is_asymmetric(self):
"""Return True if this resource is in an asymmetric format."""
return self.format in self.ASYMMETRIC_FORMATS

@property
def allows_empty_translations(self):
"""Return True if this resource allows empty translations."""
Expand All @@ -91,16 +70,3 @@ def __str__(self):
project=self.project.name,
resource=self.path,
)

@classmethod
def get_path_format(self, path):
filename, extension = splitext(path)
path_format = extension[1:].lower()

# Special case: pot files are considered the po format
if path_format == "pot":
return "po"
elif path_format == "xlf":
return "xliff"
else:
return path_format
4 changes: 3 additions & 1 deletion pontoon/base/tests/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,9 @@ def locales(self, create, extracted, **kwargs):

if extracted:
for locale in extracted:
ProjectLocaleFactory.create(project=self, locale=locale)
ProjectLocaleFactory.create(
project=self, locale=locale, total_strings=self.total_strings
)

@factory.post_generation
def repositories(self, create, extracted, **kwargs):
Expand Down
3 changes: 1 addition & 2 deletions pontoon/base/tests/models/test_entity.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
import pytest

from pontoon.base.models import ChangedEntityLocale, Entity, Project
from pontoon.sync import KEY_SEPARATOR
from pontoon.test.factories import (
EntityFactory,
ResourceFactory,
Expand Down Expand Up @@ -40,7 +39,7 @@ def entity_test_models(translation_a, locale_b):
entity_b = EntityFactory(
resource=resourceX,
string="entity_b",
key="Key%sentity_b" % KEY_SEPARATOR,
key="Key\x04entity_b",
order=0,
)
translation_a_pl = TranslationFactory(
Expand Down
5 changes: 2 additions & 3 deletions pontoon/base/tests/models/test_repository.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,15 +43,14 @@ def test_repo_checkout_path_multi_locale(settings, repo_git):
@pytest.mark.django_db
def test_repo_checkout_path_source_repo(settings, repo_git):
"""
The checkout_path for a source repo should end with a templates
The checkout_path for a source repo should not end with a templates
directory.
"""
repo_git.source_repo = True
repo_git.url = "https://example.com/path/to/locale/"
repo_git.save()
assert repo_git.checkout_path == (
"%s/projects/%s/path/to/locale/templates"
% (settings.MEDIA_ROOT, repo_git.project.slug)
"%s/projects/%s/path/to/locale" % (settings.MEDIA_ROOT, repo_git.project.slug)
)


Expand Down
Loading