-
Notifications
You must be signed in to change notification settings - Fork 992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Locks + cache for multiple revisions #8510
Closed
Closed
Changes from 36 commits
Commits
Show all changes
68 commits
Select commit
Hold shift + click to select a range
03a2833
classes for a Conan cache
jgsogo 0b1d298
implement locks on top of a database
jgsogo 78d9152
create a manager and some lockable resource
jgsogo 15fc970
class per file
jgsogo 602d45a
Add some testing for basic behavior
jgsogo c274ad4
Add some testing for basic behavior
jgsogo 099776f
wip
jgsogo 9e85c3d
work on folders
jgsogo 893ee3f
moving folders when the rrev is known
jgsogo 056b452
first running impl
jgsogo 0dcbd71
add comment
jgsogo 7cc54d1
add comment
jgsogo deb051b
small testing
jgsogo 9df88a6
add more tests
jgsogo 062d33f
add more tests
jgsogo 4454c18
check folders are retrieved from database
jgsogo 1850271
Merge branch 'feat/cache' of github.com:jgsogo/conan into feat/cache
jgsogo 4b2224d
add default path for packages
jgsogo ee889b7
working with multiprocessing
jgsogo d068fcc
organize fixtures
jgsogo 9117a64
use URI to work with memory sqlite3 databases
jgsogo a10a92f
use URI to work with memory sqlite3 databases
jgsogo 25b15f5
rename functions
jgsogo d857f53
reuse sqlite3 initializations
jgsogo ff85569
replicate scenario to install package
jgsogo fd2cfc6
typo
jgsogo eecd0a1
testing using external tools
jgsogo 907b3aa
remove files
jgsogo 13484d7
fix import
jgsogo d8a7b04
Remove LockableResource (no use case so far)
jgsogo 3d6c9dc
dump to output buffer
jgsogo 43c0784
Add meaning exceptions for locks
jgsogo 11cfbb3
exceptions for cache
jgsogo ce12419
update comments
jgsogo 628a606
Prevent SQL injection cc/ @SSE4
jgsogo 622c24b
simplify function interfaces
jgsogo 2fefe1c
check reference (at least) is et
jgsogo 63b4689
lock the database while we are operating on it
jgsogo b1d35f1
test using external script (expecting to reproduce two processes)
jgsogo 9fcb580
no need to rollback if we are closing the database
jgsogo 1dde7ba
remove useless lines
jgsogo dc94417
add backend for locks using fasteners library (failure in tests)
jgsogo d82ff38
testing working with fasterners
jgsogo cdf1587
kill a process, leftovers are not removed :(
jgsogo 15e5fa7
reorganize cache_folders
jgsogo 4f44e3d
rename classes (now names too long)
jgsogo aa40da5
move functions inside
jgsogo 4e0c2e4
add comments
jgsogo b3598cf
package_layout is not a member of references
jgsogo 6946db8
basic operations for table of prefs
jgsogo 1b825dc
basic testing for packages
jgsogo c605699
tests for packages table
jgsogo 2bbdc52
latest pref, tests
jgsogo 63be035
table to store folders
jgsogo cf79d39
manage folders, update timestamp for LRU
jgsogo a38a6be
migrating test to new implementation
jgsogo d6fba26
add uniqueness checks
jgsogo 69d067c
use new database with tables approach
jgsogo e641c85
remove old directories table/database
jgsogo 81df9e4
there exists a two-level cache
jgsogo 1666cfc
use interface to ensure we provide same methods for all cache impleme…
jgsogo 522b839
use db with 'get' and 'get_or_create' API
jgsogo ea7059c
get_or_create need to work at database level (for atomicity)
jgsogo 26fb724
provide safe[r] implementation for read-only cache
jgsogo 3879d87
remove dead code
jgsogo bfc58aa
functions to list, search,... references from one cache
jgsogo e67b028
Add docu
jgsogo 8079715
list packages given a reference (search, list, find package-id)
jgsogo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,103 @@ | ||
# Conan Cache | ||
|
||
## Considerations | ||
* In the codebase I want to use objects like `recipe_layout` | ||
or `package_layout`, I don't want to carry always the `cache` object | ||
together with the `ConanFileReference` or `PackageReference`. | ||
|
||
+ **Consequence**: the lock is not adquired at the momento of getting | ||
the `RecipeLayout` or `PackageLayout` but when it is going to be used. | ||
|
||
## Alternatives | ||
|
||
1. Before using anything from a layout, you need to indicate the ownership | ||
you want | ||
|
||
1. All operations run inside the layout (read files, write,...) | ||
|
||
1. Return a lock-object together with the information and let the user decide | ||
what to do with it. | ||
|
||
|
||
## SQlite3 | ||
|
||
According to docs, it is safe to use SQlite3 from different processes in a | ||
concurrent way. It manages several readers at the same time and only one | ||
writter at the same time ([info](https://sqlite.org/faq.html#q5), | ||
[more info](https://www.sqlite.org/lockingv3.html)). | ||
|
||
According to [Python docs](https://docs.python.org/3/library/sqlite3.html), | ||
it is also safe: | ||
|
||
> When a database is accessed by multiple connections, and one of the | ||
> processes modifies the database, the SQLite database is locked until | ||
> that transaction is committed. The timeout parameter specifies how | ||
> long the connection should wait for the lock to go away until raising | ||
> an exception. The default for the timeout parameter is 5.0 (five seconds). | ||
|
||
For the sake of the operations we will be running the time spent by the | ||
read-write operations is not worth considered (TBD) taking into account other | ||
Conan operations. | ||
|
||
|
||
## Cache folders | ||
|
||
For each reference we need the following folders. Some folders are needed | ||
before we know the final destination, if we want a deterministic cache layout | ||
we need to move them **after** being used (is this a time issue?). | ||
|
||
Some folders can't be deterministic as they depend on things that aren't, | ||
like folders that encode the `prev`. Only when using a lockfile we will | ||
know the `prev` in advance (or downloading from remotes). | ||
|
||
### [tmp]/export | ||
|
||
It is needed in order to compute the _recipe revision_. | ||
|
||
### [rrev]/export | ||
|
||
The place where `conanfile.py` and other exported files are located. | ||
|
||
### [rrev]/export_source | ||
|
||
Source files exported after the recipe. | ||
|
||
### [rrev]/source | ||
|
||
Resulting files after running `source()` function. Conan v2 should forbid | ||
usage of `settings` or `options`. This folder is shared for all the | ||
packages. | ||
|
||
### [tmp]/build | ||
|
||
Needed to build the package. It should be associated to the generated | ||
`prev` (non deterministic builds), but the `prev` is not known yet. | ||
|
||
### [tmp]/package | ||
|
||
A place to put the generated files in order to compute the _package revision_ | ||
|
||
### [prev]/build | ||
|
||
Final place for the _build_ folder. BIG ISSUE: if we move the files here | ||
after the build, maybe you are no longer capable of debugging packages | ||
you've built locally! | ||
|
||
### [prev]/package | ||
|
||
Final place for the package. | ||
|
||
### [rrev]/dl | ||
|
||
Place for downloaded `conan_export.tgz` and `conan_sources.tgz` files | ||
|
||
### [prev]/dl | ||
|
||
Place for downloaded `conan_package.tgz` files. | ||
|
||
|
||
|
||
* We need some temporal folder to compute the recipe-revision, then | ||
we can move everything to the final destination (`export` folder). | ||
|
||
* |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
import os | ||
import shutil | ||
from io import StringIO | ||
from typing import Optional, Union | ||
|
||
from conan.cache.cache_database import CacheDatabase, CacheDatabaseSqlite3Filesystem, \ | ||
CacheDatabaseSqlite3Memory | ||
from conan.cache.recipe_layout import RecipeLayout | ||
from conan.locks.locks_manager import LocksManager | ||
from conans.model.ref import ConanFileReference, PackageReference | ||
|
||
|
||
# TODO: Random folders are no longer accessible, how to get rid of them asap? | ||
# TODO: Add timestamp for LRU | ||
# TODO: We need the workflow to remove existing references. | ||
|
||
|
||
class Cache: | ||
def __init__(self, base_folder: str, backend: CacheDatabase, locks_manager: LocksManager): | ||
self._base_folder = base_folder | ||
self._locks_manager = locks_manager | ||
self._backend = backend | ||
|
||
@staticmethod | ||
def create(backend_id: str, base_folder: str, locks_manager: LocksManager, **backend_kwargs): | ||
if backend_id == 'sqlite3': | ||
backend = CacheDatabaseSqlite3Filesystem(**backend_kwargs) | ||
backend.create_table(if_not_exists=True) | ||
return Cache(base_folder, backend, locks_manager) | ||
elif backend_id == 'memory': | ||
backend = CacheDatabaseSqlite3Memory(**backend_kwargs) | ||
backend.create_table(if_not_exists=True) | ||
return Cache(base_folder, backend, locks_manager) | ||
else: | ||
raise NotImplementedError(f'Backend {backend_id} for cache is not implemented') | ||
|
||
def dump(self, output: StringIO): | ||
""" Maybe just for debugging purposes """ | ||
self._backend.dump(output) | ||
|
||
@property | ||
def base_folder(self) -> str: | ||
return self._base_folder | ||
|
||
def get_reference_layout(self, ref: ConanFileReference) -> RecipeLayout: | ||
return RecipeLayout(ref, cache=self, manager=self._locks_manager) | ||
|
||
@staticmethod | ||
def get_default_path(item: Union[ConanFileReference, PackageReference]): | ||
if item.revision: | ||
return item.full_str().replace('@', '/').replace('#', '/').replace(':', '/') # TODO: TBD | ||
else: | ||
return None | ||
|
||
def _move_rrev(self, old_ref: ConanFileReference, new_ref: ConanFileReference, | ||
move_reference_contents: bool = False) -> Optional[str]: | ||
# Once we know the revision for a given reference, we need to update information in the | ||
# backend and we might want to move folders. | ||
# TODO: Add a little bit of all-or-nothing aka rollback | ||
|
||
self._backend.update_rrev(old_ref, new_ref) | ||
|
||
if move_reference_contents: | ||
old_path, created = self._backend.get_or_create_directory(new_ref) | ||
assert not created, "We've just updated it two lines above!" | ||
new_path = self.get_default_path(new_ref) | ||
self._backend.update_path(new_ref, new_path) | ||
if os.path.exists(old_path): | ||
shutil.move(old_path, new_path) | ||
return new_path | ||
else: | ||
return None | ||
|
||
def _move_prev(self, old_pref: PackageReference, new_pref: PackageReference, | ||
move_package_contents: bool = False) -> Optional[str]: | ||
# TODO: Add a little bit of all-or-nothing aka rollback | ||
self._backend.update_prev(old_pref, new_pref) | ||
if move_package_contents: | ||
old_path, created = self._backend.get_or_create_directory(new_pref.ref, new_pref) | ||
assert not created, "We've just updated it two lines above!" | ||
new_path = self.get_default_path(new_pref) | ||
self._backend.update_path(new_pref, new_path) | ||
if os.path.exists(old_path): | ||
shutil.move(old_path, new_path) | ||
return new_path | ||
else: | ||
return None |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,180 @@ | ||
import time | ||
import uuid | ||
from enum import Enum, unique | ||
from io import StringIO | ||
from typing import Tuple, Union | ||
|
||
from conan.cache.exceptions import DuplicateReferenceException, DuplicatePackageReferenceException | ||
from conan.utils.sqlite3 import Sqlite3MemoryMixin, Sqlite3FilesystemMixin | ||
from conans.model.ref import ConanFileReference, PackageReference | ||
|
||
|
||
@unique | ||
class ConanFolders(Enum): | ||
REFERENCE = 0 | ||
PKG_BUILD = 1 | ||
PKG_PACKAGE = 2 | ||
|
||
|
||
class CacheDatabase: | ||
_table_name = "conan_cache_directories" | ||
_column_ref = 'reference' | ||
_column_ref_name = 'reference_name' | ||
_column_rrev = 'rrev' | ||
_column_pkgid = 'pkgid' | ||
_column_prev = 'prev' | ||
_column_path = 'relpath' | ||
_column_folder = 'folder' | ||
_column_last_modified = 'last_modified' | ||
|
||
def create_table(self, if_not_exists: bool = True): | ||
guard = 'IF NOT EXISTS' if if_not_exists else '' | ||
query = f""" | ||
CREATE TABLE {guard} {self._table_name} ( | ||
{self._column_ref} text NOT NULL, | ||
{self._column_ref_name} text NOT NULL, | ||
{self._column_rrev} text, | ||
{self._column_pkgid} text, | ||
{self._column_prev} text, | ||
{self._column_path} text NOT NULL, | ||
{self._column_folder} integer NOT NULL CHECK ({self._column_folder} IN (0,1, 2)), | ||
{self._column_last_modified} integer NOT NULL | ||
); | ||
""" | ||
# TODO: Need to add some timestamp for LRU removal | ||
with self.connect() as conn: | ||
conn.execute(query) | ||
|
||
def dump(self, output: StringIO): | ||
with self.connect() as conn: | ||
r = conn.execute(f'SELECT * FROM {self._table_name}') | ||
for it in r.fetchall(): | ||
output.write(str(it) + '\n') | ||
|
||
def _get_random_directory(self, item: Union[ConanFileReference, PackageReference]) -> str: | ||
# TODO: We could implement deterministic output for some inputs, not now. | ||
# TODO: If we are creating the 'path' here, we need the base_folder (and lock depending on implementation) | ||
return str(uuid.uuid4()) | ||
|
||
def _where_reference_clause(self, ref: ConanFileReference, filter_packages: bool) -> dict: | ||
where_clauses = { | ||
self._column_ref: str(ref), | ||
self._column_rrev: ref.revision if ref.revision else None, | ||
} | ||
if filter_packages: | ||
where_clauses.update({ | ||
self._column_pkgid: None, | ||
self._column_prev: None | ||
}) | ||
return where_clauses | ||
|
||
def _where_package_reference_clause(self, pref: PackageReference) -> dict: | ||
where_clauses = self._where_reference_clause(pref.ref, False) | ||
where_clauses.update({ | ||
self._column_pkgid: pref.id if pref else None, | ||
self._column_prev: pref.revision if pref and pref.revision else None | ||
}) | ||
return where_clauses | ||
|
||
def _where_clause(self, item: Union[ConanFileReference, PackageReference], | ||
filter_packages: bool) -> Tuple[str, Tuple]: | ||
if isinstance(item, ConanFileReference): | ||
where_clauses = self._where_reference_clause(item, filter_packages) | ||
else: | ||
assert filter_packages, 'If using PackageReference then it WILL filter by packages' | ||
where_clauses = self._where_package_reference_clause(item) | ||
|
||
def cmp_expr(k, v): | ||
return f'{k} = ?' if v is not None else f'{k} IS ?' | ||
|
||
where_expr = ' AND '.join([cmp_expr(k, v) for k, v in where_clauses.items()]) | ||
where_values = tuple(where_clauses.values()) | ||
return where_expr, where_values | ||
|
||
def get_or_create_directory(self, item: Union[ConanFileReference, PackageReference], | ||
default_path: str = None) -> Tuple[str, bool]: | ||
# reference = str(ref) | ||
# assert reference, "Empty reference cannot get into the cache" | ||
# assert not pref or ref == pref.ref, "Both parameters should belong to the same reference" | ||
|
||
# Search the database | ||
where_clause, where_values = self._where_clause(item, filter_packages=True) | ||
query = f'SELECT {self._column_path} ' \ | ||
f'FROM {self._table_name} ' \ | ||
f'WHERE {where_clause};' | ||
|
||
with self.connect() as conn: | ||
r = conn.execute(query, where_values) | ||
rows = r.fetchall() | ||
assert len(rows) <= 1, f"Unique entry expected... found {rows}," \ | ||
f" for where clause {where_clause}" # TODO: Ensure this uniqueness | ||
if not rows: | ||
path = default_path or self._get_random_directory(item) | ||
ref = item if isinstance(item, ConanFileReference) else item.ref | ||
pref = item if isinstance(item, PackageReference) else None | ||
values = (str(ref), | ||
ref.name, | ||
ref.revision if ref.revision else None, | ||
pref.id if pref else None, | ||
pref.revision if pref and pref.revision else None, | ||
path, | ||
ConanFolders.REFERENCE.value, | ||
int(time.time())) | ||
conn.execute(f'INSERT INTO {self._table_name} ' | ||
f'VALUES (?, ?, ?, ?, ?, ?, ?, ?)', values) | ||
return path, True | ||
else: | ||
return rows[0][0], False | ||
|
||
def update_rrev(self, old_ref: ConanFileReference, new_ref: ConanFileReference): | ||
with self.connect() as conn: | ||
# Check if the new_ref already exists, if not, we can move the old_one | ||
where_clause, where_values = self._where_clause(new_ref, filter_packages=False) | ||
query_exists = f'SELECT EXISTS(SELECT 1 ' \ | ||
f'FROM {self._table_name} ' \ | ||
f'WHERE {where_clause})' | ||
r = conn.execute(query_exists, where_values) | ||
if r.fetchone()[0] == 1: | ||
raise DuplicateReferenceException(new_ref) | ||
|
||
where_clause, where_values = self._where_clause(old_ref, filter_packages=False) | ||
query = f"UPDATE {self._table_name} " \ | ||
f"SET {self._column_rrev} = '{new_ref.revision}' " \ | ||
f"WHERE {where_clause}" | ||
r = conn.execute(query, where_values) | ||
assert r.rowcount > 0 | ||
|
||
def update_prev(self, old_pref: PackageReference, new_pref: PackageReference): | ||
with self.connect() as conn: | ||
# Check if the new_pref already exists, if not, we can move the old_one | ||
where_clause, where_values = self._where_clause(new_pref, filter_packages=True) | ||
query_exists = f'SELECT EXISTS(SELECT 1 ' \ | ||
f'FROM {self._table_name} ' \ | ||
f'WHERE {where_clause})' | ||
r = conn.execute(query_exists, where_values) | ||
if r.fetchone()[0] == 1: | ||
raise DuplicatePackageReferenceException(new_pref) | ||
|
||
where_clause, where_values = self._where_clause(old_pref, filter_packages=True) | ||
query = f"UPDATE {self._table_name} " \ | ||
f"SET {self._column_prev} = '{new_pref.revision}' " \ | ||
f"WHERE {where_clause}" | ||
r = conn.execute(query, where_values) | ||
assert r.rowcount > 0 | ||
|
||
def update_path(self, item: Union[ConanFileReference, PackageReference], new_path: str): | ||
where_clause, where_values = self._where_clause(item, filter_packages=True) | ||
query = f"UPDATE {self._table_name} " \ | ||
f"SET {self._column_path} = '{new_path}' " \ | ||
f"WHERE {where_clause}" | ||
with self.connect() as conn: | ||
r = conn.execute(query, where_values) | ||
assert r.rowcount > 0 | ||
|
||
|
||
class CacheDatabaseSqlite3Memory(CacheDatabase, Sqlite3MemoryMixin): | ||
pass | ||
|
||
|
||
class CacheDatabaseSqlite3Filesystem(CacheDatabase, Sqlite3FilesystemMixin): | ||
pass |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer composition over inheritance. Try to avoid multiple inheritance too.
Specially, the "Memory" domain should exist in test scope only, not in production code.
Please, use composition for these, and have the Sqlite3Memory functionality defined in testing only, injected as a dependency, or patched.