-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow arbitrary contents managers #24
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -2,7 +2,7 @@ | |||||
import sqlite3 | ||||||
import stat | ||||||
import time | ||||||
from typing import Optional | ||||||
from typing import Any, Callable, Dict, Optional, Union | ||||||
|
||||||
from jupyter_core.paths import jupyter_data_dir | ||||||
from traitlets import Int, TraitError, Unicode, validate | ||||||
|
@@ -37,35 +37,174 @@ def wrapped(self, *args, **kwargs): | |||||
return decorator | ||||||
|
||||||
|
||||||
class FileIdManager(LoggingConfigurable): | ||||||
class BaseFileIdManager(LoggingConfigurable): | ||||||
""" | ||||||
Manager that supports tracking files across their lifetime by associating | ||||||
each with a unique file ID, which is maintained across filesystem operations. | ||||||
|
||||||
Notes | ||||||
----- | ||||||
|
||||||
All private helper methods prefixed with an underscore (except `__init__()`) | ||||||
do NOT commit their SQL statements in a transaction via `self.con.commit()`. | ||||||
This responsibility is delegated to the public method calling them to | ||||||
increase performance. Committing multiple SQL transactions in serial is much | ||||||
slower than committing a single SQL transaction wrapping all SQL statements | ||||||
performed during a method's procedure body. | ||||||
Base class for File ID manager implementations. All File ID | ||||||
managers should inherit from this class. | ||||||
""" | ||||||
|
||||||
root_dir = Unicode( | ||||||
help=("The root being served by Jupyter server. Must be an absolute path."), config=True | ||||||
help=("The root directory being served by Jupyter server. Must be an absolute path."), | ||||||
config=False, | ||||||
) | ||||||
|
||||||
db_path = Unicode( | ||||||
default_value=default_db_path, | ||||||
help=( | ||||||
"The path of the DB file used by `FileIdManager`. " | ||||||
"The path of the DB file used by File ID manager implementations. " | ||||||
"Defaults to `jupyter_data_dir()/file_id_manager.db`." | ||||||
), | ||||||
config=True, | ||||||
) | ||||||
|
||||||
@validate("root_dir", "db_path") | ||||||
def _validate_abspath_traits(self, proposal): | ||||||
if proposal["value"] is None: | ||||||
raise TraitError(f"FileIdManager : {proposal['trait'].name} must not be None") | ||||||
if not os.path.isabs(proposal["value"]): | ||||||
raise TraitError(f"FileIdManager : {proposal['trait'].name} must be an absolute path") | ||||||
return proposal["value"] | ||||||
|
||||||
def __init__(self, *args, **kwargs): | ||||||
super().__init__(*args, **kwargs) | ||||||
|
||||||
def index(self, path: str) -> Union[int, str, None]: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we can have dual forms of IDs as suggested here. Users should not be expected to change from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see this was merged as I must have been editing this comment. This topic needs to be revisited but we can let #3 be that forum. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To the best of my knowledge, this is not a public method and should not exist on the ABC. |
||||||
raise NotImplementedError("must be implemented by subclass") | ||||||
|
||||||
def get_id(self, path: str) -> Union[int, str, None]: | ||||||
raise NotImplementedError("must be implemented by subclass") | ||||||
Comment on lines
+74
to
+75
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With a proper ABC definition Same goes for |
||||||
|
||||||
def get_path(self, id: Union[int, str]) -> Union[int, str, None]: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
raise NotImplementedError("must be implemented by subclass") | ||||||
|
||||||
def move(self, old_path: str, new_path: str) -> Union[int, str, None]: | ||||||
raise NotImplementedError("must be implemented by subclass") | ||||||
|
||||||
def copy(self, from_path: str, to_path: str) -> Union[int, str, None]: | ||||||
raise NotImplementedError("must be implemented by subclass") | ||||||
|
||||||
def delete(self, path: str) -> None: | ||||||
raise NotImplementedError("must be implemented by subclass") | ||||||
|
||||||
def save(self, path: str) -> None: | ||||||
raise NotImplementedError("must be implemented by subclass") | ||||||
Comment on lines
+80
to
+90
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To the best of my knowledge, these are not public methods and should not exist on the ABC. |
||||||
|
||||||
def get_handlers_by_action(self) -> Dict[str, Optional[Callable[[Dict[str, Any]], Any]]]: | ||||||
"""Returns a dictionary whose keys are contents manager event actions | ||||||
and whose values are callables invoked upon receipt of an event of the | ||||||
same action. The callable accepts the body of the event as its only | ||||||
argument. To ignore an event action, set the value to `None`.""" | ||||||
raise NotImplementedError("must be implemented by subclass") | ||||||
|
||||||
|
||||||
class ArbitraryFileIdManager(BaseFileIdManager): | ||||||
""" | ||||||
File ID manager that works on arbitrary filesystems. Each file is assigned a | ||||||
unique ID. The path is only updated upon calling `move()`, `copy()`, or | ||||||
`delete()`, e.g. upon receipt of contents manager events emitted by Jupyter | ||||||
Server 2. | ||||||
""" | ||||||
|
||||||
def __init__(self, *args, **kwargs): | ||||||
# pass args and kwargs to parent Configurable | ||||||
super().__init__(*args, **kwargs) | ||||||
# initialize instance attrs | ||||||
self._update_cursor = False | ||||||
# initialize connection with db | ||||||
self.log.info(f"ArbitraryFileIdManager : Configured root dir: {self.root_dir}") | ||||||
self.log.info(f"ArbitraryFileIdManager : Configured database path: {self.db_path}") | ||||||
self.con = sqlite3.connect(self.db_path) | ||||||
self.log.info("ArbitraryFileIdManager : Successfully connected to database file.") | ||||||
self.log.info("ArbitraryFileIdManager : Creating File ID tables and indices.") | ||||||
# do not allow reads to block writes. required when using multiple processes | ||||||
self.con.execute("PRAGMA journal_mode = WAL") | ||||||
self.con.execute( | ||||||
"CREATE TABLE IF NOT EXISTS Files(" | ||||||
"id INTEGER PRIMARY KEY AUTOINCREMENT, " | ||||||
"path TEXT NOT NULL UNIQUE" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the
|
||||||
")" | ||||||
) | ||||||
self.con.execute("CREATE INDEX IF NOT EXISTS ix_Files_path ON Files (path)") | ||||||
self.con.commit() | ||||||
|
||||||
def index(self, path: str) -> int: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not public, should be prefixed with |
||||||
row = self.con.execute("SELECT id FROM Files WHERE path = ?", (path,)).fetchone() | ||||||
existing_id = row and row[0] | ||||||
|
||||||
if existing_id: | ||||||
return existing_id | ||||||
|
||||||
# create new record | ||||||
cursor = self.con.execute("INSERT INTO Files (path) VALUES (?)", (path,)) | ||||||
self.con.commit() | ||||||
return cursor.lastrowid # type:ignore | ||||||
|
||||||
def get_id(self, path: str) -> Optional[int]: | ||||||
row = self.con.execute("SELECT id FROM Files WHERE path = ?", (path,)).fetchone() | ||||||
return row and row[0] | ||||||
|
||||||
def get_path(self, id: Union[int, str]) -> Optional[int]: | ||||||
row = self.con.execute("SELECT path FROM Files WHERE id = ?", (id,)).fetchone() | ||||||
return row and row[0] | ||||||
|
||||||
def move(self, old_path: str, new_path: str) -> None: | ||||||
row = self.con.execute("SELECT id FROM Files WHERE path = ?", (old_path,)).fetchone() | ||||||
id = row and row[0] | ||||||
|
||||||
if id: | ||||||
self.con.execute("UPDATE Files SET path = ? WHERE path = ?", (new_path, old_path)) | ||||||
else: | ||||||
cursor = self.con.execute("INSERT INTO Files (path) VALUES (?)", (new_path,)) | ||||||
id = cursor.lastrowid | ||||||
|
||||||
self.con.commit() | ||||||
return id | ||||||
|
||||||
def copy(self, from_path: str, to_path: str) -> Optional[int]: | ||||||
cursor = self.con.execute("INSERT INTO Files (path) VALUES (?)", (to_path,)) | ||||||
self.con.commit() | ||||||
return cursor.lastrowid | ||||||
|
||||||
def delete(self, path: str) -> None: | ||||||
self.con.execute("DELETE FROM Files WHERE path = ?", (path,)) | ||||||
self.con.commit() | ||||||
Comment on lines
+150
to
+170
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not public, should be prefixed with |
||||||
|
||||||
def save(self, path: str) -> None: | ||||||
return | ||||||
Comment on lines
+172
to
+173
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should be implemented to call Also, |
||||||
|
||||||
def get_handlers_by_action(self) -> Dict[str, Optional[Callable[[Dict[str, Any]], Any]]]: | ||||||
return { | ||||||
"get": None, | ||||||
"save": None, | ||||||
Comment on lines
+177
to
+178
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Per previous comment, these should call |
||||||
"rename": lambda data: self.move(data["source_path"], data["path"]), | ||||||
"copy": lambda data: self.copy(data["source_path"], data["path"]), | ||||||
"delete": lambda data: self.delete(data["path"]), | ||||||
} | ||||||
|
||||||
def __del__(self): | ||||||
"""Cleans up `ArbitraryFileIdManager` by committing any pending | ||||||
transactions and closing the connection.""" | ||||||
if hasattr(self, "con"): | ||||||
self.con.commit() | ||||||
self.con.close() | ||||||
|
||||||
|
||||||
class LocalFileIdManager(BaseFileIdManager): | ||||||
""" | ||||||
File ID manager that supports tracking files in local filesystems by | ||||||
associating each with a unique file ID, which is maintained across | ||||||
filesystem operations. | ||||||
|
||||||
Notes | ||||||
----- | ||||||
All private helper methods prefixed with an underscore (except `__init__()`) | ||||||
do NOT commit their SQL statements in a transaction via `self.con.commit()`. | ||||||
This responsibility is delegated to the public method calling them to | ||||||
increase performance. Committing multiple SQL transactions in serial is much | ||||||
slower than committing a single SQL transaction wrapping all SQL statements | ||||||
performed during a method's procedure body. | ||||||
""" | ||||||
|
||||||
autosync_interval = Int( | ||||||
default_value=5, | ||||||
help=( | ||||||
|
@@ -84,11 +223,11 @@ def __init__(self, *args, **kwargs): | |||||
self._update_cursor = False | ||||||
self._last_sync = 0.0 | ||||||
# initialize connection with db | ||||||
self.log.info(f"FileIdManager : Configured root dir: {self.root_dir}") | ||||||
self.log.info(f"FileIdManager : Configured database path: {self.db_path}") | ||||||
self.log.info(f"LocalFileIdManager : Configured root dir: {self.root_dir}") | ||||||
self.log.info(f"LocalFileIdManager : Configured database path: {self.db_path}") | ||||||
self.con = sqlite3.connect(self.db_path) | ||||||
self.log.info("FileIdManager : Successfully connected to database file.") | ||||||
self.log.info("FileIdManager : Creating File ID tables and indices.") | ||||||
self.log.info("LocalFileIdManager : Successfully connected to database file.") | ||||||
self.log.info("LocalFileIdManager : Creating File ID tables and indices.") | ||||||
# do not allow reads to block writes. required when using multiple processes | ||||||
self.con.execute("PRAGMA journal_mode = WAL") | ||||||
self.con.execute( | ||||||
|
@@ -109,14 +248,6 @@ def __init__(self, *args, **kwargs): | |||||
self.con.execute("CREATE INDEX IF NOT EXISTS ix_Files_is_dir ON Files (is_dir)") | ||||||
self.con.commit() | ||||||
|
||||||
@validate("root_dir", "db_path") | ||||||
def _validate_abspath_traits(self, proposal): | ||||||
if proposal["value"] is None: | ||||||
raise TraitError(f"FileIdManager : {proposal['trait'].name} must not be None") | ||||||
if not os.path.isabs(proposal["value"]): | ||||||
raise TraitError(f"FileIdManager : {proposal['trait'].name} must be an absolute path") | ||||||
return self._normalize_path(proposal["value"]) | ||||||
|
||||||
def _index_all(self): | ||||||
"""Recursively indexes all directories under the server root.""" | ||||||
self._index_dir_recursively(self.root_dir, self._stat(self.root_dir)) | ||||||
|
@@ -598,8 +729,17 @@ def save(self, path): | |||||
self._update(id, stat_info) | ||||||
self.con.commit() | ||||||
|
||||||
def get_handlers_by_action(self) -> Dict[str, Optional[Callable[[Dict[str, Any]], Any]]]: | ||||||
return { | ||||||
"get": None, | ||||||
"save": lambda data: self.save(data["path"]), | ||||||
"rename": lambda data: self.move(data["source_path"], data["path"]), | ||||||
"copy": lambda data: self.copy(data["source_path"], data["path"]), | ||||||
"delete": lambda data: self.delete(data["path"]), | ||||||
} | ||||||
|
||||||
def __del__(self): | ||||||
"""Cleans up `FileIdManager` by committing any pending transactions and | ||||||
"""Cleans up `LocalFileIdManager` by committing any pending transactions and | ||||||
closing the connection.""" | ||||||
if hasattr(self, "con"): | ||||||
self.con.commit() | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should derive from
ABC
along with a metaclass definition in order for@abstractmethod
decorators to be effective. A good example of this can be found here. By not derivingBaseFileIdManager
fromABC
,@abstractmethod
decorators do not work (and I see those too have been removed). As a result, a subclass's violation for not implementing various methods will not be discovered until that method is called, rather than when the class instance is instantiated. Proper decoration also preventsBaseFileIdManager
from being instantiated (i.e., a true abstract base class).