Skip to content

Commit

Permalink
Closes #11558: Add support for remote data sources (#11646)
Browse files Browse the repository at this point in the history
* WIP

* WIP

* Add git sync

* Fix file hashing

* Add last_synced to DataSource

* Build out UI & API resources

* Add status field to DataSource

* Add UI control to sync data source

* Add API endpoint to sync data sources

* Fix display of DataSource job results

* DataSource password should be write-only

* General cleanup

* Add data file UI view

* Punt on HTTP, FTP support for now

* Add DataSource URL validation

* Add HTTP proxy support to git fetcher

* Add management command to sync data sources

* DataFile REST API endpoints should be read-only

* Refactor fetch methods into backend classes

* Replace auth & git branch fields with general-purpose parameters

* Fix last_synced time

* Render discrete form fields for backend parameters

* Enable dynamic edit form for DataSource

* Register DataBackend classes in application registry

* Add search indexers for DataSource, DataFile

* Add single & bulk delete views for DataFile

* Add model documentation

* Convert DataSource to a primary model

* Introduce pre_sync & post_sync signals

* Clean up migrations

* Rename url to source_url

* Clean up filtersets

* Add API & filterset tests

* Add view tests

* Add initSelect() to HTMX refresh handler

* Render DataSourceForm fieldsets dynamically

* Update compiled static resources
  • Loading branch information
jeremystretch committed Feb 2, 2023
1 parent c8779a8 commit 4d87ce5
Show file tree
Hide file tree
Showing 53 changed files with 1,865 additions and 14 deletions.
25 changes: 25 additions & 0 deletions docs/models/core/datafile.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Data Files

A data file object is the representation in NetBox's database of some file belonging to a remote [data source](./datasource.md). Data files are synchronized automatically, and cannot be modified locally (although they can be deleted).

## Fields

### Source

The [data source](./datasource.md) to which this file belongs.

### Path

The path to the file, relative to its source's URL. For example, a file at `/opt/config-data/routing/bgp/peer.yaml` with a source URL of `file:///opt/config-data/` would have its path set to `routing/bgp/peer.yaml`.

### Last Updated

The date and time at which the file most recently updated from its source. Note that this attribute is updated only when the file's contents have been modified. Re-synchronizing the data source will not update this timestamp if the upstream file's data has not changed.

### Size

The file's size, in bytes.

### Hash

A [SHA256 hash](https://en.wikipedia.org/wiki/SHA-2) of the file's data. This can be compared to a hash taken from the original file to determine whether any changes have been made.
47 changes: 47 additions & 0 deletions docs/models/core/datasource.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Data Sources

A data source represents some external repository of data which NetBox can consume, such as a git repository. Files within the data source are synchronized to NetBox by saving them in the database as [data file](./datafile.md) objects.

## Fields

### Name

The data source's human-friendly name.

### Type

The type of data source. Supported options include:

* Local directory
* git repository

### URL

The URL identifying the remote source. Some examples are included below.

| Type | Example URL |
|------|-------------|
| Local | file:///var/my/data/source/ |
| git | https://https://github.com/my-organization/my-repo |

### Status

The source's current synchronization status. Note that this cannot be set manually: It is updated automatically when the source is synchronized.

### Enabled

If false, synchronization will be disabled.

### Ignore Rules

A set of rules (one per line) identifying filenames to ignore during synchronization. Some examples are provided below. See Python's [`fnmatch()` documentation](https://docs.python.org/3/library/fnmatch.html) for a complete reference.

| Rule | Description |
|----------------|------------------------------------------|
| `README` | Ignore any files named `README` |
| `*.txt` | Ignore any files with a `.txt` extension |
| `data???.json` | Ignore e.g. `data123.json` |

### Last Synced

The date and time at which the source was most recently synchronized successfully.
Empty file added netbox/core/__init__.py
Empty file.
Empty file added netbox/core/api/__init__.py
Empty file.
25 changes: 25 additions & 0 deletions netbox/core/api/nested_serializers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
from rest_framework import serializers

from core.models import *
from netbox.api.serializers import WritableNestedSerializer

__all__ = [
'NestedDataFileSerializer',
'NestedDataSourceSerializer',
]


class NestedDataSourceSerializer(WritableNestedSerializer):
url = serializers.HyperlinkedIdentityField(view_name='core-api:datasource-detail')

class Meta:
model = DataSource
fields = ['id', 'url', 'display', 'name']


class NestedDataFileSerializer(WritableNestedSerializer):
url = serializers.HyperlinkedIdentityField(view_name='core-api:datafile-detail')

class Meta:
model = DataFile
fields = ['id', 'url', 'display', 'path']
51 changes: 51 additions & 0 deletions netbox/core/api/serializers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
from rest_framework import serializers

from core.choices import *
from core.models import *
from netbox.api.fields import ChoiceField
from netbox.api.serializers import NetBoxModelSerializer
from .nested_serializers import *

__all__ = (
'DataSourceSerializer',
)


class DataSourceSerializer(NetBoxModelSerializer):
url = serializers.HyperlinkedIdentityField(
view_name='core-api:datasource-detail'
)
type = ChoiceField(
choices=DataSourceTypeChoices
)
status = ChoiceField(
choices=DataSourceStatusChoices,
read_only=True
)

# Related object counts
file_count = serializers.IntegerField(
read_only=True
)

class Meta:
model = DataSource
fields = [
'id', 'url', 'display', 'name', 'type', 'source_url', 'enabled', 'status', 'description', 'comments',
'parameters', 'ignore_rules', 'created', 'last_updated', 'file_count',
]


class DataFileSerializer(NetBoxModelSerializer):
url = serializers.HyperlinkedIdentityField(
view_name='core-api:datafile-detail'
)
source = NestedDataSourceSerializer(
read_only=True
)

class Meta:
model = DataFile
fields = [
'id', 'url', 'display', 'source', 'path', 'last_updated', 'size', 'hash',
]
13 changes: 13 additions & 0 deletions netbox/core/api/urls.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
from netbox.api.routers import NetBoxRouter
from . import views


router = NetBoxRouter()
router.APIRootView = views.CoreRootView

# Data sources
router.register('data-sources', views.DataSourceViewSet)
router.register('data-files', views.DataFileViewSet)

app_name = 'core-api'
urlpatterns = router.urls
52 changes: 52 additions & 0 deletions netbox/core/api/views.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
from django.shortcuts import get_object_or_404

from rest_framework.decorators import action
from rest_framework.exceptions import PermissionDenied
from rest_framework.response import Response
from rest_framework.routers import APIRootView

from core import filtersets
from core.models import *
from netbox.api.viewsets import NetBoxModelViewSet, NetBoxReadOnlyModelViewSet
from utilities.utils import count_related
from . import serializers


class CoreRootView(APIRootView):
"""
Core API root view
"""
def get_view_name(self):
return 'Core'


#
# Data sources
#

class DataSourceViewSet(NetBoxModelViewSet):
queryset = DataSource.objects.annotate(
file_count=count_related(DataFile, 'source')
)
serializer_class = serializers.DataSourceSerializer
filterset_class = filtersets.DataSourceFilterSet

@action(detail=True, methods=['post'])
def sync(self, request, pk):
"""
Enqueue a job to synchronize the DataSource.
"""
if not request.user.has_perm('extras.sync_datasource'):
raise PermissionDenied("Syncing data sources requires the core.sync_datasource permission.")

datasource = get_object_or_404(DataSource, pk=pk)
datasource.enqueue_sync_job(request)
serializer = serializers.DataSourceSerializer(datasource, context={'request': request})

return Response(serializer.data)


class DataFileViewSet(NetBoxReadOnlyModelViewSet):
queryset = DataFile.objects.defer('data').prefetch_related('source')
serializer_class = serializers.DataFileSerializer
filterset_class = filtersets.DataFileFilterSet
8 changes: 8 additions & 0 deletions netbox/core/apps.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from django.apps import AppConfig


class CoreConfig(AppConfig):
name = "core"

def ready(self):
from . import data_backends, search
34 changes: 34 additions & 0 deletions netbox/core/choices.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
from django.utils.translation import gettext as _

from utilities.choices import ChoiceSet


#
# Data sources
#

class DataSourceTypeChoices(ChoiceSet):
LOCAL = 'local'
GIT = 'git'

CHOICES = (
(LOCAL, _('Local'), 'gray'),
(GIT, _('Git'), 'blue'),
)


class DataSourceStatusChoices(ChoiceSet):

NEW = 'new'
QUEUED = 'queued'
SYNCING = 'syncing'
COMPLETED = 'completed'
FAILED = 'failed'

CHOICES = (
(NEW, _('New'), 'blue'),
(QUEUED, _('Queued'), 'orange'),
(SYNCING, _('Syncing'), 'cyan'),
(COMPLETED, _('Completed'), 'green'),
(FAILED, _('Failed'), 'red'),
)
117 changes: 117 additions & 0 deletions netbox/core/data_backends.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
import logging
import subprocess
import tempfile
from contextlib import contextmanager
from urllib.parse import quote, urlunparse, urlparse

from django import forms
from django.conf import settings
from django.utils.translation import gettext as _

from netbox.registry import registry
from .choices import DataSourceTypeChoices
from .exceptions import SyncError

__all__ = (
'LocalBackend',
'GitBackend',
)

logger = logging.getLogger('netbox.data_backends')


def register_backend(name):
"""
Decorator for registering a DataBackend class.
"""
def _wrapper(cls):
registry['data_backends'][name] = cls
return cls

return _wrapper


class DataBackend:
parameters = {}

def __init__(self, url, **kwargs):
self.url = url
self.params = kwargs

@property
def url_scheme(self):
return urlparse(self.url).scheme.lower()

@contextmanager
def fetch(self):
raise NotImplemented()


@register_backend(DataSourceTypeChoices.LOCAL)
class LocalBackend(DataBackend):

@contextmanager
def fetch(self):
logger.debug(f"Data source type is local; skipping fetch")
local_path = urlparse(self.url).path # Strip file:// scheme

yield local_path


@register_backend(DataSourceTypeChoices.GIT)
class GitBackend(DataBackend):
parameters = {
'username': forms.CharField(
required=False,
label=_('Username'),
widget=forms.TextInput(attrs={'class': 'form-control'})
),
'password': forms.CharField(
required=False,
label=_('Password'),
widget=forms.TextInput(attrs={'class': 'form-control'})
),
'branch': forms.CharField(
required=False,
label=_('Branch'),
widget=forms.TextInput(attrs={'class': 'form-control'})
)
}

@contextmanager
def fetch(self):
local_path = tempfile.TemporaryDirectory()

# Add authentication credentials to URL (if specified)
username = self.params.get('username')
password = self.params.get('password')
if username and password:
url_components = list(urlparse(self.url))
# Prepend username & password to netloc
url_components[1] = quote(f'{username}@{password}:') + url_components[1]
url = urlunparse(url_components)
else:
url = self.url

# Compile git arguments
args = ['git', 'clone', '--depth', '1']
if branch := self.params.get('branch'):
args.extend(['--branch', branch])
args.extend([url, local_path.name])

# Prep environment variables
env_vars = {}
if settings.HTTP_PROXIES and self.url_scheme in ('http', 'https'):
env_vars['http_proxy'] = settings.HTTP_PROXIES.get(self.url_scheme)

logger.debug(f"Cloning git repo: {' '.join(args)}")
try:
subprocess.run(args, check=True, capture_output=True, env=env_vars)
except subprocess.CalledProcessError as e:
raise SyncError(
f"Fetching remote data failed: {e.stderr}"
)

yield local_path.name

local_path.cleanup()
2 changes: 2 additions & 0 deletions netbox/core/exceptions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
class SyncError(Exception):
pass
Loading

0 comments on commit 4d87ce5

Please sign in to comment.