cvat-ai · bsekachev · Oct 1, 2020 · Sep 22, 2020 · Sep 22, 2020 · Sep 22, 2020
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -19,6 +19,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 It supports regular navigation, searching a frame according to annotations
 filters and searching the nearest frame without any annotations (<https://github.com/openvinotoolkit/cvat/pull/2221>)
 - MacOS users notes in CONTRIBUTING.md
+- Ability to prepare meta information manually (<https://github.com/openvinotoolkit/cvat/pull/2217>)
+- Ability to upload prepared meta information along with a video when creating a task (<https://github.com/openvinotoolkit/cvat/pull/2217>)
 - Optional chaining plugin for cvat-canvas and cvat-ui (<https://github.com/openvinotoolkit/cvat/pull/2249>)
 
 ### Changed
@@ -45,6 +47,7 @@ filters and searching the nearest frame without any annotations (<https://github
 - Fixed case when a task with 0 jobs is shown as "Completed" in UI (<https://github.com/openvinotoolkit/cvat/pull/2200>)
 - Fixed use case when UI throws exception: Cannot read property 'objectType' of undefined #2053 (<https://github.com/openvinotoolkit/cvat/pull/2203>)
 - Fixed use case when logs could be saved twice or more times #2202 (<https://github.com/openvinotoolkit/cvat/pull/2203>)
+- Fixed issues from #2112 (<https://github.com/openvinotoolkit/cvat/pull/2217>)
 
 ### Security
 -

@@ -0,0 +1,35 @@
+# Data preparation on the fly
+
+## Description
+Data on the fly processing is a way of working with data, the main idea of which is as follows:
+Minimum necessary meta information is collected, when task is created.
+This meta information allows in the future to create a necessary chunks when receiving a request from a client.
+
+Generated chunks are stored in a cache of limited size with a policy of evicting less popular items.
+
+When a request received from a client, the required chunk is searched for in the cache.
+If the chunk does not exist yet, it is created using a prepared meta information and then put into the cache.
+
+This method of working with data  allows:
+- reduce the task creation time.
+- store data in a cache of limited size with a policy of evicting less popular items.
+
+## Prepare meta information
+Different meta information is collected for different types of uploaded data.
+### Video
+For video, this is a valid mapping of key frame numbers and their timestamps. This information is saved to `meta_info.txt`.
+
+Unfortunately, this method will not work for all videos with valid meta information.
+If there are not enough keyframes in the video for smooth video decoding, the task will be created in the old way.
+
+#### Uploading meta information along with data
+
+When creating a task, you can upload a file with meta information along with the video,
+which will further reduce the time for creating a task.
+You can see how to prepare meta information [here](/utils/prepare_meta_information/README.md).
+
+It is worth noting that the generated file also contains information about the number of frames in the video at the end.
+
+### Images
+Mapping of chunk number and paths to images that should enter the chunk
+is saved at the time of creating a task in a files `dummy_{chunk_number}.txt`
@@ -141,17 +141,24 @@ Go to the [Django administration panel](http://localhost:8080/admin). There you
     **Select files**. Press tab ``My computer`` to choose some files for annotation from your PC.
     If you select tab ``Connected file share`` you can choose files for annotation from your network.
     If you select `` Remote source`` , you'll see a field where you can enter a list of URLs (one URL per line).
+    If you upload a video data and select ``Use cache`` option, you can along with the video file attach a file with meta information.
+    You can find how to prepare it [here](/utils/prepare_meta_information/README.md).
 
       ![](static/documentation/images/image127.jpg)
 
     #### Advanced configuration
 
-      ![](static/documentation/images/image128.jpg)
+      ![](static/documentation/images/image128_use_cache.jpg)
 
     **Z-Order**. Defines the order on drawn polygons. Check the box for enable layered displaying.
 
     **Use zip chunks**. Force to use zip chunks as compressed data. Actual for videos only.
 
+    **Use cache**. Defines how to work with data. Select the checkbox to switch to the "on-the-fly data processing",
+    which will reduce the task creation time (by preparing chunks when requests are received)
+    and store data in a cache of limited size with a policy of evicting less popular items.
+    See more [here](/cvat/apps/documentation/data_on_fly.md).
+
     **Image Quality**. Use this option to specify quality of uploaded images.
     The option helps to load high resolution datasets faster.
     Use the value from ``5`` (almost completely compressed images) to ``100`` (not compressed images).

@@ -3,7 +3,9 @@
 # SPDX-License-Identifier: MIT
 
 import av
+from collections import OrderedDict
 import hashlib
+import os
 
 class WorkWithVideo:
     def __init__(self, **kwargs):
@@ -72,27 +74,30 @@ def __init__(self, **kwargs):
     def get_task_size(self):
         return self.frames
 
+    @property
+    def frame_sizes(self):
+        frame = next(iter(self.key_frames.values()))
+        return (frame.width, frame.height)
+
+    def check_key_frame(self, container, video_stream, key_frame):
+        for packet in container.demux(video_stream):
+            for frame in packet.decode():
+                if md5_hash(frame) != md5_hash(key_frame[1]) or frame.pts != key_frame[1].pts:
+                    self.key_frames.pop(key_frame[0])
+                return
+
     def check_seek_key_frames(self):
         container = self._open_video_container(self.source_path, mode='r')
         video_stream = self._get_video_stream(container)
 
         key_frames_copy = self.key_frames.copy()
 
-        for index, key_frame in key_frames_copy.items():
-            container.seek(offset=key_frame.pts, stream=video_stream)
-            flag = True
-            for packet in container.demux(video_stream):
-                for frame in packet.decode():
-                    if md5_hash(frame) != md5_hash(key_frame) or frame.pts != key_frame.pts:
-                        self.key_frames.pop(index)
-                    flag = False
-                    break
-                if not flag:
-                    break
+        for key_frame in key_frames_copy.items():
+            container.seek(offset=key_frame[1].pts, stream=video_stream)
+            self.check_key_frame(container, video_stream, key_frame)
 
-        #TODO: correct ratio of number of frames to keyframes
-        if len(self.key_frames) == 0:
-            raise Exception('Too few keyframes')
+    def check_frames_ratio(self, chunk_size):
+        return (len(self.key_frames) and (self.frames // len(self.key_frames)) <= 2 * chunk_size)
 
     def save_key_frames(self):
         container = self._open_video_container(self.source_path, mode='r')
@@ -152,4 +157,79 @@ def decode_needed_frames(self, chunk_number, db_data):
                     self._close_video_container(container)
                     return
 
-        self._close_video_container(container)
+        self._close_video_container(container)
+
+class UploadedMeta(PrepareInfo):
+    def __init__(self, **kwargs):
+        super().__init__(**kwargs)
+
+        with open(self.meta_path, 'r') as meta_file:
+            lines = meta_file.read().strip().split('\n')
+            self.frames = int(lines.pop())
+
+            key_frames = {int(line.split()[0]): int(line.split()[1]) for line in lines}
+            self.key_frames = OrderedDict(sorted(key_frames.items(), key=lambda x: x[0]))
+
+    @property
+    def frame_sizes(self):
+        container = self._open_video_container(self.source_path, 'r')
+        video_stream = self._get_video_stream(container)
+        container.seek(offset=next(iter(self.key_frames.values())), stream=video_stream)
+        for packet in container.demux(video_stream):
+            for frame in packet.decode():
+                self._close_video_container(container)
+                return (frame.width, frame.height)
+
+    def save_meta_info(self):
+        with open(self.meta_path, 'w') as meta_file:
+            for index, pts in self.key_frames.items():
+                meta_file.write('{} {}\n'.format(index, pts))
+
+    def check_key_frame(self, container, video_stream, key_frame):
+        for packet in container.demux(video_stream):
+            for frame in packet.decode():
+                assert frame.pts == key_frame[1], "Uploaded meta information does not match the video"
+                return
+
+    def check_seek_key_frames(self):
+        container = self._open_video_container(self.source_path, mode='r')
+        video_stream = self._get_video_stream(container)
+
+        for key_frame in self.key_frames.items():
+            container.seek(offset=key_frame[1], stream=video_stream)
+            self.check_key_frame(container, video_stream, key_frame)
+
+        self._close_video_container(container)
+
+    def check_frames_numbers(self):
+        container = self._open_video_container(self.source_path, mode='r')
+        video_stream = self._get_video_stream(container)
+        # not all videos contain information about numbers of frames
+        if video_stream.frames:
+            self._close_video_container(container)
+            assert video_stream.frames == self.frames, "Uploaded meta information does not match the video"
+            return
+        self._close_video_container(container)
+
+def prepare_meta(media_file, upload_dir=None, meta_dir=None, chunk_size=None):
+    paths = {
+        'source_path': os.path.join(upload_dir, media_file) if upload_dir else media_file,
+        'meta_path': os.path.join(meta_dir, 'meta_info.txt') if meta_dir else os.path.join(upload_dir, 'meta_info.txt'),
+    }
+    analyzer = AnalyzeVideo(source_path=paths.get('source_path'))
+    analyzer.check_type_first_frame()
+    analyzer.check_video_timestamps_sequences()
+
+    meta_info = PrepareInfo(source_path=paths.get('source_path'),
+                            meta_path=paths.get('meta_path'))
+    meta_info.save_key_frames()
+    meta_info.check_seek_key_frames()
+    meta_info.save_meta_info()
+    smooth_decoding = meta_info.check_frames_ratio(chunk_size) if chunk_size else None
+    return (meta_info, smooth_decoding)
+
+def prepare_meta_for_upload(func, *args):
+    meta_info, smooth_decoding = func(*args)
+    with open(meta_info.meta_path, 'a') as meta_file:
+        meta_file.write(str(meta_info.get_task_size()))
+    return smooth_decoding
@@ -6,6 +6,7 @@
 import itertools
 import os
 import sys
+from re import findall
 import rq
 import shutil
 from traceback import print_exception
@@ -16,6 +17,7 @@
 from cvat.apps.engine.media_extractors import get_mime, MEDIA_TYPES, Mpeg4ChunkWriter, ZipChunkWriter, Mpeg4CompressedChunkWriter, ZipCompressedChunkWriter
 from cvat.apps.engine.models import DataChoice, StorageMethodChoice
 from cvat.apps.engine.utils import av_scan_paths
+from cvat.apps.engine.prepare import prepare_meta
 
 import django_rq
 from django.conf import settings
@@ -24,7 +26,6 @@
 
 from . import models
 from .log import slogger
-from .prepare import PrepareInfo, AnalyzeVideo
 
 ############################# Low Level server API
 
@@ -105,7 +106,7 @@ def _save_task_to_db(db_task):
     db_task.data.save()
     db_task.save()
 
-def _count_files(data):
+def _count_files(data, meta_info_file=None):
     share_root = settings.SHARE_ROOT
     server_files = []
 
@@ -132,11 +133,12 @@ def count_files(file_mapping, counter):
             mime = get_mime(full_path)
             if mime in counter:
                 counter[mime].append(rel_path)
+            elif findall('meta_info.txt$', rel_path):
+                meta_info_file.append(rel_path)
             else:
                 slogger.glob.warn("Skip '{}' file (its mime type doesn't "
                     "correspond to a video or an image file)".format(full_path))
 
-
     counter = { media_type: [] for media_type in MEDIA_TYPES.keys() }
 
     count_files(
@@ -151,7 +153,7 @@ def count_files(file_mapping, counter):
 
     return counter
 
-def _validate_data(counter):
+def _validate_data(counter, meta_info_file=None):
     unique_entries = 0
     multiple_entries = 0
     for media_type, media_config in MEDIA_TYPES.items():
@@ -161,6 +163,9 @@ def _validate_data(counter):
             else:
                 multiple_entries += len(counter[media_type])
 
+            if meta_info_file and media_type != 'video':
+                raise Exception('File with meta information can only be uploaded with video file')
+
     if unique_entries == 1 and multiple_entries > 0 or unique_entries > 1:
         unique_types = ', '.join([k for k, v in MEDIA_TYPES.items() if v['unique']])
         multiply_types = ', '.join([k for k, v in MEDIA_TYPES.items() if not v['unique']])
@@ -219,8 +224,12 @@ def _create_thread(tid, data):
     if data['remote_files']:
         data['remote_files'] = _download_data(data['remote_files'], upload_dir)
 
-    media = _count_files(data)
-    media, task_mode = _validate_data(media)
+    meta_info_file = []
+    media = _count_files(data, meta_info_file)
+    media, task_mode = _validate_data(media, meta_info_file)
+    if meta_info_file:
+        assert settings.USE_CACHE and db_data.storage_method == StorageMethodChoice.CACHE, \
+            "File with meta information can be uploaded if 'Use cache' option is also selected"
 
     if data['server_files']:
         _copy_data_from_share(data['server_files'], upload_dir)
@@ -288,24 +297,51 @@ def update_progress(progress):
             if media_files:
                 if task_mode == MEDIA_TYPES['video']['mode']:
                     try:
-                        analyzer = AnalyzeVideo(source_path=os.path.join(upload_dir, media_files[0]))
-                        analyzer.check_type_first_frame()
-                        analyzer.check_video_timestamps_sequences()
-
-                        meta_info = PrepareInfo(source_path=os.path.join(upload_dir, media_files[0]),
-                                                meta_path=os.path.join(upload_dir, 'meta_info.txt'))
-                        meta_info.save_key_frames()
-                        meta_info.check_seek_key_frames()
-                        meta_info.save_meta_info()
+                        if meta_info_file:
+                            try:
+                                from cvat.apps.engine.prepare import UploadedMeta
+                                if os.path.split(meta_info_file[0])[0]:
+                                    os.replace(
+                                        os.path.join(upload_dir, meta_info_file[0]),
+                                        db_data.get_meta_path()
+                                    )
+                                meta_info = UploadedMeta(source_path=os.path.join(upload_dir, media_files[0]),
+                                                         meta_path=db_data.get_meta_path())
+                                meta_info.check_seek_key_frames()
+                                meta_info.check_frames_numbers()
+                                meta_info.save_meta_info()
+                                assert len(meta_info.key_frames) > 0, 'No key frames.'
+                            except Exception as ex:
+                                base_msg = str(ex) if isinstance(ex, AssertionError) else \
+                                    'Invalid meta information was upload.'
+                                job.meta['status'] = '{} Start prepare valid meta information.'.format(base_msg)
+                                job.save_meta()
+                                meta_info, smooth_decoding = prepare_meta(
+                                    media_file=media_files[0],
+                                    upload_dir=upload_dir,
+                                    chunk_size=db_data.chunk_size
+                                )
+                                assert smooth_decoding == True, 'Too few keyframes for smooth video decoding.'
+                        else:
+                            meta_info, smooth_decoding = prepare_meta(
+                                media_file=media_files[0],
+                                upload_dir=upload_dir,
+                                chunk_size=db_data.chunk_size
+                            )
+                            assert smooth_decoding == True, 'Too few keyframes for smooth video decoding.'
 
                         all_frames = meta_info.get_task_size()
+                        video_size = meta_info.frame_sizes
+
                         db_data.size = len(range(db_data.start_frame, min(data['stop_frame'] + 1 if data['stop_frame'] else all_frames, all_frames), db_data.get_frame_step()))
                         video_path = os.path.join(upload_dir, media_files[0])
-                        frame = meta_info.key_frames.get(next(iter(meta_info.key_frames)))
-                        video_size = (frame.width, frame.height)
-
-                    except Exception:
+                    except Exception as ex:
                         db_data.storage_method = StorageMethodChoice.FILE_SYSTEM
+                        if os.path.exists(db_data.get_meta_path()):
+                            os.remove(db_data.get_meta_path())
+                        base_msg = str(ex) if isinstance(ex, AssertionError) else "Uploaded video does not support a quick way of task creating."
+                        job.meta['status'] = "{} The task will be created using the old method".format(base_msg)
+                        job.save_meta()
 
                 else:#images,archive
                     counter_ = itertools.count()