You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had searched in the DSIP and found no similar DSIP.
Motivation
Right now, the storage api is very complex, there are a lot of issue related to the storage, e.g. CVE, file path is incorrect.
Most of these are due to the usage of storage path. We will use absolute path and relative path with storage, but the api does't announce this.
In additional, the storage interface is complex, it rely on some business logic, e.g. tenant, default path, once we want to import a new storage, it's difficult.
This DSIP is aim to refactor the storage api. make it more easy to maintain.
Design Detail
The new storage spi will focus on filesystem operation.
publicinterfaceStorageOperator {
StringFILE_FOLDER_NAME = "resources";
StringUDF_FOLDER_NAME = "udfs";
StringRESOURCE_UPLOAD_PATH = PropertyUtils.getString(Constants.RESOURCE_UPLOAD_PATH, "/dolphinscheduler");
ResourceMetadatagetResourceMetaData(StringresourceAbsolutePath);
/** * Get the absolute path of base directory. * * @return the base directory. e.g. file:///tmp/dolphinscheduler/, /tmp/dolphinscheduler/ */StringgetStorageBaseDirectory();
/** * Get the absolute path of directory which will be used by the given tenant. the tenant directory is under the base directory. * * @param tenantCode the tenant code, cannot be empty * @return the tenant directory. e.g. file:///tmp/dolphinscheduler/default/ */StringgetStorageBaseDirectory(StringtenantCode);
/** * Get the absolute path of directory which will be used by the given tenant and resource type. the resource directory is under the tenant directory. * <p> If the resource type is FILE, will be 'file:///tmp/dolphinscheduler/default/resources/'. * <p> If the resource type is UDF, will be 'is file:///tmp/dolphinscheduler/default/udfs/'. * <p> If the resource type is ALL, will be 'is file:///tmp/dolphinscheduler/default/'. * * @param tenantCode the tenant code, cannot be empty * @param resourceType the resource type, cannot be null * @return the resource directory. e.g. file:///tmp/dolphinscheduler/default/resources/ */StringgetStorageBaseDirectory(StringtenantCode, ResourceTyperesourceType);
/** * Get the absolute path of the file in the storage. the file will under the file resource directory. * * @param tenantCode the tenant code, cannot be empty * @param fileName the file name, cannot be empty * @return the file absolute path. e.g. file:///tmp/dolphinscheduler/default/resources/test.sh */StringgetStorageFileAbsolutePath(StringtenantCode, StringfileName);
/** * Create a directory if the directory is already exists will throw exception(Dependent on the storage implementation). * <p> If the directory is not exists, will create the directory. * <p> If the parent directory is not exists, will create the parent directory. * <p> If the directory is already exists, will throw {@link FileAlreadyExistsException}. * * @param directoryAbsolutePath the directory absolute path */voidcreateStorageDir(StringdirectoryAbsolutePath);
/** * Check if the resource exists. * * @param resourceAbsolutePath the resource absolute path * @return true if the resource exists, otherwise false */booleanexists(StringresourceAbsolutePath);
/** * Delete the resource, if the resourceAbsolutePath is not exists, will do nothing. * * @param resourceAbsolutePath the resource absolute path * @param recursive whether to delete all the sub file/directory under the given resource */voiddelete(StringresourceAbsolutePath, booleanrecursive);
/** * Copy the resource from the source path to the destination path. * * @param srcAbsolutePath the source path * @param dstAbsolutePath the destination path * @param deleteSource whether to delete the source path after copying * @param overwrite whether to overwrite the destination path if it exists */voidcopy(StringsrcAbsolutePath, StringdstAbsolutePath, booleandeleteSource, booleanoverwrite);
/** * Move the resource from the source path to the destination path. * * @param srcLocalFileAbsolutePath the source local file * @param dstAbsolutePath the destination path * @param deleteSource whether to delete the source path after moving * @param overwrite whether to overwrite the destination path if it exists * @return true if the resource is moved successfully, otherwise false * @throws IOException */voidupload(StringsrcLocalFileAbsolutePath, StringdstAbsolutePath, booleandeleteSource, booleanoverwrite);
/** * Download the resource from the source path to the destination path. * * @param srcFileAbsolutePath the source path * @param dstAbsoluteFile the destination file * @param overwrite whether to overwrite the destination file if it exists * @throws IOException */voiddownload(StringsrcFileAbsolutePath, StringdstAbsoluteFile, booleanoverwrite) throwsIOException;
/** * Fetch the content of the file. * * @param fileAbsolutePath the file path * @param skipLineNums the number of lines to skip * @param limit the number of lines to read * @return the content of the file */List<String> fetchFileContent(StringfileAbsolutePath, intskipLineNums, intlimit);
/** * Return the {@link StorageEntity} under the given path. * <p>If the path is a file, return the file status. * <p>If the path is a directory, return the file/directory under the directory. * <p>If the path is not exist, will return empty. * * @param resourceAbsolutePath the resource absolute path, cannot be empty */List<StorageEntity> listStorageEntity(StringresourceAbsolutePath);
/** * Return the {@link StorageEntity} which is file under the given path * * @param resourceAbsolutePath the resource absolute path, cannot be empty */List<StorageEntity> listFileStorageEntityRecursively(StringresourceAbsolutePath);
/** * Return the {@link StorageEntity} under the current directory * * @param resourceAbsolutePath the resource absolute path, cannot be empty */StorageEntitygetStorageEntity(StringresourceAbsolutePath);
}
Search before asking
Motivation
Right now, the storage api is very complex, there are a lot of issue related to the storage, e.g. CVE, file path is incorrect.
Most of these are due to the usage of storage path. We will use absolute path and relative path with storage, but the api does't announce this.
In additional, the storage interface is complex, it rely on some business logic, e.g. tenant, default path, once we want to import a new storage, it's difficult.
This DSIP is aim to refactor the storage api. make it more easy to maintain.
Design Detail
The new storage spi will focus on filesystem operation.
Compatibility, Deprecation, and Migration Plan
Compatibility with current version
Test Plan
Add IT for HDFS(Local mode) / S3
Code of Conduct
The text was updated successfully, but these errors were encountered: