Homepage: https://vldb.org/2024/
- DL training workloads
- Big data analytic workloads
- Intelligent Pooling: Proactive Resource Provisioning in Large-scale Cloud Service [Paper]
- Microsoft
- Predict usage patterns using a hybrid ML model; optimize the pool size dynamically.
- Intelligent Pooling: Proactive Resource Provisioning in Large-scale Cloud Service [Paper]
- Job scheduling
- ResLake: Towards Minimum Job Latency and Balanced Resource Utilization in Geo-distributed Job Scheduling
- ByteDance
- ResLake: Towards Minimum Job Latency and Balanced Resource Utilization in Geo-distributed Job Scheduling
- Autoscaling
- OptScaler: A Collaborative Framework for Robust Autoscaling in the Cloud [arXiv]
- Ant Group
- OptScaler: A Collaborative Framework for Robust Autoscaling in the Cloud [arXiv]
- Serverless
- Resource Management in Aurora Serverless [Paper]
- AWS
- Industry Paper
- Resource Management in Aurora Serverless [Paper]
- Approximate Inference
- Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines [Paper] [arXiv] [Code]
- CUHK
- Approximate input features to accelerate inference pipelines.
- Trade-off between latency and accuracy.
- Evaluation: All inference pipelines were implemented using Python and scikit-learn; run on CPU servers.
- CUHK
- InferDB: In-Database Machine Learning Inference Using Indexes [Paper]
- Hasso Plattner Institute & University of Potsdam & University of Illinois Chicago
- Approximate ML inference pipelines using index structures available in DBMS.
- Predictions are preserved in the embedding space; select binned features for indexing.
- IMO: Aggressive...
- Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines [Paper] [arXiv] [Code]
- Edge Computing
- SmartLite: A DBMS-based Serving System for DNN Inference in Resource-constrained Environments [Paper] [Code]
- ZJU & Alibaba
- SmartLite, a lightweight DBMS
- Store the parameters and structural information of neural networks as database tables.
- Implement neural network operators inside the DBMS engine.
- Quantize model parameters as binarized values, apply neural pruning techniques to compress the models, and transform tensor manipulations into value lookup operations of the DBMS.
- SmartLite: A DBMS-based Serving System for DNN Inference in Resource-constrained Environments [Paper] [Code]
- ElasticNotebook: Enabling Live Migration for Computational Notebooks [Paper] [arXiv] [Code]
- UIUC & UMich
- Live migration via checkpointing/restoration.
- Reconstruct all variables from a subset of variables.
- RALF: Accuracy-Aware Scheduling for Feature Store Maintenance [Paper]
- UC Berkeley
- Limitations of existing works
- Naively apply a one-size-fits-all policy as to when/how to update these features.
- Do not consider query access patterns or impacts on prediction accuracy.
- Feature store regret: a metric for how much featurization degrades downstream accuracy.
- Leverage downstream error feedback to minimize feature store regret.
- FusionFlow: Accelerating Data Preprocessing for Machine Learning with CPU-GPU Cooperation [Paper] [Code]
- UNIST
- Cooperatively utilizes both CPUs and GPUs to accelerate the data preprocessing stage of DL training that runs the data augmentation algorithm.
- Orchestrate data preprocessing tasks across CPUs and GPUs while minimizing interference with GPU-based model training.