Skip to content

Latest commit

 

History

History
89 lines (70 loc) · 5.74 KB

vldb-2024.md

File metadata and controls

89 lines (70 loc) · 5.74 KB

VLDB 2024

Meta Info

Homepage: https://vldb.org/2024/

Paper list

Papers

Resource Management

  • DL training workloads
    • Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads [Paper] [Code]
      • UCSD
      • Saturn -> SPASE: Select a Parallelism, Allocate resources, and Schedule; formulate the joint SPASE problem as an MILP.
  • Big data analytic workloads
    • Intelligent Pooling: Proactive Resource Provisioning in Large-scale Cloud Service [Paper]
      • Microsoft
      • Predict usage patterns using a hybrid ML model; optimize the pool size dynamically.
  • Job scheduling
    • ResLake: Towards Minimum Job Latency and Balanced Resource Utilization in Geo-distributed Job Scheduling
      • ByteDance
  • Autoscaling
    • OptScaler: A Collaborative Framework for Robust Autoscaling in the Cloud [arXiv]
      • Ant Group
  • Serverless
    • Resource Management in Aurora Serverless [Paper]
      • AWS
      • Industry Paper

Model Serving

  • Approximate Inference
    • Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines [Paper] [arXiv] [Code]
      • CUHK
        • Approximate input features to accelerate inference pipelines.
        • Trade-off between latency and accuracy.
        • Evaluation: All inference pipelines were implemented using Python and scikit-learn; run on CPU servers.
    • InferDB: In-Database Machine Learning Inference Using Indexes [Paper]
      • Hasso Plattner Institute & University of Potsdam & University of Illinois Chicago
      • Approximate ML inference pipelines using index structures available in DBMS.
      • Predictions are preserved in the embedding space; select binned features for indexing.
      • IMO: Aggressive...
  • Edge Computing
    • SmartLite: A DBMS-based Serving System for DNN Inference in Resource-constrained Environments [Paper] [Code]
      • ZJU & Alibaba
      • SmartLite, a lightweight DBMS
        • Store the parameters and structural information of neural networks as database tables.
        • Implement neural network operators inside the DBMS engine.
        • Quantize model parameters as binarized values, apply neural pruning techniques to compress the models, and transform tensor manipulations into value lookup operations of the DBMS.

Notebook

  • ElasticNotebook: Enabling Live Migration for Computational Notebooks [Paper] [arXiv] [Code]
    • UIUC & UMich
    • Live migration via checkpointing/restoration.
    • Reconstruct all variables from a subset of variables.

Feature Stores

  • RALF: Accuracy-Aware Scheduling for Feature Store Maintenance [Paper]
    • UC Berkeley
    • Limitations of existing works
      • Naively apply a one-size-fits-all policy as to when/how to update these features.
      • Do not consider query access patterns or impacts on prediction accuracy.
    • Feature store regret: a metric for how much featurization degrades downstream accuracy.
    • Leverage downstream error feedback to minimize feature store regret.

Data Pre-processing

  • FusionFlow: Accelerating Data Preprocessing for Machine Learning with CPU-GPU Cooperation [Paper] [Code]
    • UNIST
    • Cooperatively utilizes both CPUs and GPUs to accelerate the data preprocessing stage of DL training that runs the data augmentation algorithm.
    • Orchestrate data preprocessing tasks across CPUs and GPUs while minimizing interference with GPU-based model training.

Deep Learning Recommendation Model (DLRM)

  • DLRover: Resource Optimization for Deep Recommendation Models Training at AntGroup [arXiv] [Code]
    • AntGroup & Sichuan University

Graph Neural Network (GNN)

  • Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses [Paper] [Code]
    • UIUC & NVIDIA
    • GIDS: GPU Initiated Direct Storage Access -> A data loader to utilize all hardware resources (i.e., CPU memory, storage, and GPU memory)