Skip to content

Release v2.0.0

Compare
Choose a tag to compare
@dylan-fan dylan-fan released this 31 Dec 15:27
· 215 commits to master since this release
bb6de87

By downloading, installing or using the software, you accept and agree to be bound by all of the terms and conditions of the LICENSE and DISCLAIMER.

Major Features and Improvements

FATE 2.0

collaps

Arch 2.0:Building Unified and Standardized API for Heterogeneous Computing Engines Interconnection

  • Introduce Context to manage useful APIs for developers, such as Distributed Compting, Federation, Cipher, Tensor, Metrics, and IO.
  • Introduce Tensor data structure to handle local and distributed matrix operation, with built-in heterogeneous acceleration support.
    • abstracted PHETensor, smooth switch between various underlying PHE implementations through standard interface
  • Introduce DataFrame, a 2D tabular data structure for data io and simple feature engineering
    • add data block manager to support mixed-type columns & feature anonymization
    • added 30+ operator interfaces for statistics, including comparison, indexing, data binning, and transformation, etc
  • Refactor Federation, a unified interface for federated communication. We provide a unified Serdes control and more user-friendly api.
  • Introduce Config, a unified configuration for FATE, including safety restrictions, system configuration, and algorithm configuration
  • Refactor logger, customizable logging for different use cases and flavors.
  • Introduce Launcher, a simple tool for federated program execution, especially useful for standalone and local debugging
  • Framework: PSI-ECDH protocol support, single entry for histogram statistical computation
  • Deepspeed integration: support distributed training using deepspeed with Eggroll.
  • Protocol: Support for SSHE(mpc and homomophic encryption mixed protocol), ECDH, Secure Aggregation protocols
  • Experimental Integrate Crypten for SMPC support, more protocols and features will be added in the future

Components 2.0: Building Standardized Algorithm Components for different Scheduling Engines

  • Introduce components toolbox to wrap ML modules as standard executable programs
  • spec and loader expose clear API for smooth internal extension and external system integration
  • Provide several cli tools to interact and execute components
  • Input-Output: Further decoupling of FATE-Flow, providing standardized black-box calling processes
  • Component Definition: Support for typing-based definition, automatic checking for component parameters, support for multiple types of data and model input and output, in addition to multiple inputs

ML 2.0: Major functionality migration from FATE-v1.x, decoupling call hierarchy

  • Data preprocessing: Added DataFrame Transformer; Reader, Union and DataSplit migration completed
  • Feature Engineering: Migrated HeteroFederatedBinning, HeteroFeatureSelection, DataStatistics, Sampling, FeatureScale and Pearson Correlation
  • Federated Training Migrated: HeteroSecureBoost, HomoNN, HeteroCoordinatedLogisticRegression, HeteroCoordinatedLinearRegression, SSHE-LogisticRegression and SSHE-LinearRegression
  • Federated Training Added:
    • SSHE-HeteroNN: based on mpc and homomorphic encryption mixed protocal
    • FedPASS-HeteroNN: based on fedpass protocol

Algorithm Performance Improvements (Comparison with FATE-v1.11.*)

  • PSI (Privacy Set Intersection): tested on a dataset of 100 million with an intersection result of 100 million, 1.8+ times of FATE-v1.11.4
  • Hetero-SSHE-LR: tested on data of guest 10w * 30 dimensions and host 10w * 300 dimensions, 4.3+ times of FATE-v1.11.4
  • Hetero-NN(Based on FedPass Protocol): tested on data of guest 10w * 30 dimensions and host 10w * 300 dimensions, basically consistent with the plaintext performance, 143+ times of FATE-v1.11.4
  • Hetero-Coordinated-LR: tested on data of guest 10w * 30 dimensions and host 10w * 300 dimensions, 1.2+ times of FATE-v1.11.4
  • Hetero-Feature-Binning: tested on data of guest 10w * 30 dimensions and host 10w * 300 dimensions, 1.5+ times of FATE-v1.11.4

OSX(Open Site Exchange) 1.0: Building Open Platform for Cross-Site Communication Interconnection

  • Implement the transmission interface in accordance with the “ Technical Specification for Financial Industry Privacy Computing Interconnection Platform”,The transmission interface is compatible with FATE 1.X version and FATE 2.X version
  • Supports GRPC synchronous and streaming transmission, supports TLS secure transmission protocol, and is compatible with FATE1.X rollsite components
  • Supports Http 1.X protocol transmission and TLS secure transmission protocol
  • Support message queue mode transmission, used to replace rabbitmq and pulsar components in FATE 1.X
  • Supports Eggroll and Spark computing engines
  • Supports networking as an Exchange component, with support for FATE 1.X and FATE 2.X access
  • Compared to the rollsite component, it improves the exception handling logic during transmission and provides more accurate log output for quickly locating exceptions.
  • The routing configuration is basically consistent with the original rollsite, reducing the difficulty of porting
  • Supports HTTP interface modification of routing tables and provides simple permission verification
  • Improved network connection management logic, reduced connection leakage risk, and improved transmission efficiency
  • Using different ports to handle access requests both inside and outside the cluster, facilitating the adoption of different security policies for different ports

FATE Flow 2.0: Building Open and Standardized Scheduling Platform for Scheduling Interconnection

collaps
  • Adapted to new scalable and standardized federated DSL IR
  • Built an interconnected scheduling layer framework, supported the BFIA protocol
  • Optimized process scheduling, with scheduling separated and customizable, and added priority scheduling
  • Optimized algorithm component scheduling,support container-level algorithm loading, enhancing support for cross-platform heterogeneous scenarios
  • Optimized multi-version algorithm component registration, supporting registration for mode of components
  • Federated DSL IR extension enhancement: supports multi-party asymmetric scheduling
  • Optimized client authentication logic, supporting permission management for multiple clients
  • Optimized RESTful interface, making parameter fields and types, return fields, and status codes clearer
  • Added OFX(Open Flow Exchange) module: encapsulated scheduling client to allow cross-platform scheduling
  • Supported the new communication engine OSX, while remaining compatible with all engines from FATE Flow 1.x
  • Decoupled the System Layer and the Algorithm Layer, with system configuration moved from the FATE repository to the Flow repository
  • Published FATE Flow package to PyPI and added service-level CLI for service management
  • Migrated major functionality from FATE Flow 1.x

FATE-Client 2.0: Building Scalable Federated DSL for Application Layer Interconnection And Providing Tools For Fast Federated Modeling

collaps
  • Introduce new scalable and standardized federated DSL IR(Intermediate Representation) for federated modeling job
  • Compile python client to DSL IR
  • Federated DSL IR extension enhancement: supports multi-party asymmetric scheduling.
  • Support mutual translation between Standardized Fate-2.0.0 DSL IR and UnionPay's BFIA protocol.
  • Support components with UnionPay's BFIA protocol through adapter mode
  • Flow CLI and PipeLine share configuration

FATE-Test: FATE Automated Testing Tool

collaps
  • Migrated automated testing for functionality, performance, and correctness

FATE-Board 2.0

collaps
  • Refactoring DAG components, adding support for stage status, and displaying dynamic ports.
  • Update the cache structure to optimize issues such as user timeout handling and duplicate storage of configuration information.
  • Optimize some interactive functions.
  • Update the style theme.

Eggroll 3.0

collaps

Enhancements in the JVM Part:

  • Core Component Reconstruction: The cluster-manager and node-manager components have been entirely rebuilt using Java, ensuring uniformity and enhanced performance.
  • Transport Component Modification: The rollsite transport component has been removed and replaced with the more efficient osx component.
  • Improved Process Management: Advanced logic has been implemented to manage processes more effectively, significantly reducing the risk of process leakage.
  • Enhanced Data Storage Logic: Data storage mechanisms have been refined for better performance and reliability.
  • Concurrency Control Improvements: We've upgraded the logic for concurrency control in the original components, leading to performance boosts.
  • Visualization Component: A new visualization component has been added for convenient monitoring of computational information.
  • Refined Logging: The logging system has been enhanced for more precise outputs, aiding in rapid anomaly detection.

Upgrades in the Python Part:

  • Reconstruction of roll_pair and egg_pair: These components now support serialization and partition methods controlled by the caller. Serialization safety is uniformly managed by the caller.
  • Automated Cleanup of Intermediate Tables: The issue of automatic cleaning for intermediate tables between federation and computing has been resolved, eliminating the need for extra operations by the caller.
  • Unified Configuration Control: A flexible configuration system is introduced, supporting direct pass-through, configuration files, and environment variables to cater to diverse requirements.
  • Client-Side PyPI Installation: Eggroll 3.0 supports easy installation via PyPI for clients.
  • Optimized Log Configuration: Callers can now customize log formats according to their needs.
  • Code Structure Refinement: The codebase has been streamlined for clarity, removing a substantial amount of redundant code.

Eggroll 3.0 brings comprehensive enhancements in system performance, usability, and reliability with these significant updates in both the JVM and Python parts.

Easy Deploy

  • Supports installation of FATE by PyPi

Commit Authors

collaps