Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-36805][cdc-common] Add ConfigShade interface to support encryption of sensitive configuration items and provide a base64 encoding implementation #3829

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Jzjsnow
Copy link
Contributor

@Jzjsnow Jzjsnow commented Jan 2, 2025

Introduction

When Flink CDC uses pipeline to submit jobs, we need to set configuration items in plaintext in the definition file, including sensitive configurations such as passwords for connecting to source and sink components (e.g., mysql, doris, etc.), which may be a security risk. To avoid the use of plaintext passwords, we provide an interface (ConfigShade) by implementing which developers can customize the decryption method themselves.

We also provide an implementation for base64 encoding first, not only as an example implementation of the interface, but also to solve the current problem of plaintext passwords.

How to use

Using the base64 implementation as an example, the following shows how to use a configuration file with sensitive items encrypted:

  1. Add two new options shade.identifier and shade.sensitive.keywords to the pipeline part in the definition yaml file to specify the encryption algorithm and the encrypted sensitive keywords.
  2. Replace the plaintext of the sensitive items specified in shade.sensitive.keywords with the encrypted ciphertext.
  3. Submit a pipeline job with the new definition file.

Example definition file:

source:
  type: mysql
  name: source-database
  hostname: localhost
  port: 3306
  username: YWRtaW4=
  password: cGFzc3dvcmQx
  tables: replication.cluster
  server-id: 5400-5404
  server-time-zone: Asia/Shanghai

route:
  - source-table: replication.cluster
    sink-table: test.cluster
    description: sync table to one destination table

sink:
  type: doris
  name: sink-queue
  fenodes: localhost:8035
  username: cm9vdA==
  password: cGFzc3dvcmQy
  table.create.properties.light_schema_change: true
  table.create.properties.replication_num: 1

pipeline:
  name: Sync MySQL Database to Doris
  parallelism: 2
  shade.identifier: base64
  shade.sensitive.keywords: password;username

How to customize the encryption algorithm

To use a user-defined encryption algorithm, we expect the developer to provide a dependency package that implements the ConfigShade interface.

/**
 * The interface that provides the ability to decrypt {@link
 * org.apache.flink.cdc.composer.definition}.
 */
public interface ConfigShade {
    /**
     * Initializes the custom instance using the pipeline configuration.
     *
     * <p>This method can be useful when decryption requires an external file (e.g. a key file)
     * defined in the pipeline configs.
     */
    default void initialize(Configuration pipelineConfig) throws Exception {}

    /**
     * The unique identifier of the current interface, used it to select the correct {@link
     * ConfigShade}.
     */
    String getIdentifier();

    /**
     * Decrypt the content.
     *
     * @param content The content to decrypt
     */
    String decrypt(String content);
}

In it, the method getIdentifier() can be called to get the unique identifier of the algorithm, which is used to configure the shade.identifier, and the method decrypt(String content) can be used to decrypt the input cipher text.

@Jzjsnow Jzjsnow changed the title [FLINK-36805][cdc-common] Add interface to support encryption of sensitive configuration items and provide base64 encoding implementation [FLINK-36805][cdc-common] Add interface to support encryption of sensitive configuration items and provide a base64 encoding implementation Jan 2, 2025
@Jzjsnow Jzjsnow changed the title [FLINK-36805][cdc-common] Add interface to support encryption of sensitive configuration items and provide a base64 encoding implementation [FLINK-36805][cdc-common] Add ConfigShade interface to support encryption of sensitive configuration items and provide a base64 encoding implementation Jan 2, 2025
@Jzjsnow Jzjsnow force-pushed the master-Add_support_for_configshade branch 4 times, most recently from fcb28d0 to 125d039 Compare January 10, 2025 06:45
jzjsnow added 2 commits January 17, 2025 11:33
…itive configuration items and provide base64 encoding implementation
…of sensitive configuration items and provide base64 encoding implementation
@Jzjsnow Jzjsnow force-pushed the master-Add_support_for_configshade branch from 125d039 to 592a81f Compare January 17, 2025 03:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant