Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial Cloudberry code dump2. #12

Merged
merged 1 commit into from
Jul 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 2 additions & 0 deletions concourse/scripts/dumpdb.bash
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ INSTALL_DIR=${INSTALL_DIR:-/usr/local/cloudberry-db-devel}
source $INSTALL_DIR/greenplum_path.sh
source ./gpdb_src/gpAux/gpdemo/gpdemo-env.sh

# ignore ERR trap
gpstop -qa || :
gpstart -a
sleep 60
./gpdb_src/concourse/scripts/ic_start_fts_once.bash
Expand Down
2 changes: 2 additions & 0 deletions contrib/bloom/blinsert.c
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,8 @@ blbuildempty(Relation index)
* XLOG_DBASE_CREATE or XLOG_TBLSPC_CREATE record. Therefore, we need
* this even when wal_level=minimal.
*/
PageEncryptInplace(metapage, INIT_FORKNUM,
BLOOM_METAPAGE_BLKNO);
PageSetChecksumInplace(metapage, BLOOM_METAPAGE_BLKNO);
smgrwrite(index->rd_smgr, INIT_FORKNUM, BLOOM_METAPAGE_BLKNO,
(char *) metapage, true);
Expand Down
2 changes: 1 addition & 1 deletion contrib/bloom/blscan.c
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ blgetbitmap(IndexScanDesc scan, Node **bmNodeP)
if (*bmNodeP == NULL)
{
/* XXX should we use less than work_mem for this? */
tbm = tbm_create(work_mem * 1024L, NULL);
tbm = tbm_create(work_mem * 1024L, scan->dsa);
*bmNodeP = (Node *) tbm;
}
else if (!IsA(*bmNodeP, TIDBitmap))
Expand Down
2 changes: 1 addition & 1 deletion deploy/cbdb_deploy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ function cbdb_build() {

#do compile configuration
echo "[CBDB build] start to init configuraiton for code compile..."
CFLAGS=-O0 CXXFLAGS='-O0 -std=c++14' ./configure --prefix=${install_dir}/cbdb --enable-debug --enable-cassert --enable-tap-tests --with-gssapi --with-libxml --with-quicklz --with-pythonsrc-ext
CFLAGS=-O0 CXXFLAGS='-O0 -std=c++14' ./configure --prefix=${install_dir}/cbdb --enable-debug --enable-cassert --enable-tap-tests --with-gssapi --with-libxml --with-quicklz --with-pythonsrc-ext --with-openssl

#do compile
echo "[CBDB build] start to compile binary file..."
Expand Down
125 changes: 107 additions & 18 deletions doc/src/sgml/config.sgml
Original file line number Diff line number Diff line change
Expand Up @@ -1542,19 +1542,31 @@ include_dir 'conf.d'
mechanism is used.
</para>
<para>
The command must print the passphrase to the standard output and exit
with code 0. In the parameter value, <literal>%p</literal> is
replaced by a prompt string. (Write <literal>%%</literal> for a
literal <literal>%</literal>.) Note that the prompt string will
probably contain whitespace, so be sure to quote adequately. A single
newline is stripped from the end of the output if present.
The command must print the passphrase to the standard output
and exit with code 0. It can prompt from the terminal if
<option>--authprompt</option> is used. In the parameter
value, <literal>%R</literal> is replaced by a file descriptor
number opened to the terminal that started the server. A file
descriptor is only available if enabled at server start via
<option>-R</option>. If <literal>%R</literal> is specified and
no file descriptor is available, the server will not start. Value
<literal>%p</literal> is replaced by a pre-defined prompt string.
(Write <literal>%%</literal> for a literal <literal>%</literal>.)
Note that the prompt string will probably contain whitespace,
so be sure to quote its use adequately. Newlines are stripped
from the end of the output if present.
</para>

<para>
The command does not actually have to prompt the user for a
passphrase. It can read it from a file, obtain it from a keychain
facility, or similar. It is up to the user to make sure the chosen
mechanism is adequately secure.
Sample scripts can be found in
<filename>$SHAREDIR/auth_commands</filename>,
where <literal>$SHAREDIR</literal> means the
<productname>PostgreSQL</productname> installation's shared-data
directory, often <filename>/usr/local/share/postgresql</filename>
(use <command>pg_config --sharedir</command> to determine it if
you're not sure).
</para>

<para>
This parameter can only be set in the <filename>postgresql.conf</filename>
file or on the server command line.
Expand All @@ -1576,10 +1588,12 @@ include_dir 'conf.d'
parameter is off (the default), then
<varname>ssl_passphrase_command</varname> will be ignored during a
reload and the SSL configuration will not be reloaded if a passphrase
is needed. That setting is appropriate for a command that requires a
TTY for prompting, which might not be available when the server is
running. Setting this parameter to on might be appropriate if the
passphrase is obtained from a file, for example.
is needed. This setting is appropriate for a command that requires a
terminal for prompting, which will likely not be available when the server is
running. (<option>--authprompt</option> closes the terminal file
descriptor soon after server start.) Setting this parameter on
might be appropriate, for example, if the passphrase is obtained
from a file.
</para>
<para>
This parameter can only be set in the <filename>postgresql.conf</filename>
Expand Down Expand Up @@ -2775,7 +2789,7 @@ include_dir 'conf.d'
Note that changing <varname>wal_level</varname> to
<literal>minimal</literal> makes any base backups taken before
unavailable for archive recovery and standby server, which may
lead to data loss.
lead to data loss. Cluster file encryption also does not support.
</para>
<para>
In <literal>logical</literal> level, the same information is logged as
Expand Down Expand Up @@ -3129,9 +3143,10 @@ include_dir 'conf.d'
</para>

<para>
If data checksums are enabled, hint bit updates are always WAL-logged
and this setting is ignored. You can use this setting to test how much
extra WAL-logging would occur if your database had data checksums
If data checksums or cluster file encryption is enabled,
hint bit updates are always WAL-logged and this setting is
ignored. You can use this setting to test how much extra
WAL-logging would occur if your database had data checksums
enabled.
</para>

Expand Down Expand Up @@ -8048,6 +8063,64 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</variablelist>
</sect1>

<sect1 id="runtime-config-encryption">
<title>Cluster File Encryption</title>

<variablelist>
<varlistentry id="guc-cluster-key-command" xreflabel="cluster_key_command">
<term><varname>cluster_key_command</varname> (<type>string</type>)
<indexterm>
<primary><varname>cluster_key_command</varname> configuration parameter</primary>
</indexterm>
</term>
<listitem>
<para>
This option specifies an external command to obtain the cluster-level
key for cluster file encryption during server initialization and
server start.
</para>
<para>
The command must print the cluster key to the standard
output as 64 hexadecimal characters, and exit with code 0.
The command can prompt for the passphrase or PIN from the
terminal if <option>--authprompt</option> is used. In the
parameter value, <literal>%R</literal> is replaced by a file
descriptor number opened to the terminal that started the server.
A file descriptor is only available if enabled at server start
via <option>-R</option>. If <literal>%R</literal> is specified
and no file descriptor is available, the server will not start.
Value <literal>%p</literal> is replaced by a pre-defined
prompt string. Value <literal>%d</literal> is replaced by the
directory containing the keys; this is useful if the command
must create files with the keys, e.g., to store a cluster-level
key encrypted by a key stored in a hardware security module.
(Write <literal>%%</literal> for a literal <literal>%</literal>.)
Note that the prompt string will probably contain whitespace,
so be sure to quote its use adequately. Newlines are stripped
from the end of the output if present.
</para>

<para>
Sample script can be found in
<filename>$SHAREDIR/auth_commands</filename>,
where <literal>$SHAREDIR</literal> means the
<productname>PostgreSQL</productname> installation's shared-data
directory, often <filename>/usr/local/share/postgresql</filename>
(use <command>pg_config --sharedir</command> to determine it if
you're not sure).
</para>

<para>
This parameter can only be set by
<application>initdb</application>, in the
<filename>postgresql.conf</filename> file, or on the server
command line.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect1>

<sect1 id="runtime-config-client">
<title>Client Connection Defaults</title>

Expand Down Expand Up @@ -10068,6 +10141,22 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
</listitem>
</varlistentry>

<varlistentry id="guc-file-encryption-method" xreflabel="file_encryption_method">
<term><varname>file_encryption_method</varname> (<type>boolean</type>)
<indexterm>
<primary>Cluster file encryption method</primary>
</indexterm>
</term>
<listitem>
<para>
Reports the cluster file
encryption method. See <xref
linkend="app-initdb-cluster-key-command"/> for more
information.
</para>
</listitem>
</varlistentry>

<varlistentry id="guc-data-directory-mode" xreflabel="data_directory_mode">
<term><varname>data_directory_mode</varname> (<type>integer</type>)
<indexterm>
Expand Down
149 changes: 149 additions & 0 deletions doc/src/sgml/database-encryption.sgml
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
<!-- doc/src/sgml/database-encryption.sgml -->

<chapter id="cluster-file-encryption">
<title>Cluster File Encryption</title>

<indexterm zone="cluster-file-encryption">
<primary>Cluster File Encryption</primary>
</indexterm>

<para>
The purpose of cluster file encryption is to prevent users with read
access on the directories used to store database files and write-ahead
log files from being able to access the data stored in those files.
For example, when using cluster file encryption, users who have read
access to the cluster directories for backup purposes will not be able
to decrypt the data stored in these files. Read-only access for a group
of users can be enabled using the <application>initdb</application>
<option>--allow-group-access</option> option. Cluster file encryption
also provides data-at-rest security, protecting users from data loss
should the physical storage media be stolen or improperly erased before
disposal.
</para>

<para>
Cluster file encryption does not protect against unauthorized file
system writes. Such writes can allow data decryption if used to weaken
the system's security and the weakened system is later supplied with
the externally-stored cluster encryption key. This also does not always
detect if users with write access remove or modify database files.
</para>

<para>
This also does not protect against users who have read access to database
process memory because all in-memory data pages and data encryption keys
are stored unencrypted in memory. Therefore, an attacker who is able
to read memory can read the data encryption keys and decrypt the entire
cluster. The Postgres operating system user and the operating system
administrator, e.g., the <literal>root</literal> user, have such access.
</para>

<sect1 id="cluster-encryption-keys">
<title>Keys</title>

<para>
Cluster file encryption uses two levels of encryption &mdash; an upper
key which encrypts lower-level keys. The upper-level key is often
referred to as a Key Encryption Key (<acronym>KEK</acronym>). This key
is <emphasis>not</emphasis> stored in the file system, but provided at
<command>initdb</command> time and each time the server is started. This
key can be easily changed via <command>pg_alterckey</command> without
requiring any changes to the the data files or <command>WAL</command>
files.
</para>

<para>
The lower level keys are data encryption keys, specifically for relations
and <acronym>WAL</acronym>. The relation key is used to encrypt database
heap and index files. The WAL key is used to encrypt write-ahead log
(WAL) files. Two different keys are used so that primary and standby
servers can use different relation keys, but the same WAL key, so that
these keys can (in a future release) be rotated by switching the
primary to the standby and then changing the WAL key. Eventually,
encryption will be able to added to non-encrypted clusters by creating
encrypted replicas and switching over to them.
</para>

<para>
Postgres stores the data encryption (lower-level) keys in the data
directory encrypted (wrapped) by key encryption (upper-level) key.
Though the data encryption keys technically exist in the file system,
the key encryption key does not, so the data encryption keys are
securely stored. Data encryption keys are used to security encrypt
other database files.
</para>
</sect1>

<sect1 id="cluster-encryption-initialization">
<title>Initialization</title>

<para>
Cluster file encryption is enabled when
<productname>PostgreSQL</productname> is built
with <literal>--with-openssl</literal> and <xref
linkend="app-initdb-cluster-key-command"/> is specified
during <command>initdb</command>. The cluster key
provided by the <option>--cluster-key-command</option>
option during <command>initdb</command> and the one generated
by <xref linkend="guc-cluster-key-command"/> in the
<filename>postgresql.conf</filename> must match for the database
cluster to start. Note that the cluster key command
passed to <command>initdb</command> must return a key of
64 hexadecimal characters. For example:
<programlisting>
initdb -D dbname --cluster-key-command='ckey_passphrase.sh'
</programlisting>
Cluster file encryption does not support a <varname>wal_level</varname>
of <literal>minimal</literal>.
</para>
</sect1>

<sect1 id="cluster-encryption-operation">
<title>Operation</title>

<para>
During the <command>initdb</command> process, if
<option>--cluster-key-command</option> is specified, two data-level
encryption keys are created. These two keys are then encrypted with
the key encryption key (KEK) supplied by the cluster key command before
being stored in the database directory. The key or passphrase that
derives the key must be supplied from the terminal or stored in a
trusted key store, such as key vault software or a hardware security
module.
</para>

<para>
If the <productname>PostgreSQL</productname> server has
been initialized to require a cluster key, each time the
server starts the <filename>postgresql.conf</filename>
<varname>cluster_key_command</varname> command will be executed
and the cluster key retrieved. The data encryption keys in the
<filename>pg_cryptokeys</filename> directory will then be decrypted
using the supplied key and integrity-checked to ensure it matches the
initdb-supplied key. (If this check fails, the server will refuse
to start.) The cluster encryption key will then be removed from
system memory. The decrypted data encryption keys will remain in
shared memory until the server is stopped.
</para>

<para>
The data encryption keys are randomly generated and can be 128, 192,
or 256-bits in length, depending on whether <literal>AES128</literal>,
<literal>AES192</literal>, or <literal>AES256</literal> is specified.
They are encrypted by the key encryption key (KEK) using Advanced
Encryption Standard (<acronym>AES256</acronym>) encryption in Key
Wrap Padded Mode, which also provides KEK authentication; see <ulink
url="https://tools.ietf.org/html/rfc5649">RFC 5649</ulink>. While
128-bit encryption is sufficient for most sites, 256-bit encryption
is thought to be more immune to future quantum cryptographic attacks.
</para>

<para>.
If you prefer to create the random keys on your own, you can create
a empty directory with a <filename>pg_cryptokeys/live</filename>
subdirectory, generate the keys there using your tools. and use the
<command>initdb</command> <option>--copy-encryption-keys</option>
to copy those keys into the newly-created cluster.
</para>
</sect1>
</chapter>
1 change: 1 addition & 0 deletions doc/src/sgml/filelist.sgml
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@
<!ENTITY wal SYSTEM "wal.sgml">
<!ENTITY logical-replication SYSTEM "logical-replication.sgml">
<!ENTITY jit SYSTEM "jit.sgml">
<!ENTITY database-encryption SYSTEM "database-encryption.sgml">

<!-- programmer's guide -->
<!ENTITY bgworker SYSTEM "bgworker.sgml">
Expand Down
2 changes: 1 addition & 1 deletion doc/src/sgml/installation.sgml
Original file line number Diff line number Diff line change
Expand Up @@ -1004,7 +1004,7 @@ build-postgresql:
<listitem>
<para>
Build with support for <acronym>SSL</acronym> (encrypted)
connections. The only <replaceable>LIBRARY</replaceable>
connections and cluster file encryption. The only <replaceable>LIBRARY</replaceable>
supported is <option>openssl</option>. This requires the
<productname>OpenSSL</productname> package to be installed.
<filename>configure</filename> will check for the required
Expand Down
13 changes: 13 additions & 0 deletions doc/src/sgml/monitoring.sgml
Original file line number Diff line number Diff line change
Expand Up @@ -1318,6 +1318,19 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
<entry><literal>DataFileWrite</literal></entry>
<entry>Waiting for a write to a relation data file.</entry>
</row>
<row>
<entry><literal>KeyFileRead</literal></entry>
<entry>Waiting for a read of the wrapped data encryption keys.</entry>
</row>
<row>
<entry><literal>KeyFileWrite</literal></entry>
<entry>Waiting for a write of the wrapped data encryption keys.</entry>
</row>
<row>
<entry><literal>KeyFileSync</literal></entry>
<entry>Waiting for changes to the wrapped data encryption keys to reach
durable storage.</entry>
</row>
<row>
<entry><literal>LockFileAddToDataDirRead</literal></entry>
<entry>Waiting for a read while adding a line to the data directory lock
Expand Down
Loading