diff --git a/src/cargo/util/context/de.rs b/src/cargo/util/context/de.rs index 11a2d463407..71af17c0516 100644 --- a/src/cargo/util/context/de.rs +++ b/src/cargo/util/context/de.rs @@ -1,4 +1,25 @@ -//! Support for deserializing configuration via `serde` +//! Deserialization for converting [`ConfigValue`] instances to target types. +//! +//! The [`Deserializer`] type is the main driver of deserialization. +//! The workflow is roughly: +//! +//! 1. [`GlobalContext::get()`] creates [`Deserializer`] and calls `T::deserialize()` +//! 2. Then call type-specific deserialize methods as in normal serde deserialization. +//! - For primitives, `deserialize_*` methods look up [`ConfigValue`] instances +//! in [`GlobalContext`] and convert. +//! - Structs and maps are handled by [`ConfigMapAccess`]. +//! - Sequences are handled by [`ConfigSeqAccess`], +//! which later uses [`ArrayItemDeserializer`] for each array item. +//! - [`Value`] is delegated to [`ValueDeserializer`] in `deserialize_struct`. +//! +//! The purpose of this workflow is to: +//! +//! - Retrieve the correct config value based on source location precedence +//! - Provide richer error context showing where a config is defined +//! - Provide a richer internal API to map to concrete config types +//! without touching underlying [`ConfigValue`] directly +//! +//! [`ConfigValue`]: CV use crate::util::context::value; use crate::util::context::{ConfigError, ConfigKey, GlobalContext}; diff --git a/src/cargo/util/context/mod.rs b/src/cargo/util/context/mod.rs index 7a4831535ed..51578064fe4 100644 --- a/src/cargo/util/context/mod.rs +++ b/src/cargo/util/context/mod.rs @@ -11,15 +11,31 @@ //! //! There are a variety of helper types for deserializing some common formats: //! -//! - `value::Value`: This type provides access to the location where the +//! - [`value::Value`]: This type provides access to the location where the //! config value was defined. -//! - `ConfigRelativePath`: For a path that is relative to where it is +//! - [`ConfigRelativePath`]: For a path that is relative to where it is //! defined. -//! - `PathAndArgs`: Similar to `ConfigRelativePath`, but also supports a list -//! of arguments, useful for programs to execute. -//! - `StringList`: Get a value that is either a list or a whitespace split +//! - [`PathAndArgs`]: Similar to [`ConfigRelativePath`], +//! but also supports a list of arguments, useful for programs to execute. +//! - [`StringList`]: Get a value that is either a list or a whitespace split //! string. //! +//! ## Config deserialization +//! +//! Cargo uses a two-layer deserialization approach: +//! +//! 1. **External sources → `ConfigValue`** --- +//! Configuration files, environment variables, and CLI `--config` arguments +//! are parsed into [`ConfigValue`] instances via [`ConfigValue::from_toml`]. +//! These parsed results are stored in [`GlobalContext`]. +//! +//! 2. **`ConfigValue` → Target types** --- +//! The [`GlobalContext::get`] method uses a [custom serde deserializer](Deserializer) +//! to convert [`ConfigValue`] instances to the caller's desired type. +//! Precedence between [`ConfigValue`] sources is resolved during retrieval +//! based on [`Definition`] priority. +//! See the top-level documentation of the [`de`] module for more. +//! //! ## Map key recommendations //! //! Handling tables that have arbitrary keys can be tricky, particularly if it @@ -40,14 +56,6 @@ //! structs/maps, but if it is a struct or map, then it will not be able to //! read the environment variable due to ambiguity. (See `ConfigMapAccess` for //! more details.) -//! -//! ## Internal API -//! -//! Internally config values are stored with the `ConfigValue` type after they -//! have been loaded from disk. This is similar to the `toml::Value` type, but -//! includes the definition location. The `get()` method uses serde to -//! translate from `ConfigValue` and environment variables to the caller's -//! desired type. use crate::util::cache_lock::{CacheLock, CacheLockMode, CacheLocker}; use std::borrow::Cow; @@ -2157,6 +2165,7 @@ enum KeyOrIdx { Idx(usize), } +/// Similar to [`toml::Value`] but includes the source location where it is defined. #[derive(Eq, PartialEq, Clone)] pub enum ConfigValue { Integer(i64, Definition), diff --git a/src/cargo/util/context/value.rs b/src/cargo/util/context/value.rs index 0f9f8fe7f36..c7bf232d46a 100644 --- a/src/cargo/util/context/value.rs +++ b/src/cargo/util/context/value.rs @@ -1,5 +1,6 @@ -//! Deserialization of a `Value` type which tracks where it was deserialized -//! from. +//! Deserialization of a [`Value`] type which tracks where it was deserialized from. +//! +//! ## Rationale for `Value` //! //! Often Cargo wants to report semantic error information or other sorts of //! error information about configuration keys but it also may wish to indicate @@ -7,6 +8,31 @@ //! debugging). The `Value` type here can be used to deserialize a `T` value //! from configuration, but also record where it was deserialized from when it //! was read. +//! +//! ## How `Value` deserialization works +//! +//! Deserializing `Value` is pretty special, and serde doesn't have built-in +//! support for this operation. To implement this we extend serde's "data model" +//! a bit. We configure deserialization of `Value` to basically only work with +//! our one deserializer using configuration. +//! +//! We define that `Value` deserialization asks the deserializer for a very +//! special [struct name](NAME) and [struct field names](FIELDS). In doing so, +//! the deserializer will recognize this and synthesize a magical value for the +//! `definition` field when we deserialize it. This protocol is how we're able +//! to have a channel of information flowing from the configuration deserializer +//! into the deserialization implementation here. +//! +//! You'll want to also check out the implementation of `ValueDeserializer` in +//! the [`de`] module. Also note that the names below are intended to be invalid +//! Rust identifiers to avoid conflicts with other valid structures. +//! +//! Finally the `definition` field is transmitted as a tuple of i32/string, +//! which is effectively a tagged union of [`Definition`] itself. You should +//! update both places here and in the impl of [`serde::de::MapAccess`] for +//! `ValueDeserializer` when adding or modifying enum variants of [`Definition`]. +//! +//! [`de`]: crate::util::context::de use crate::util::context::GlobalContext; use serde::de; @@ -29,24 +55,6 @@ pub struct Value { pub type OptValue = Option>; -// Deserializing `Value` is pretty special, and serde doesn't have built-in -// support for this operation. To implement this we extend serde's "data model" -// a bit. We configure deserialization of `Value` to basically only work with -// our one deserializer using configuration. -// -// We define that `Value` deserialization asks the deserializer for a very -// special struct name and struct field names. In doing so the deserializer will -// recognize this and synthesize a magical value for the `definition` field when -// we deserialize it. This protocol is how we're able to have a channel of -// information flowing from the configuration deserializer into the -// deserialization implementation here. -// -// You'll want to also check out the implementation of `ValueDeserializer` in -// `de.rs`. Also note that the names below are intended to be invalid Rust -// identifiers to avoid how they might conflict with other valid structures. -// Finally the `definition` field is transmitted as a tuple of i32/string, which -// is effectively a tagged union of `Definition` itself. - pub(crate) const VALUE_FIELD: &str = "$__cargo_private_value"; pub(crate) const DEFINITION_FIELD: &str = "$__cargo_private_definition"; pub(crate) const NAME: &str = "$__cargo_private_Value";