Skip to content

JSTEP 7

Tatu Saloranta edited this page Jun 3, 2024 · 17 revisions

(Back to JSTEP page)

New DataTypeFeature configuration options (Jackson 2.x)

Author

Tatu Saloranta (@cowtowncoder)

Version history

  • 2024-06-02: Update based on state for 2.17
  • 2022-03-20: First parts implemented for 2.14
  • 2022-01-16: The first draft version

Status

Partial implementation (2 out of 3 added) as of Jackson 2.18.

Related

Related JSTEPs:

  • JSTEP-3 Refers to JsonNodeFeature (was originally included there)
  • JSTEP-5 discusses Unification of Date/Time handling

Background

Over time, various XxxFeature on/off options have proven useful and popular with developers: they are easy to set, change and (for the most part), understand.

Original set of "Features" were configurable at Streaming API and Databind level, including:

  • JsonParser.Feature / JsonGenerator.Feature / JsonFactory.Feature for Streaming API
  • MapperFeature / SerializationFeature / DeserializationFeature for databind

and for Streaming API, further split of format-specific (often JSON-specific) vs. generic (across all or most formats) features, resulted in:

  • JsonParser.Feature split into generic StreamReadFeature and XxxReadFeature (like JsonReadFeature)
  • JsonGenerator.Feature split into generic StreamWriteFeature and XxxWriteFeature (like JsonWriteFeature).

But while this split allowed better support for Format-specific features (via Streaming API), there is no similar mechanism for more granular configuration for datatype-specific features. This has lead to inclusion of some "too [datatype] specific" features at databind level; for example:

  • DeserializationFeature.READ_ENUMS_USING_TO_STRING
  • DeserializationFeature.READ_DATE_TIMESTAMPS_AS_NANOSECONDS
  • SerializationFeature.WRITE_DATES_AS_TIMESTAMPS

Such configuration is against the idea that these Features should be cross-cutting across dataformats AND datatypes. But it is worth noting that there is need for such configuration, at some other granularity. This lead to the idea of "datatype-specific" features, to cover up to 3 initial datatypes:

  1. JsonNode (Tree Model) configuration -- since it differs a lot from POJO configuration, and is difficult to configure
  2. Enum configuration -- a few entries already leaked into generic features, and a few configuration aspects missing
  3. Date/Time configuration -- similar to Enums, some configurability exists in general features, as well as via @JsonFormat, but there's need for more.

Approach

There are a few things to consider with respect to the general idea. For example:

  • Would a single DataTypeFeature enum suffice? Based on having multiple general "datatypes", settings, not really.
    • So, need multiple concrete Feature Enums
    • But would like to avoid need to add all relevant plumbing -- would prefer general-purpose extension mechanism
  • How similar should configuration interface (API) be compared to, say, DeserializationFeature?
    • Seems like usage should be very similar, perhaps almost identical
    • Possible to implement if we make DataTypeFeature an interface that actual Feature enum implements
  • Do these features need to be per-call, or per-mapper?
    • Since some of the settings should be per-call, it seems necessary to make them ALL per-call
    • Needs to be considered when adding new feature entries: should adding Features that do not work on per-call scope
  • Should we support pluggable, per-datatype-module way of adding DataTypeFeatures?
    • Ideally that would be great, but coordination of new types seems difficult (depending on mechanism)
    • Would likely lead to worse issues wrt. cross-version (different minor version between Datatype and Core modules)
    • At least initially the plan is NOT to support anything other than DataTypeFeatures defined by Databind module itself
  • Should there be 3-state definition -- true, false, DEFAULT -- or 2-state true, false? Former has benefit of allowing distinction between Default value and explicit "true" or "false".
    • Keeping track of 3 states adds some complexity
    • Since we need to consider both existing global defaults and possible per-call overrides, it seems necessary that we can differentiate between explicit and default values.

So, to summarize:

  1. We would have DataTypeFeature general interface for each actual feature (say, JsonNodeFeature) to implement: this interface mostly exists to keep Databind API simple and does not affect users directly (no functionality to access through it)
  2. There will be multiple Enum subtypes of DataTypeFeature, and more can (and likely will) be added over future Jackson versions
  3. Datatype Modules cannot define additional subtypes: these features will relate to either somewhat abstract/general types (Date/Time), or JDK- (Enum) and Jackson-specific (JsonNode) types
  4. We should keep track of difference between explicitly set true/false vs. default setting

Current state (implementation)

General support/scaffolding

DatatypeFeature was added in Jackson 2.14 as the basic scaffolding to help implement generic handling for new features:

  • databind#3405: Create DataTypeFeature abstraction (for JSTEP-7) with placeholder features

EnumFeature

EnumFeature was added in Jackson 2.15. Features implemented so far are:

  • DB#2536: Add EnumFeature.READ_ENUM_KEYS_USING_INDEX to work with existing "WRITE_ENUM_KEYS_USING_INDEX" (2.15)
  • DB#3053: Allow serializing enums to lowercase (EnumFeature.WRITE_ENUMS_TO_LOWERCASE) (2.15)

JsonNodeFeature

JsonNodeFeature was added in Jackson 2.14 This feature is cleaved off of JSTEP-3 and is the one closest to actual implementation. Features implemented so far are:

  • Null skipping:
    • For ObjectNode properties:
      • READ_NULL_PROPERTIES (default: {@code true}) -- are null valued properties represented as NullNodes in resulting ObjectNode or skipped? databind#3421 (2.14)
      • WRITE_NULL_PROPERTIES (default: {@code true}) -- are null valued fields in ObjectNode written out as JSON or skipped? databind#3476 (2.14)
  • Numeric, floating-point/decimal
    • STRIP_TRAILING_BIGDECIMAL_ZEROES (default: {@code true}) do we force dropping of trailing zeroes for BigDecimal valued nodes? databind#3651 (2.15)
    • FAIL_ON_NAN_TO_BIG_DECIMAL_COERCION (default: {@code false}) Do we fail on attempts to coerce BigDecimal values or silently coerce to DoubleNode with Double.NaN? databind#4194 (2.17)
  • Property sorting (Similar to MapperFeature.SORT_PROPERTIES_ALPHABETICALLY, but for ObjectNodes)
    • WRITE_PROPERTIES_SORTED (default: false) - databind#3965 (2.16)

Features proposed for implementation:

  • Null skipping:
    • For ArrayNode elements:
      • READ_NULL_ELEMENTS (default: true) -- are null elements in JSON arrays represented as NullNodes in resulting ArrayNode or skipped?
      • WRITE_NULL_ELEMENTS (default: true) -- are null elements in ArrayNode written out as JSON or skipped?
  • Property sorting (Similar to MapperFeature.SORT_PROPERTIES_ALPHABETICALLY, but for ObjectNodes)
    • SORT_KEYS_ON_READ (default: false) - use TreeMap for ObjectNode when reading
  • CONVERT_POJOS_FULLY (default: ?) -- when converting values (mapper.convertValue(), treeToValue()), are opaque values contained in POJONode:
    1. Serialized explicitly by matching serializer (true) -- which will essentially transform it into non-opaque value (like Map, List etc)
    2. Written out as "writePOJO", which in case of TokenBuffer will retain opaque value exactly as is
  • Coercion/Leniency: while 2.12 added "Coercion Config", it does not work well with JsonNode. So how about:
    • LENIENT_SCALAR_CONVERSION (default: false?): do we allow more speculative coercion, like boolean from int (zero -> false, otherwise true)
    • or, maybe better yet: ALLOW_LENIENT_NUMBERS, ALLOW_LENIENT_BOOLEANS etc? We can afford granularity
  • Numeric, floating-point/decimal
    • Something about forcing use of BigDecimal? There is DeserializationFeature.USE_BIG_DECIMAL_FOR_FLOATS already, but should we have something separate?
  • Merging: instead of relying on configOverrides, allow preventing merging
    • ALLOW_ARRAY_MERGE (default: true)
    • ALLOW_OBJECT_MERGE (default: true)

DateTimeFeature (not yet implemented)

The last but no least feature would be something to control default settings for Date/Time types of:

  1. "Classic"/Legacy JDK types -- java.util.Date, java.util.Calendar
  2. Java 8 Date/Time API
  3. Joda Date/Time

Note that JSTEP-5 discusses aspects of Date/Time handling unification and may overlap here.

There are no concrete plans for implementation as of Jackson 2.18 development.