Skip to content

[C++] Unqualified format() calls are ambiguous in C++20 #30993

@asfimport

Description

@asfimport

I work on MSVC's C++ Standard Library implementation, and we regularly build open-source projects, including Apache Arrow, with development versions of the MSVC toolset in order to find and fix compiler/library bugs before they can cause problems for our programmer-users like you. This also allows us to provide advance notice of breaking changes in the C++ Standard that will affect you, which is the case here.

We recently implemented C+20's std::format(), followed by the C 20 Defect Report P2418R2 "Add Support For std::generator-like Types To std::format" with microsoft/STL#2323. This prevents Apache Arrow from compiling with the latest C+ Standard mode enabled, because MakeTimeFormatter() in diff.cc contains unqualified calls:

template <typename T, bool AddEpoch>
Formatter MakeTimeFormatter(const std::string& fmt_str) {
return [fmt_str](const Array& array, int64_t index, std::ostream* os) {
auto fmt = fmt_str.c_str();
auto unit = checked_cast<const T&>(*array.type()).unit();
auto value = checked_cast<const NumericArray<T>&>(array).Value(index);
using arrow_vendored::date::format;
using std::chrono::nanoseconds;
using std::chrono::microseconds;
using std::chrono::milliseconds;
using std::chrono::seconds;
if (AddEpoch) {
static arrow_vendored::date::sys_days epoch{arrow_vendored::date::jan / 1 / 1970};
switch (unit) {
case TimeUnit::NANO:
*os << format(fmt, static_cast<nanoseconds>(value) + epoch);
break;
case TimeUnit::MICRO:
*os << format(fmt, static_cast<microseconds>(value) + epoch);
break;
case TimeUnit::MILLI:
*os << format(fmt, static_cast<milliseconds>(value) + epoch);
break;
case TimeUnit::SECOND:
*os << format(fmt, static_cast<seconds>(value) + epoch);
break;
}
return;
}
switch (unit) {
case TimeUnit::NANO:
*os << format(fmt, static_cast<nanoseconds>(value));
break;
case TimeUnit::MICRO:
*os << format(fmt, static_cast<microseconds>(value));
break;
case TimeUnit::MILLI:
*os << format(fmt, static_cast<milliseconds>(value));
break;
case TimeUnit::SECOND:
*os << format(fmt, static_cast<seconds>(value));
break;
}
};
}

(The issue involves Argument-Dependent Lookup. The using-declaration using arrow_vendored::date::format; means that the following unqualified calls to format() will consider the overload in the arrow_vendored::date namespace, which is the desired overload. However, because the arguments are std::chrono::duration types, std is considered an "associated namespace", so it will also be searched for overloads. Our implementation of the chrono header includes the new format header (as permitted by the Standard - we do this because chrono types are formattable in C++20), so the std::format overload is visible. Finally, the signature change required by the P2418R2 paper makes std::format a highly greedy "perfect forwarding" signature, so it's ambiguous with the desired arrow_vendored::date::format overload.)

The full steps to reproduce are:

  1. Build the microsoft/STL repo, and update the INCLUDE/LIB/etc. environment variables so that it can be consumed. (Or wait for Visual Studio 2022 17.2 Preview 2 to ship, as it will contain these changes.)

  2. Configure Apache Arrow with the latest C++ Standard version: cmake -G Ninja -S . -B build -DCMAKE_CXX_STANDARD=23 (note that C+23 must be selected at this time even though std::format() is C+20 - it's a long story)

  3. diff.cc fails to compile with:

     

    
    ninja: Entering directory `build'
    [13/191] Building CXX object src\arrow\CMakeFiles\arrow_static.dir\array\diff.cc.obj
    FAILED: src/arrow/CMakeFiles/arrow_static.dir/array/diff.cc.obj
    C:\PROGRA~1\MIB055~1\2022\Preview\VC\Tools\MSVC\1431~1.311\bin\Hostx64\x64\cl.exe  /nologo /TP -DARROW_EXPORTING -DARROW_HAVE_RUNTIME_AVX2 -DARROW_HAVE_RUNTIME_AVX512 -DARROW_HAVE_RUNTIME_BMI2 -DARROW_HAVE_RUNTIME_SSE4_2 -DARROW_HAVE_SSE4_2 -DARROW_STATIC -DARROW_WITH_TIMING_TESTS -DURI_STATIC_BUILD -D_CRT_SECURE_NO_WARNINGS -D_ENABLE_EXTENDED_ALIGNED_STORAGE -IC:\Temp\arrow\cpp\build\src -IC:\Temp\arrow\cpp\src -IC:\Temp\arrow\cpp\src\generated -IC:\Temp\arrow\cpp\thirdparty\flatbuffers\include -IC:\Temp\arrow\cpp\build\boost_ep-prefix\src\boost_ep -IC:\Temp\arrow\cpp\build\xsimd_ep\src\xsimd_ep-install\include -IC:\Temp\arrow\cpp\thirdparty\hadoop\include /DWIN32 /D_WINDOWS  /GR /EHsc /D_SILENCE_TR1_NAMESPACE_DEPRECATION_WARNING   /EHsc /wd5105 /bigobj /utf-8 /W3 /wd4800 /wd4996 /wd4065   /MD /O2 /Ob2 /DNDEBUG -std:c++latest /showIncludes /Fosrc\arrow\CMakeFiles\arrow_static.dir\array\diff.cc.obj /Fdsrc\arrow\CMakeFiles\arrow_static.dir\arrow_static.pdb /FS -c C:\Temp\arrow\cpp\src\arrow\array\diff.cc
    C:\Temp\arrow\cpp\src\arrow\array\diff.cc(652): error C2666: 'arrow_vendored::date::format': 2 overloads have similar conversions
    C:\Temp\arrow\cpp\src\arrow/vendored/datetime/date.h(6264): note: could be 'std::basic_string<char,std::char_traits<char>,std::allocator<char>> arrow_vendored::date::format<_Elem,std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::nano>>>(const CharT *,const Streamable &)'
            with
            [
                _Elem=char,
                CharT=char,
                Streamable=std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::nano>>
            ]
    C:\GitHub\STL\out\build\x64\out\inc\format(3100): note: or       'std::wstring std::format<std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::nano>>>(const std::_Basic_format_string<wchar_t,std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::nano>>>,std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::nano>> &&)' [found using argument-dependent lookup]
    C:\GitHub\STL\out\build\x64\out\inc\format(3095): note: or       'std::string std::format<std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::nano>>>(const std::_Basic_format_string<char,std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::nano>>>,std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::nano>> &&)' [found using argument-dependent lookup]
    C:\Temp\arrow\cpp\src\arrow\array\diff.cc(680): note: while trying to match the argument list '(const _Elem *, std::chrono::time_point<std::chrono::system_clock,std::chrono::duration<__int64,std::nano>>)'
            with
            [
                _Elem=char
            ]
    C:\Temp\arrow\cpp\src\arrow\array\diff.cc(680): note: note: qualification adjustment (const/volatile) may be causing the ambiguity
    C:\Temp\arrow\cpp\src\arrow\array\diff.cc(451): note: see reference to function template instantiation 'arrow::Formatter arrow::MakeFormatterImpl::MakeTimeFormatter<arrow::TimestampType,true>(const std::string &)' being compiled
    [...]
    

    The fix is very simple: remove the using-declaration and explicitly qualify each call. I will submit a pull request on GitHub for this.
     
    (Even if Apache Arrow isn't planning on using C++20 any time soon, making this change will make it easier for MSVC to continue validating Apache Arrow with the latest toolset and Standard changes, and it will remove a potential future headache if and when Apache Arrow does migrate to later Standard versions.)

Environment: Visual Studio 2022 17.2 Preview 2
Reporter: Stephan T. Lavavej / @StephanTLavavej
Assignee: Stephan T. Lavavej / @StephanTLavavej

PRs and other links:

Note: This issue was originally created as ARROW-15520. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions