Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce MS Windows CI #425

Merged
merged 10 commits into from
Sep 2, 2024
45 changes: 45 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,51 @@ jobs:
cd build
ninja

Windows:
runs-on: windows-2022

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Setup python 3.10
uses: actions/setup-python@v5
with:
python-version: '3.10'

- name: Install packages
run:
choco install pkgconfiglite ninja

- name: Install python modules
run: pip3 install meson

- name: Setup MSVC compiler
uses: bus1/cabuild/action/msdevshell@v1
with:
architecture: x64

- name: Install dependencies
uses: kiwix/kiwix-build/actions/dl_deps_archive@main
with:
target_platform: win-x86_64-static

- name: Compile
shell: cmd
run: |
set PKG_CONFIG_PATH=%cd%\BUILD_win-amd64\INSTALL\lib\pkgconfig
set CPPFLAGS=-I%cd%\BUILD_win-amd64\INSTALL\include
meson.exe setup . build -Dstatic-linkage=true --buildtype=debug
cd build
ninja.exe

- name: Test
shell: cmd
run: |
cd build
meson.exe test --verbose
env:
WAIT_TIME_FACTOR_TEST: 10

Linux:
strategy:
Expand Down
4 changes: 1 addition & 3 deletions meson.build
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
project('zim-tools', ['c', 'cpp'],
version : '3.4.2',
license : 'GPLv3+',
default_options : ['c_std=c11', 'cpp_std=c++17', 'werror=true'])

add_global_arguments(['-Werror', '-Wall'], language:'cpp')
default_options : ['c_std=c11', 'cpp_std=c++17', 'werror=true', 'warning_level=1'])

add_global_arguments('-DVERSION="@0@"'.format(meson.project_version()), language : 'cpp')

Expand Down
18 changes: 16 additions & 2 deletions src/metadata.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,21 @@ bool searchRegex(const std::string& regexStr, const std::string& text)

size_t getTextLength(const std::string& utf8EncodedString)
{
return icu::UnicodeString::fromUTF8(utf8EncodedString).length();
// For some unknown reason implicite convertion from std::string to icu::StringPiece
// is broken on Windows.
// Constructors are definde in stringpiece.h as
// ```
// StringPiece(const std::string& str)
// : ptr_(str.data()), length_(static_cast<int32_t>(str.size())) { }
// StringPiece(const char* offset, int32_t len) : ptr_(offset), length_(len) { }
// ```
// However using the first constructor ends with a corrupted StringPiece (wrong ptr)
// and using second one works. Don't ask me why
// This is broken
// icu::StringPiece stringPiece(utf8EncodedString);
// This is not
icu::StringPiece stringPiece(utf8EncodedString.data(), static_cast<int32_t>(utf8EncodedString.size()));
return icu::UnicodeString::fromUTF8(stringPiece).length();
Comment on lines -51 to +65
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only explanation that I could find is because of the following template constructor of StringPiece:

  /**
   * Constructs from some other implementation of a string piece class, from any
   * C++ record type that has these two methods:
   *
   * \code{.cpp}
   *
   *   struct OtherStringPieceClass {
   *     const char* data();  // or const char8_t*
   *     size_t size();
   *   };
   *
   * \endcode
   *
   * The other string piece class will typically be std::string_view from C++17
   * or absl::string_view from Abseil.
   *
   * Starting with C++20, data() may also return a const char8_t* pointer,
   * as from std::u8string_view.
   *
   * @param str the other string piece
   * @stable ICU 65
   */
  template <typename T,
            typename = typename std::enable_if<
                (std::is_same<decltype(T().data()), const char*>::value
#if defined(__cpp_char8_t)
                    || std::is_same<decltype(T().data()), const char8_t*>::value
#endif
                ) &&
                std::is_same<decltype(T().size()), size_t>::value>::type>
  StringPiece(T str)
      : ptr_(reinterpret_cast<const char*>(str.data())),
        length_(static_cast<int32_t>(str.size())) {}

If overload resolution for some reason chooses that constructor over StringPiece(const std::string& str) then the string is passed into it by value, and the pointer is bound to the data of a temporary object that is destroyed after the completion of the StringPiece constructor. That explanation is valid for an explicit stringPiece variable from your commented out example, since the temporary object definitely doesn't survive beyond that point. I am less sure if it can also be valid for the original single liner implementation, as I thought that the temporary objects are required to survive until the end of the full expression in the context of which they were created. But it can be a bug in the compiler (on top of the other bug that chooses the wrong overload).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea but not conclusive.
I have removed this constructor in the stringpiece header and same error occurs.

I am less sure if it can also be valid for the original single liner implementation, as I thought that the temporary objects are required to survive until the end of the full expression in the context of which they were created. But it can be a bug in the compiler (on top of the other bug that chooses the wrong overload).

I confirm to you that single liner version is not helping.
A single line "fix" is return icu::UnicodeString::fromUTF8(utf8EncodedString.data()).length(); but we may truncate binary data containing \0 so I prefer be explicit about the data size.

}

class MetadataComplexCheckBase
Expand Down Expand Up @@ -125,7 +139,7 @@ std::string escapeNonPrintableChars(const std::string& s)
std::ostringstream os;
os << std::hex;
for (const char c : s) {
if (std::isprint(c)) {
if (std::isprint(static_cast<unsigned char>(c))) {
os << c;
} else {
const unsigned int charVal = static_cast<unsigned char>(c);
Expand Down
3 changes: 1 addition & 2 deletions src/tools.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@

#include "tools.h"

#include <dirent.h>
#include <string.h>
#include <sys/stat.h>
#include <cerrno>
Expand All @@ -31,14 +30,14 @@
#include <stdexcept>
#include <vector>
#include <memory>
#include <unistd.h>
#include <algorithm>
#include <regex>
#include <array>

#ifdef _WIN32
#define SEPARATOR "\\"
#else
#include <unistd.h>
#define SEPARATOR "/"
#endif

Expand Down
2 changes: 1 addition & 1 deletion src/zimcheck/meson.build
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ executable('zimcheck',
'../tools.cpp',
'../metadata.cpp',
include_directories : inc,
dependencies: [libzim_dep, icu_dep, thread_dep],
dependencies: [libzim_dep, icu_dep, thread_dep, docopt_dep],
install: true)


Loading
Loading