Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use templates for the base types to factorize code #186

Merged
merged 2 commits into from
Jan 1, 2024

Conversation

robUx4
Copy link
Contributor

@robUx4 robUx4 commented Dec 26, 2023

A lot of the logic is the same between classes. We can factorize the code once we have factorized the storage (and initialization) types in a template.

The default value is moved to a EbmlCallbacksDefault class for the types that can have a default value. So the default value is not copied in each instance of every object.

This was done on top of #173 but maybe it works on its own.

ebml/EbmlElement.h Outdated Show resolved Hide resolved
@robUx4 robUx4 force-pushed the default_init/11 branch 2 times, most recently from fa30c2f to 4f3de95 Compare December 28, 2023 15:23
@robUx4 robUx4 changed the title [RFC] Use templates for the base types to factorize code Use templates for the base types to factorize code Dec 28, 2023
@robUx4 robUx4 requested a review from mbunkus December 28, 2023 15:30
@robUx4
Copy link
Contributor Author

robUx4 commented Dec 28, 2023

I need to update libmatroska to build with this branch

Copy link
Contributor

@mbunkus mbunkus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't merge with the current master at the moment. Can you please rebase? I'll review then.

@robUx4
Copy link
Contributor Author

robUx4 commented Dec 29, 2023

Doesn't merge with the current master at the moment. Can you please rebase? I'll review then.

As soon as it works in libmatroska :)

@@ -220,6 +277,7 @@ class EBML_DLL_API EbmlCallbacks {
inline EbmlElement & NewElement() const { return Create(); }
/// is infinite/unknown size allowed
inline bool CanHaveInfiniteSize() const { return CanInfinite; }
virtual const bool HasDefault() const { return false; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiler warning:

lib/libebml/ebml/EbmlElement.h:280:17: warning: 'const' type qualifier on return type has no effect [-Wignored-qualifiers]
virtual const bool HasDefault() const { return false; }

Just change it to virtual bool HasDefault() const both here & in EbmlCallbacksDefault.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional compiler warning:

lib/libebml/ebml/EbmlElement.h:262:20: warning: 'libebml::EbmlCallbacks' has virtual functions but non-virtual destructor [-Wnon-virtual-dtor]
class EBML_DLL_API EbmlCallbacks {                                                                                                                                                                                                                                                                                              

Adding a simple

    virtual ~EbmlCallbacks() {}

seems to get rid of them, but at the cost of a compiler error in various later source files, e.g. EbmlDummy.cpp:

lib/libebml/src/EbmlDummy.cpp:13:1: fatal error: constexpr variable cannot have non-literal type 'const EbmlCallbacks'
DEFINE_EBML_CLASS_ORPHAN(EbmlDummy, 0xFF, 1, "DummyElement")
^
lib/libebml/ebml/EbmlElement.h:150:60: note: expanded from macro 'DEFINE_EBML_CLASS_ORPHAN'
#define DEFINE_EBML_CLASS_ORPHAN(x,id,idl,name)            DEFINE_xxx_CLASS_ORPHAN(x,id,idl,name,GetEbmlGlobal_Context)
                                                           ^
lib/libebml/ebml/EbmlElement.h:145:38: note: expanded from macro 'DEFINE_xxx_CLASS_ORPHAN'
    constexpr const EbmlCallbacks x::ClassInfos(x::Create, Id_##x, false, name, Context_##x); \
                                     ^
lib/libebml/ebml/EbmlElement.h:274:13: note: 'EbmlCallbacks' is not literal because it has a user-provided destructor
    virtual ~EbmlCallbacks() {}
            ^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed both. It's odd I don't get these warnings in clang.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm the warning: 'libebml::EbmlCallbacks' has virtual functions but non-virtual destructor [-Wnon-virtual-dtor] warning is still there in revision 5b434b7.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I also compile with clang++, using the following warning/error options:

-Wall -Wno-comment -Wfatal-errors -fstack-protector-strong -O3 -g3 -std=c++17 -Wnon-virtual-dtor -Wextra -Wno-missing-field-initializers -Wunused -Wpedantic -Woverloaded-virtual -Wshadow -Qunused-arguments -Wno-self-assign -Wno-mismatched-tags -Wno-inconsistent-missing-override -Wno-potentially-evaluated-expression -Wno-extra-semi

Pretty sure that -Wnon-virtual-dtor turns on this particular warning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. I also added #195 to make these issues more visible in the future.

@mbunkus
Copy link
Contributor

mbunkus commented Dec 30, 2023

I still get an error:

lib/libmatroska/src/KaxSemantic.cpp:113:1: fatal error: constexpr variable cannot have non-literal type 'const EbmlCallbacks'
DEFINE_MKX_BINARY_CONS(KaxSimpleBlock, 0xA3, 1, KaxCluster, "SimpleBlock")
^
lib/libmatroska/matroska/KaxDefines.h:30:32: note: expanded from macro 'DEFINE_MKX_BINARY_CONS'
    constexpr EbmlCallbacks a::ClassInfos(a::Create, Id_##a, false, e, Context_##a);
                               ^
lib/libebml/ebml/EbmlElement.h:274:13: note: 'EbmlCallbacks' is not literal because it has a non-trivial destructor
    virtual ~EbmlCallbacks() = default;
            ^
lib/libebml/ebml/EbmlElement.h:274:13: note: destructor for 'EbmlCallbacks' is not trivial because it is virtual
1 error generated.

@robUx4
Copy link
Contributor Author

robUx4 commented Dec 30, 2023

I forgot to push the lastest version of the libmatroska counterpart: Matroska-Org/libmatroska#134
It should work now.

@mbunkus
Copy link
Contributor

mbunkus commented Dec 30, 2023

Thanks. Compiles now. But there are a ton of issues from crashing due to null pointer access to use of uninitialized values. I've fixed some of them locally, will try to look into the others over the next couple of days. Definitely not ready for merging yet.

Copy link
Contributor

@mbunkus mbunkus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are at least three issues that I found. I couldn't really fix them as I didn't understand your new structure enough. Please look into the following:

nullptr arguments not checked

Operators & functions taking pointers are dereferencing the pointers, leading to invalid memory access or exceptions being thrown when nullptr is given as an argument. This immediately happens in operator == in both EbmlString & EbmlUnicodeString. I don't have the stack trace at hand, but both functions were called with a nullptr arg, causing std::string/std::wstring to throw an std::logic_error.

Not sure what the right way to handle nullptr is. Two possibilities:

  1. Always return false, as in "must not be used that way, cannot be equal"
  2. Check if stored Value is empty & return true if so, as in "no string and empty string are the same"

Another function that comes to mind here is EbmlElementDefaultSameStorage::IsSmallerThan.

reads from uninitialized values

Using valgrind I can see tons of errors of the form of making decisions based on uninitialized Value member variables, mostly in EbmlUInteger. This is unsurprising as the EbmlElementDefaultSameStorage and EbmlElementDefaultStorage don't initialize their Value member variable, neither statically in the class definition, nor in the constructor. The constructor in the original EbmlUInteger class from before the changes initialized their value member to the class's default value according to the specs, or to 0 (or the variable's equivalent) if there was no default.

This definitely affects EbmlUInteger, EbmlSInteger, EbmlFloat, EbmlDate as they have trivial member variables, which are only initialized if the code requests it.

It likely does not affect EbmlString, EbmlUnicodeString, EbmlBinary, EbmlCrc32 as those use non-trivial types which do get initialized via their default constructor.

Please take a look at those issues.

@robUx4
Copy link
Contributor Author

robUx4 commented Dec 31, 2023

I simplified the PR to not include the code factorization to make it easier to spot the regressions. The libmatroska counterpart is still needed with this PR.

@robUx4
Copy link
Contributor Author

robUx4 commented Dec 31, 2023

I added some tests for strings with and without a default value and did hit some null pointer dereferencing. It was using the default value when there isn't one. It may also be the valgrind issues you saw. The default value should never be used if the default class says it has no default value.

So I think this branch should be safe now.

@robUx4 robUx4 force-pushed the default_init/11 branch 2 times, most recently from 280eddf to 19ad33a Compare December 31, 2023 11:18
@mbunkus
Copy link
Contributor

mbunkus commented Dec 31, 2023

I cannot test the branch in its current form. I need to be able to query an existing element if it has a default value. In 1.x I can do that via elt.DefaultISset(), which has been removed, which is fine. In earlier iterations of this PR I could go via the EbmlCallbacks with elt.GetClassInfo().HasDefault(), which was fine, too.

However, this current iteration doesn't have a way to retrieve the EbmlCallbacks anymore.

@robUx4 robUx4 mentioned this pull request Jan 1, 2024
@robUx4
Copy link
Contributor Author

robUx4 commented Jan 1, 2024

Rebased on top of #204. I added the DefaultIsSet() call to the macro tests in #204 and turned it into the new call to use in this PR: GetElementSpec().HasDefault().

This PR consists of 2 PRs on top of #204.

@robUx4 robUx4 force-pushed the default_init/11 branch 2 times, most recently from 0eac624 to 288a6b6 Compare January 1, 2024 10:57
@robUx4
Copy link
Contributor Author

robUx4 commented Jan 1, 2024

I reworked the code further. I am now testing default values on an EbmlElement in the tests (ie without knowing anything about the underlying class).

There is no longer a type that holds the default value if there is one. There's one EbmlCallbacksWithDefault template that holds the acutal default value. Types without a default value use EbmlCallbacksDefault.

@mbunkus
Copy link
Contributor

mbunkus commented Jan 1, 2024

Thanks. With elt.ElementSpec().HasDefault() I can now compile this branch (together with the corresponding one for libmatroska). Unfortunately most of my test cases still fail due to different checksums (at least there are no more crashes as far as I can see; so the nullptr comparison seems to have been solved). For example, here's how the EBML Head looks like with a file produced by a mkvmerge compiled with v1.x:

+ EBML head
|+ EBML version: 1
|+ EBML read version: 1
|+ Maximum EBML ID length: 4
|+ Maximum EBML size length: 8
|+ Document type: matroska
|+ Document type version: 4
|+ Document type read version: 1

Here's what the same code compiled with branch default_init/11 looks like:

+ EBML head
|+ EBML version: 0
|+ EBML read version: 0
|+ Maximum EBML ID length: 0
|+ Maximum EBML size length: 0
|+ Document type: matroska
|+ Document type version: 4
|+ Document type read version: 1

Another interesting (but wrong) effect is that the cues suddenly contain "cue codec state" elements, even if the file doesn't contain any codec state elements! They're just… there suddenly. Again I'm comparing files created with the same mkvmerge code, just compiled with different libebml/libmatroska releases.

It seems that the initial values are still not initialized. Running the mkvmerge from default_init/11 with valgrind, I get lots of warnings such as this:

==986973== Conditional jump or move depends on uninitialised value(s)
==986973==    at 0x7290EC: libebml::EbmlUInteger::UpdateSize(std::function<bool (libebml::EbmlElement const&)>, bool) (lib/libebml/src/EbmlUInteger.cpp:59)
==986973==    by 0x725D44: libebml::EbmlMaster::UpdateSize(std::function<bool (libebml::EbmlElement const&)>, bool) (lib/libebml/src/EbmlMaster.cpp:112)
==986973==    by 0x7235FD: libebml::EbmlElement::Render(libebml::IOCallback&, std::function<bool (libebml::EbmlElement const&)>, bool, bool) (lib/libebml/src/EbmlElement.cpp:486)
==986973==    by 0x3FE733: render_ebml_head (src/merge/output_control.cpp:492)
==986973==    by 0x3FE733: render_headers (???:581)
==986973==    by 0x3FE733: create_next_output_file() (???:1425)
==986973==    by 0x39DDDD: main (src/merge/mkvmerge.cpp:3206)

I definitely do not have those warnings when running valgrind on the "good" mkvmerge compiled against v1.x.

These types of decisions based on uninitialized values could easily explain the behavior mentioned above with the "cue codec state" elements suddenly appearing.

@robUx4
Copy link
Contributor Author

robUx4 commented Jan 1, 2024

Thanks for all the tests. If the header is broken it's definitely something that can be tested in libebml. I'll give this a try.

Can you check that #204 works for you ? It will help reduce the scope of this PR.

@mbunkus
Copy link
Contributor

mbunkus commented Jan 1, 2024

Done! #204 is fine & can be merged.

It's mandatory since 8494914 and always stored in the EbmlElement.
So the default value is hardcoded in the class semantic info and not
set in the constructor.

A new EbmlElement subclass is created for EbmlElement types that may
have a default value.

An IsSameValue() method is added to compare an element to a value. This
is used by the EbmlElementDefault class to tell if the element value is
the same as the default when there is one.
@robUx4
Copy link
Contributor Author

robUx4 commented Jan 1, 2024

I found the issue. The old constructor with a default value would initialize the element to its default value. IMO the element should not be initialized until it's manually done. If not it has its default value to use or should not be used at all.

For I went with the forced initialization. Only for elements with a default value, the other ones should not be used without setting a value (or their size is 0 = default value for the type).

Now it passes the test from #206.

Copy link
Contributor

@mbunkus mbunkus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Now it passes all of my tests as well. Nice work!

Can be merged (along with the corresponding branch for libmatroska).

@robUx4
Copy link
Contributor Author

robUx4 commented Jan 1, 2024

Awesome. Thanks for all the testing.

@robUx4 robUx4 merged commit 51c2767 into Matroska-Org:master Jan 1, 2024
15 checks passed
@robUx4 robUx4 deleted the default_init/11 branch January 1, 2024 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants