-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Symtab refactor #1
Open
metti
wants to merge
27
commits into
master
Choose a base branch
from
symtab_refactor
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
metti
force-pushed
the
symtab_refactor
branch
10 times, most recently
from
June 15, 2020 09:32
13f7226
to
a3c098b
Compare
metti
force-pushed
the
symtab_refactor
branch
4 times, most recently
from
June 17, 2020 13:37
d7f5216
to
71308cf
Compare
When we get the qualified name of a pointer type, the result is cached so that subsequent invocations of the getter yields a faster result. When the pointed-to type is not yet fully constructed at the time of the first invocation of the getter of the qualified name, what is cached is the name of the pointer to a non-yet fully qualified type. Then after the pointed-to type is fully constructed (and canonicalized), the pointer type also becomes canonicalized (in that order) and thus, the cache needs to be invalidated so that the qualified name of the pointer to the fully qualified type is cached again by a subsequent invocation of the getter. The problem in this problem report is that the cache doesn't get invalidated when the pointer type is canonicalized. This patch fixes that. A similar issue exists with reference and qualified types so the patch addresses it for those types as well. * include/abg-ir.h (decl_base::clear_qualified_name): Declare new protected member function. ({pointer_type_def, reference_type_def, qualified_type_def, function_type}::on_canonical_type_set): Declare virtual member functions. * src/abg-ir.cc (decl_base::clear_qualified_name): Define new protected member function. ({pointer_type_def, reference_type_def, qualified_type_def, function_type}::on_canonical_type_set): Define virtual member functions. * tests/data/test-annotate/test17-pr19027.so.abi: Adjust. * tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise. * tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise. Signed-off-by: Dodji Seketeli <dodji@redhat.com>
metti
force-pushed
the
symtab_refactor
branch
3 times, most recently
from
June 22, 2020 13:35
3b9bf71
to
6e600f3
Compare
The method type_base::get_canonical_type_for contains some logic which temporarily changes a couple of control flags in the type's environment. It then restores these, but not consistently. This patch ensures the flags are restored unconditionally. * src/abg-ir.cc (get_canonical_type_for): Ensure the do_on_the_fly_canonicalization and decl_only_class_equals_definition flags are restored unconditionally. Signed-off-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Dodji Seketeli <dodji@redhat.com>
These are zero impact changes. * include/abg-fwd.h: Correct doc-comment reference to enum_type_decl. * src/abg-comp-filter.cc: Fix doc-comment syntax. * src/abg-comparison.cc (operator<<): In the diff_category overload, fix code indentation. * src/abg-default-reporter.cc (report): In the class_or_union_diff overload, adjust comment to reflect that the code is reporting changes between declaration-only and defined types, in either direction. Signed-off-by: Giuliano Procida <gprocida@google.com>
* src/abg-default-reporter.cc (report): In the enum_diff overload, introduce the name ctxt to replace four occurrences of d.context(). Signed-off-by: Giuliano Procida <gprocida@google.com>
This patch brings the enum code closer to the class/union code, in the hope that this will ease future code maintenance. There are no behavioural changes. * src/abg-dwarf-reader.cc (build_enum_type): Rename local variable enum_is_anonymous to is_anonymous. Move initilisation of local variable is_artificial to location corresponding to that in add_or_update_class_type and add_or_update_union_type functions. Signed-off-by: Giuliano Procida <gprocida@google.com>
This patch renames CLASS_DECL_ONLY_DEF_CHANGE_CATEGORY to TYPE_DECL_ONLY_DEF_CHANGE_CATEGORY. * include/abg-comparison.h (TYPE_DECL_ONLY_DEF_CHANGE_CATEGORY): Rename CLASS_DECL_ONLY_DEF_CHANGE_CATEGORY into this. (EVERYTHING_CATEGORY): In the value of this enumerator, rename CLASS_DECL_ONLY_DEF_CHANGE_CATEGORY into TYPE_DECL_ONLY_DEF_CHANGE_CATEGORY. * src/abg-comp-filter.cc (categorize_harmless_diff_node): Likewise. * src/abg-comparison.cc (get_default_harmless_categories_bitmap): Likewise. (operator<<(ostream& o, diff_category c)): Likewise. * src/abg-default-reporter.cc (default_reporter::report): Likewise in the overload for class_or_union_diff. * src/abg-leaf-reporter.cc (leaf_reporter::report): Likewise in the overload for class_or_union_diff. Signed-off-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Dodji Seketeli <dodji@redhat.com>
metti
force-pushed
the
symtab_refactor
branch
3 times, most recently
from
July 2, 2020 09:58
af6ec44
to
63c14dc
Compare
In the absence (but desire) of std::optional<T>, add a simplified version of it to abg_compat:: in case we are compiling with a pre-C++17 standard. Otherwise use std::optional from <optional> directly. This is being used by a later patch and serves as a prerequisite. It only serves the purpose of being a compatibility implementation and does not claim to be complete at all. Just enough for the project's needs. * include/abg-cxx-compat.h (abg_compat::optional): Add new class. * tests/tests-cxx-compat.cc: Add new test cases. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
Add abg_compat::{bind,function,placeholders} to the compatibility layer. That is made use of in a later patch. As usual, for C++ standards that natively support this functionality (C++11 and later), the native implementation is aliased into the abg_compat namespace. * include/abg-cxx-compat.h: add support for abg_compat::{bind,function,placeholders} Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
Being exported through a ksymtab (in case of Linux Kernel binaries) is actually a property of the Elf symbol itself and we can therefore track it along with the symbol that we collect from symtab. While tracking is currently done by keeping separate symbol lists and maps for symtab and ksymtab symbols, they can be consolidated having a property to indicate whether this symbol also appeared as a ksymtab entry. Hence, and for future changes in this area, add this property and update all references. The flag is false initially unless otherwise specified. * include/abg-ir.h (elf_symbol::elf_symbol): Add is_in_ksymtab parameter. (elf_symbol::create): Likewise. (elf_symbol::is_in_ksymtab): New getter declaration. (elf_symbol::set_is_in_ksymtab): New setter declaration. * src/abg-ir.cc (elf_symbol::priv::priv): Add is_in_ksymtab parameter. (elf_symbol::priv::is_in_ksymtab_): New field. (elf_symbol::elf_symbol): Add is_in_ksymtab parameter. (elf_symbol::create): Likewise. (elf_symbol::is_in_ksymtab): New getter implementation. (elf_symbol::set_is_in_ksymtab): New setter implementation. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
In the context of libabigail and a single library run (when reading from dwarf or from xml), a symbol is either suppressed or it is not. While one could argue that this is a property of the read_context, the read_context might not be around anymore when the symbol still is. Hence, persist the 'is_suppressed' state along with the symbol itself. * include/abg-ir.h (elf_symbol::elf_symbol): Add is_suppressed parameter. (elf_symbol::create): Likewise. (elf_symbol::is_suppressed): New getter declaration. (elf_symbol::set_is_suppressed): New setter declaration. * src/abg-ir.cc (elf_symbol::priv::priv): Add is_suppressed parameter. (elf_symbol::priv::is_suppressed_): New field. (elf_symbol::elf_symbol): Add is_suppressed parameter. (elf_symbol::create): Likewise. (elf_symbol::is_suppressed): New getter implementation. (elf_symbol::set_is_suppressed): New setter implementation. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
abg-symtab-reader.{h,cc} shall contain the refactored symtab reader. Create the stub files, an empty unit test and hook everything up in the make system. * include/abg-symtab-reader.h: New header file. * include/Makefile.am: Add new header file abg-symtab-reader.h. * src/Makefile.am: Add new source file abg-symtab-reader.cc. * src/abg-symtab-reader.cc: New source file. * tests/Makefile.am: Add new test case runtestsymtabreader. * tests/test-symtab-reader.cc: New test source file. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
Based on existing functionality, implement the reading of ELF symbol tables as a separate component. This reduces the complexity of abg-dwarf-reader's read_context by separating and delegating the functionality. This also allows dedicated testing. The new namespace symtab_reader contains a couple of new components that work loosely coupled together. Together they allow for a consistent view on a symbol table. With filter criteria those views can be restricted, iterated and consistent lookup maps can be built on top of them. While this implementation tries to address some shortcomings of the previous model, it still provides the high level interfaces to the symbol table contents through sorted iterating and name/address mapped access. symtab_reader::symtab While the other classes in the same namespace are merely helpers, this is the main implementation of symtab reading and storage. Symtab objects are factory created to ensure a consistent construction and valid invariants. Thus a symtab will be loaded by either passing an ELF handle (when reading from binary) or by passing a set of function/variable symbol maps (when reading from XML). When constructed they are considered const and are not writable anymore. As such, all public methods are const. The load reuses the existing implementation for loading symtab sections, but since the new implementation does not distinguish between functions and variables, the code could be simplified. The support for ppc64 function entry addresses has been deferred to a later commit. Linux Kernel symbol tables are now directly loaded by name when encountering symbols prefixed with the __ksymtab_ as per convention. This has been tricky in the past due to various different binary layouts (relocations, position relative relocations, symbol namespaces, CFI indirections, differences between vmlinux and kernel modules). Thus the new implementation is much simpler and is less vulnerable to future ksymtab changes. As we are also not looking up the Kernel symbols by addresses, we could resolve shortcomings with symbol aliasing: Previously a symbol and its alias were indistinguishable as they are having the same symbol address. We could not identify the one that is actually exported via ksymtab. One major architectural difference of this implementation is that we do not early discard suppressed symbols. While we keep them out of the vector of exported symbols, we still make them available for lookup. That helps addressing issues when looking up a symbol by address (e.g. from the ksymtab read implementation) that is suppressed. That would fail in the existing implementation. Still, we intend to only instantiate each symbol once and pass around shared_ptr instances to refer to it from the vector as well as from the lookup maps. For reading, there are two access paths that serve the existing patterns: 1) lookup_symbol: either via a name or an address 2) filtered iteration with begin(), end() The former is used for direct access with a clue in hand (like a name or an address), the latter is used for iteration (e.g. when emitting the XML). symtab_reader::symtab_iterator The symtab_iterator is an STL compatible iterator that is returned from begin() and end() of the symtab. It allows usual forward iterator operations and can optionally take a filter predicate to skip non matching elements. symtab_reader::symtab_filter The symtab_filter serves as a predicate for the symtab_iterator by providing a matches(const elf_symbol_sptr&) function. The predicate is built by ANDing together several conditions on attributes a symbol can have. The filter conditions are implemented in terms of std::optional<bool> members to allow a tristate: "needs to have the condition set", "must not have it set" and "don't care". symtab_reader::symtab_filter_builder This is a convenient way of building filters with a builder pattern and a fluent interface. Hence, filters can be expressed neatly, expressive and precise. When instantiated, via symtab::make_filter(), the filter_builder is preset with suitable defaults. The filter_builder is convertable to a symtab_filter by passing on the local filter copy and therefore serving the fluent interface. symtab_reader::filtered_symtab The filtered_symtab is a convenience zero cost abstraction that allows prepopulating the symtab_filter (call it a capture) such that begin() and end() are now accessible without the need to pass the filter again. Argumentless begin() and end() are a requirement for range-for loops and other STL based algorithms. * include/abg-symtab-reader.h (symtab_filter): New class. (symtab_filter_builder): Likewise. (symtab_iterator): Likewise. (symtab): Likewise. (filtered_symtab): Likewise. * src/abg-symtab-reader.cc (symtab_filter::matches): New. (symtab::make_filter): Likewise. (symtab::lookup_symbol): Likewise. (symbol_sort): Likewise. (symtab::load): Likewise. (symtab::load_): Likewise. * tests/test-symtab-reader.cc (default filter matches anything): New test case. (default filter built with filter_builder matches anything): Likewise. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
While reading the corpus in the read_context, also load the new type symtab object side-by-side and set it accordingly in the resulting corpus. This is still side by side and passive code that gets active in the following changes. This is applicable for the dwarf reader as well as for the reader that consumes XML. * include/abg-corpus.h (corpus::set_symtab): New method declaration. (corpus::get_symtab): New method declaration. * include/abg-fwd.h (symtab_reader::symtab_sptr): New forward declaration. * src/abg-corpus-priv.h (corpus::priv::symtab_): New data member. * src/abg-corpus.cc (corpus::set_symtab): Likewise. (corpus::get_symtab): Likewise. * src/abg-dwarf-reader.cc (read_context::symtab_): New data member. (read_context::initialize): reset symtab_ as well (read_context::symtab): new method that loads a symtab on first access and returns it. (read_debug_info_into_corpus): also set the new symtab object on the current corpus. (read_corpus_from_elf): Also determine (i.e. load) the new symtab object and contribute to the load status. * src/abg-reader.cc (read_corpus_from_input): also set the new type symtab when reading from xml. * tests/test-symtab.cc: Add test assertions. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
Make the corresponding members an implementation detail of corpus::priv. They get computed based on the new symtab whenever they are accessed first with an atomic instantiation. That simplifies the implementation and homogenizes the access to functions and variables. Sorting does not need to be done as the symtab already gives a guarantee for that. Due to improved alias detection in the new symtab reader, ensure we only write symbol aliases to ksymtab symbols if having a ksymtab main symbol. Test data needed to be adjusted as the new symtab reader is stricter in regards to symbols listed in ksymtab. I.e. init_module is not an exported symbol in the ksymtab of a kernel module. * src/abg-corpus-priv.h (corpus::priv::sorted_var_symbols): make private, mutable and optional. (corpus::sorted_undefined_var_symbols): Likewise. (corpus::sorted_fun_symbols): Likewise. (corpus::sorted_undefined_fun_symbols): Likewise. (corpus::priv::get_sorted_fun_symbols): New method declaration. (corpus::priv::get_sorted_undefined_fun_symbols): Likewise. (corpus::priv::get_sorted_var_symbols): Likewise. (corpus::priv::get_sorted_undefined_var_symbols): Likewise. * src/abg-corpus.cc (corpus::elf_symbol_comp_functor): Delete struct. (corpus::priv::get_sorted_fun_symbols): New method implementation. (corpus::priv::get_sorted_undefined_fun_symbols): Likewise. (corpus::priv::get_sorted_var_symbols): Likewise. (corpus::priv::get_sorted_undefined_var_symbols): Likewise. (corpus::get_sorted_fun_symbols): Proxy call to corpus::priv. (corpus::get_sorted_undefined_fun_symbols): Likewise. (corpus::get_sorted_var_symbols): Likewise. (corpus::get_sorted_undefined_var_symbols): Likewise. * src/abg-writer.cc (write_elf_symbol_aliases): When emitting aliases for a kernel symbol, ensure to only emit exported aliases. * tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: update test data. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
… symtab Make the corresponding members an implementation detail of corpus::priv. They get computed based on the new symtab whenever they are accessed first with an atomic instantiation. That simplifies the implementation and homogenizes the access to functions and variables. Sorting does not need to be done as the symtab already gives a guarantee for that. * src/abg-corpus-priv.h (corpus::priv::unrefed_var_symbols): make private, mutable and optional. (corpus::unrefed_fun_symbols): Likewise. (corpus::priv::get_unreferenced_function_symbols): New method declaration. (corpus::priv::get_unreferenced_variable_symbols): Likewise. * src/abg-corpus.cc (corpus::priv::build_unreferenced_symbols_tables): Delete method. (corpus::priv::get_unreferenced_function_symbols): New method implementation. (corpus::priv::get_unreferenced_variable_symbols): Likewise. (corpus::get_unreferenced_function_symbols): Proxy call to corpus::priv. (corpus::get_unreferenced_variable_symbols): Likewise. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
Instead of using the corpus var|function_symbol_maps for symbol lookups, let build_elf_symbol_from_reference use the symtab::lookup_symbol method. That leads to a shorter implementation and we can drop the indicative parameter. * src/abg-reader.cc (build_elf_symbol_from_reference): drop last parameter indicating the lookup type and use corpus symtab for the lookup (build_function_decl): Adjust accordingly. (build_var_decl): Likewise. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
Testing whether a symbol is exported can be simplified using the new symtab implementation. The same holds true for whether a symbol is exported via ksymtab in case of linux kernel binaries. So, do that. * src/abg-dwarf-reader.cc (function_symbol_is_exported): Use new symtab implementation. (variable_symbol_is_exported): Likewise. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
Now that the new symtab implementation is capable of reading the ksymtab, we can switch over the implementation to gather information from there and delete all now-obsolete code. * src/abg-dwarf-reader.cc: (kernel_symbol_table_kind): Delete. (ksymtab_format): Likewise. (read_context::ksymtab_format_): Likewise. (read_context::ksymtab_entry_size_): Likewise. (read_context::nb_ksymtab_entries_): Likewise. (read_context::nb_ksymtab_gpl_entries_): Likewise. (read_context::ksymtab_section_): Likewise. (read_context::ksymtab_reloc_section_): Likewise. (read_context::ksymtab_gpl_section_): Likewise. (read_context::ksymtab_gpl_reloc_section_): Likewise. (read_context::ksymtab_strings_section_): Likewise. (read_context::linux_exported_fn_syms_): Likewise. (read_context::linux_exported_var_syms_): Likewise. (read_context::linux_exported_gpl_fn_syms_): Likewise. (read_context::linux_exported_gpl_var_syms_): Likewise. (read_context::initialize): Remove initializations accordingly. (read_context::find_ksymtab_section): Delete. (read_context::find_ksymtab_gpl_section): Likewise. (read_context::find_ksymtab_reloc_section): Likewise. (read_context::find_ksymtab_gpl_reloc_section): Likewise. (read_context::find_ksymtab_strings_section): Likewise. (read_context::find_any_ksymtab_section): Likewise. (read_context::find_any_ksymtab_reloc_section): Likewise. (read_context::lookup_elf_symbol_from_index): Adjust overload with main logic.. (read_context::linux_exported_fn_syms): Delete. (read_context::create_or_get_linux_exported_fn_syms): Likewise. (read_context::linux_exported_var_syms): Likewise. (read_context::create_or_get_linux_exported_var_syms): Likewise. (read_context::linux_exported_gpl_fn_syms): Delete. (read_context::create_or_get_linux_exported_gpl_fn_syms): Likewise. (read_context::linux_exported_gpl_var_syms): Likewise. (read_context::create_or_get_linux_exported_gpl_var_syms): Likewise. (read_context::try_reading_first_ksymtab_entry): Likewise. (read_context::try_reading_first_ksymtab_entry_using_pre_v4_19_format): Likewise. (read_context::try_reading_first_ksymtab_entry_using_v4_19_format): Likewise. (read_context::get_ksymtab_format_module): Likewise. (read_context::get_ksymtab_format): Likewise. (read_context::get_ksymtab_symbol_value_size): Likewise. (read_context::get_ksymtab_entry_size): Likewise. (read_context::get_nb_ksymtab_entries): Likewise. (read_context::get_nb_ksymtab_gpl_entries): Likewise. (read_context::populate_symbol_map_from_ksymtab): Likewise. (read_context::populate_symbol_map_from_ksymtab_reloc): Likewise. (read_context::load_kernel_symbol_table): Likewise. (read_context::load_ksymtab_symbols): Likewise. (read_context::load_ksymtab_gpl_symbols): Likewise. (read_context::load_linux_specific_exported_symbol_maps): Likewise. (read_context::load_symbol_maps): Do not load kernel symbol maps. (read_context::maybe_adjust_sym_address_from_v4_19_ksymtab): Delete. (read_context::add_fn_symbols_to_map): Likewise. (read_context::add_var_symbols_to_map): Likewise. (read_context::read_debug_info_into_corpus): Fill export maps from new symtab. (read_context::lookup_elf_fn_symbol_from_address): Delete. (read_context::lookup_elf_var_symbol_from_address): Likewise. (read_context::lookup_elf_symbol_from_address): Likewise. (read_context::lookup_public_function_symbol_from_elf): Likewise. (read_context::fun_entry_addr_sym_map_sptr): Likewise. (read_context::fun_entry_addr_sym_map): Likewise. (read_context::var_addr_sym_map): Likewise. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
This migrates more helpers to abg-elf-helpers: lookup_ppc64_elf_fn_entry_point_address with dependencies read_uint64_from_array_of_bytes read_int_from_array_of_bytes address_is_in_opd_section with dependency address_is_in_section read_context::find_opd_section and read_context::opd_section_ are obsolete. * src/abg-dwarf-reader.cc (read_context::opd_section_): Delete. (read_context::find_opd_section): Delete. (read_context::read_uint64_from_array_of_bytes): Delete. (read_context::read_int_from_array_of_bytes): Delete. (read_context::lookup_ppc64_elf_fn_entry_point_address): Delete. (read_context::address_is_in_opd_section): Delete. (read_context::address_is_in_section): Delete. (read_context::load_symbol_maps_from_symtab_section): Adjust. * src/abg-elf-helpers.cc (read_int_from_array_of_bytes): New. (read_uint64_from_array_of_bytes): New. (lookup_ppc64_elf_fn_entry_point_address): New. (address_is_in_section): New. (address_is_in_opd_section): New. * src/abg-elf-helpers.h (lookup_ppc64_elf_fn_entry_point_address): New declaration. (address_is_in_opd_section): New declaration. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
When loading the symtab from an ppc64 binary, also keep track of the function entry addresses as a key for the symbol lookup. That accommodates the differences in DWARF pointing to the function entry address while the symbol table points to the function pointer. The implementation is mostly copied and adopted from abg-dwarf-reader's read_context to add this functionality also to the new symtab reader. * src/abg-symtab-reader.cc (symtab::lookup_symbol): fall back to lookup the address in entry_addr_symbol_map_. (symtab::load): update the function entry address map for ppc64 targets. (symtab::update_function_entry_address_symbol_map): New function implementation. * src/abg-symtab-reader.h (symtab::entry_addr_symbol_map_): New data member. (symtab::update_function_entry_address_symbol_map): New function declaration. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
With the prework in previous commits, we are now able to drop the public symbols maps in corpus::priv and replace them by private members with access through getters. The getters use the new symtab implementation to generate the maps on the fly. Setters are not required anymore and are removed. Also remove redundant getters. We could also remove the getters for the symbol maps and the local caching variable and leave it all to lookup_symbol, but this is left for a later change. * include/abg-corpus.h (corpus::set_fun_symbol_map): Remove method declaration. (corpus::set_undefined_fun_symbol_map): Likewise. (corpus::set_var_symbol_map): Likewise. (corpus::set_undefined_var_symbol_map): Likewise. (corpus::get_fun_symbol_map_sptr): Likewise. (corpus::get_undefined_fun_symbol_map_sptr): Likewise. (corpus::get_var_symbol_map_sptr): Likewise. (corpus::get_undefined_var_symbol_map_sptr): Likewise. * src/abg-corpus-priv.h (corpus::priv::var_symbol_map): make private and mutable (corpus::priv::undefined_var_symbol_map): Likewise. (corpus::priv::fun_symbol_map): Likewise. (corpus::priv::undefined_fun_symbol_map): Likewise. (corpus::priv::get_fun_symbol_map): New method declaration. (corpus::priv::get_undefined_fun_symbol_map): Likewise. (corpus::priv::get_var_symbol_map): Likewise. (corpus::priv::get_undefined_var_symbol_map): Likewise. * src/abg-corpus.cc (corpus::priv::get_fun_symbol_map): New method implementation. (corpus::priv::get_undefined_fun_symbol_map): Likewise. (corpus::priv::get_var_symbol_map): Likewise. (corpus::priv::get_undefined_var_symbol_map): Likewise. (corpus::is_empty): depend on symtab only. (corpus::set_fun_symbol_map): Remove method. (corpus::set_undefined_fun_symbol_map): Likewise. (corpus::set_var_symbol_map): Likewise. (corpus::set_undefined_var_symbol_map): Likewise. (corpus::get_fun_symbol_map_sptr): Likewise. (corpus::get_undefined_fun_symbol_map_sptr): Likewise. (corpus::get_var_symbol_map_sptr): Likewise. (corpus::get_undefined_var_symbol_map_sptr): Likewise. (corpus::get_fun_symbol_map): Use corpus::priv proxy method. (corpus::get_undefined_fun_symbol_map): Likewise. (corpus::get_var_symbol_map): Likewise. (corpus::get_undefined_var_symbol_map): Likewise. * src/abg-dwarf-reader.cc (read_debug_info_into_corpus): Do not set corpus symbol maps anymore. * src/abg-reader.cc (read_corpus_from_input): Likewise. * tests/test-symtab.cc (assert_symbol_count): Do not access the corpus symbol maps through sptr anymore. * tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: Adjust expected test output. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
The introduction of the new symtab reader incorporated much of the existing functionality. Now that the most code parts are migrated to the new symtab reader, we can safely remove the old code paths. Ignoring the symbol table is not a thing anymore. The new symtab reader does read the symtab unconditionally for consistency reasons. Hence also remove all functionality around conditional symtab reading. * include/abg-dwarf-reader.h (set_ignore_symbol_table): Remove. (get_ignore_symbol_table): Likewise. * src/abg-dwarf-reader.cc (add_symbol_to_map): Likewise. (read_context::options_type::ignore_symbol_table): Likewise. (read_context::options_type): Adjust. (read_context::fun_addr_sym_map_): Remove. (read_context::fun_entry_addr_sym_map_): Likewise. (read_context::fun_syms_): Likewise. (read_context::var_addr_sym_map_): Likewise. (read_context::var_syms_): Likewise. (read_context::undefined_fun_syms_): Likewise. (read_context::undefined_var_syms_): Likewise. (read_context::initialize): Adjust. (read_context::lookup_elf_symbol_from_index): Remove. (read_context::fun_entry_addr_sym_map_sptr): Likewise. (read_context::fun_entry_addr_sym_map): Likewise. (read_context::fun_syms_sptr): Likewise. (read_context::fun_syms): Likewise. (read_context::var_syms_sptr): Likewise. (read_context::var_syms): Likewise. (read_context::undefined_fun_syms_sptr): Likewise. (read_context::undefined_var_syms_sptr): Likewise. (read_context::load_symbol_maps_from_symtab_section): Likewise. (read_context::load_symbol_maps): Likewise. (read_context::maybe_load_symbol_maps): Likewise. (set_ignore_symbol_table): Likewise. (get_ignore_symbol_table): Likewise. (create_default_var_sym): Likewise. (build_var_decl): Adjust. (function_is_suppressed): Likewise. (variable_is_suppressed): Likewise. (build_function_decl): Likewise. (add_symbol_to_map): Remove. (read_corpus_from_elf): Adjust. (build_corpus_group_from_kernel_dist_under): Likewise. * tools/abidw.cc (main): Likewise. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
Extend the test functionality in test-symtab to allow processing of KMI whitelists and add additional test cases for whitelist handling. * tests/data/Makefile.am: add new test files * tests/data/test-symtab/basic/ofov_all.whitelist: New test file, * tests/data/test-symtab/basic/ofov_function.whitelist: Likewise. * tests/data/test-symtab/basic/ofov_irrelevant.whitelist: Likewise. * tests/data/test-symtab/basic/ofov_variable.whitelist: Likewise. * tests/test-symtab.cc (read_corpus): Add support for whitelists. (assert_symbol_count): Likewise. (Symtab::SymtabWithWhitelist): New testcase. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
metti
force-pushed
the
symtab_refactor
branch
4 times, most recently
from
July 3, 2020 14:37
ce02c6e
to
64e3bd0
Compare
In case of aliased symbols, the "main symbol" cannot be deduced from the symtab as this solely contains a name->addr mapping and aliases are represented by multiple names resolving to the same address. Therefore the main symbol can only be picked rather randomly and unpredictable. Unlike DWARF, which contains a single symbol entry for only the main symbol. Hence we can (late) detect the main symbol. Exploiting that property allows to correct the addr->symbol lookup in the symtab to return the correct main symbol and it also allows to correct the aliased symbols to maintain the correct main symbol. This patch adds the `update_main_symbol` functionality to `elf_symbol` to update the main symbol by name (ELF symbols need unique names) and adds `update_main_symbol` to `symtab` that makes use of said new method. When we discover a main symbol during DWARF reading, we instruct the symtab to update the mapping. This creates consistent representations across different builds of the same binary with the same ABI by pinning down the main symbol to the defined one. Knowing the main symbol also helps to keep the correct dwarf information in the representation in the presence of symbol suppressions. A later patch will address that. Some test cases in tests/data need adjustment and they have all been verified to be valid changes. - main symbol changed for various elf symbols - test-annotate/test15-pr18892.so.abi - test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi - test-annotate/test3.so.abi - test-read-dwarf/test15-pr18892.so.abi - test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi - test-read-dwarf/test3.so.abi - test-read-dwarf/test3.so.hash.abi - due to main symbol changes, the symbol diff needs to be corrected - test-diff-dwarf/test12-report.txt - test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-0.txt - test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-1.txt - the test scenario needed adjustments as the main symbol changed - test-diff-suppr/test23-alias-filter-4.suppr - test-diff-suppr/test23-alias-filter-report-0.txt - test-diff-suppr/test23-alias-filter-report-2.txt As usual, the complete changelog follows. * include/abg-ir.h (elf_symbol::update_main_symbol): New method. * include/abg-symtab-reader.h (symtab::update_main_symbol): New method. * src/abg-dwarf-reader.cc (build_var_decl): Hint symtab about main symbol discovered in DWARF. (build_function_decl): Likewise. * src/abg-ir.cc (elf_symbol::get_main_symbol): Lock the weak_ptr on access in both overloads. (update_main_symbol): New method to allow updating the main symbol. * src/abg-symtab-reader.cc (symtab::update_main_symbol): New method. * data/Makefile.am: Add new test data files. * tests/data/test-annotate/test15-pr18892.so.abi: Updated test file. * tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise. * tests/data/test-annotate/test2.so.abi: Likewise. * tests/data/test-annotate/test3.so.abi: Likewise. * tests/data/test-diff-dwarf/test12-report.txt: Likewise. * tests/data/test-diff-dwarf/test42-PR21296-clanggcc-report0.txt: Likewise. * tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-0.txt: Likewise. * tests/data/test-diff-pkg/tbb-4.1-9.20130314.fc22.x86_64--tbb-4.3-3.20141204.fc23.x86_64-report-1.txt: Likewise. * tests/data/test-diff-suppr/test23-alias-filter-4.suppr: Likewise. * tests/data/test-diff-suppr/test23-alias-filter-report-0.txt: Likewise. * tests/data/test-diff-suppr/test23-alias-filter-report-2.txt: Likewise. * tests/data/test-read-dwarf/PR22015-libboost_iostreams.so.abi: Likewise. * tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise. * tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise. * tests/data/test-read-dwarf/test11-pr18828.so.abi: Likewise. * tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise. * tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise. * tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise. * tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise. * tests/data/test-read-dwarf/test2.so.abi: Likewise. * tests/data/test-read-dwarf/test2.so.hash.abi: Likewise. * tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Likewise. * tests/data/test-read-dwarf/test3.so.abi: Likewise. * tests/data/test-read-dwarf/test3.so.hash.abi: Likewise. * tests/data/test-symtab/basic/aliases.c: New test source file. * tests/data/test-symtab/basic/aliases.so: Likewise. * tests/test-symtab.cc (Symtab::AliasedFunctionSymbols): New test case. (Symtab::AliasedVariableSymbols): Likewise. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
When a symbol is suppressed and it happens to be the main symbol of a group of aliased symbols where another symbol is not suppressed, the dwarf reader discards the DWARF information upon reading and the writer will not be able to connect dwarf information to the aliased elf symbol. In order to address this, ensure we are not suppressing symbols (actually functions and variables) for which an alias is not suppressed. We therefore keep the DWARF information even if only a non-main symbol is asked for. Likewise, when the abg-writer is having to attach an elf-symbol-id to the DWARF collected information (for functions and variables), instead of omitting the symbol altogether, rather make use of the property of aliases and connect the dwarf information to an alias instead. This way the function dwarf information stays connected to the elf symbol that we want to track. * src/abg-dwarf-reader.cc(function_is_suppressed): Do not suppress a function for which there is an alias that is not suppressed. (variable_is_suppressed): Likewise for variables. * src/abg-writer.cc(write_elf_symbol_reference): Fall back to any aliased symbol if the main symbol is suppressed. * tests/data/Makefile.am: Add new test files. * tests/data/test-read-dwarf/test3-alias-1.so.hash.abi: New test file. * tests/data/test-read-dwarf/test3-alias-1.suppr: Likewise. * tests/data/test-read-dwarf/test3-alias-2.so.hash.abi: Likewise. * tests/data/test-read-dwarf/test3-alias-2.suppr: Likewise. * tests/data/test-read-dwarf/test3-alias-3.so.hash.abi: Likewise. * tests/data/test-read-dwarf/test3-alias-3.suppr: Likewise. * tests/data/test-read-dwarf/test3-alias-4.so.hash.abi: Likewise. * tests/data/test-read-dwarf/test3-alias-4.suppr: Likewise. * tests/test-read-dwarf.cc: Add new test cases. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
The Linux Kernel has a mechanism (MODVERSIONS) to checksum symbols based on their type. In a way similar to what libabigail does, but different. The CRC values for symbols can be extracted from the symtab either by following the __kcrctab_<symbol> entry or by using the __crc_<symbol> value directly. This patch adds support for extracting those CRC values and storing them as a property of elf_symbol. Subsequently, 'crc' gets emitted as an attribute of 'elf-symbol' in the XML representation. CRC comparisons are also added to the abidiff machinery such that if both representations of a comparison contain a CRC value, they will be compared and if any of the values is unset (i.e. == 0), equality is assumed. Differences will be reported in the format that the Kernel presents them e.g. via Module.symvers. It is likely, but not necessary, that a CRC difference comes along with an ABI difference reported by libabigail. Not everything that leads to a change of the CRC value an ABI breakage in the libabigail sense. Also add some test cases to ensure reading crc values from kernel binaries works as expected. The empty-report files have been consolidated to one file: empty-report.txt. That also clarifies the expected outcome for the affected tests. * include/abg-ir.h (elf_symbol::elf_symbol): Add crc parameter. (elf_symbol::create): Likewise. (elf_symbol::get_crc): New member method. (elf_symbol::set_crc): New member method. * src/abg-ir.cc (elf_symbol::priv::crc_): New data member. * src/abg-ir.cc (elf_symbol::priv::priv): Add crc parameter. (elf_symbol::elf_symbol): Likewise. (elf_symbol::create): Likewise. (elf_symbol::textually_equals): Add crc support. (elf_symbol::get_crc): New member method. (elf_symbol::set_crc): New member method. * src/abg-reader.cc (build_elf_symbol): Add crc support. * src/abg-reporter-priv.cc (maybe_report_diff_for_symbol): Likewise. * src/abg-symtab-reader.cc (symtab::load): Likewise. * src/abg-writer.cc (write_elf_symbol): Likewise. * tests/data/Makefile.am: Add new test data files. * tests/data/test-abidiff/empty-report.txt: New file. * tests/data/test-abidiff/test-PR18166-libtirpc.so.report.txt: Deleted. * tests/data/test-abidiff/test-PR24552-report0.txt: Deleted. * tests/data/test-abidiff/test-crc-0.xml: New test file. * tests/data/test-abidiff/test-crc-1.xml: Likewise. * tests/data/test-abidiff/test-crc-2.xml: Likewise. * tests/data/test-abidiff/test-crc-report.txt: Likewise. * tests/data/test-abidiff/test-empty-corpus-report.txt: Deleted. * tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: Add crc values. * tests/data/test-read-write/test-crc.xml: New test data file. * tests/data/test-symtab/kernel-modversions/Makefile: New test source. * tests/data/test-symtab/kernel-modversions/one_of_each.c: Likewise. * tests/data/test-symtab/kernel-modversions/one_of_each.ko: Likewise. * tests/test-abidiff.cc: Add new test case. * tests/test-read-write.cc: Likewise. * tests/test-symtab.cc (Symtab::KernelSymtabsWithCRC): New test case. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
When reading from XML with a symbol whitelist that leads to suppression of aliased symbols, abidiff would hit an assertion and crash when looking up the aliased symbol due to it being suppressed. In the new symtab reader we can still suppress a symbol without removing it from the lookup. Make use of that property to fix this bug. A test has been added for this as well. * src/abg-reader.cc (build_elf_symbol): Improve handling of suppressed aliased symbols when reading from XML. * src/abg-symtab-reader.cc (load): Likewise. * tests/data/Makefile.am: Add new test data files. * tests/data/test-abidiff-exit/test-missing-alias-report.txt: New test file. * tests/data/test-abidiff-exit/test-missing-alias.abi: Likewise. * tests/data/test-abidiff-exit/test-missing-alias.suppr: Likewise. * tests/test-abidiff-exit.cc: Add support for whitelists and add new testcase. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com>
metti
pushed a commit
that referenced
this pull request
Nov 24, 2020
The symptom of the issue at hand is that sometimes there can be types missing from the abixml output. This happens when analysing some C++ code bases. The core of the issue is the following. Support we have a type "struct S" defined somewhere as: struct S // #0 { int dm1; char dm2; }; S s; Suppose that in another translation unit, we have the class 'S' being extended to add a member type to it: struct S // #1 { typedef int dm1_type; }; typedef S::dm1_type Integer; Integer something; When emitting the abixml for the codebase, the definition of the typedef S::dm1_type can be missing. Note that in location #1, struct S is considered declaration-only. It's definition is in another translation unit, in location #0. So the abixml writer emits the 'struct S' defined in location #0, but forgets to emit the 'struct S' in #1, which is indirectly used for the sole purpose of using its member type S::dm1_type. This patch emits the S::dm1_type type that is mistakenly forgotten today. Now that the "struct S" of #1 is also emitted, a tangent problem is uncovered: S in #0 can be wrongly thought to be equivalent to S in #1, for ABI purposes This is because of an ODR-based optimization that is used for C++. That is, the two struct S can be wrongly considered equivalent just because they have the same name. Note that ODR means "One Definition Rule[1]" This patch removes the ODR-based optimization and thus fixes many of the issues uncovered by the previous changes. The patch also uncovered that some non-static variables were sometimes wrongly being added to the set of exported variables, while libabigail reads corpora from abixml. The patch fixes this as well. [1]: One Definition Rule: https://en.wikipedia.org/wiki/One_Definition_Rule * include/abg-corpus.h (corpus::{record_canonical_type, lookup_canonical_type}): Remove function declarations. * src/abg-corpus-priv.h (corpus::priv::canonical_types_): Remove data member. * src/abg-corpus.cc (corpus::{record_canonical_type, lookup_canonical_type}): Remove functions. * src/abg-ir.cc (type_eligible_for_odr_based_comparison): Remove static function. (type_base::get_canonical_type_for): Don't perform the ODR-based optimization for C++ anymore. * src/abg-reader.cc (read_context&::maybe_add_var_to_exported_decls): Don't add a variable that hasn't been added to its scope. Otherwise, it means we added a variable that wasn't yet properly constructed. Also add a new overload for var_decl_sptr&. (build_var_decl): Do not add the var to its the set of exported declaration before we are sure it has been fully constructed and added to the scope it belongs. (build_class_decl): Only add *static* data members to the list of exported declarations. (handle_var_decl): A var decl seen here is a global variable declaration. Add it to the list of exported declarations. * src/abg-writer.cc (write_context::decl_only_type_is_emitted): Constify parameter. (write_translation_unit): Do not forget to emit referenced types that were maybe not canonicalized. Also, avoid using noop_deleter when it's not necessary. (write_namespace_decl): Do not forget to emit canonicalized types that are present in namespaces other than the global namespace. * tests/runtestslowselfcompare.sh.in: New test that compares libabigail.so against its own ABIXML representation. * tests/Makefile.am: Add the new test runtestslowselfcompare.sh to source distribution. This test is too slow to be run during the course of 'make check'. It takes more than 5 minutes on my slow box here. Rather, it can be run using 'make check-self-compare'. I plan to run this before releases now. * tests/data/test-annotate/libtest24-drop-fns-2.so.abi: Adjust. * tests/data/test-annotate/libtest24-drop-fns.so.abi: Likewise. * tests/data/test-annotate/test0.abi: Likewise. * tests/data/test-annotate/test13-pr18894.so.abi: Likewise. * tests/data/test-annotate/test14-pr18893.so.abi: Likewise. * tests/data/test-annotate/test15-pr18892.so.abi: Likewise. * tests/data/test-annotate/test17-pr19027.so.abi: Likewise. * tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise. * tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise. * tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise. * tests/data/test-annotate/test21-pr19092.so.abi: Likewise. * tests/data/test-read-dwarf/PR22015-libboost_iostreams.so.abi: Likewise. * tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise. * tests/data/test-read-dwarf/PR25042-libgdbm-clang-dwarf5.so.6.0.0.abi: Likewise. * tests/data/test-read-dwarf/PR26261/PR26261-exe.abi: Likewise. * tests/data/test-read-dwarf/libtest24-drop-fns-2.so.abi: Likewise. * tests/data/test-read-dwarf/libtest24-drop-fns.so.abi: Likewise. * tests/data/test-read-dwarf/test-libandroid.so.abi: Likewise. * tests/data/test-read-dwarf/test0.abi: Likewise. * tests/data/test-read-dwarf/test0.hash.abi: Likewise. * tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise. * tests/data/test-read-dwarf/test11-pr18828.so.abi: Likewise. * tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise. * tests/data/test-read-dwarf/test14-pr18893.so.abi: Likewise. * tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise. * tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise. * tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise. * tests/data/test-read-dwarf/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise. * tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise. * tests/data/test-read-dwarf/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise. * tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise. * tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Likewise. * tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise. * tests/data/test-read-write/test28-without-std-fns-ref.xml: Likewise. * tests/data/test-read-write/test28-without-std-vars-ref.xml: Likewise. * tests/data/test-read-write/test6.xml: Likewise. Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.