Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm-objcopy] Add --change-section-address #98664

Merged
merged 4 commits into from
Jul 30, 2024

Conversation

eleanor-arm
Copy link
Contributor

--change-section address and its alias --adjust-section-vma allows modification
of section addresses in a relocatable file. This used to be used, for example,
in Fiasco microkernel.

On a relocatable file this option behaves the same as GNU objcopy, apart from
the fact that it does not issue any warnings, for example, when an argument is
not used.
GNU objcopy does not produce an error when passed an executable file but the
usecase for this is not clear, and the behaviour is inconsistent. The idea of
GNU objcopy --change-section-address is that the option should change both LMA
and VMA in an executable file. Since this patch does not implement executable
file support, only VMA is changed.

@llvmbot
Copy link
Member

llvmbot commented Jul 12, 2024

@llvm/pr-subscribers-llvm-binary-utilities

Author: Eleanor Bonnici (eleanor-arm)

Changes

--change-section address and its alias --adjust-section-vma allows modification
of section addresses in a relocatable file. This used to be used, for example,
in Fiasco microkernel.

On a relocatable file this option behaves the same as GNU objcopy, apart from
the fact that it does not issue any warnings, for example, when an argument is
not used.
GNU objcopy does not produce an error when passed an executable file but the
usecase for this is not clear, and the behaviour is inconsistent. The idea of
GNU objcopy --change-section-address is that the option should change both LMA
and VMA in an executable file. Since this patch does not implement executable
file support, only VMA is changed.


Full diff: https://github.com/llvm/llvm-project/pull/98664.diff

6 Files Affected:

  • (modified) llvm/docs/CommandGuide/llvm-objcopy.rst (+9)
  • (modified) llvm/include/llvm/ObjCopy/CommonConfig.h (+57-39)
  • (modified) llvm/lib/ObjCopy/ConfigManager.cpp (+8-4)
  • (modified) llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp (+46)
  • (modified) llvm/tools/llvm-objcopy/ObjcopyOptions.cpp (+69)
  • (modified) llvm/tools/llvm-objcopy/ObjcopyOpts.td (+8)
diff --git a/llvm/docs/CommandGuide/llvm-objcopy.rst b/llvm/docs/CommandGuide/llvm-objcopy.rst
index 8ccb025a3f0f3..78663fb1cef23 100644
--- a/llvm/docs/CommandGuide/llvm-objcopy.rst
+++ b/llvm/docs/CommandGuide/llvm-objcopy.rst
@@ -303,6 +303,15 @@ them.
 
  Shift LMA of non-zero-sized segments by ``<val>``.
 
+.. option:: --change-section-address <sectionpattern>{=+-}<val>, --adjust-section-vma
+
+ Change the address of ``<sectionpattern>`` to the specified value, or apply
+ offset to the current value. Can be specified multiple times to specify multiple
+ patterns. Each section is only modified by one --change-section-address
+ argument. Changes apply from the right of the command line. If a section name
+ matches multiple patterns, the rightmost change applies. Object file needs to be
+ relocatable.
+
 .. option:: --change-start <incr>, --adjust-start
 
  Add ``<incr>`` to the program's start address. Can be specified multiple
diff --git a/llvm/include/llvm/ObjCopy/CommonConfig.h b/llvm/include/llvm/ObjCopy/CommonConfig.h
index 7f9d90d528b3e..5e447670c4f99 100644
--- a/llvm/include/llvm/ObjCopy/CommonConfig.h
+++ b/llvm/include/llvm/ObjCopy/CommonConfig.h
@@ -44,45 +44,6 @@ struct MachineInfo {
   bool IsLittleEndian;
 };
 
-// Flags set by --set-section-flags or --rename-section. Interpretation of these
-// is format-specific and not all flags are meaningful for all object file
-// formats. This is a bitmask; many section flags may be set.
-enum SectionFlag {
-  SecNone = 0,
-  SecAlloc = 1 << 0,
-  SecLoad = 1 << 1,
-  SecNoload = 1 << 2,
-  SecReadonly = 1 << 3,
-  SecDebug = 1 << 4,
-  SecCode = 1 << 5,
-  SecData = 1 << 6,
-  SecRom = 1 << 7,
-  SecMerge = 1 << 8,
-  SecStrings = 1 << 9,
-  SecContents = 1 << 10,
-  SecShare = 1 << 11,
-  SecExclude = 1 << 12,
-  SecLarge = 1 << 13,
-  LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/SecLarge)
-};
-
-struct SectionRename {
-  StringRef OriginalName;
-  StringRef NewName;
-  std::optional<SectionFlag> NewFlags;
-};
-
-struct SectionFlagsUpdate {
-  StringRef Name;
-  SectionFlag NewFlags;
-};
-
-enum class DiscardType {
-  None,   // Default
-  All,    // --discard-all (-x)
-  Locals, // --discard-locals (-X)
-};
-
 enum class MatchStyle {
   Literal,  // Default for symbols.
   Wildcard, // Default for sections, or enabled with --wildcard (-w).
@@ -191,6 +152,61 @@ struct NewSectionInfo {
   std::shared_ptr<MemoryBuffer> SectionData;
 };
 
+// Flags set by --set-section-flags or --rename-section. Interpretation of these
+// is format-specific and not all flags are meaningful for all object file
+// formats. This is a bitmask; many section flags may be set.
+enum SectionFlag {
+  SecNone = 0,
+  SecAlloc = 1 << 0,
+  SecLoad = 1 << 1,
+  SecNoload = 1 << 2,
+  SecReadonly = 1 << 3,
+  SecDebug = 1 << 4,
+  SecCode = 1 << 5,
+  SecData = 1 << 6,
+  SecRom = 1 << 7,
+  SecMerge = 1 << 8,
+  SecStrings = 1 << 9,
+  SecContents = 1 << 10,
+  SecShare = 1 << 11,
+  SecExclude = 1 << 12,
+  SecLarge = 1 << 13,
+  LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/SecLarge)
+};
+
+struct SectionRename {
+  StringRef OriginalName;
+  StringRef NewName;
+  std::optional<SectionFlag> NewFlags;
+};
+
+struct SectionFlagsUpdate {
+  StringRef Name;
+  SectionFlag NewFlags;
+};
+
+struct AddressUpdate {
+  uint64_t Value = 0;
+  bool Absolute = false;
+  bool Negative = false;
+};
+
+struct SectionPatternAddressUpdate {
+  NameMatcher SectionPattern;
+  AddressUpdate Update;
+};
+
+struct SectionNameAddressUpdate {
+  StringRef Name;
+  AddressUpdate Update;
+};
+
+enum class DiscardType {
+  None,   // Default
+  All,    // --discard-all (-x)
+  Locals, // --discard-locals (-X)
+};
+
 // Configuration for copying/stripping a single file.
 struct CommonConfig {
   // Main input/output options
@@ -219,6 +235,7 @@ struct CommonConfig {
   SmallVector<NewSectionInfo, 0> AddSection;
   SmallVector<StringRef, 0> DumpSection;
   SmallVector<NewSectionInfo, 0> UpdateSection;
+  SmallVector<SectionPatternAddressUpdate, 0> ChangeSectionAddress;
 
   // Section matchers
   NameMatcher KeepSection;
@@ -241,6 +258,7 @@ struct CommonConfig {
   StringMap<SectionFlagsUpdate> SetSectionFlags;
   StringMap<uint64_t> SetSectionType;
   StringMap<StringRef> SymbolsToRename;
+  StringMap<AddressUpdate> SectionsToUpdateAddress;
 
   // Symbol info specified by --add-symbol option.
   SmallVector<NewSymbolInfo, 0> SymbolsToAdd;
diff --git a/llvm/lib/ObjCopy/ConfigManager.cpp b/llvm/lib/ObjCopy/ConfigManager.cpp
index c542c4e5f0743..78fc0c451e1a3 100644
--- a/llvm/lib/ObjCopy/ConfigManager.cpp
+++ b/llvm/lib/ObjCopy/ConfigManager.cpp
@@ -26,7 +26,8 @@ Expected<const COFFConfig &> ConfigManager::getCOFFConfig() const {
       Common.DecompressDebugSections ||
       Common.DiscardMode == DiscardType::Locals ||
       !Common.SymbolsToAdd.empty() || Common.GapFill != 0 ||
-      Common.PadTo != 0 || Common.ChangeSectionLMAValAll != 0)
+      Common.PadTo != 0 || Common.ChangeSectionLMAValAll != 0 ||
+      !Common.ChangeSectionAddress.empty())
     return createStringError(llvm::errc::invalid_argument,
                              "option is not supported for COFF");
 
@@ -48,7 +49,8 @@ Expected<const MachOConfig &> ConfigManager::getMachOConfig() const {
       Common.DecompressDebugSections || Common.StripUnneeded ||
       Common.DiscardMode == DiscardType::Locals ||
       !Common.SymbolsToAdd.empty() || Common.GapFill != 0 ||
-      Common.PadTo != 0 || Common.ChangeSectionLMAValAll != 0)
+      Common.PadTo != 0 || Common.ChangeSectionLMAValAll != 0 ||
+      !Common.ChangeSectionAddress.empty())
     return createStringError(llvm::errc::invalid_argument,
                              "option is not supported for MachO");
 
@@ -68,7 +70,8 @@ Expected<const WasmConfig &> ConfigManager::getWasmConfig() const {
       !Common.SectionsToRename.empty() || !Common.SetSectionAlignment.empty() ||
       !Common.SetSectionFlags.empty() || !Common.SetSectionType.empty() ||
       !Common.SymbolsToRename.empty() || Common.GapFill != 0 ||
-      Common.PadTo != 0 || Common.ChangeSectionLMAValAll != 0)
+      Common.PadTo != 0 || Common.ChangeSectionLMAValAll != 0 ||
+      !Common.ChangeSectionAddress.empty())
     return createStringError(llvm::errc::invalid_argument,
                              "only flags for section dumping, removal, and "
                              "addition are supported");
@@ -97,7 +100,8 @@ Expected<const XCOFFConfig &> ConfigManager::getXCOFFConfig() const {
       Common.StripDebug || Common.StripNonAlloc || Common.StripSections ||
       Common.Weaken || Common.StripUnneeded || Common.DecompressDebugSections ||
       Common.GapFill != 0 || Common.PadTo != 0 ||
-      Common.ChangeSectionLMAValAll != 0) {
+      Common.ChangeSectionLMAValAll != 0 ||
+      !Common.ChangeSectionAddress.empty()) {
     return createStringError(
         llvm::errc::invalid_argument,
         "no flags are supported yet, only basic copying is allowed");
diff --git a/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp b/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
index 075455c034154..718d6f2739382 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
+++ b/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
@@ -745,6 +745,52 @@ static Error handleArgs(const CommonConfig &Config, const ELFConfig &ELFConfig,
     }
   }
 
+  if (!Config.ChangeSectionAddress.empty()) {
+    if (Obj.isRelocatable()) {
+      StringMap<AddressUpdate> SectionsToUpdateAddress;
+      for (const SectionPatternAddressUpdate &PatternUpdate :
+           make_range(Config.ChangeSectionAddress.rbegin(),
+                      Config.ChangeSectionAddress.rend())) {
+        for (SectionBase &Sec : Obj.sections()) {
+          if (PatternUpdate.SectionPattern.matches(Sec.Name)) {
+            if (SectionsToUpdateAddress
+                    .try_emplace(Sec.Name, PatternUpdate.Update)
+                    .second) {
+              if (PatternUpdate.Update.Absolute) {
+                Sec.Addr = PatternUpdate.Update.Value;
+              } else if (PatternUpdate.Update.Negative &&
+                         Sec.Addr < PatternUpdate.Update.Value) {
+                return createStringError(
+                    errc::invalid_argument,
+                    "address 0x" + Twine::utohexstr(Sec.Addr) +
+                        " cannot be decreased by 0x" +
+                        Twine::utohexstr(PatternUpdate.Update.Value) +
+                        ". The result would underflow");
+              } else if (!PatternUpdate.Update.Negative &&
+                         Sec.Addr > std::numeric_limits<uint64_t>::max() -
+                                        PatternUpdate.Update.Value) {
+                return createStringError(
+                    errc::invalid_argument,
+                    "address 0x" + Twine::utohexstr(Sec.Addr) +
+                        " cannot be increased by 0x" +
+                        Twine::utohexstr(PatternUpdate.Update.Value) +
+                        ". The result would overflow");
+              } else if (PatternUpdate.Update.Negative) {
+                Sec.Addr -= PatternUpdate.Update.Value;
+              } else {
+                Sec.Addr += PatternUpdate.Update.Value;
+              }
+            }
+          }
+        }
+      }
+    } else {
+      return createStringError(
+          object_error::invalid_file_type,
+          "cannot change section address in a non-relocatable file");
+    }
+  }
+
   if (Config.OnlyKeepDebug)
     for (auto &Sec : Obj.sections())
       if (Sec.Flags & SHF_ALLOC && Sec.Type != SHT_NOTE)
diff --git a/llvm/tools/llvm-objcopy/ObjcopyOptions.cpp b/llvm/tools/llvm-objcopy/ObjcopyOptions.cpp
index a54feb7644bda..be4511c1d7c00 100644
--- a/llvm/tools/llvm-objcopy/ObjcopyOptions.cpp
+++ b/llvm/tools/llvm-objcopy/ObjcopyOptions.cpp
@@ -584,6 +584,65 @@ static Expected<int64_t> parseChangeSectionLMA(StringRef ArgValue,
   return *LMAValue;
 }
 
+static Expected<SectionPatternAddressUpdate>
+parseChangeSectionAddr(StringRef ArgValue, StringRef OptionName,
+                       MatchStyle SectionMatchStyle,
+                       function_ref<Error(Error)> ErrorCallback) {
+
+  SectionPatternAddressUpdate PatternUpdate;
+  std::pair<StringRef, StringRef> UpdateParts;
+  std::string UpdateSymbol;
+
+  if (ArgValue.contains("+")) {
+    UpdateParts = ArgValue.split("+");
+    UpdateSymbol = '+';
+  } else if (ArgValue.contains("-")) {
+    UpdateParts = ArgValue.split("-");
+    UpdateSymbol = '-';
+    PatternUpdate.Update.Negative = true;
+  } else if (ArgValue.contains("=")) {
+    UpdateParts = ArgValue.split("=");
+    UpdateSymbol = '=';
+    PatternUpdate.Update.Absolute = true;
+  } else {
+    return createStringError(errc::invalid_argument,
+                             "bad format for " + OptionName +
+                                 ": argument value " + ArgValue +
+                                 " is invalid. See help");
+  }
+
+  if (UpdateParts.second.empty()) {
+    if (UpdateSymbol == "+" || UpdateSymbol == "-")
+      return createStringError(errc::invalid_argument,
+                               "bad format for " + OptionName +
+                                   ": missing value of offset after '" +
+                                   UpdateSymbol + "'");
+    else if (UpdateSymbol == "=")
+      return createStringError(errc::invalid_argument,
+                               "bad format for " + OptionName +
+                                   ": missing address value after '='");
+  }
+  if (UpdateParts.first.empty())
+    return createStringError(
+        errc::invalid_argument,
+        "bad format for " + OptionName +
+            ": missing section pattern to apply address change to");
+
+  if (Error E = PatternUpdate.SectionPattern.addMatcher(NameOrPattern::create(
+          UpdateParts.first, SectionMatchStyle, ErrorCallback)))
+    return std::move(E);
+
+  auto AddrValue = getAsInteger<uint64_t>(UpdateParts.second);
+  if (!AddrValue)
+    return createStringError(AddrValue.getError(),
+                             "bad format for " + OptionName + ": value after " +
+                                 UpdateSymbol + " is " + UpdateParts.second +
+                                 " when it should be an integer");
+
+  PatternUpdate.Update.Value = *AddrValue;
+  return PatternUpdate;
+}
+
 // parseObjcopyOptions returns the config and sets the input arguments. If a
 // help flag is set then parseObjcopyOptions will print the help messege and
 // exit.
@@ -874,6 +933,16 @@ objcopy::parseObjcopyOptions(ArrayRef<const char *> RawArgsArr,
     Config.ChangeSectionLMAValAll = *LMAValue;
   }
 
+  for (auto *Arg : InputArgs.filtered(OBJCOPY_change_section_address)) {
+    Expected<SectionPatternAddressUpdate> AddressUpdate =
+        parseChangeSectionAddr(Arg->getValue(), Arg->getSpelling(),
+                               SectionMatchStyle, ErrorCallback);
+    if (!AddressUpdate)
+      return AddressUpdate.takeError();
+    else
+      Config.ChangeSectionAddress.push_back(*AddressUpdate);
+  }
+
   for (auto *Arg : InputArgs.filtered(OBJCOPY_redefine_symbol)) {
     if (!StringRef(Arg->getValue()).contains('='))
       return createStringError(errc::invalid_argument,
diff --git a/llvm/tools/llvm-objcopy/ObjcopyOpts.td b/llvm/tools/llvm-objcopy/ObjcopyOpts.td
index b26c497ca3997..15bfbd7ab7b23 100644
--- a/llvm/tools/llvm-objcopy/ObjcopyOpts.td
+++ b/llvm/tools/llvm-objcopy/ObjcopyOpts.td
@@ -267,6 +267,14 @@ defm change_section_lma
     : Eq<"change-section-lma", "Shift LMA of non-zero-sized sections in the program header by <val>">,
       MetaVarName<"*{+|-}val">;
 
+defm change_section_address
+    : Eq<"change-section-address", "Set, or adjust the address of sections in <sectionpattern> to, "
+         "or by <val> in a relocatable file">,
+      MetaVarName<"sectionpattern{=|+|-}val">;
+def adjust_section_vma : JoinedOrSeparate<["--"], "adjust-section-vma">,
+                   Alias<change_section_address>,
+                   HelpText<"Alias for --change-section-address">;
+
 defm add_symbol
     : Eq<"add-symbol", "Add new symbol <name> to .symtab. Accepted flags: "
          "global, local, weak, default, hidden, protected, file, section, object, "

--change-section address and its alias --adjust-section-vma allows modification
of section addresses in a relocatable file. This used to be used, for example,
in Fiasco microkernel.

On a relocatable file this option behaves the same as GNU objcopy, apart from
the fact that it does not issue any warnings, for example, when an argument is
not used.
GNU objcopy does not produce an error when passed an executable file but the
usecase for this is not clear, and the behaviour is inconsistent. The idea of
GNU objcopy --change-section-address is that the option should change both LMA
and VMA in an executable file. Since this patch does not implement executable
file support, only VMA is changed.
@eleanor-arm eleanor-arm force-pushed the objcopy-change-address branch from bee89de to 1c5a984 Compare July 12, 2024 17:32
@jh7370 jh7370 requested review from MaskRay and jh7370 July 15, 2024 06:53
@jh7370
Copy link
Collaborator

jh7370 commented Jul 16, 2024

What is the purpose of this option in practice? You mentioned it "used to be used" in a micro kernel, implying it isn't needed anymore.

@eleanor-arm eleanor-arm requested a review from smithp35 July 16, 2024 11:05
@eleanor-arm
Copy link
Contributor Author

Yes, I changed it in Fiasco recently, the older versions would still need this. There seems to be some popularity of using --change-section-address to "shift" data sections during conversion from an ELF object file to binary in embedded systems. Another example: https://github.com/charlottestick/SimpleOS/blob/2b63c6481353d6cc1ae2a198e847a3716e52e6e3/compile.sh#L27

Thanks @smithp35 for the example.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I confirm that the behaviour you're following is compatible with GNU? In particular, if you end up with multiple patterns that match the same section, does only the last one apply for certain, but other sections which don't match later patterns are still adjusted by earlier ones?

@eleanor-arm
Copy link
Contributor Author

Can I confirm that the behaviour you're following is compatible with GNU? In particular, if you end up with multiple patterns that match the same section, does only the last one apply for certain, but other sections which don't match later patterns are still adjusted by earlier ones?

Yes. That is what GNU does. I copied the behaviour from there, otherwise I would have probably done something else.

For example, this test produces the same output with aarch64-linux-gnu-objcopy:

# RUN: llvm-objcopy --change-section-address *+0x20 --change-section-address .anotherone=0x455 %ti %to
# RUN: llvm-readelf --section-headers %to | FileCheck %s --check-prefix=CHECK-SUPERSET-SET

@eleanor-arm
Copy link
Contributor Author

As I was updating the tests, I noticed that GNU objcopy is not able to change .shstrtab address. If the pattern matches .shstrtab it will be silently ignored and if it's explicitly set a warning is printed.

On the other hand, .strtab is modified when it's part of the pattern but can't be changed explicitly either.

This seems like a bug in GNU objcopy to me, or am area that has not been properly implemented.

This implementation does not treat special sections in any special way.

@eleanor-arm eleanor-arm requested a review from jh7370 July 24, 2024 16:19
@eleanor-arm eleanor-arm requested a review from jh7370 July 27, 2024 08:09
# RUN: llvm-objcopy --change-section-address *+0x20 %ti %to
# RUN: llvm-readelf --section-headers %to | FileCheck %s --check-prefix=CHECK-ADD-ALL

# Check --change-section-address alias --adjust-section-vma
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's similar, but the difference is that if there is anything impacted by the different name, that isn't covered by the FileCheck line, the cmp will catch it. For example, imagine an option that was supposed to be an alias for --strip-debug, you might check that there are no section names matching "debug" in the output, but if you accidentally aliased --strip-all instead, you'd have this behaviour and more.

@eleanor-arm eleanor-arm requested a review from jh7370 July 29, 2024 19:08
Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@eleanor-arm eleanor-arm merged commit 2b2f4ae into llvm:main Jul 30, 2024
8 checks passed
@eleanor-arm eleanor-arm deleted the objcopy-change-address branch July 30, 2024 09:58
@MaskRay
Copy link
Member

MaskRay commented Jul 31, 2024

I think github "Squash and merge" wraps your commit message at 72 bytes (80 minus 1 tab).
If you description is wrapped at 80 bytes, the landed commit might appear hard wrapped.

@eleanor-arm
Copy link
Contributor Author

Thanks. That's good to know for next time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants