Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: don't let uproot.update mess up the TFile fVersion #1212

Merged
merged 1 commit into from
May 9, 2024

Conversation

jpivarski
Copy link
Member

When uproot.update was reading in a file, it used to unconditionally subtract 1000000 from the TFile::fVersion that it read:

out = FileHeader(
end,
free_location,
free_num_bytes,
free_num_slices_plus_1 - 1,
begin_num_bytes,
uproot.compression.Compression.from_code(compression_code),
info_location,
info_num_bytes,
uuid.UUID(bytes=uuid_bytes),
)
out._version = version - 1000000
out._begin = begin
return out

The offset of 1000000 is how ROOT distinguishes between files with (some) 64-bit seek positions and files that are all 32-bit. Uproot's FileHeader.version holds the real version number, without the offset, and puts the offset back in if needed:

if self.big:
format = uproot.reading._file_header_fields_big
version = self.version + 1000000
units = 8
else:
format = uproot.reading._file_header_fields_small
version = self.version
units = 4

But it was unconditionally subtracting 1000000 from the files it read, even if the file was fully 32-bit, as most are (because it only starts to matter for files bigger than 4 GB). So uproot.update was messing up a metadata field of most of the files it touched. (I can see how this mistake was made: I must have thought the line that subtracts 1000000 was still inside the if version >= 1000000: block, just above.)

Uproot doesn't do anything with the TFile::fVersion, other than use it as a 64-bit indicator, and I suppose we haven't been updating large enough files for this to be noticed. However, the version number that it wrote was nonsensical, a number near -937600.

ROOT doesn't do much with these numbers, either, but it does more than Uproot. This bug was discovered because ROOT was shifting into a mode in which it was handling files as though they were older than ROOT 3. (The last version of ROOT 2.x was in September of the year 2000.) That fallback code did not work for TLeafL and TLeafO (among the TLeaf subclasses that I tested; TLeafD, TLeafI, and TLeafS were fine), and that makes sense because TLeafL and TLeafO were added in 2003 and in 2005, respectively (the others are older, and that's why they're fine).


For @zbilodea: files made with this correction can be read from ROOT with no warnings.

>>> import uproot
>>> f = uproot.recreate("one-tree.root")
>>> import numpy as np
>>> f["tree1"] = {"branch": np.array([1, 2, 3])}
>>> f["tree2"] = {"branch": np.array([1, 2, 3])}
% cp one-tree.root two-trees.root
>>> import uproot
>>> f = uproot.update("one-tree.root")
>>> del f["tree2"]
root [0] auto f = new TFile("one-tree.root")
(TFile *) 0x5e75fab86820
root [1] TTree *t
(TTree *) nullptr
root [2] f->GetObject("tree1", t);
root [0] auto f = new TFile("two-trees.root")
(TFile *) 0x55e5977bc500
root [1] TTree *t
(TTree *) nullptr
root [2] f->GetObject("tree1", t);

You were correct in your observation that the TStreamerInfo for the TLeafL class and the bytes of the TLeafL instance were identical between the file that worked without errors and the file that didn't. The "hinge" came from uproot.update setting a bad version number and sending ROOT into this pre-version 3 mode: it didn't read the (correct) TStreamerInfo, and because TLeafL was not designed before ROOT version 3, it generated an incorrect TStreamerInfo instead of reading the one in the file. That's also why it didn't matter whether you used the high-level del or the low-level FreeSegments.release: either way, uproot.update set the TFile::fVersion to a nonsensical value.

I'll merge this PR into main right away; when you merge main into your PR, it should fix the TLeafL error.


This bug would have affected any files modified by uproot.update, any process that adds or removes an object, because any modification of the TFree record (map of unused space within the file) would overwrite the FileHeader to indicate that the TFree record might have moved. As long as these files were smaller than 4 GB, Uproot would not have noticed, but it could have influenced ROOT in subtle ways, as shown above.

@jpivarski
Copy link
Member Author

To follow up with more information, how this happened on the ROOT side: in TBufferFile::ReadClassBuffer, there's a check for v2file (as in, a 20th century ROOT file).

https://github.com/root-project/root/blob/f2dd53a49b29365ef2d48a2f32d959a9bfd9a94f/io/io/src/TBufferFile.cxx#L3486-L3491

If v2file, we go into a mode that builds an emulated TStreamerInfo, rather than building it directly:

https://github.com/root-project/root/blob/f2dd53a49b29365ef2d48a2f32d959a9bfd9a94f/io/io/src/TBufferFile.cxx#L3573-L3579

This is the only call to TStreamerInfo::BuildEmulated, which is defined here:

https://github.com/root-project/root/blob/f2dd53a49b29365ef2d48a2f32d959a9bfd9a94f/io/io/src/TStreamerInfo.cxx#L1248-L1282

While that works fine for a version 2-era class like TLeafD,

root [0] auto f = new TFile("broken-version-number.root");
root [1] auto cls_TLeafD = TClass::GetClass("TLeafD");
root [2] auto tsi_TLeafD = new TStreamerInfo(const_cast<TClass*>(cls_TLeafD));
root [3] tsi_TLeafD->Build();
root [4] tsi_TLeafD->Clear("build");
root [5] tsi_TLeafD->BuildEmulated(f);   // only using the TFile::fVersion to pass assertion
root [6] tsi_TLeafD->ls()

StreamerInfo for class: TLeafD, version=-1, checksum=0x7d1
  TLeaf          BASE            offset=  0 type= 0 Leaf: description of a Branch data type
  double         fMinimum        offset=112 type= 8 Minimum value if leaf range is specified
  double         fMaximum        offset=120 type= 8 Maximum value if leaf range is specified
   i= 0, TLeaf           type=  0, offset=  0, len=1, method=0
   i= 1, fMinimum        type= 28, offset=112, len=2, method=0 [optimized]

it does strange things to TLeafL and TLeafO (and maybe others; I didn't check):

root [7] auto cls_TLeafL = TClass::GetClass("TLeafL");
root [8] auto tsi_TLeafL = new TStreamerInfo(const_cast<TClass*>(cls_TLeafL));
root [9] tsi_TLeafL->Build();
root [10] tsi_TLeafL->Clear("build");
root [11] tsi_TLeafL->BuildEmulated(f);   // only using the TFile::fVersion to pass assertion
root [12] tsi_TLeafL->ls()

StreamerInfo for class: TLeafL, version=-1, checksum=0x7d1
  TLeaf          BASE            offset=  0 type= 0 Leaf: description of a Branch data type
  int            fMinimumQWERTY  offset=99999 type= 3                     
  Long64_t       fMinimum        offset=112 type=16 Minimum value if leaf range is specified
  int            fMaximumQWERTY  offset=99999 type= 3                     
  Long64_t       fMaximum        offset=120 type=16 Maximum value if leaf range is specified
   i= 0, TLeaf           type=  0, offset=  0, len=1, method=0
   i= 1, fMinimumQWERTY  type=103, offset=99999, len=1, method=0
   i= 2, fMinimum        type= 16, offset=112, len=1, method=0
   i= 3, fMaximumQWERTY  type=103, offset=99999, len=1, method=0
   i= 4, fMaximum        type= 16, offset=120, len=1, method=0

This is the TStreamerInfo that was used, and so it really did read too many bytes. Whereas a TLeaf's fMinimum and fMaximum would ordinarily both be 0, it read an fMinimum of 0 and an fMaximum of 196609 for this TLeafL. It was reading beyond the object's bounds; the error message was alerting us to a real problem.

@jpivarski jpivarski merged commit 2c7099d into main May 9, 2024
24 checks passed
@jpivarski jpivarski deleted the jpivarski/fix_TLeafL_too_many_bytes branch May 9, 2024 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant