Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rotate FK-tables instead of EKOs #183

Open
cschwan opened this issue Jun 30, 2024 · 15 comments
Open

Rotate FK-tables instead of EKOs #183

cschwan opened this issue Jun 30, 2024 · 15 comments
Labels
good first issue Good for newcomers refactor Refactor code

Comments

@cschwan
Copy link
Contributor

cschwan commented Jun 30, 2024

There's another big optimization potential to speed up the generation of FK-tables. Currently, I believe, pineko works the following way (correct me if I'm wrong):

  1. we store flavour-basis to flavour-basis EKOs for all Q2 slices, but always for the standard 50 x-node points. The biggest EKO that we have, ATLAS_1JET_8TEV_R06, is 2.4 GB.
  2. Next, we reinterpolate the EKO to match the x-grid points actually used in the grid. For ATLAS_1JET_8TEV_R06 this blows up the size of the EKO to 22 GB.
  3. Finally, we rotate the EKO, such that the resulting FK-tables are in the evolution basis. This further increases the size, resulting in whopping 45 GB.

But thanks to some developments on PineAPPL's side, we can now perform a rotation at the level of grids (and therefore also FK-tables). So I suggest that

  • we perform the evolution using the unrotated EKOs, which has the advantage that the EKOs are much smaller, and finally
  • rotate the evolved FK-tables from the flavour-basis to the evolution basis.

The evolution in the flavour basis is much faster, it's roughly the quotient of the size of the EKOs. For ATLAS_1JET_8TEV_R06 it's roughly twice as fast (45 GB / 22 GB). With the added benefit that we need less disk space, of course.

Practically speaking, the changes on pineko's side should be minimal. We probably have to delete the following lines:

pineko/src/pineko/evolve.py

Lines 282 to 286 in 9f8b03c

if np.allclose(operators.bases.inputpids, br.flavor_basis_pids):
eko.io.manipulate.to_evol(operators)
# Here we are checking if the EKO contains the rotation matrix (flavor to evol)
elif not np.allclose(operators.bases.inputpids, br.rotate_flavor_to_evolution):
raise ValueError("The EKO is neither in flavor nor in evolution basis.")

and instead add the rotation after the evolution:

pineko/src/pineko/evolve.py

Lines 313 to 320 in 9f8b03c

fktable = grid.evolve(
ekompatibility.pineappl_layout(operators),
xir * xir * mur2_grid,
alphas_values,
"evol",
order_mask=order_mask,
xi=(xir, xif),
)

@alecandido
Copy link
Member

Yes, I definitely agree about this, and we already discussed many times in the past with @felixhekhorn.

The only limitation in this respect was never PineAPPL itself, but just person power. In principle, we didn't even need to be able to rotate grids in order to implement this optimization: it would have been sufficient for EKO to expose the rotation, and for PineAPPL to consume during evolution, i.e. at the end of the evolution.
It's just a matter of linear algebra, and many people realized that not all contraction paths are equivalent. Though, with a complex graph, finding the optimal one could be more expensive than performing the contraction itself (cf. tensor networks, opt_einsum, and cotengra).

In this case, our graph has three nodes, and it was pretty simple to optimize the contraction by hand. The only limitation was the availability of all ingredients in the same place.

@cschwan
Copy link
Contributor Author

cschwan commented Jun 30, 2024

The rotation is already implemented in PineAPPL (thanks to @scarlehoff insisting on having this feature). In the CLI you can ask for:

pineappl write --rotate-pid-basis=EVOL <INPUT> <OUTPUT>

We just need to add a Python wrapper for Grid::rotate_pid_basis. My point is: now this really should be very simple.

@alecandido
Copy link
Member

The rotation is already implemented in PineAPPL (thanks to @scarlehoff insisting on having this feature). In the CLI you can ask for:

Having it both in PineAPPL and EKO is a duplicated feature, since the two of them are working often together. Especially for the purpose for which the rotation is usually involved.

In principle, we could always deduplicate it by exposing the rotation matrix from PineAPPL as an ndarray, and use that in EKO, dropping the existing one.
But EKO has many more bases implemented (all the "intrinsic" ones), so I'm not sure it is convenient to fully migrate on either sides.

Most likely, the clean approach would be to split these rotations as a separate crate, even within the EKO repo, as soon as EKO will be rustified-enough. Since there is no need for frequent changes on that side, it would become rapidly very stable, and a good dependency for PineAPPL.
But this is just a long term maintenance plan. For the time being, it's fine to keep as it is.

@cschwan
Copy link
Contributor Author

cschwan commented Jun 30, 2024

The rotation is already implemented in PineAPPL (thanks to @scarlehoff insisting on having this feature). In the CLI you can ask for:

Having it both in PineAPPL and EKO is a duplicated feature, since the two of them are working often together. Especially for the purpose for which the rotation is usually involved.

Strictly speaking yes, but the rotation in PineAPPL does something very simple: it only rewrites the channel definition of the interpolation grid and doesn't even perform an evolution. This operation is extremely fast, and takes just a fraction of a second.

@alecandido
Copy link
Member

alecandido commented Jun 30, 2024

Strictly speaking yes, but the rotation in PineAPPL does something very simple: it only rewrites the channel definition of the interpolation grid and doesn't even perform an evolution. This operation is extremely fast, and takes just a fraction of a second.

Yes, because you're essentially postponing once more the contraction: it will be applied when you're going to convolve the PDF.
And the postponed evolution is worth because the PDF is definitely the cheapest object (as the EKO is the most expensive).

However, the duplication is not in the application, that is PineAPPL-specific, but just in the matrix elements. The same could be used in both packages. And dropping the zero elements you would obtain the same improvement for many different basis (e.g. not only the QCD evolution basis, but even the unified basis used in QEDxQCD evolution).

@felixhekhorn
Copy link
Contributor

felixhekhorn commented Jul 1, 2024

Just to say that I agree on the idea of having the 14x14 matrix only once, but both programs need the feature since they are having very different scopes. EKO is a (very) active player in flavor space so it knows more and may hold all necessary information. In fact our beloved br module is imported in several places.

it will be applied when you're going to convolve the PDF.

more over: rotating the PDF has only to be done once, because then PineAPPL closes all open indices (something that is beyond EKO)

@alecandido
Copy link
Member

alecandido commented Jul 1, 2024

more over: rotating the PDF has only to be done once, because then PineAPPL closes all open indices (something that is beyond EKO)

Just one caveat: lines are blurring (as always) when you consider very huge replica sets. In those cases, it is not any longer clear that is more convenient to prefer rotating the PDF over the grid, since it depends on the relative sizes.

But, for sure, it is always convenient to rotate the evolved object, being it the PDF set or the grid. And not the EKO itself.

@cschwan
Copy link
Contributor Author

cschwan commented Jul 1, 2024

Just one caveat: lines are blurring (as always) when you consider very huge replica sets. In those cases, it is not any longer clear that is more convenient to prefer rotating the PDF over the grid, since it depends on the relative sizes.

I don't think I understand this point; whether I evolve with a rotated EKO or evolve and then rotate the FK-tables, the FK-table is in both cases exactly the same (up numerical inaccuracies and the ordering of the channels). They must be the same, because in the past we decided that FK-tables only allow trivial particle combinations, for instance (100, 100) is allowed, but (100, 100) + (103, 103) isn't.

@alecandido
Copy link
Member

I'm just referring to performances. No doubt the observable will be the same (up to double precision).

Since the operation is associative, you can contract in your favorite order. But which is the best one depends on the size of the various dimensions.

Strictly speaking yes, but the rotation in PineAPPL does something very simple: it only rewrites the channel definition of the interpolation grid

The FK table is not necessarily the exact same: if you store the final rotation in the channels definition, instead of burning in the subgrids.
Channels storage is equivalent to store the rotation matrix (it is just another representation of that), and it means you will apply first the rotation to the PDF, and then you contract the rotated PDF with the subgrids.
This is also part of your contraction path choice.

@cschwan
Copy link
Contributor Author

cschwan commented Jul 1, 2024

The FK table is not necessarily the exact same: if you store the final rotation in the channels definition, instead of burning in the subgrids. Channels storage is equivalent to store the rotation matrix (it is just another representation of that), and it means you will apply first the rotation to the PDF, and then you contract the rotated PDF with the subgrids. This is also part of your contraction path choice.

I left out an important detail: after rotating the channel definition the grid is no longer an FK-table as explained in my previous comment. But we can restore that property by calling 1) Grid::split_channels, 2) Grid::optimize and finally 3) an operation that absorbs the channel factors into subgrids. When you do that you really get the exact same FK-table, and I checked that explicitly with the CLI.

@felixhekhorn
Copy link
Contributor

because in the past we decided that FK-tables only allow trivial particle combinations

wait, this is a pure technical decision, because n3fit wants it that way - but this has nothing to do with the physical property of being an FK table. I'd say the defining property of an FK table is that it has a single factorization scale - and that is independent of what flavor basis representation you are using. A rotation preserves the predictions (given a proper normalization everywhere), so of course you can recover the "n3fit FK table".

I'm just referring to performances.

that is the only thing we can discuss here - where and when it is numerically advantages to rotate; the physics has to stay the same

@cschwan
Copy link
Contributor Author

cschwan commented Jul 15, 2024

because in the past we decided that FK-tables only allow trivial particle combinations

wait, this is a pure technical decision, because n3fit wants it that way - but this has nothing to do with the physical property of being an FK table. I'd say the defining property of an FK table is that it has a single factorization scale - and that is independent of what flavor basis representation you are using. A rotation preserves the predictions (given a proper normalization everywhere), so of course you can recover the "n3fit FK table".

Yes, but then you'd also have to change n3fit accordingly. The point I'm trying to make with this Issue is that there's a big optimization potential that can we can reap by changing only a few lines of code in Pineko.

@felixhekhorn felixhekhorn added the refactor Refactor code label Jul 18, 2024
@cschwan cschwan added the good first issue Good for newcomers label Sep 25, 2024
@Radonirinaunimi
Copy link
Member

After looking into this for a bit, I think I agree with @cschwan's proposition, and indeed the changes would be quite minimal from pineko's POV (the basis rotation has just been exposed in the PineAPPL Python API, see commit).

The only thing I am not sure in the upcoming EKOv0.15 is the x -node points of the operator. @felixhekhorn, @giacomomagni, is the re-interpolation still required?

@cschwan
Copy link
Contributor Author

cschwan commented Dec 2, 2024

The reinterpolation is still required in any case, I believe!

@felixhekhorn
Copy link
Contributor

The only thing I am not sure in the upcoming EKOv0.15 is the x -node points of the operator. @felixhekhorn, @giacomomagni, is the re-interpolation still required?

someone needs to match the EKO (process scale) x-grid to the PineAPPL (process scale) x-grid and pineko is the correct player to do so

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers refactor Refactor code
Projects
None yet
Development

No branches or pull requests

4 participants