-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cascade collider #1326
Add cascade collider #1326
Conversation
src/celeritas/phys/FourVector.hh
Outdated
/*! | ||
* Add two four-vectors. | ||
*/ | ||
inline CELER_FUNCTION FourVector add(FourVector const& a, FourVector const& b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of add
we might as well make it operator+
, but perhaps first we should do operator+=
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I added operator+ and the assignment operator+=. I also realize that it may be better to template FourVector with the precision type, which I may do in another MR. I did not add other assignment/arithmetic operators as they are not likely used.
src/celeritas/phys/FourVector.hh
Outdated
@@ -28,6 +28,8 @@ struct FourVector | |||
real_type energy{0}; //!< Particle energy | |||
}; | |||
|
|||
namespace lorentz | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need a whole namespace for this... and even if we did the entire file (especially the FourVector) should be in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed - I though that it might help for clarify.
CELER_ASSERT(channel_data.par.slope > 0); | ||
xs_params.push_back(channel_data.par); | ||
|
||
GenericGridBuilder build_grid{&data.reals}; | ||
make_builder(&data.nucleon_xs) | ||
.push_back(build_grid(bins, make_span(channel_data.xs))); | ||
.push_back(build_grid(xs_energy_bins, make_span(channel_data.xs))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To allow deduplication of cross section grids etc., the build_grid
(and the other builders) should be outside the for
loop. I'd also replace make_builder(&data.nucleon_xs)
inside the loop with CollectionBuilder nucleon_xs{&data.nucleon_xs};
outside of it. (The make_builder
command was before we had C++17 CTAD.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know! Updated as suggested. Thanks.
{ | ||
StepanovParameters par; | ||
ChannelArray xs; | ||
Array<double, 13> xs; | ||
Array<double, 19 * 6> cdf; //! c.d.f[cos_theta_bins][cdf_energy_bins] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these are fixed dimensions it would be more sensible to have:
using ArrayEnergy = Array<double, 6>;
Array<ArrayEnergy, 19> cdf; // [angle][energy]
Also since the constants each have only 4 digits of precision we could probably use float... or store even a 16-bit integer if we first multiply them by 10000 😆 it's fine to keep them as double though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I keep them as they are for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@whokion Forgetting about the precision type: the documentation is wrong since the array is accessed and built with {energy, angle}
Array<double, 19 * 6> cdf; //! c.d.f[cos_theta_bins][cdf_energy_bins] | |
Array<double, 6 * 19> cdf; //! [energy][angle] |
FourVector vec4; //!< Four momentum in natural MevMomentum and | ||
//!< MevEnergy units |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a big fan of the combined text+number; could it just be "momentum"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change vec4 to four_vec.
@sethrj Thanks a lot for all review comments. Will follow up later today! Regarding to vec4 --> momentum, I will think an alternative as a component of FourVector is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition @whokion! Thanks! Just a couple of tiny documentation fixes.
@amandalund Thanks for the review and suggestions! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the arithmetic opertions in the collider need to be checked/clarified/simplified...
size_type idx = cos_grid.size() - 2; | ||
real_type cdf_upper = 0; | ||
real_type cdf_lower = 1; | ||
|
||
do | ||
{ | ||
cdf_upper = cdf_lower; | ||
cdf_lower = calc_cdf({kin_energy_, cos_grid[idx]}); | ||
} while (cdf_lower > cdf && idx-- > 0); | ||
|
||
real_type frac = (cdf - cdf_lower) / (cdf_upper - cdf_lower); | ||
cos_theta = fma(frac, cos_grid[idx + 1] - cos_grid[idx], cos_grid[idx]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though this algorithm is only ten lines, it's a little complicated, I think we'll see it a lot in tabulated sampling, and it could use some extra assertions/testing—after this merge request I'll create a helper function that encapsulates it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I originally tried an array of GenericGridData [energy bin], so that we can swap x an y after find a corresponding cos_grid_data for each energy bin, then use the GenericGridCalculator to sample cos(theta). I believe that two approaches are basically equivalent, but using Two2GridData is much more compact and cleaner. If we can encapsulate the inverse sampling inside Twod(Sub)gridCalculator, it would be even great! Thanks for your help.
{ | ||
StepanovParameters par; | ||
ChannelArray xs; | ||
Array<double, 13> xs; | ||
Array<double, 19 * 6> cdf; //! c.d.f[cos_theta_bins][cdf_energy_bins] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@whokion Forgetting about the precision type: the documentation is wrong since the array is accessed and built with {energy, angle}
Array<double, 19 * 6> cdf; //! c.d.f[cos_theta_bins][cdf_energy_bins] | |
Array<double, 6 * 19> cdf; //! [energy][angle] |
result[0].four_vec | ||
= {{fv.mom[0] * vscm + fv.mom[1] * vxcm + fv.mom[2] * cm_dir}, | ||
fv.energy}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly why I didn't want to make "array operators": even though this reads like a simple algebra expression, C++ is creating an insane number of temporaries here. I created some test kernels replicating this operation: as written, this takes 35 registers and 280 (!) bytes of stack space. Rewriting inside a for
loop reduces the usage to 32 registers and 88 bytes of stack space, or even less (depending on how it's written).
result[0].four_vec | |
= {{fv.mom[0] * vscm + fv.mom[1] * vxcm + fv.mom[2] * cm_dir}, | |
fv.energy}; | |
for (int i = 0; i < 3; ++i) | |
{ | |
result[0].four_vec[i] = fv.mom[0] * vscm[i] | |
+ fv.mom[1] * vxcm[i] | |
+ fv.mom[2] * cm_dir[i]; | |
} | |
result[0].four_vec.energy = fv.energy; |
Remember that 200 bytes of overflow per thread is 200MB of memory for the expected use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know! I thought that they are very much equivalent. Anyway, changed as suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, unfortunately C++ forces the creation of temporaries that all store local copies of the intermediate compilation. The only ways to get around it are to manually unroll like this or to use a complicated template metaprogramming math library like Eigen or Blitz++ (neither of which are CUDA compatible).
|
||
if (!degenerated) | ||
{ | ||
Real3 vscm = make_unit_vector(cm_velocity_ - vel_parallel * cm_dir); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use an axpy
intead of array operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean axpy(-vel_parallel, cm_dir, &cm_velocity_)
which will change value for cm_velocity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind. I got it.
// Booster vector in the center of mass frame | ||
Real3 cm_velocity_; | ||
// Momentum magnitude in the center of mass frame | ||
real_type cm_p_; | ||
// Kinetic energy in the target rest frame | ||
real_type kin_energy_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Booster vector in the center of mass frame | |
Real3 cm_velocity_; | |
// Momentum magnitude in the center of mass frame | |
real_type cm_p_; | |
// Kinetic energy in the target rest frame | |
real_type kin_energy_; | |
// Boost vector in the center of mass frame [1/c] | |
Real3 cm_velocity_; | |
// Momentum magnitude in the center of mass frame [MeV / c] | |
real_type cm_p_; | |
// Kinetic energy in the target rest frame [MeV] | |
real_type kin_energy_; |
// Degenerated if velocity perpendicular to the c.m. momentum is small | ||
bool degenerated | ||
= (dot_product(cm_velocity_, cm_velocity_) - ipow<2>(vel_parallel) | ||
< this->epsilon()); | ||
|
||
if (!degenerated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this expression is right? It's comparing the difference in the square of speeds? Maybe it would be better to calculate the vscm
and then check the magnitude of that...
Also it appears that epsilon
here needs to have units of [1/c^2]
? Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that they should be equivalent, i.e., norm(vscm) == sqrt(dot_product(cm_velocity_, cm_velocity_) - ipow<2>(vel_parallel))
. epsilon is small arbitrary magic number from Geant4 and used for condition for the degeneracy [1/c^2] and vscm [1/c] as c = 1 in Geant4. Okay, let's use norm(vscm) as the condition for the degeneracy, which should be equivalent for the final logic.
@sethrj Thanks again for all the comments and suggestions! I think that I addressed them for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Soon! Looks great.
Add a helper for intra-nucleus nucleon-nucleon collisions