Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Hessian Kernels for PP and SS #688

Open
wjlei1990 opened this issue Jun 10, 2020 · 14 comments
Open

New Hessian Kernels for PP and SS #688

wjlei1990 opened this issue Jun 10, 2020 · 14 comments

Comments

@wjlei1990
Copy link
Contributor

wjlei1990 commented Jun 10, 2020

Hi all,

SPECFEM3D GLOBE currently calculate Hessian kernels based on density. We would like to add two more kernels, to proper precondition Vp and Vs kernels separately. This requires SPECFEM to calculate two extra Hessian kernels (in both fortran and CUDA) and write them out (in both binary and adios format).

Second, the current Hessian kernel is based on source-receiver wavefield. Congyue would like to use the source-source wavfield, to get Hessian.

I would like to use this thread to track our working progress.

Congyue's previous code change to calculate pp hessian kernels: icui@5329763


update 2020.06.17

I put some code update here so people can help to check the formula.
https://github.com/wjlei1990/specfem3d_globe/pull/2/files

@qcliu0
Copy link

qcliu0 commented Jun 10, 2020

Formulations for Hessian Kernels: H_pp, H_ss. We can do different parametrizations similarly.
3D Hess_v2.pdf

@wjlei1990
Copy link
Contributor Author

wjlei1990 commented Jun 11, 2020

Hi @danielpeter,

I think I will start with CPU first since I think it is much easier.

I am thinking about what is the most best way to store the extra hessian kernels. The current rho hessian kernels are stored in variable hess_kl_crust_mantle.

The current dimension of hess_kl_crust_mantle is dimension of 4:

allocate(hess_kl_crust_mantle(NGLLX,NGLLY,NGLLZ,NSPEC_CRUST_MANTLE_ADJOINT),stat=ier)

Should we expand the current hessian array to (:, :, :, :, 3) or creating new variables to store? I am thinking about in the future, we may have more hessian kernels for different parametrizations.

Also,considering the CPU and GPU code similarity.

@danielpeter
Copy link
Member

i would stay with single arrays using (NGLLX,NGLLY,NGLLZ,NSPEC) only but using some naming convention to differentiate, like hess_vp_kl.., hess_vs_kl.., and so on.

that way, we have minimal impact on existing workflow and visualization tools like combine_vol_data**, etc.

@wjlei1990
Copy link
Contributor Author

Hi @danielpeter,

To calculate the hessian kernel for kappa and mu, I need some extra terms such as space derivatives of adjoint velocity field:

  b_xix_crust_mantle, b_xiy_crust_mantle, b_xiz_crust_mantle,
  b_etax_crust_mantle, b_etay_crust_mantle, b_etaz_crust_mantle
  b_gammax_crust_mantle, b_gammay_crust_mantle, b_gammaz_crust_mantle

Those terms are needed for computing source and reciver hessian kernels. Current specfem3d globe seems only computed the spatial gradients of forward velocity, but not for adjoint velocity.

Could you confirm my point?

@danielpeter
Copy link
Member

these derivatives depend on the mapping, i.e. are mesh dependent only, but do not depend on the wavefields. so you can use xix_crust_mantle, etc. for calculating the gradient of both forward and adjoint wavefields.

@wjlei1990
Copy link
Contributor Author

these derivatives depend on the mapping, i.e. are mesh dependent only, but do not depend on the wavefields. so you can use xix_crust_mantle, etc. for calculating the gradient of both forward and adjoint wavefields.

Got you. So you mean they are just arrays got allocated and we are free to use them as we want during the calculation.

@danielpeter
Copy link
Member

yes, these arrays are read in the beginning by the solver and available during all calculations.

if you prefer, for speed performance we use Deville products by default (USE_DEVILLE_PRODUCTS_VAL) and in those cases, have these shape function derivatives merged into the array deriv_mapping_crust_mantle(x,NGLLX,NGLLY,NGLLZ,NSPEC) (in prepare_optimized_arrays.F90 line 482). this makes reading these values a bit more efficient. that's fine tuning however.

@wjlei1990
Copy link
Contributor Author

wjlei1990 commented Jun 23, 2020

Hi @danielpeter,

Could you help to take a look at this code change?
https://github.com/wjlei1990/specfem3d_globe/pull/2/files

Please let me know if there is any alarming issues you spot. I think I should get most of the calculation code done and the code compiles with no errors.

The only thing bugs me a little bit is the conversion between ijk and i, j, k. For example, here and here. The thing is when computing the spatical derivatives of the velocity field, I need to get i, j, k separately so I can access hprime_xx. So I write some explicit code to convert between ijk and i, j, k. It makes the code less clean.

Also, since I write a function to be called so I define a interface here. Not sure if it follows the coding style of specfem.

@wjlei1990
Copy link
Contributor Author

wjlei1990 commented Jun 29, 2020

Hi @danielpeter, may I ask for your help when porting the code to GPU? I have a few questions to you if you have time to answer them.

Q1: When can I find the definiation to the Mesh_pointer? I need to have pass in a few arguments, such as the hess_rho_kl_crust_mantle, hess_kappa_kl_crust_mantle, hess_mu_kl_crust_mantle, ans also the veloc_crust_mantle and b_veloc_crust_mantle. I suppose those variables are accessible in the Mesh_pointer.

Q2: I need to caluclate the gradient of the veloc_crust_mantle and b_veloc_crust_mantle. Do you have some existing code than I may use as an example. I guess the ideal case would be have a kernel function to calculate the gradients, just like the CPU version, to avoid code duplication. Not sure if it is possible to do in that way.

@wjlei1990
Copy link
Contributor Author

Hi @danielpeter, I think I get most of what I want. There is one thing confuse me a little bit.

In the compute_hess_kernel.rb:

    if (get_lang == CUDA and ref) then
      get_output.print File::read("references/#{function_name}.cu")
    elsif(get_lang == CL or get_lang == CUDA) then
      ...

I see it actually support reading CUDA code from reference directory and I suppose the reference directory is hand-written CUDA code. The default value of ref=True but I checked the kernels.gen/compute_hess_kernel.cu and found ref is set to False.

For computing kernels of hessian kernels of rho, kappa and mu, can I write the reference code?

@danielpeter
Copy link
Member

hi Wenjie, these reference CUDA routines were used when we ported the code from CUDA to BOAST routines. these original CUDA routines became the "reference" when we compared then the different versions for testing and validation. so there is still this option to either use the reference hand-written or the generated CUDA routines. however, most of the reference routines haven't been adapted to the newer code changes, so i would guess that this won't work anymore.

thus, just focus on writing BOAST kernels to then generate CUDA and OpenCL kernels. you can ignore these references/**.cu kernels and won't need to write such an additional one.

@danielpeter
Copy link
Member

for the questions above, i don't have much to add other than try and look up how it was done for the d_hess_kl_crust_mantle. and for the gradient, there is a kernel compute_gradient_kernel which shows how to compute a gradient from a scalar field (acoustic). this together with a corresponding CPU version should give an idea how to compute it for a vector field.

@wjlei1990
Copy link
Contributor Author

wjlei1990 commented Jul 14, 2020

for the questions above, i don't have much to add other than try and look up how it was done for the d_hess_kl_crust_mantle. and for the gradient, there is a kernel compute_gradient_kernel which shows how to compute a gradient from a scalar field (acoustic). this together with a corresponding CPU version should give an idea how to compute it for a vector field.

Hi @danielpeter , thanks for the help. I am I am pretty much done with the BOAST part. You may take a look at the code change here, though there are lots of code changes...

The only thing I am not sure is the opencl part when computing those extra hessian kernels here. I just mimick what you did at other places.

@danielpeter
Copy link
Member

hi Wenjie, the changes look good. regarding the openCL calls, we will just test it and correct if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants