Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed off by one and other issues (affecting models, log-likelihood computation and predictions). #48

Merged
merged 11 commits into from
Oct 14, 2024

Conversation

yuxinc17
Copy link
Contributor

  1. Fixed the off-by-one issue in computing log-likelihood, mainly due to inconsistent shifting of inter-event times (dts) for each model. Details are explained in each commit comment.
  2. Fixed the next event prediction of marks with multiple accepted sampled times, and the off-by-one issue due to inconsistent shifting of inter-event times (dts).
  3. When computing log-likelihood, we updated the integral estimate using the trapezoid rule because the last grid point on each interval always overlaps with the left intensity at event time. We added the option to use unbiased Monte Carlo samples in integral estimates. We directly used the mark sequences to compute event_ll to be more memory efficient, so it can be extended to higher dimensional mark space.
  4. Other changes to models:
  • We fixed IFTPP with data leakage and incorrect scaling of log-likelihood and allowed it to sample from the parameterized distribution.
  • We added mark-specific Softplus parameters for several models to be consistent with the original papers (NHP, THP, SAHP, AttNHP).
  • We added marked intensities for FullyNN.
  • RMTPP and NHP are refactored for simpler implementations (and to be consistent with the above changes).

All these changes are made collaboratively with @ajboyd2.

yuxinc17 and others added 11 commits October 13, 2024 00:27
Three changes being made to log-likelihood computation:
1. Changed the computation of logL only using type_mask to be more memory-efficient and can be extended to higher-dimensional mark space. This affects all models.
2. Use trapezoid rule if the integral is estimated using grid points for each inter-event interval, because the last grid point on each interval always overlaps with the left intensity at event time
3. Add the option to use Monte Carlo samples in estimating the integral so the estimates are unbiased.

The list of `batch` and corresponding masks are updated.

Co-Authored-By: ajboyd2 <ajboyd2@calpoly.edu>
1. Fixed the off by one shifting issue.
2. Removed `max_seq_length` in `forward`.
3. Integrated `compute_states_at_sample_times` that was intended to compute intensity into `evolve_and_get_intentsity`.

Co-Authored-By: ajboyd2 <ajboyd2@calpoly.edu>
1. Fixed the off by one issue.
2. Removed `max_decay_time` and `num_steps` in `forward`.
3. Add mark-specific learnable beta parameters for Softplus to be consistent with the paper.

Co-Authored-By: ajboyd2 <ajboyd2@calpoly.edu>
1. Fixed off by one issue (with dts that causes leakage).
2. Added mark-specific beta for Softplus.
3. Match attention mask to `compute_intensities_at_sample_times` and the updated `event_tokenizer`.

Co-Authored-By: ajboyd2 <11725441+ajboyd2@users.noreply.github.com>
1. Added mark-specific beta for Softplus.
2. Used mark sequences to compute log-likelihood and match attention mask to `compute_intensities_at_sample_times` and the updated `event_tokenizer`.

Co-Authored-By: ajboyd2 <11725441+ajboyd2@users.noreply.github.com>
1. Added mark-specific beta for Softplus (match the paper).
2. Fixed compatibility for GPU.
3. Used mark sequences to compute log-likelihood and match attention mask to `compute_intensities_at_sample_times` and the updated `event_tokenizer`.

Co-Authored-By: ajboyd2 <11725441+ajboyd2@users.noreply.github.com>
1. Fixed the shifting issue in dts.
2. Used mark sequences to compute log-likelihood.

Co-Authored-By: ajboyd2 <11725441+ajboyd2@users.noreply.github.com>
1. Fixed the off by one shifting issue.
2. Used mark sequences to compute log-likelihood.
3. Now allows for having marked intensities.

Co-Authored-By: ajboyd2 <11725441+ajboyd2@users.noreply.github.com>
1. Fixed three sources of data leakage:
(1) For the parameterized distribution, the running statistics of dts were computed using the current batch's dts which effectively cheats by knowing ahead of time the scale of the targeted dts.
(2) The shifting of dts misaligned with the marks.
(3) The log-likelihood of marks is computed against the wrong marks due to incorrect shifting - specifically the marks that they conditioned on.

2. Corrected the log likelihood that was reported on an incorrect scale. It originally was ll.sum(dim=-1).mean() when it should just be ll.sum().

3. Now supports next event prediction based on the parameterized distribution.

Co-Authored-By: ajboyd2 <11725441+ajboyd2@users.noreply.github.com>
Co-Authored-By: ajboyd2 <11725441+ajboyd2@users.noreply.github.com>
1. Fixed the shift by one issue (especially with dts) that matches other model implementations.
2. With multiple accepted sampled time, the events were originally sampled conditioned on the expected time \lambda(k | t_avg), while they should be sampled conditioned on *each* accepted sampled time \lambda(k | t_j) at all t_j. Then we normalize the intensity values to Categorical distributions, aggregate the distributions, and then take argmax for the final prediction of mark. This is now fixed with explanation in comments.

Co-Authored-By: ajboyd2 <11725441+ajboyd2@users.noreply.github.com>
@iLampard
Copy link
Collaborator

Thanks. As discussed, we will issue a tech report soon to explain the new version of easytpp.

@iLampard iLampard merged commit f1583af into ant-research:main Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants