Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding back in rvalue checks on 1d indexes #849

Merged
merged 1 commit into from
Mar 10, 2021
Merged

Conversation

bbbales2
Copy link
Member

@bbbales2 bbbales2 commented Mar 10, 2021

Release notes

This is a revert of #656 to put the fix in #521 back in to fixes #489, fixes #776, fixes #847, fixes #731.

The revert in #656 was cause there was a substantial performance hit (like 30%) of including this fix and we were in code freeze and the simplest thing was revert.

I think it would be good to put this fix back in even without an immediate fix to the performance thing. We can re-evaluate the performance thing (lots of stuff has changed between error checks and indexing) and fix that later, but at least then the segfaults are out of the way.

Copyright and Licensing

By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the BSD 3-clause license (https://opensource.org/licenses/BSD-3-Clause)

@bbbales2
Copy link
Member Author

Looks like @rok-cesnovar has a reproducible example here: #656 (comment)

Let me start a couple cmdstan tarball builds and I'll compare:

  1. Stanc3 develop
  2. Stanc3 here
  3. 2.24rc-1
  4. 2.24

@bbbales2
Copy link
Member Author

bbbales2 commented Mar 10, 2021

Alright the timings are (for N = 10000 in the model above and using the same seed and verifying the last samples are the same):

2.24.0:     4.4s
2.24.0-rc1: 5.0s

And then:

develop:   4.8s
this pull: 5.1s

So this change looks like less of a performance hit now, though it looks like everything overall is slower but that's a different issue.

The binaries I'm using are:
2.24.0
2.24.0rc-1 with the stanc binaries here
develop, and
this pull

@bbbales2 bbbales2 requested a review from rok-cesnovar March 10, 2021 14:04
@wds15
Copy link

wds15 commented Mar 10, 2021

Less of performance hit? This PR is the slowest of all.

Can we make the checks optional with some compiler or stanc3 flag?

@bbbales2
Copy link
Member Author

Less of performance hit?

It's 0.3s slower instead of 0.6s slower, so in terms of performance hit this is less.

As far as future performance stuff goes, there are probably a few options, but those things can come in the future. This segfault behavior is not good and we can fix it now easily. Any performance optimization stuff can stack onto this in the future (as more flags, patches to Math, etc. etc).

@SteveBronder
Copy link
Contributor

I'm 50/50 on this. Why don't we wait for @rok-cesnovar 's cleanup of rvalue and assign before doing this?

@wds15 I think we should just use the -DNDEBUG macro to turn off these checks at the stan C++ level. That macro is used to turn off indexing checks in the std so I think it makes sense.

@bbbales2
Copy link
Member Author

bbbales2 commented Mar 10, 2021

Does the rvalue cleanup fix this? I was looking at the generated code and it doesn't seem like it to me (for instance here).

If it does we can wait, but otherwise better to just get this issue out of the way and then the focus can be on speeding things up and there's only one thing to worry about (rather than having performance and a segfault hanging over out heads -- just performance, which is sort of business as usual).

If there's a change needed in the rvalue refactor pull, it will presumably be like one or two lines (there's only 5 lines of new code here and the rest is changed generated code).

@rok-cesnovar
Copy link
Member

Does the rvalue cleanup fix this?

No, just cleans up thing a bit. I think the order in which we merge does not make a difference.

I agree that the baseline should be no-segfault, then worry about performance. This same check was present in stanc2. I propose merging this and working on the no-debug stuff.

@SteveBronder this would have to be a stancflag or do you propose avoiding the range check based on C++ compiler flags? I am fine with the C++ flag since its already "a thing" in other cases.

I was debating this with @t4c1 at the time of the revert and I think the conclusion was there is no performant way of doing this for std::vector.

@bbbales2
Copy link
Member Author

I propose merging this and working on the no-debug stuff.

Yeah something that comes in at the interface level could address this in the future.

Like mod = cmdstan_model("mymodel.stan", cpp_options = list(NDEBUG = TRUE)). I guess that would be handled very similarly to threads.

@SteveBronder
Copy link
Contributor

Cool cool we can add the NDEBUG thing

@rok-cesnovar
Copy link
Member

Also fixes #731

@rok-cesnovar rok-cesnovar merged commit 74f7832 into master Mar 10, 2021
@rok-cesnovar rok-cesnovar deleted the revert-revert-521 branch March 10, 2021 17:10
@ryan-richt
Copy link

Just a user comment, the no-debug flag to recover the performance is critical to our use cases as Stan users. We have profiled a ~15% slowdown due to these bounds checks in the older version (in models that otherwise are fully performance optimized). Having access to the "production" flag to turns these checks off is very important to us.

@wds15
Copy link

wds15 commented Mar 10, 2021

Maybe open an issue for that?

@ryan-richt
Copy link

@wds15 I have opened said ticket here: #853 please do edit for correctness

rok-cesnovar added a commit that referenced this pull request Mar 15, 2021
This reverts commit 74f7832, reversing
changes made to 59c92d6.

# Conflicts:
#	test/integration/good/code-gen/cl.expected
#	test/integration/good/code-gen/cpp.expected
#	test/integration/good/code-gen/standalone_functions/cpp.expected
#	test/integration/good/compiler-optimizations/cpp.expected
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants