Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make take and drop more symmetric #1183

Merged
merged 5 commits into from
Jun 8, 2023
Merged

Conversation

primo-ppcg
Copy link
Contributor

Related issue: #1178

Allow take from the end of bytes or indexed (as drop does).
Allow drop from fibers (as take does).
Return table for take from dictionary types.
Allow drop from dictionary types (as take does).
Increase efficiency for take and drop with slices.
Check indexed types before bytes.

It isn't clear to me that there is a common use case for taking and dropping from dictionary types. However, the implementation is relatively straight-forward, particularly for take, so I don't know that it should be explicitly forbidden either.

take for indexed types isn't slowed any by this change, and drop has become significantly faster. take has become slightly slower for bytes objects due to the type check reordering. I believe that indexed types are a more common case, and in particular when efficiency is a concern.

(def ind (range 200))
(repeat 10_000_000 (take 100 ind))
master     8.399327
take-drop  7.313185

(repeat 10_000_000 (drop 100 ind))
master    11.91337
take-drop  6.92769

(repeat 1_000_000 (take-until |(> $ 100) ind))
master     7.40999
take-drop  7.256637

(repeat 1_000_000 (drop-until |(> $ 100) ind))
master     7.251279
take-drop  7.15903

Allow `take` from the end of bytes or indexed (as `drop` does).
Allow `drop` from fibers (as `take` does).
Return table for `take` of dictionary types.
Allow `drop` of dictionary types.
Increase efficiency for `take` and `drop` with slices.
Check indexed types before bytes types.
@sogaiu
Copy link
Contributor

sogaiu commented Jun 8, 2023

I tried the code out. Below are some preliminary remarks.

FWIW, I did something like:

(def ind (range 200))

## take

(def take-results @[])

(for i 0 10
  (def start (os/clock)) 
  (repeat 10_000_000 (take 100 ind))
  (array/push take-results
              (- (os/clock) start)))

So I could see individual batch run results.

For take and drop I got results that looked similar to the original posting.

However, for take-until and drop-until, the results were not so clearly different. One thing I noticed in the results was that there could sometimes be a "spike" result or a result beyond which things took more time consistently, perhaps this was some throttling kicking in.

If I throw out outlier-ish things take-until looks slightly faster on the take-drop branch, but drop-until I couldn't really tell.

In any case, I presume the main point of this PR is symmetry (though of course it's nice to get a boost for some cases and/or maintain a similar level of efficiency for others).


A rather minor thing -- there was a change that got lost in one of zevv's recent PRs and it looks like the same lines are touched here so I wonder if we might make a change: 472ec73#diff-e120880268b4f0f04177470180f50ee0d2c7ac13cb83bb778b6d81efda1cbbccR4171-R4172

If you don't mind, may be we can go for one of the suggested options. No big deal though, just noticed a chance to tweak :)

@sogaiu
Copy link
Contributor

sogaiu commented Jun 8, 2023

If we're going the route of handling dictionaries [1] for drop-until, I wonder if it's worth the predicate getting access to both the key and value per encountered pair. At the moment it looks like to me like the predicate only gets access to the value.

Not sure though, just bringing it up.

If so, I guess it would make sense to do similarly for take-until: b5407ac#diff-e120880268b4f0f04177470180f50ee0d2c7ac13cb83bb778b6d81efda1cbbccR1122

If those changes were made then take-while and drop-while are also affected, but I guess that doesn't involve any code changes.


[1] I'm in favor of leaving the functionality undocumented initially :)

@primo-ppcg
Copy link
Contributor Author

primo-ppcg commented Jun 8, 2023

I appreciate the feedback!

drop became significantly slower after #1114, which is understandable because it added extra functionality. My primary goal was to win back some of the lost performance, and also not to slow take any.

drop-until and take-until are essentially the same implementation as before, just restructured slightly. My point in including them is to demonstrate that they haven't been slowed by the refactoring. On my system, both are approximately 0.1µs per call faster for indexed types, and 0.05µs slower per call for bytes, pretty consistently. This is due primarily to the re-ordering of the checks.

For profiling, I use something similar to this: timeit.janet

(defmacro timeit
  ``Similar to `loop`, but outputs performance statistics after completion.``
  [head & body]
  (with-syms [clk cnt elp run]
    ~(do
       (var ,cnt 0)
       (def ,clk (os/clock))
       (loop ,head (++ ,cnt) ,;body)
       (def ,elp (- (os/clock) ,clk))
       (def ,run (/ ,elp ,cnt))
       (cond
         (< ,run 1e-3) (printf "elapsed %fs, %.4gµs/body" ,elp (* ,run 1_000_000))
         (< ,run 1) (printf "elapsed %fs, %.4gms/body" ,elp (* ,run 1_000))
         (printf "elapsed %fs, %.4gs/body" ,elp ,run)))))

(def ind (range 200))
(timeit [:repeat 10_000_000]
  (drop 100 ind))

A rather minor thing -- there was a change that got lost in one of zevv's recent PRs and it looks like the same lines are touched here

I remembered to run my changes through format-file (unlike last time). Should I revert those lines? I see, he had meant to change the (if-not ... (do for (unless ... (and left it in a form the formatter doesn't like 😉).

Done.

As suggested by @sogaiu

@zevv forget to push this change in a recent PR (janet-lang#1175 (comment)).

Incidentally, the affected lines were already reformatted in the current PR, via fmt/format-file.
@primo-ppcg
Copy link
Contributor Author

primo-ppcg commented Jun 8, 2023

If we're going the route of handling dictionaries [1] for drop-until, I wonder if it's worth the predicate getting access to both the key and value per encountered pair. At the moment it looks like to me like the predicate only gets access to the value.

Not sure though, just bringing it up.

All other functions that accept a predicate (find, filter, etc.) treat dictionary types the same as any other, passing only the value to the predicate. For consistency, these should probably do the same.

@bakpakin bakpakin merged commit e35c6b8 into janet-lang:master Jun 8, 2023
@primo-ppcg primo-ppcg deleted the take-drop branch June 9, 2023 00:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants