Some operations are slow due to fusion oriented definitions (replicateM) #113

mgsloan · 2016-04-22T00:02:48Z

replicateM round trips through a list, even though we immediately know the size of the result and could allocate it. It's much faster (5x for my application) to do something like this

replicateM n f = do
    mut <- MV.new n
    forM_ [0..n-1] $ \i -> f >>= MV.write mut i
    V.unsafeFreeze mut

Relevant chunks of code:

From Data.Vector:

replicateM :: Monad m => Int -> m a -> m (Vector a)
{-# INLINE replicateM #-}
replicateM = G.replicateM

From Data.Vector.Generic:

replicateM :: (Monad m, Vector v a) => Int -> m a -> m (v a)
{-# INLINE replicateM #-}
replicateM n m = unstreamM (MBundle.replicateM n m)

unstreamM s = do
                xs <- MBundle.toList s
                return $ unstream $ Bundle.unsafeFromList (MBundle.size s) xs

The text was updated successfully, but these errors were encountered:

cartazio · 2016-04-22T00:19:40Z

@mgsloan 1) what GHC version 2) what optimization levels 3) do we currently have any rewrite rules that should have accomplished this?

cartazio · 2016-04-22T00:20:01Z

either way, thanks for including the unrolling of the definitions on the ticket, that will help contextulize discussing this

mgsloan · 2016-04-22T00:32:59Z

-O2 with ghc-7.10.3. I do not see any rewrite rules that attempt to accomplish an optimization here.

I'm not sure what to do about this, we can either prioritize fusion working or prioritize normal vector performance. It would be interesting to benchmark how well some multi-step fusion thing performs with the non-fusion definition of replicateM. It's quite possible that it's only a little bit slower. In that case, I think it makes sense to adopt the non-fusion definition for this and related functions.

cartazio · 2016-04-22T01:47:23Z

since i'm on 8.0Tip atm, I'll have a go at digging into this this weekend (if you or @dolio or @hvr nudge me), since build times etc for vector are much more pleasant in 8.0 land.

I do agree with what you're saying, and that does remind me of some of the other improvements for making imperative ish code much nicer to write in vector, esp since fusion isn't a cureall (though it can be darn nice)

cartazio · 2016-07-24T18:55:52Z

@mgsloan did you have any good comparative conclusions here?

cartazio · 2016-07-24T20:41:04Z

if you have some some benchmarking code you can share comparing implementations/performance, please share, closing for now

mgsloan mentioned this issue Apr 22, 2016

performance anomaly and possible related bug mgsloan/store#6

Closed

cartazio added question performance-audit-todo labels Jul 24, 2016

cartazio closed this as completed Jul 24, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some operations are slow due to fusion oriented definitions (replicateM) #113

Some operations are slow due to fusion oriented definitions (replicateM) #113

mgsloan commented Apr 22, 2016 •

edited

Loading

cartazio commented Apr 22, 2016

cartazio commented Apr 22, 2016

mgsloan commented Apr 22, 2016

cartazio commented Apr 22, 2016

cartazio commented Jul 24, 2016

cartazio commented Jul 24, 2016

Some operations are slow due to fusion oriented definitions (replicateM) #113

Some operations are slow due to fusion oriented definitions (replicateM) #113

Comments

mgsloan commented Apr 22, 2016 • edited Loading

cartazio commented Apr 22, 2016

cartazio commented Apr 22, 2016

mgsloan commented Apr 22, 2016

cartazio commented Apr 22, 2016

cartazio commented Jul 24, 2016

cartazio commented Jul 24, 2016

mgsloan commented Apr 22, 2016 •

edited

Loading