Skip to content

Releases: diku-dk/futhark

nightly

22 Dec 10:00
Compare
Choose a tag to compare
nightly Pre-release
Pre-release

Commits

  • 6ef4de0: Shim get_wall_time_ns on Windows. (Troels Henriksen)
  • f4cd916: Unconditionally include pthread.h. (Troels Henriksen)

0.25.25

18 Dec 14:08
Compare
Choose a tag to compare

Added

  • Improvements to futhark fmt.

Fixed

  • Sizes that go out of scope due to use of higher order functions will
    now work in more cases by adding existentials. (#2193)

  • Tracing inside AD operators with the interpreter now prints values
    properly.

  • Compiled and interpreted code now have same treatment of inclusive
    ranges with start==end and negative step size, e.g. 1..0...1
    produces [1] rather than an invalid range error.

  • Inconsistent handling of types in lambda lifting (#2197).

  • Invalid primal results from vjp2 in interpreter (#2199).

0.25.24

11 Nov 14:54
Compare
Choose a tag to compare

Added

  • futhark doc now produces better (and stable) anchor IDs.

  • futhark profile now supports multiple JSON files.

  • futhark fmt, by William Due and Therese Lyngby.

  • Lambdas can now be passed as the last argument to a function application.

Fixed

  • Negation of floating-point positive zero now produces a negative
    zero.

  • Necessary inlining of functions used inside AD constructs.

  • A compile time regression for programs that used higher order
    functions very aggressively.

  • Uniqueness bug related to slice simplification.

0.25.23

15 Oct 08:33
Compare
Choose a tag to compare

Added

  • Trailing commas are now allowed for arrays, records, and tuples in
    the textual value format and in FutharkScript.

  • Faster floating-point atomics with OpenCL backend on AMD and NVIDIA
    GPUs. This affects histogram workloads.

  • AD is now supported by the interpreter (thanks to Marcus Jensen).

Fixed

  • Some instances of invalid copy removal. (Again.)

  • An issue related to entry points with nontrivial sizes in their
    arguments, where the entry points were also used as normal functions
    elsewhere. (#2184)

0.25.22

10 Sep 08:15
Compare
Choose a tag to compare

Added

  • futhark script now supports an -f option.

  • futhark script now supports the builtin procedure $store.

Removed

Changed

Fixed

  • An error in tuning file validation.

  • Constant folding for loops that produce floating point results could
    result in different numerical behaviour.

  • Compiler crash in memory short circuiting (#2176).

0.25.21

01 Sep 13:24
Compare
Choose a tag to compare

Added

  • Logging now prints more GPU information on context initialisation.

  • GPU cache size can now be configured (tuning param: default_cache).

  • GPU shared memory can now be configured (tuning param: default_shared_memory).

  • GPU register capacity can now be configured.

  • futhark script now accepts a -b option for producing binary
    output.

Fixed

  • Type names for element types of array indexing functions in C
    interface are now often better - although there are still cases
    where you end up with hashed names. (#2172)

  • In some cases, GPU failures would not be reported properly if a
    previous failure was pending.

  • auto output didn't work if the .fut file did not have any path
    components.

  • Improved detection of malformed tuning files.

0.25.20

15 Aug 18:51
Compare
Choose a tag to compare

Added

  • Better error message when in-place updates fail at runtime due to a
    shape mismatch.

Fixed

  • #[unroll] on an outer loop now no longer causes unrolling of all
    loops nested inside the loop body.

  • Obscure issue related to replications of constants in complex
    intrablock kernels.

  • Interpreter no longer crashes on attributes in patterns.

  • Fixes to array indexing through C API when using GPU backends.

0.25.19

26 Jul 17:11
Compare
Choose a tag to compare

Added

  • The compiler now does slightly less aggressive inlining. Use the
    #[inline] attribute if you want to force inlining of some
    function.

  • Arrays of opaque types now support indexing through the C API.
    Arrays of records can also be constructed. (#2082)

Fixed

  • The opencl backend now always passes
    -cl-fp32-correctly-rounded-divide-sqrt to the kernel compiler, in
    order to match CUDA and HIP behaviour.

0.25.18

19 Jul 09:54
Compare
Choose a tag to compare

Added

  • New prelude function: rep, an implicit form of replicate.

  • Improved handling of large monomorphic single-dimensional array
    literals (#2160).

Fixed

  • futhark repl no longer asks for confirmation on EOF.

  • Obscure oversight related to abstract size-lifted types (#2120).

  • Accidential exponential-time algorithm in layout optimisation for
    multicore backends (#2151).

0.25.17

12 Jun 08:49
Compare
Choose a tag to compare
  • Faster device-to-device copies on CUDA.

  • "More correctly" detect L2 cache size for OpenCL backend on AMD GPUs.

Fixed

  • Handling of .. in import paths (again).

  • Detection of impossible loop parameter sizes (#2144).

  • Rare case where GPU histograms would use slightly too much shared
    memory and fail at run-time.

  • Rare crash in layout optimisation.