Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shuffled checked properties in bli_l3_check.c #676

Merged
merged 1 commit into from
Oct 18, 2022
Merged

Conversation

fgvanzee
Copy link
Member

Details:

  • Added certain checks for matrix structure to the level-3 operations' _check() functions, and slightly reorganized existing checks.

Details:
- Added certain checks for matrix structure to the level-3 operations'
  _check() functions, and slightly reorganized existing checks.
@fgvanzee fgvanzee merged commit 23f5b8d into master Oct 18, 2022
@fgvanzee fgvanzee deleted the l3_check_shuf branch October 18, 2022 01:21
fgvanzee added a commit that referenced this pull request Nov 20, 2023
- (cherry picked from commit c803b03)

Fix auto-detection of firestorm (Apple M1).

- (cherry picked from commit 2dd692b)

Added Discord documentation (#677)

Details:
- Added a docs/Discord.md markdown document that walks the reader
  through creating a Discord account, obtaining the invite link, and
  using the link to join the BLIS Discord server.
- Updated README.md to reference the new Discord.md document in multiple
  places, including via the official Discord logo (used with explicit
  permission from representatives at Discord Inc.).
- (cherry picked from commit 88105db)

Shuffled checked properties in bli_l3_check.c. (#676)

Details:
- Added certain checks for matrix structure to the level-3 operations'
  _check() functions, and slightly reorganized existing checks.
- (cherry picked from commit 23f5b8d)

CREDITS file update.

Details:
- This attribution was intended to go in PR #647.
- (cherry picked from commit 9453e0f)

Reinstate sanity check in bli_pool_finalize. (#671)

Details:
- Added a reinit argument to bli_pool_finalize(). This bool will signal
  whether or not the function is being called from bli_pool_reinit(). If
  it is not being called from _reinit(), we can safely check to confirm
  that .top_index == 0 (i.e., all blocks have been checked in). But if
  it *is* being called from _reinit(), then that check will be skipped
  since one of the predicted use cases for bli_pool_reinit() anticipates
  that some blocks are (probably) checked out when the pool_t is
  reinitialized.
- Updated existing invocations of bli_pool_finalize() to pass in either
  FALSE (from bli_apool_free_block() or bli_pba_finalize_pools()) or
  TRUE (from bli_pool_reinit()) for the new reinit argument.
- (cherry picked from commit 76a23bd)

Fix some bugs in bli_pool.c (#670)

Details:
- Add a check for premature pool exhaustion when checking in blocks via
  bli_pool_checkin_block(). This detects "double-free" and other bad
  conditions that don't necessarily result in a segfault.
- Make sure to copy all block pointers when growing the pool size.
  Previously, checked-out block pointers (which are guaranteed to be set
  to NULL) were not being copied, leading to the presence of
  uninitialized data.
- (cherry picked from commit 63470b4)

Add AddressSanitizer (-fsanitize=address) option. (#669)

Details:
- Added support for AddressSanitizer (ASan), a compiler-integrated
  memory error detector. The option (disabled by default) enables
  compiling and linking with the -fsanitize=address flag supported by
  clang, gcc, and probably others. This flag is employed during
  compilation of all BLIS source files *except* for optimized kernels,
  which are exempted because ASan usually requires an extra register,
  which violates the constraints for many gemm microkernels.
- Minor whitespace, comment, ordering, and configure help text updates.
- (cherry picked from commit 42d0e66)

Add consistent NaN/Inf handling in sumsqv. (#668)

Details:
- Changed sumsqv implementation as follows:
  - If there is a NaN (either real or imaginary), then return a sum of
    NaN and unit scale.
  - Else, if there is an Inf (either real or imaginary), then return a
    sum of +Inf and unit scale.
  - Otherwise behave as normal.
- (cherry picked from commit b861c71)

Parameterized test/3 drivers via command line args. (#667)

Details:
- Rewrote the drivers in test/3, the Makefile, and the runme.sh script
  so that most of the important parameters, including parameter combo,
  datatype, storage combo, induced method, problem size range, dimension
  bindings, number of repeats, and alpha/beta values can be passed in
  via command line arguments. (Previously, most of these parameters were
  hard-coded into the driver source, except a few that were hard-coded
  into the Makefile.) If no argument is given for any particular option,
  it will be assigned a sane default. Either way, the values employed at
  runtime will be printed to stdout before the performance data in a
  section that is commented out with '%' characters (which is used by
  matlab and octave for comments), unless the -q option is given, in
  which case the driver will proceed quietly and output only performance
  data. Each driver also provides extensive help via the -h option, with
  the help text tailored for the operation in question (e.g. gemm, hemm,
  herk, etc.). In this help text, the driver reminds the user which
  implementation it was linked to (e.g. blis, openblas, vendor, eigen).
  Thanks to Jeff Diamond for suggesting this CLI-based reimagining of
  the test/3 drivers.
- In the test/3 drivers: converted cpp macro string constants, as well
  as two string literals (for the opname and pc_str) used in each test
  driver, to global (or static) const char* strings, and replaced the
  use of strncpy() for storing the results of the command line argument
  parsing with pointer copies from the corresponding strings in argv.
  This works because the argv array is guaranteed by the C99 standard
  to persist throughout the life of the program. This new approach uses
  less storage and executes faster. Thanks to Minh Quan Ho for
  recommending this change.
- Renamed the IMP_STR cpp macro that gets defined on the command line,
  via the test/3/Makefile, to IMPL_STR.
- Updated runme.sh to set the problem size ranges for single-threaded
  and multithreaded execution independently from one another, as well as
  on a per-system basis.
- Added a 'quiet' variable to runme.sh that can easily toggle quiet mode
  for the test drivers' output.
- Very minor typecast fix in call to bli_getopt() in bli_utils.c.
- In bli_getopt(), changed the nextchar variable from being a local
  static variable to a field of the getopt_t state struct. (Not sure why
  it was ever declared static to begin with.)
- Other minor changes to bli_getopt() to accommodate the rewritten test
  drivers' command line parsing needs.
- (cherry picked from commit ee81efc)
fgvanzee added a commit that referenced this pull request Apr 30, 2024
- (cherry picked from commit c803b03)

Fix auto-detection of firestorm (Apple M1).

- (cherry picked from commit 2dd692b)

Added Discord documentation (#677)

Details:
- Added a docs/Discord.md markdown document that walks the reader
  through creating a Discord account, obtaining the invite link, and
  using the link to join the BLIS Discord server.
- Updated README.md to reference the new Discord.md document in multiple
  places, including via the official Discord logo (used with explicit
  permission from representatives at Discord Inc.).
- (cherry picked from commit 88105db)

Shuffled checked properties in bli_l3_check.c. (#676)

Details:
- Added certain checks for matrix structure to the level-3 operations'
  _check() functions, and slightly reorganized existing checks.
- (cherry picked from commit 23f5b8d)

CREDITS file update.

Details:
- This attribution was intended to go in PR #647.
- (cherry picked from commit 9453e0f)

Reinstate sanity check in bli_pool_finalize. (#671)

Details:
- Added a reinit argument to bli_pool_finalize(). This bool will signal
  whether or not the function is being called from bli_pool_reinit(). If
  it is not being called from _reinit(), we can safely check to confirm
  that .top_index == 0 (i.e., all blocks have been checked in). But if
  it *is* being called from _reinit(), then that check will be skipped
  since one of the predicted use cases for bli_pool_reinit() anticipates
  that some blocks are (probably) checked out when the pool_t is
  reinitialized.
- Updated existing invocations of bli_pool_finalize() to pass in either
  FALSE (from bli_apool_free_block() or bli_pba_finalize_pools()) or
  TRUE (from bli_pool_reinit()) for the new reinit argument.
- (cherry picked from commit 76a23bd)

Fix some bugs in bli_pool.c (#670)

Details:
- Add a check for premature pool exhaustion when checking in blocks via
  bli_pool_checkin_block(). This detects "double-free" and other bad
  conditions that don't necessarily result in a segfault.
- Make sure to copy all block pointers when growing the pool size.
  Previously, checked-out block pointers (which are guaranteed to be set
  to NULL) were not being copied, leading to the presence of
  uninitialized data.
- (cherry picked from commit 63470b4)

Add AddressSanitizer (-fsanitize=address) option. (#669)

Details:
- Added support for AddressSanitizer (ASan), a compiler-integrated
  memory error detector. The option (disabled by default) enables
  compiling and linking with the -fsanitize=address flag supported by
  clang, gcc, and probably others. This flag is employed during
  compilation of all BLIS source files *except* for optimized kernels,
  which are exempted because ASan usually requires an extra register,
  which violates the constraints for many gemm microkernels.
- Minor whitespace, comment, ordering, and configure help text updates.
- (cherry picked from commit 42d0e66)

Add consistent NaN/Inf handling in sumsqv. (#668)

Details:
- Changed sumsqv implementation as follows:
  - If there is a NaN (either real or imaginary), then return a sum of
    NaN and unit scale.
  - Else, if there is an Inf (either real or imaginary), then return a
    sum of +Inf and unit scale.
  - Otherwise behave as normal.
- (cherry picked from commit b861c71)

Parameterized test/3 drivers via command line args. (#667)

Details:
- Rewrote the drivers in test/3, the Makefile, and the runme.sh script
  so that most of the important parameters, including parameter combo,
  datatype, storage combo, induced method, problem size range, dimension
  bindings, number of repeats, and alpha/beta values can be passed in
  via command line arguments. (Previously, most of these parameters were
  hard-coded into the driver source, except a few that were hard-coded
  into the Makefile.) If no argument is given for any particular option,
  it will be assigned a sane default. Either way, the values employed at
  runtime will be printed to stdout before the performance data in a
  section that is commented out with '%' characters (which is used by
  matlab and octave for comments), unless the -q option is given, in
  which case the driver will proceed quietly and output only performance
  data. Each driver also provides extensive help via the -h option, with
  the help text tailored for the operation in question (e.g. gemm, hemm,
  herk, etc.). In this help text, the driver reminds the user which
  implementation it was linked to (e.g. blis, openblas, vendor, eigen).
  Thanks to Jeff Diamond for suggesting this CLI-based reimagining of
  the test/3 drivers.
- In the test/3 drivers: converted cpp macro string constants, as well
  as two string literals (for the opname and pc_str) used in each test
  driver, to global (or static) const char* strings, and replaced the
  use of strncpy() for storing the results of the command line argument
  parsing with pointer copies from the corresponding strings in argv.
  This works because the argv array is guaranteed by the C99 standard
  to persist throughout the life of the program. This new approach uses
  less storage and executes faster. Thanks to Minh Quan Ho for
  recommending this change.
- Renamed the IMP_STR cpp macro that gets defined on the command line,
  via the test/3/Makefile, to IMPL_STR.
- Updated runme.sh to set the problem size ranges for single-threaded
  and multithreaded execution independently from one another, as well as
  on a per-system basis.
- Added a 'quiet' variable to runme.sh that can easily toggle quiet mode
  for the test drivers' output.
- Very minor typecast fix in call to bli_getopt() in bli_utils.c.
- In bli_getopt(), changed the nextchar variable from being a local
  static variable to a field of the getopt_t state struct. (Not sure why
  it was ever declared static to begin with.)
- Other minor changes to bli_getopt() to accommodate the rewritten test
  drivers' command line parsing needs.
- (cherry picked from commit ee81efc)

Allow test/3 drivers to use default ind_t method. (#804)

Details:
- Previously, the standalone performance drivers in test/3 were written
  under the assumption that the user would want to explicitly test
  either native execution *or* 1m. But because the accompanying runme.sh
  script defaults to passing "native" in for the -i command line option
  (which explicitly sets the induced method type), running the script
  without modification causes the test drivers to use slow reference
  microkernels on systems where native complex-domain microkernels are
  not registered -- which will yield poor performance for complex-domain
  level-3 operations. Furthermore, even if a user was aware of this, the
  test drivers did not support any single value for the -i option that
  would test BLIS using the library's default behavior -- that is, using
  1m on systems where it is needed and native execution on systems that
  have native microkernels implemented and registered.
- This commit addresses the aforementioned issue by supporting a new
  value for the -i option: "auto". The "auto" value causes the driver
  to avoid explicitly setting the induced method altogether, leaving
  BLIS's default behavior in place. This "auto" option is also now the
  default setting within the runme.sh script. Thanks to Leick Robinson
  for finding and reporting this issue.
- Also added support for "nat" as a shorthand for "native", which
  the help text already (erroneously) claimed was supported.
- (cherry picked from commit fd1a7e3)

Use "-i auto" by default in test/3 drivers.

Details:
- Request default induced method behavior of BLIS via "-i auto" when
  running the standalone performance drivers in test/3 via the runme.sh
  script present in that directory. (Previously, the runme.sh script
  would use "-i native" by default.) This change was originally intended
  for fd1a7e3.
- (cherry picked from commit cad5149)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant