Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution hault in FAST.Farm tests with MinGW on Windows #784

Open
erikfurevik opened this issue Jun 30, 2021 · 15 comments
Open

Execution hault in FAST.Farm tests with MinGW on Windows #784

erikfurevik opened this issue Jun 30, 2021 · 15 comments

Comments

@erikfurevik
Copy link

Desciption
Hi, I am trying to complete the r-tests on windows. First I will describe some problems I have encountered on the way and what I did to solve them (in case it is of use to you, or if it's the cause of current issues for me). Then, what my current issue is.

Procedure

  1. Windows 10, MinGW compiler
  2. git clone --single-branch --branch dev --recursive https://github.com/OpenFAST/openfast.git
  3. From build: cmake .. -G "MinGW Makefiles" -DBUILD_TESTING -DBUILD_FASTFARM
  4. make install. So far no errors.
  5. ctest --output-on-failure. Nearly every test fails due to: "ModuleNotFoundError: No module named 'rtestlib'". I see that this library is in C:\openfast\reg_tests\lib, so I move all 5 files there to C:\openfast\reg_tests.
  6. ctest --output-on-failure. Now, nearly all fail due to: Error: file does not exist at C:/openfast/build/glue-codes/openfast/openfast. I see from C:\openfast\build\Testing\Temporary\LastTest.log that the commands being run are of the form

"C:/Python/Python395/python3.exe" "C:/openfast/reg_tests/executeOpenfastRegressionCase.py" "AWT_YFix_WSt" "C:/openfast/build/glue-codes/openfast/openfast" "C:/openfast/reg_tests/.." "C:/openfast/build/reg_tests/glue-codes/openfast" "0.00001" "Windows" "GNU"

Apparently an .exe extension is missing from the executable file shown in bold, as adding this to the command completes with code 0. I believe the commands are decided by C:\openfast\build\reg_tests\CTestTestfile.cmake, so I enter this file and add .exe to all the executables (openfast, fast.farm, drivers).

  1. This seems to fix most issues, the cmd output is now:
C:\openfast\build>ctest -j4 --output-on-failure
Test project C:/openfast/build
      Start  1: AWT_YFix_WSt
      Start  2: AWT_WSt_StartUp_HighSpShutDown
      Start  3: AWT_YFree_WSt
      Start  4: AWT_YFree_WTurb
 1/59 Test  #1: AWT_YFix_WSt .................................   Passed    8.21 sec
      Start  5: AWT_WSt_StartUpShutDown
 2/59 Test  #2: AWT_WSt_StartUp_HighSpShutDown ...............   Passed   17.43 sec
      Start  6: AOC_WSt
 3/59 Test  #3: AWT_YFree_WSt ................................   Passed   22.40 sec
      Start  7: AOC_YFree_WTurb
 4/59 Test  #6: AOC_WSt ......................................   Passed   13.52 sec
      Start  8: AOC_YFix_WSt
 5/59 Test  #5: AWT_WSt_StartUpShutDown ......................   Passed   27.80 sec
      Start  9: UAE_Dnwind_YRamp_WSt
 6/59 Test  #8: AOC_YFix_WSt .................................   Passed   26.54 sec
      Start 10: UAE_Upwind_Rigid_WRamp_PwrCurve
 7/59 Test  #4: AWT_YFree_WTurb ..............................   Passed   65.59 sec
      Start 11: WP_VSP_WTurb_PitchFail
 8/59 Test #11: WP_VSP_WTurb_PitchFail .......................   Passed   16.22 sec
      Start 12: WP_VSP_ECD
 9/59 Test  #9: UAE_Dnwind_YRamp_WSt .........................***Failed   59.72 sec
-- Using gold standard files with machine-compiler type macos-gnu
C:/openfast/build/glue-codes/openfast/openfast.exe C:/openfast/build/reg_tests/glue-codes/openfast\UAE_Dnwind_YRamp_WSt\UAE_Dnwind_YRamp_WSt.fst > C:/openfast/build/reg_tests/glue-codes/openfast\UAE_Dnwind_YRamp_WSt\UAE_Dnwind_YRamp_WSt.log
COMPLETE with code 0

      Start 13: WP_VSP_WTurb
10/59 Test #10: UAE_Upwind_Rigid_WRamp_PwrCurve ..............   Passed   43.87 sec
      Start 14: SWRT_YFree_VS_EDG01
11/59 Test #12: WP_VSP_ECD ...................................   Passed   20.34 sec
      Start 15: SWRT_YFree_VS_EDC01
12/59 Test  #7: AOC_YFree_WTurb ..............................   Passed   91.29 sec
      Start 16: SWRT_YFree_VS_WTurb
13/59 Test #13: WP_VSP_WTurb .................................   Passed   56.80 sec
      Start 17: 5MW_Land_DLL_WTurb
14/59 Test #15: SWRT_YFree_VS_EDC01 ..........................   Passed   88.48 sec
      Start 18: 5MW_OC3Mnpl_DLL_WTurb_WavesIrr
15/59 Test #17: 5MW_Land_DLL_WTurb ...........................   Passed  111.26 sec
      Start 19: 5MW_OC3Trpd_DLL_WSt_WavesReg
16/59 Test #14: SWRT_YFree_VS_EDG01 ..........................   Passed  175.89 sec
      Start 20: 5MW_OC4Jckt_DLL_WTurb_WavesIrr_MGrowth
17/59 Test #16: SWRT_YFree_VS_WTurb ..........................***Failed  348.02 sec
-- Using gold standard files with machine-compiler type macos-gnu
C:/openfast/build/glue-codes/openfast/openfast.exe C:/openfast/build/reg_tests/glue-codes/openfast\SWRT_YFree_VS_WTurb\SWRT_YFree_VS_WTurb.fst > C:/openfast/build/reg_tests/glue-codes/openfast\SWRT_YFree_VS_WTurb\SWRT_YFree_VS_WTurb.log
COMPLETE with code 0

      Start 21: 5MW_ITIBarge_DLL_WTurb_WavesIrr
18/59 Test #21: 5MW_ITIBarge_DLL_WTurb_WavesIrr ..............   Passed   46.32 sec
      Start 22: 5MW_TLP_DLL_WTurb_WavesIrr_WavesMulti
19/59 Test #22: 5MW_TLP_DLL_WTurb_WavesIrr_WavesMulti ........   Passed  115.95 sec
      Start 23: 5MW_OC3Spar_DLL_WTurb_WavesIrr
20/59 Test #23: 5MW_OC3Spar_DLL_WTurb_WavesIrr ...............   Passed  124.20 sec
      Start 24: 5MW_OC4Semi_WSt_WavesWN
21/59 Test #18: 5MW_OC3Mnpl_DLL_WTurb_WavesIrr ...............   Passed  752.28 sec
      Start 25: 5MW_Land_BD_DLL_WTurb
22/59 Test #24: 5MW_OC4Semi_WSt_WavesWN ......................   Passed  453.28 sec
      Start 26: 5MW_OC4Jckt_ExtPtfm
23/59 Test #26: 5MW_OC4Jckt_ExtPtfm ..........................   Passed   42.11 sec
      Start 27: HelicalWake_OLAF
24/59 Test #19: 5MW_OC3Trpd_DLL_WSt_WavesReg .................***Failed  1041.49 sec
-- Using gold standard files with machine-compiler type macos-gnu
C:/openfast/build/glue-codes/openfast/openfast.exe C:/openfast/build/reg_tests/glue-codes/openfast\5MW_OC3Trpd_DLL_WSt_WavesReg\5MW_OC3Trpd_DLL_WSt_WavesReg.fst > C:/openfast/build/reg_tests/glue-codes/openfast\5MW_OC3Trpd_DLL_WSt_WavesReg\5MW_OC3Trpd_DLL_WSt_WavesReg.log
COMPLETE with code 0

      Start 28: EllipticalWing_OLAF
25/59 Test #28: EllipticalWing_OLAF ..........................   Passed    1.13 sec
      Start 29: StC_test_OC4Semi
26/59 Test #25: 5MW_Land_BD_DLL_WTurb ........................   Passed  380.49 sec
      Start 30: IEA_LB_RWT-AeroAcoustics
27/59 Test #27: HelicalWake_OLAF .............................   Passed   85.58 sec
      Start 31: WP_Stationary_Linear
28/59 Test #31: WP_Stationary_Linear .........................   Passed    2.18 sec
      Start 32: Ideal_Beam_Fixed_Free_Linear
29/59 Test #32: Ideal_Beam_Fixed_Free_Linear .................   Passed    2.78 sec
      Start 33: Ideal_Beam_Free_Free_Linear
30/59 Test #33: Ideal_Beam_Free_Free_Linear ..................   Passed    2.47 sec
      Start 34: 5MW_Land_BD_Linear
31/59 Test #30: IEA_LB_RWT-AeroAcoustics .....................   Passed   62.70 sec
      Start 35: 5MW_OC4Semi_Linear
32/59 Test #34: 5MW_Land_BD_Linear ...........................   Passed   64.60 sec
      Start 36: TSinflow
33/59 Test #36: TSinflow .....................................***Failed    4.80 sec
-- Using gold standard files with machine-compiler type macos-gnu
C:/openfast/build/glue-codes/fast-farm/FAST.Farm.exe C:/openfast/build/reg_tests/glue-codes/fast-farm\TSinflow\TSinflow.fstf > C:/openfast/build/reg_tests/glue-codes/fast-farm\TSinflow\TSinflow.log
COMPLETE with code 3221225725


      Start 37: LESinflow
34/59 Test #37: LESinflow ....................................***Failed    2.02 sec
-- Using gold standard files with machine-compiler type macos-gnu
C:/openfast/build/glue-codes/fast-farm/FAST.Farm.exe C:/openfast/build/reg_tests/glue-codes/fast-farm\LESinflow\LESinflow.fstf > C:/openfast/build/reg_tests/glue-codes/fast-farm\LESinflow\LESinflow.log
COMPLETE with code 3221225725


      Start 38: ad_timeseries_shutdown
35/59 Test #38: ad_timeseries_shutdown .......................   Passed   17.38 sec
      Start 39: bd_5MW_dynamic
36/59 Test #39: bd_5MW_dynamic ...............................   Passed   47.20 sec
      Start 40: bd_5MW_dynamic_gravity_Az00
37/59 Test #40: bd_5MW_dynamic_gravity_Az00 ..................   Passed    9.72 sec
      Start 41: bd_5MW_dynamic_gravity_Az90
38/59 Test #41: bd_5MW_dynamic_gravity_Az90 ..................   Passed    9.79 sec
      Start 42: bd_curved_beam
39/59 Test #42: bd_curved_beam ...............................   Passed    0.78 sec
      Start 43: bd_isotropic_rollup
40/59 Test #43: bd_isotropic_rollup ..........................   Passed    1.14 sec
      Start 44: bd_static_cantilever_beam
41/59 Test #44: bd_static_cantilever_beam ....................   Passed    0.58 sec
      Start 45: bd_static_twisted_with_k1
42/59 Test #45: bd_static_twisted_with_k1 ....................   Passed    0.53 sec
      Start 46: hd_OC3tripod_offshore_fixedbottom_wavesirr
43/59 Test #46: hd_OC3tripod_offshore_fixedbottom_wavesirr ...   Passed    6.17 sec
      Start 47: hd_5MW_ITIBarge_DLL_WTurb_WavesIrr
44/59 Test #47: hd_5MW_ITIBarge_DLL_WTurb_WavesIrr ...........   Passed    1.94 sec
      Start 48: hd_5MW_OC3Spar_DLL_WTurb_WavesIrr
45/59 Test #48: hd_5MW_OC3Spar_DLL_WTurb_WavesIrr ............   Passed   10.05 sec
      Start 49: hd_5MW_OC4Semi_WSt_WavesWN
46/59 Test #49: hd_5MW_OC4Semi_WSt_WavesWN ...................   Passed   54.81 sec
      Start 50: hd_5MW_TLP_DLL_WTurb_WavesIrr_WavesMulti
47/59 Test #50: hd_5MW_TLP_DLL_WTurb_WavesIrr_WavesMulti .....   Passed    5.55 sec
      Start 51: hd_TaperCylinderPitchMoment
48/59 Test #51: hd_TaperCylinderPitchMoment ..................   Passed    0.44 sec
      Start 52: SD_Cable_5Joints
49/59 Test #52: SD_Cable_5Joints .............................   Passed    1.61 sec
      Start 53: SD_PendulumDamp
50/59 Test #53: SD_PendulumDamp ..............................   Passed    1.86 sec
      Start 54: SD_Rigid
51/59 Test #54: SD_Rigid .....................................   Passed    0.56 sec
      Start 55: SD_SparHanging
52/59 Test #55: SD_SparHanging ...............................   Passed    1.79 sec
      Start 57: nwtc_library_utest
53/59 Test #57: nwtc_library_utest ...........................***Failed    0.28 sec
.At line 20 of file C:\openfast\build\unit_tests\tests\nwtc-library\NWTC_Library_test_tools.F90 (unit = 6)
Fortran runtime error: Cannot open file '/dev/null': No such file or directory

Error termination. Backtrace:

Could not print backtrace: libbacktrace could not find executable to open
#0  0xffffffff
#1  0xffffffff
#2  0xffffffff
#3  0xffffffff
#4  0xffffffff
#5  0xffffffff
#6  0xffffffff
#7  0xffffffff
#8  0xffffffff
#9  0xffffffff
#10  0xffffffff
#11  0xffffffff
#12  0xffffffff
#13  0xffffffff
#14  0xffffffff
#15  0xffffffff
#16  0xffffffff
#17  0xffffffff
#18  0xffffffff
#19  0xffffffff
#20  0xffffffff
#21  0xffffffff
#22  0xffffffff

      Start 56: beamdyn_utest
54/59 Test #56: beamdyn_utest ................................   Passed    0.35 sec
      Start 59: inflowwind_utest
55/59 Test #59: inflowwind_utest .............................   Passed    0.24 sec
      Start 58: fvw_utest
56/59 Test #58: fvw_utest ....................................   Passed    0.02 sec
57/59 Test #29: StC_test_OC4Semi .............................   Passed  460.03 sec
58/59 Test #20: 5MW_OC4Jckt_DLL_WTurb_WavesIrr_MGrowth .......***Failed  1543.17 sec
-- Using gold standard files with machine-compiler type macos-gnu
C:/openfast/build/glue-codes/openfast/openfast.exe C:/openfast/build/reg_tests/glue-codes/openfast\5MW_OC4Jckt_DLL_WTurb_WavesIrr_MGrowth\5MW_OC4Jckt_DLL_WTurb_WavesIrr_MGrowth.fst > C:/openfast/build/reg_tests/glue-codes/openfast\5MW_OC4Jckt_DLL_WTurb_WavesIrr_MGrowth\5MW_OC4Jckt_DLL_WTurb_WavesIrr_MGrowth.log
COMPLETE with code 0

59/59 Test #35: 5MW_OC4Semi_Linear ...........................   Passed  1430.03 sec

88% tests passed, 7 tests failed out of 59

Label Time Summary:
aeroacoustics    =  62.70 sec*proc (1 test)
aerodyn          =  17.38 sec*proc (1 test)
aerodyn14        = 646.08 sec*proc (8 tests)
aerodyn15        = 5153.89 sec*proc (20 tests)
beamdyn          = 520.07 sec*proc (11 tests)
bem              =  17.38 sec*proc (1 test)
dynamic          =  66.70 sec*proc (3 tests)
elastodyn        = 5314.35 sec*proc (26 tests)
extptfm          =  42.11 sec*proc (1 test)
fastfarm         =   6.82 sec*proc (2 tests)
hydrodyn         = 6045.68 sec*proc (15 tests)
linear           = 1502.05 sec*proc (5 tests)
map              = 286.47 sec*proc (3 tests)
moordyn          = 913.31 sec*proc (2 tests)
offshore         = 4621.48 sec*proc (18 tests)
olaf             =  86.72 sec*proc (2 tests)
openfast         = 7804.15 sec*proc (35 tests)
servodyn         = 7605.20 sec*proc (28 tests)
static           =   3.03 sec*proc (4 tests)
subdyn           = 3342.75 sec*proc (7 tests)

Total Test time (real) = 2816.19 sec

The following tests FAILED:
          9 - UAE_Dnwind_YRamp_WSt (Failed)
         16 - SWRT_YFree_VS_WTurb (Failed)
         19 - 5MW_OC3Trpd_DLL_WSt_WavesReg (Failed)
         20 - 5MW_OC4Jckt_DLL_WTurb_WavesIrr_MGrowth (Failed)
         36 - TSinflow (Failed)
         37 - LESinflow (Failed)
         57 - nwtc_library_utest (Failed)
Errors while running CTest

The issue
I'm most interested in the two FAST.Farm tests. Both
"C:/openfast/build/reg_tests/glue-codes/fast-farm\TSinflow\TSinflow.log" and
"C:/openfast/build/reg_tests/glue-codes/fast-farm\LESinflow\LESinflow.log" are empty files.
Also, if I try to run to run a FAST.Farm test manually, I get no output. See below.
I'm very grateful if someone has a good idea about what's wrong.

C:\openfast\install\bin>FAST.Farm.exe C:\openfast\reg_tests\r-test\glue-codes\fast-farm\LESinflow\LESinflow.fstf

 **************************************************************************************************
 FAST.Farm

 Copyright (C) 2021 National Renewable Energy Laboratory
 Copyright (C) 2021 Envision Energy USA LTD

 This program is licensed under Apache License Version 2.0 and comes with ABSOLUTELY NO WARRANTY.
 See the "LICENSE" file distributed with this software for details.
 **************************************************************************************************

 FAST.Farm-v3.0.0-49-g216d4987
 Compile Info:
  - Compiler: GCC version 8.1.0
  - Architecture: 64 bit
  - Precision: double
  - OpenMP: No
  - Date: Jun 30 2021
  - Time: 11:53:43
 Execution Info:
  - Date: 06/30/2021
  - Time: 13:54:22+0200

  Heading of the FAST.Farm input file:
    Sample FAST.Farm input file
 Running AWAE.
 Running WakeDynamics.
@rafmudaf
Copy link
Collaborator

rafmudaf commented Jul 2, 2021

Hi @erikfurevik Nice job debugging the tests and explaining your process. MinGW on Windows is not a well tested or well supported configuration for OpenFAST. It should work fine, but may require some minor tweaks to the CMake configuration, as you've already seen. Also, just a heads up that the Intel compilers are now free for all systems.

Any test with an exit code of 0 is more than likely not a problem. The results comparison failed, but the test cases executed to completion. The nwtc_library_utest failed because the MinGW compiler is apparently not setting _WIN32 on a Windows system. This unit test relies on that variable to determine where to dump output. See NWTC_Library_test_tools.F90.

The non-zero exit code for the FAST Farm tests are concerning. It looks like there's a seg fault given that the log file stops printing. Are you able to compile FAST Farm in debug mode? You can do this with CMake: cmake .. -DCMAKE_BUILD_TYPE=DEBUG. Then, recompile FAST Farm and run those tests with verbose output: ctest -L fastfarm -VV.

@rafmudaf rafmudaf self-assigned this Jul 2, 2021
@rafmudaf rafmudaf changed the title Issues with testing and FAST.Farm Execution hault in FAST.Farm tests with MinGW on Windows Jul 2, 2021
@erikfurevik
Copy link
Author

Thank you for the suggestions. Do you mean that using Intel compilers would be better? Regarding FAST.Farm I deleted the build folder and recompiled with the debug option as suggested. Here is both console output and LastTest.log. The test-specific log files are still empty.

C:\openfast\build>ctest -L fastfarm -VV
UpdateCTestConfiguration  from :C:/openfast/build/DartConfiguration.tcl
Parse Config file:C:/openfast/build/DartConfiguration.tcl
UpdateCTestConfiguration  from :C:/openfast/build/DartConfiguration.tcl
Parse Config file:C:/openfast/build/DartConfiguration.tcl
Test project C:/openfast/build
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 36
    Start 36: TSinflow

36: Test command: C:\Python\Python395\python3.exe "C:/openfast/reg_tests/executeFASTFarmRegressionCase.py" "TSinflow" "C:/openfast/build/glue-codes/fast-farm/FAST.Farm.exe" "C:/openfast/reg_tests/.." "C:/openfast/build/reg_tests/glue-codes/fast-farm" "0.00001" "Windows" "GNU"
36: Test timeout computed to be: 5400
36: -- Using gold standard files with machine-compiler type macos-gnu
36: C:/openfast/build/glue-codes/fast-farm/FAST.Farm.exe C:/openfast/build/reg_tests/glue-codes/fast-farm\TSinflow\TSinflow.fstf > C:/openfast/build/reg_tests/glue-codes/fast-farm\TSinflow\TSinflow.log
36: COMPLETE with code 3221225725
36:
1/2 Test #36: TSinflow .........................***Failed    2.00 sec
test 37
    Start 37: LESinflow

37: Test command: C:\Python\Python395\python3.exe "C:/openfast/reg_tests/executeFASTFarmRegressionCase.py" "LESinflow" "C:/openfast/build/glue-codes/fast-farm/FAST.Farm.exe" "C:/openfast/reg_tests/.." "C:/openfast/build/reg_tests/glue-codes/fast-farm" "0.00001" "Windows" "GNU"
37: Test timeout computed to be: 5400
37: -- Using gold standard files with machine-compiler type macos-gnu
37: C:/openfast/build/glue-codes/fast-farm/FAST.Farm.exe C:/openfast/build/reg_tests/glue-codes/fast-farm\LESinflow\LESinflow.fstf > C:/openfast/build/reg_tests/glue-codes/fast-farm\LESinflow\LESinflow.log
37: COMPLETE with code 3221225725
37:
2/2 Test #37: LESinflow ........................***Failed    0.86 sec

0% tests passed, 2 tests failed out of 2

Label Time Summary:
fastfarm    =   2.86 sec*proc (2 tests)

Total Test time (real) =   2.88 sec

The following tests FAILED:
         36 - TSinflow (Failed)
         37 - LESinflow (Failed)
Errors while running CTest
Output from these tests are in: C:/openfast/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

@rafmudaf
Copy link
Collaborator

rafmudaf commented Jul 6, 2021

Do you mean that using Intel compilers would be better?

Yes, with OpenFAST the Intel compilers will be better. The build system is configured to support Intel specifically and you will get better run time performance.

As for your crash in FAST.Farm, there's nothing obvious in the CTest output. Could you run one of these cases with your FAST.Farm debug binary directly? You can use the command given in the CTest output but remove the pipe to the log file:

C:/openfast/build/glue-codes/fast-farm/FAST.Farm.exe C:/openfast/build/reg_tests/glue-codes/fast-farm\LESinflow\LESinflow.fstf

I'm hoping we get an idea of where it stops running.

@erikfurevik
Copy link
Author

I understand, I will probably try Intel compilers if this can't be solved, though I would prefer this open-source configuration.

Here is what I get when running the command:

C:\openfast\build>C:/openfast/build/glue-codes/fast-farm/FAST.Farm.exe C:/openfast/build/reg_tests/glue-codes/fast-farm\LESinflow\LESinflow.fstf

 **************************************************************************************************
 FAST.Farm

 Copyright (C) 2021 National Renewable Energy Laboratory
 Copyright (C) 2021 Envision Energy USA LTD

 This program is licensed under Apache License Version 2.0 and comes with ABSOLUTELY NO WARRANTY.
 See the "LICENSE" file distributed with this software for details.
 **************************************************************************************************

 FAST.Farm-v3.0.0-49-g216d4987
 Compile Info:
  - Compiler: GCC version 8.1.0
  - Architecture: 64 bit
  - Precision: double
  - OpenMP: No
  - Date: Jul  6 2021
  - Time: 16:55:36
 Execution Info:
  - Date: 07/07/2021
  - Time: 20:25:23+0200

  Heading of the FAST.Farm input file:
    Sample FAST.Farm input file
 Running AWAE.
 Running WakeDynamics.

C:\openfast\build>

And the other test for completeness:

C:\openfast\build>C:/openfast/build/glue-codes/fast-farm/FAST.Farm.exe C:/openfast/build/reg_tests/glue-codes/fast-farm\TSinflow\TSinflow.fstf

 **************************************************************************************************
 FAST.Farm

 Copyright (C) 2021 National Renewable Energy Laboratory
 Copyright (C) 2021 Envision Energy USA LTD

 This program is licensed under Apache License Version 2.0 and comes with ABSOLUTELY NO WARRANTY.
 See the "LICENSE" file distributed with this software for details.
 **************************************************************************************************

 FAST.Farm-v3.0.0-49-g216d4987
 Compile Info:
  - Compiler: GCC version 8.1.0
  - Architecture: 64 bit
  - Precision: double
  - OpenMP: No
  - Date: Jul  6 2021
  - Time: 16:55:36
 Execution Info:
  - Date: 07/07/2021
  - Time: 20:42:41+0200

  Heading of the FAST.Farm input file:
    Sample FAST.Farm input file
 Running AWAE.
 Running InflowWind.

    Reading a 101x35 grid (1000 m wide, 5 m to 345 m above ground) with a characteristic wind
    speed of 9.243 m/s. This full-field file was generated by TurbSim (v2.00.07a-bjj, 14-Jun-2016)
    on 24-Sep-2019 at 16:18:09.

    Processed 2000 time steps of 10-Hz full-field data (period of 200 seconds).
 Running WakeDynamics.

C:\openfast\build>

@rafmudaf
Copy link
Collaborator

@bjonkman do you have any insight on this?

@bjonkman
Copy link
Contributor

@rafmudaf, the first versions of FAST8 didn't run properly with MinGW, either. I built the executable with debugging symbols (-g), loaded it into Intel Inspector, and fairly quickly found the problem. Turns out there was an issue with setting parts of a type equal to some other type... MinGW stores memory differently, and so it crashed. Based on the above tests, I would not be surprised if there was something similar in FAST.Farm. You might be able to just run the code through a debugger, too.

@ValentinChb
Copy link

ValentinChb commented Oct 6, 2021

@rafmudaf , @bjonkman Following up on this issue, I tried debugging with gdb to find out where the exception was occuring and then played with the type definitions to identify the culprit. It seems like a stack corruption exception thrown by __chkstk_ms() at launch of FWrap_Init. Digging deeper, this appears to stem from the input of type FWrap_MiscVarType, more precisely to its element of type FAST_TurbineType (which brings us away from FAST.farm to the turbine-level OpenFAST), and further down to MAP_Data. Wondering how the MAP dll that is used together with the distributed binaries is handled with MinGW? Is/Should it be compiled again? One thing that makes me doubt about this conclusion though: compiling with single precision shifts the exception further down as "running FASTWrapper" is written to stdout. I did not try debugging there.

EDIT: it does not seem to be due to MAP_Data specifically, the latter only significantly increases the stack size and triggers __chkstk_ms(). By using the compiler flag -fno-automatic to avoid using stack (only for relevant files/fortran modules), the code runs a bit further but gets stuck sooner or later. Running out of ideas, I think IVF is the only option until further notice.

@ebranlard
Copy link
Contributor

This has been an issue for me too. I typically compile OpenFAST tools on windows using GNU Fortran (Rev5, Built by MSYS2 project) 10.2.0 and cmake with "MinGW Makefiles", but FAST.Farm has always failed at initialization (crashing somewhere between WakeDynamics and FastWrapper). I had never found the source of the error, even when compiling in debug. It's one of these errors where depending where you put a print statement, the code will crash at different locations. I have not tried Intel Inspector. @ValentinChb if you find the error that would be of great help to me!

@bjonkman
Copy link
Contributor

bjonkman commented Oct 6, 2021

@ValentinChb , can you run your executable in Intel's Inspector program? It is extremely difficult to find memory problems like this without a tool like Inspector. As @ebranlard said, the debugger will often fail in different places based on small differences in code, and can lead you down many different dead-end paths.

To use Inspector: Create a new project, and set the application to be the executable you built with the debugging symbols. Set the application parameters to point to your input file (include the path if the input file isn't in the working directory listed just below it in Inspector)
image
You'll also want to specify locations for it to find your source files in the "Source Search" tab. After you have set all of these parameters, press "OK", and then select "Memory Error Analysis: Locate Memory Problems"
image

I would expect it to fail and point you to the offending line in the source code. If you can post the results here, that would be very helpful.

@ValentinChb
Copy link

ValentinChb commented Oct 6, 2021

@ebranlard same setup, same issue. And @bjonkman I concur on the difficulty of finding the error using the debugger alone, the exception being basically triggered after an accumulation of memory handling errors, you'll always only cacth "the last drop". Thanks (also @rafmudaf) for updating me on Intel's and Microsoft's move toward free licensing (although free licensing and open source are two different things and I'm still relunctant to use Visual Studio if I can avoid. Wondering what their business model is now ??).

So I gave Intel Inspector a go. Had to check "analyze stack access", else no problem was reported. But I guess the results contain a lot of "false positive" and are hard to interprete. Here is a snapshot:
image
Among the problems found:

  • some are hardly interpretable (those in dlls) because not associated with any symbol.
  • some seem "benign" to me (those in NWTC_IO), referring to READ, WRITE statement or CALL_TIME procedure.
  • some are ALLOCATE statements, but with no obvious mistake, located among other similar statements that have not been identified as problematic.
  • two unitialized memory access problems on calls to FWrap_Init and WD_Init, and one invalid memory access problem on call to Farm_ReadPrimaryFile, which I find suspicious but without further info.

Any clue?

I tried running Intel Inspector on compiled binaries (being aware I'd not get symbols), just to try. Hard to interprete but there is indeed a number of "false positive" (I presume) problems. It would be good to have an IVF debug version to compare with.

EDIT: I also tried running Inspector on OpenFAST, which runs fine in appearance. Many problems found are, if not identical, very similar to what I got with FAST.farm. They may either be false positive or not critical then... Maybe things that came after FAST v8, which I understood was made compatible with MinGW Bonnie?

@bjonkman
Copy link
Contributor

bjonkman commented Oct 6, 2021

Thanks, @ValentinChb . Inspector can give false positives, particularly on file I/O and on internal libraries, but it does seem like we may have several things to look at in FAST.Farm.

Did the code actually complete? Or did it end due to one of these memory issues?

I will try to run Inspector on my Intel version. What input files (model) are you using? One of the r-tests?

@ValentinChb
Copy link

@bjonkman The FAST.farm analysis did not complete, ended at the same place as before. I'm using the TSinflow r-test.

@bjonkman
Copy link
Contributor

bjonkman commented Oct 7, 2021

I fixed a few issues where FAST.Farm memory wasn't deallocated here: https://github.com/bjonkman/openfast/tree/b/FAST.Farm-memory. I am not sure that will fix any of the issues you are seeing, unless there was an issue with the OutList array allocation in gfortran (it may not like the way the INTENT(OUT) arguments combined with ALLOCATABLE properties?). You could try it and see.

Here are the results from Inspector from my Intel Fortran build (though it aborted because I didn't have the controller DLLs set properly. I'll update the results when that finishes.) All of these errors are in the following categories:

  • INQUIRE statements to see if a file exists
  • OPEN file statements
  • READ statements... I think these two show up because parts of the code process input files by reading until there is an error
  • GET_COMMAND_ARGUMENT statement
  • EXIT statement
  • part of system DLLs or Intel code, not part of the source code we are building; several of these are related to the loading of the controller DLLs.

I don't see anything in this list that concerns me at the moment. But, it is different from the gfortran case, which we'll have to check out more closely.
image

@andrew-platt
Copy link
Collaborator

Possibly related: #895

@ebranlard
Copy link
Contributor

I've tried with the latest dev and gfortran 10.3.0, the issue is still here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants