Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI with github actions #216

Merged
merged 36 commits into from
Sep 20, 2021
Merged

CI with github actions #216

merged 36 commits into from
Sep 20, 2021

Conversation

nhirschey
Copy link
Contributor

@nhirschey nhirschey commented Sep 15, 2021

I'm really looking forward to using this library.

  • This pull request seems to get CI to work as discussed on issue Get CI etc working again #213. The issue was R_HOME, which I set on mac only for simplicity. If you want to build on all platforms, you'll need to update R_HOME for other platforms. See https://github.com/nhirschey/RProvider/actions/runs/1239038715 for the passing github workflow example while I was testing.
  • You'll also need to set a repository nuget key as a secret in order to auto-deploy to nuget when you change release notes (see below).
  • You may need to update the release notes format to be like what you see in FSharp.Data in order to get auto-deploy via fake to work.

image

@nhirschey
Copy link
Contributor Author

I figured out how to add back ubuntu and windows and I did fail-fast:false so that you could see macOS is completing while ubuntu and windows do not.

The test failures on ubuntu are the same that I see on my windows machine when I try to run the tests.

Example fun: https://github.com/nhirschey/RProvider/runs/3616975071?check_suite_focus=true

@dsyme
Copy link
Member

dsyme commented Sep 16, 2021

Great to see this going through on Mac!

@dsyme
Copy link
Member

dsyme commented Sep 16, 2021

@nhirschey Would you like to be co-maintainer with @AndrewIOM ?

@nhirschey
Copy link
Contributor Author

@dsyme, if @AndrewIOM would like the help I am happy to co-maintain with him. I am very appreciative of his hard work getting this library up to date. I use both F# and R most days for my research, so I benefit a lot from getting this library working again and stable.

@AndrewIOM
Copy link
Collaborator

Thanks @nhirschey , would be great to co-maintain with you. It will be very beneficial to have additional support from someone experienced on both the R and F# sides!

@AndrewIOM
Copy link
Collaborator

Thanks for your fixes! We have left one outstanding issue: building the test solution on linux and windows seems to lock up.

I've just tested out building, testing, and the full 'fake run -t All' in an Ubuntu dev container with .net 5.0 and R 3.6.3 (installed at /usr/lib/R) on Ubuntu 20.04.3 LTS. All runs fine, all tests pass, makes nugets and docs fine. Need to investigate further if this is an R version issue or Github Actions issue.

@nhirschey
Copy link
Contributor Author

nhirschey commented Sep 16, 2021

I am on windows, and the CI behavior is the same as when I run locally--I cannot use the provider or run the tests.

When I try to load RProvider.dll it locks up fsharp interactive, so it seems to be the issue. Here's a screen-shot of the intellisense error ... I tried a bit of digging but haven't been able to track down the explanation for these error messages.

image

It actually looks like this issue: dotnet/fsharp#3303

@dsyme
Copy link
Member

dsyme commented Sep 17, 2021

Can DesignTime and Runtme be netstandard2.0? That would be simplest if it works

https://github.com/fslaborg/RProvider/blob/master/src/RProvider.DesignTime/RProvider.DesignTime.fsproj#L5

@dsyme
Copy link
Member

dsyme commented Sep 17, 2021

On Windows I installed R locally, built and got a hang when building the Tests. Attached a debugger and it's stuck at this point. Note that so far I've not explicitly installed any R packages on my machine

image

@dsyme
Copy link
Member

dsyme commented Sep 17, 2021

@nhirschey #222 might help with the issues in the IDE

@dsyme
Copy link
Member

dsyme commented Sep 17, 2021

I merged #222

@nhirschey I enabled logging and got this log https://gist.github.com/dsyme/5b31f597a035fe66b371e0ca9a30ecd9

Looks like R is closing the pipe and hence the hang

Looks like you might first need to do better pipe-closure detection, recovery and error reporting so this is visible and not a hang. Making the log much more visible e.g. in error reports sounds good.

[17/09/2021 15:38:50] [Pid:13664, Tid:1, Apid:1] Communicating with R to get packages
[17/09/2021 15:38:50] [Pid:13664, Tid:1, Apid:1] eval(1+4)
[17/09/2021 15:38:50] [Pid:13664, Tid:1, Apid:1] engine: Creating and initializing instance (sizeof<IntPtr>=8)
[17/09/2021 15:38:52] [Pid:23356, Tid:4, Apid:1] [Client Pipe log]: Pipe has closed.

@nhirschey
Copy link
Contributor Author

@AndrewIOM,

  1. I tried with R 3.6.2 and the result for me building locally is the same as what I describe below with R-4.0.2.

  2. I used CI to generate gists of the logs for mac and windows. Do these differences hint at something that could cause the windows tests to fail?

One thing from your big pull request caught my eye:

Remove Remoting between server and client processes, and replace with an implementation based on System.IO.Pipes.

I think RProvider.Server.exe is still getting launched on windows (see "[Server Pipe log]" messages). Is this expected in the new pipe architecture?

Note: windows was local with R-4.0.2, mac was CI R-4.1.1.

  • Processes (PIDs)
    • Mac uses 1
    • Windows uses 2, a client and server
  • Server logs
    • Mac only has logs from the client
    • Windows shows logs from the client and server and shows the EventLoop taking place.
  • What gets processed
    • Mac goes through base packages and datasets, but it does not have client messages about the Test.RProvider\data/sample.rdata file.
    • Windows does packages but then starts loading the 'Test.RProvider\data/sample.rdata' (3 times) with client and server messages about the volcano data inside.

As an example of differences, compare windows starting line 15:

[9/18/2021 9:22:31 PM] [Pid:12656, Tid:4, Apid:1] [Client Pipe log]: Connecting to named pipe
[9/18/2021 9:22:31 PM] [Pid:30372, Tid:1, Apid:1] Server started, running event loop
[9/18/2021 9:22:31 PM] [Pid:30372, Tid:1, Apid:1] server event loop: starting
[9/18/2021 9:22:31 PM] [Pid:30372, Tid:6, Apid:1] [Server Pipe log]: Connected to client.

To mac line 12

[9/18/2021 11:26:04 PM] [Pid:7701, Tid:6, Apid:1] [Client Pipe log]: Connecting to named pipe
[9/18/2021 11:26:04 PM] [Pid:7701, Tid:4, Apid:1] [Client Pipe log]: Connected.
[9/18/2021 11:26:04 PM] [Pid:7701, Tid:6, Apid:1] Got some server
[9/18/2021 11:26:04 PM] [Pid:7701, Tid:6, Apid:1] Sending command: get init error message...

@AndrewIOM
Copy link
Collaborator

AndrewIOM commented Sep 19, 2021

Thanks @nhirschey. I'm back from travelling now so have access to a Windows 10 VM and can finally start looking into the windows issues in detail today.

Just got started and can reproduce the lock-up locally on windows:

[19/09/2021 11:11:13] [Pid:7032, Tid:1, Apid:1] Adding member pi [19/09/2021 11:11:13] [Pid:7032, Tid:1, Apid:1] Adding member volcano [19/09/2021 11:11:13] [Pid:7032, Tid:1, Apid:1] Adding member volcanoList [19/09/2021 11:11:13] [Pid:7032, Tid:1, Apid:1] Adding member volcanoMean [19/09/2021 11:11:13] [Pid:7032, Tid:1, Apid:1] Finished generating types for C:\Users\acm\Documents\GitHub\RProvider\tests\Test.RProvider\data/sample.rdata [19/09/2021 11:11:15] [Pid:6036, Tid:9, Apid:1] [Server Pipe log]: Pipe has closed.

Will investigate this (look at error handling / pipe closure etc.), and also the logging issue.

I think the client and server are both running, but that a recent regression has stopped logs from spawned server processes being added to the text file on macOS for some reason (could be related to passing of environment variables?). These logs were there a few commits ago, and now on my local mac copy of this PR I only see logs from the client. Will investigate!

Edit: so it looks like the issue only occurs on windows during fake test. I have been able to run all tests using dotnet test, and also use R from dotnet fsi. In image, left is Ubuntu with fake passing; right is windows with dotnet test and dotnet fsi working (R v4.0.2, dotnet latest v5 SDK, Windows 10 1704). It may be that it builds and runs fine when the build is triggered for a second time.
Screenshot 2021-09-19 at 11 55 39
.

@AndrewIOM
Copy link
Collaborator

AndrewIOM commented Sep 19, 2021

I think I have identified the problem - the Stop command is not being sent on windows, as it never detects that the process with the given PID has ended (this is when the process is no longer present in task manager). The problematic block of code is:

/// Process.WaitForExit does not seem to be working reliably
/// on Mono, so instead we loop asynchronously until the process is gone
let rec asyncWaitForExit pid = async {
  let parentProcess = try Process.GetProcessById(pid) |> Some with _ -> None
  match parentProcess with
  | Some _ ->
    do! Async.Sleep(1000)
    return! asyncWaitForExit pid
  | None -> () }

Will work on a fix.
Edit: Fix applied in below commit stops windows from locking up locally, and allows dotnet test to run correctly. However, this has not fixed the problems with CI

Use a version of PipeMethodCalls that doesn't pack on build
Stops infinite hangs when the process has exited
Try to see if options passed by FAKE cause test instability
Change build order and enable more verbose logging for tests
@AndrewIOM
Copy link
Collaborator

Windows and macOS are now building and testing OK!

  • xUnit 2 turned on parallelism by default, which needed to be disabled.
  • Downgraded to R 4.0.2 so that Windows builds

Only linux to go - this is proving tricky, as it works locally for me. Going to try R 4.0.2 locally to see if that makes a difference.

@dsyme
Copy link
Member

dsyme commented Sep 20, 2021

Great progress!

Remove the test. Process.HasExisted doesn't appear to be the issue in Github Actions...
Testing Github Actions stalling on Ubuntu 20.04
Not that either...
@nhirschey
Copy link
Contributor Author

Fantastic! I was able to build and pass all tests on windows 10 using dotnet fake build -t All. Additionally, today is the first time that I have seen intellisense in Visual Studio 2019 working without errors on the R types/namespaces (I can browse packages, installed datasets, etc. using R. or open RProvider.).

@AndrewIOM maybe save this for a separate issue, but when I plot something the plot window is locked up by the REngine. I cannot resize the window and if I try to close the plot window it crashes the F# process (see gif).

rprovider-plot

@AndrewIOM
Copy link
Collaborator

I have been debugging the reasons for the Ubuntu fail on CI but not on a local machine. I have identified that it is actually REngine.GetInstance() in RInit.fs that is hanging in the Ubuntu CI, so this could be an RDotNet issue. I tried moving the engine initialisation out to a subsequent call to engine.Initialize() - this then is the step that locks up. No exceptions seem to be thrown. Not 100% sure how to proceed with this, as it works for me on the same Ubuntu version and same R version vscode dev-env. Downgrading the CI to R 3.6.3 doesn't help.

Really exciting to see it all working on Windows. Thanks for testing! For the years I have used RProvider (on mac), the graphics have beachballed and been locked up when using quartz, but not when using X11. I will open an issue - definately worth a fix for both windows and mac.

I propose that we remove Ubuntu for now and merge in this PR to get Windows working on CI. We could then release an RProvider v2.0.0-beta to allow wider testing of the .net 5 version. We could start a new PR to resolve the outstanding Ubuntu CI issues. What do you think?

Screenshot 2021-09-20 at 22 42 53

!

@dsyme
Copy link
Member

dsyme commented Sep 20, 2021

I propose that we remove Ubuntu for now and merge in this PR to get Windows working on CI.

Sounds good!

@AndrewIOM AndrewIOM merged commit c7804dd into fslaborg:master Sep 20, 2021
@nhirschey nhirschey mentioned this pull request Sep 29, 2021
9 tasks
@nhirschey nhirschey deleted the nh-si branch October 3, 2021 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants