Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subtle differences in DI between WebJobs.Script.WebHost and Microsoft.Azure.WebJobs/Microsoft.Extensions.DependencyInjection #3399

Closed
ielcoro opened this issue Sep 5, 2018 · 43 comments
Assignees

Comments

@ielcoro
Copy link

ielcoro commented Sep 5, 2018

Tryin to create a extension that worked both on Microsoft.Azure.WebJobs and Azure Functions (Script.WebHost), I found that there were differences between how the DI container behaves on Azure Functions on how it does in WebJobs (the standard behavior from Microsoft.Extensions.DependencyInjection), that makes a pain to create a extensions that works on both, or to migrate an application that worked in ASP.NET to functions:

Scenario Script.WebHost Webjobs/Microsoft.Extensions.DependencyInjection
A constructor parameter dependency is not registerd with DI Silently injects null Throws an exception explaining that the dependency could not be resolved
A scoped delegate factory is used Any dependency resolved in the factory becomes a singleton Any dependency resolved from the factory respect their registered lifetime

I could work around 1, but I couldn't do anything than use another separate DI container instance for my extension services for 2.

To illustrate the second scenario example, let's think that we want to use the well-known MediatR library, this library uses a delegate for resolving handlers, and we want those handlers to be scoped for whatever reason (compatibility, isolating the object graph between function instances…) :

delegate object ServiceFactory(Type type);

builder.Services.AddScoped<IMediator>();
builder.Services.AddScoped<ServiceFactory>(provider => (type) => provider.GetRequiredService(type));

By default Mediator register it dependencies as scoped. The factory, in the default WebJobs/Microsoft DI implementation, ends using the scope from wich it was resolver and passing it as the provider parameter to the factory lambda. However from my tests, in Azure Functions, it always uses the root service provider instance, thus making any scoped dependency from "type" effectively a singleton.

Looks like this line in DryIOC could be causing this severe difference in behavior.

I´m sure the team has a reason for deviating from the standard DI solutions that the parent WebJobs project happens to be using, however wouldn't be better and much simpler, to have a consistent behavior between Azure Functions - WebJobs and ASP.NET default implementations? That would ease the creation of extensions and reduce the changes needed to migrate existings ASP.NET projects.

@DanielSmon
Copy link

+1 on this issue. I've spent a while investigating DbContext exceptions because of this. My case is similar in that each MediatR handler instance receives the same DbContext even though the Handler instance is unique and is called from a unique service scope.

Thanks @ielcoro for identifying the root cause.

@jtemperv
Copy link

jtemperv commented Apr 8, 2019

Using the same pattern and libraries. Is this still an active issue?

@ielcoro
Copy link
Author

ielcoro commented Apr 9, 2019

@jtemperv Yes, that line hasn't changed, it's still and issue, and I think it will not be resolved until this is done at the webjobs-sdk level first: Azure/azure-webjobs-sdk#2078 and then they start adapting those changes to azure-functions

@brettsam
Copy link
Member

brettsam commented Apr 9, 2019

@ielcoro -- can you try the scoped scenario again? Scoped services weren't supported until this PR went in a few weeks ago: Azure/azure-webjobs-sdk#2133

@ielcoro
Copy link
Author

ielcoro commented Apr 9, 2019

@brettsam Thanks for looking into this.

Tried, now scoped services work BUT the issue with scoped delegate dependencies is still happening. I created a sample repo here: https://github.com/ielcoro/ScopedDIDemo

When the scoped service is a dependency of a registered factory, it is injected as a Singleton. That is happening because in this line

r => descriptor.ImplementationFactory(r.Resolve<IServiceProvider>()),
DryIOC is handing the root IServiceProvider to the factory function, instead of using the current scoped one.

The good news is that, the first behavior difference ("A constructor parameter dependency is not registerd with DI" has been fixed, and now it throws like the builtin one.

However, what worries me most, is that the longer the functions runtime keeps building on top of a different IOC library, instead of building on top of using .NET Core one (Microsoft.Extensions.DependencyInjection), more of this subtle issues could appear, specially in complex .net ecosystem libraries. Those differences can make both hard to migrate existing apps to functions and the functions framework itself costly to mantain, as customers report weird issues when using libraries like EF, Mediator, Health and many others.

@PureKrome
Copy link

Just want to +1 this comment from @ielcoro

However, what worries me most, is that the longer the functions runtime keeps building on top of a different IOC library, instead of building on top of using .NET Core one (Microsoft.Extensions.DependencyInjection), more of this subtle issues could appear, specially in complex .net ecosystem libraries.

@meixger
Copy link

meixger commented Apr 15, 2019

I'm also using MediatR and run into DbContext errors -- exactly as @ielcoro described above.

🙏Hopefully this will be addressed in the next release as mentioned in Dependency Injection support for Functions #3736 ?

For transparency, and to better set expectations, we're currently targeting Sprint 48 (due by May 1st) to complete this.

[Edit] Seeing activity in #3736 maybe @fabiocav can drop a comment here?

@espray
Copy link

espray commented Jun 8, 2019

+1 blocked on the same issue

@APIWT
Copy link

APIWT commented Jun 12, 2019

I have the same issue with no usage of a Delegate Factory. I think this issue is just with scoped services in general. I'm not exactly sure what causes them to get treated as a singleton but this is blocking us.

@espray
Copy link

espray commented Jul 10, 2019

@fabiocav @jeffhollan any update?

@fabiocav fabiocav modified the milestones: Functions Sprint 50, Triaged Jul 10, 2019
@fabiocav
Copy link
Member

@espray apologies for the delay on providing an update. We'll triage this again today (as this wasn't addressed on the sprint it was assigned to) so we can provide an update ASAP.

@APIWT
Copy link

APIWT commented Jul 10, 2019

@fabiocav Yes please and thank you very much! :) We have so much code in our project currently that just does the DI and we have been looking forward to removing it.

@jwisener
Copy link

ugh, we have just hit this issue as well. We are really trying to use the power of azure functions in our current sprint.

@espray
Copy link

espray commented Jul 10, 2019

@jwisener because of this issue, I would recommend using attribute/binding DI and NOT the AzFunc/WebJob DI. Been using attribute/binding DI for better part of a year for a multi-tenant SaaS product with out any problems.

https://github.com/espray/azure-webjobs-sdk-extensions
-OR-
https://github.com/BorisWilhelms/azure-function-dependency-injection

@jwisener
Copy link

@espray thanks for the links , I will try to do that. Currently we are attempting to use MediatR for events & handlers via domain events for transnational saving/publishing. The problem I'm encountering is when mediatR calls the eventhanlders the EF Context that is injected is not the right one, it seems like it's a new one (or maybe the singleton, I am still not sure yet). Anyway, we lose any data changes that were made in the event handler run. (in our case adds that occured to the dbcontext).

@espray
Copy link

espray commented Jul 15, 2019

@fabiocav @jeffhollan any update?

@fabiocav
Copy link
Member

@espray this is something that is high priority on our list. As soon as there's an update, I'll keep providing updates here, so you won't miss if you're following the issue. I'm hoping to have more details by the end of the week.

@PureKrome
Copy link

I'm hoping to have more details by the end of the week.

🙏 🙏 🙏 🙏 🙏 Please be that Microsoft.Extensions.DependencyInjection is replacing DryIOC and balanced is returned to the Order of Things ... 🙏 🙏 🙏 🙏 🙏

Side Thoughts: So awesome seeing people using MediatR + AzFunctions

@fabiocav
Copy link
Member

Just to provide an udpate, as promised (since I'll also be out for a couple of days).

We have a fix for this in validation right now. Because of the nature of the issue, full validation may not be completed before we cut the next release (which happens early next week), so it's possible that it wouldn't be on the next drop, but on the following one. If that happens, we'll investigate the possibility of making this available for validation on Azure with an opt-in runtime version.

@jtemperv
Copy link

jtemperv commented Aug 7, 2019

@fabiocav thanks for the update. I'll check this out asap. Enjoy your holidays!

@t-l-k
Copy link

t-l-k commented Aug 16, 2019

@fabiocav

This will get you running with the new version that contains the fixes.

Having followed your advice, starting such a project yields the following startup:

Azure Functions Core Tools (2.7.1558 Commit hash: eff964a0c658b5752ea77764111e17eefcdc3f8c)
Function Runtime Version: 2.0.12625.0

But I'm guessing I want this for the pre-release version, because as you say this branch https://github.com/Azure/azure-functions-host/tree/asynscopeccontext is not yet merged:

Function Runtime Version: 2.0.12620.0

Confused. What host process should be running this? I've func.exe installed via the Functions CLI tools. This is no good, I'm guessing.

Do I need to dotnet Microsoft.Azure.WebJobs.Script.WebHost.dll ? Is the 1.0.30-beta1 SDK meant to resolve a dependency to a patched version of this dll?

@meixger , @jtemperv are you able to clarify at all?

Thanks

update: In my ignorance, I thought maybe specifying "FUNCTIONS_EXTENSION_VERSION": "2.0.12620" in the configuration might help, but it did not.

Eventually I settled on copying the base Dockerfile at https://github.com/Azure/azure-functions-docker/blob/master/host/2.0/stretch/amd64/base.Dockerfile and customised it to build commit 7fd035c - then published my function library into it and went from there.

Attempting to run a prerelease WebHost.dll build from Visual Studio presented me with a number of challenges, primarily because it did not respect the local.settings.json file, as that is a func.exe convenience feature. Consequently it meant layering a number of environment variables which are not easily defined in VS.

@meixger
Copy link

meixger commented Aug 19, 2019

@t-l-k You are right - as of now it doesn't work anymore on my machine - evidently because the Function Runtime auto updated.

I'm was able to configure it manually and now i get this version information on startup:

Azure Functions Core Tools (2.7.1513 Commit hash: 7c2ae304b41dcfb7bcee14d9cbbf12c73f78ca0a)
Function Runtime Version: 2.0.12620.0
  1. Download the Artifacts (drop.zip 820MB) from the build server and extract the Azure.Functions.Cli.win-x64.2.7.1513.zip.

  2. Then follow the instructions (solution#2) in setting-azure-functions-runtime-version-for-local-dev-with-visual-studio-2017 (Stackoverflow) to point your Visual Studio debug profile to this runtime.

@t-l-k
Copy link

t-l-k commented Aug 19, 2019

@meixger thanks, this works for me now also. I hadn't realised there was a prerelease build for azure-functions-core-tools. Definitely wanted to keep using func.exe for the convenience factor.

@VDKB
Copy link

VDKB commented Aug 20, 2019

@meixger We just did the test and it works, but is has to be version Azure.Functions.Cli.win-x64.2.7.1513. Later versions don't work (tested with Azure.Functions.Cli.win-x64.2.7.1575).

@jtemperv
Copy link

@fabiocav tested and confirmed, the fix works. Looking forward to see this becoming available in consumption plans. :) thanks for the effort!

@t-l-k
Copy link

t-l-k commented Aug 21, 2019

Has anyone managed to get this working hosted? It works fine in localhost containers and the patched func.exe, but I've tried uploading a test container and Azure just isn't having.

func init MyFunctionProj --docker
func new --name MyHttpTrigger --template "HttpTrigger"

Then updating the SDK to 1.0.30-beta1 and Azure Functions v2-preview ...

  1. Built from the template Dockerfile - works fine, functions work, /admin/host/status reports fine
  2. Built from the template Dockerfile - but with rebuilt official base image (no changes) - looked fine for a while, then suddenly wasn't, functions won't enumerate, /admin/host/status still reports fine though
  3. Built from the template Dockerfile - but rebuilt official base image using source 7fd035c and same as (2), functions won't enumerate, /admin/host/status reports fine though (and is running what I want, 1.2.12620).

Is there some sort of integrity check on the images when hosted in Azure? I'm puzzled why it appeared to momentarily work in (2), but alas I may just have been fooled by a possible cached result.

I thought Docker might be the easier approach here, and allows me to keep consistent across every environment and dev/test workstations. I'll try an AppService tomorrow.

Anybody else have luck? I've a lot of code needing QA cycles in a deployed environment.

UPDATE: So it definitely does work in an AppService. When running as a container, the detailed diagnostic logging in Kudu's Log Stream (had to enable debug level logging as per .NET Core logging configuration) just reported that "0 functions were found". They were definitely there though, and the image worked locally. Probably some kind of integrity check? Couldn't see anything in azure-functions-host code (didn't look long though).

The advantage of that containerised Linux plan is the sweet 66% discount ... but alas, expensive AppServices it is.

@handeeadiguzel
Copy link

Is the deployment still planned for early September ? Looking forward to using this in production.

@t-l-k
Copy link

t-l-k commented Aug 30, 2019

Just a heads up, have been working with this patched func.exe and have observed some more DI weirdness. Weirdness by way of the wrong dependencies being injected within the scope. Like many others who have been bitten by this bug, our implementation leverages Mediatr to keep a clean separation of concerns at a command level.

As it's the end of the week, I haven't yet debugged the issue fully, but my early observation appears to indicate that a number of async style continuations are involved. This weirdness manifests in a behaviour that is dependent on a number of IO waits.

What's odd is that normally the first time the behaviour runs the wrong dependencies are resolved. Yet, on subsequent executions the right dependencies are resolved.

If I have the scope, I'll see if I can repro in a repo next week, but because of the problems we're experiencing I may have to work on something else. It's making testing very difficult.

This might be another issue TBH and of course I have to rule out our own code as the issue ;)

t-l-k added a commit to t-l-k/azure-functions-di-bug that referenced this issue Sep 10, 2019
@t-l-k
Copy link

t-l-k commented Sep 10, 2019

Okay, after spending some time on some other stuff, I've come back around to this. I've spent all day narrowing down what my problem is. I've affectionately called it "scope creep", because the DI framework doesn't appear to be honouring the scope (affection is really not what I'm feeling in regards to these DI issues).

It all comes down to this specific registration using Microsoft.Extensions.Http (2.2.0):

builder.Services.AddHttpClient<IScopeCreepService, ScopeCreepServiceClient>();

I've repro'd it at: https://github.com/t-l-k/azure-functions-di-bug.git

I've added the prerelease CLI tools Azure.Functions.Cli.min.win-x64.2.7.1513.zip into the repo and they're expanded in a Directory.Build.props Unzip build step (requires > MSBuild 15.8), the project references the contained func.exe as the debug executable. I've done this because the Artifact drop is no longer availables. perhaps @fabiocav we could have another build (retained)? Azure/azure-functions-core-tools@7c2ae30 -

You have to run it in this version, because without the 2.0.12620.0 extension, it doesn't run at all (everything is a singleton chaos).

When running, the bug manifests in the class MediatrWarmupExtension (which was a behaviour I was exploring to workaround the issue), line https://github.com/t-l-k/azure-functions-di-bug/blob/23919854b3791311ae48bcc67cc4f3f5a183386d/Extensions/MediatrWarmupExtension.cs#L40

var handler1 = scopedProvider.GetRequiredService<IRequestHandler<ScopeCreepCommand, ScopeCreepCommandResult>>();
var handler2 = scopedProvider.GetRequiredService<IRequestHandler<ScopeCreepCommand, ScopeCreepCommandResult>>();
// ^^ The bug manifests at this point ^^ handler1 and handler2 dependencies should be the same. 

The trace level output of the application displays this as part of an extension initialisation:

Azure Functions Core Tools (2.7.1513 Commit hash: 7c2ae304b41dcfb7bcee14d9cbbf12c73f78ca0a)
Function Runtime Version: 2.0.12620.0
Can't determine project language from files. Please use one of [--csharp, --javascript, --typescript, --java, --python, --powershell]
[10/09/2019 15:58:47] Building host: startup suppressed:False, configuration suppressed: False
[10/09/2019 15:58:48] [0] graphObjectA 15948253 (from 66477871 (DependencyGraphAlpha)) 4
[10/09/2019 15:58:48] [0] graphObjectB 53858013 (from 66477871 (DependencyGraphAlpha)) 4
[10/09/2019 15:58:48] [0] graphObjectA 8857874 (from 51797632 (DependencyGraphBravo)) 4
[10/09/2019 15:58:48] [0] graphObjectB 14008467 (from 51797632 (DependencyGraphBravo)) 4
[10/09/2019 15:58:51] [1] graphObjectA 8857874 (from 29236951 (DependencyGraphAlpha)) 4
[10/09/2019 15:58:51] [1] graphObjectB 14008467 (from 29236951 (DependencyGraphAlpha)) 4
[10/09/2019 15:58:51] Warmed up IRequestHandler for Func.Canary.Application.ScopeCreepCommand
Hosting environment: Production
Content root path: D:\source\repos\t-l-k\Func.Canary\bin\Debug\netcoreapp2.2
Now listening on: http://0.0.0.0:7071
Application started. Press Ctrl+C to shut down.

Commenting out the AddHttpClient registration behaves thus:

Azure Functions Core Tools (2.7.1513 Commit hash: 7c2ae304b41dcfb7bcee14d9cbbf12c73f78ca0a)
Function Runtime Version: 2.0.12620.0
Can't determine project language from files. Please use one of [--csharp, --javascript, --typescript, --java, --python, --powershell]
[10/09/2019 15:59:49] Building host: startup suppressed:False, configuration suppressed: False
[10/09/2019 15:59:50] [0] graphObjectA 50788593 (from 59927501 (DependencyGraphAlpha)) 4
[10/09/2019 15:59:50] [0] graphObjectB 50517987 (from 59927501 (DependencyGraphAlpha)) 4
[10/09/2019 15:59:50] [0] graphObjectA 50788593 (from 54244775 (DependencyGraphBravo)) 4
[10/09/2019 15:59:50] [0] graphObjectB 50517987 (from 54244775 (DependencyGraphBravo)) 4
[10/09/2019 15:59:52] Warmed up IRequestHandler for Func.Canary.Application.ScopeCreepCommand
Hosting environment: Production
Content root path: D:\source\repos\t-l-k\Func.Canary\bin\Debug\netcoreapp2.2
Now listening on: http://0.0.0.0:7071
Application started. Press Ctrl+C to shut down.

Note the absence of graphObjectA and graphObjectB preceded by [1]. With the bug active, a second service DependencyGraphAlpha is constructed, in the same scope, for the 2nd transient handler resolution. The other parameter, DependencyGraphBravo, is reused correctly across both instances.

Here are all the registrations in the example:

services.AddMediatR(typeof(ScopeCreepCommandHandler));
services.AddScoped<DependencyGraphAlpha>();
services.AddScoped<DependencyGraphBravo>();
services.AddScoped<GraphObjectA>();
services.AddScoped<GraphObjectB>();
services.AddScoped<IGraphInterfaceA>(sp => sp.GetRequiredService<GraphObjectA>());

I did try experimenting with service provider sub-scopes, but doing so appeared to offend the Function's host runtime. I wish to point out, that whilst I have materialised this in an extension, the exact same behaviour was being exhibited when resolving a HttpClient reference within a durable function activity call.

My temporary workaround is to just resolve all of my HttpClient consumers in an initialisation step, for the time being, but obviously it's not ideal I don't know where else this bug may materialise.

Please fix! 😢

@fabiocav
Copy link
Member

@t-l-k I've moved your comment to a new issue to make it easier for us to iterate and discuss. It's also a bit different than what we're tracking as part of this issue, so I wanted to make sure we had a dedicated thread.

For the updates for this issue, we'll be releasing those changes with the current sprint deployment.

@PureKrome
Copy link

@t-l-k and others : issue #4914 is the new thread, for those who wish to watch that issue.

@t-l-k
Copy link

t-l-k commented Sep 12, 2019

@fabiocav sorry to bother but can you clarify "current sprint deployment"?

Can we expect to target it App Services once it appears in "Releases" using the FUNCTIONS_EXTENSION_VERSION parameter?

Thanks!

@fabiocav
Copy link
Member

@t-l-k ; the release I'm referring to would be the release for Sprint 58. That sprint comes to an end on 09/18, and at that point, we begin the validation/release activities, which usually takes about a week or so (for global deployment).

Can we expect to target it App Services once it appears in "Releases" using the FUNCTIONS_EXTENSION_VERSION parameter?

Once that release is fully deployed, existing function apps will be automatically updated, there's no need to take any of the steps for a private deployment or to make any modifications to the FUNCTIONS_EXTENSION_VERSION setting.

@VDKB
Copy link

VDKB commented Sep 25, 2019

Is it possible to give the current status of this? Just to know when and how we can go to production with our project.

Thanks

@fabiocav
Copy link
Member

The changes are going out with this release: Azure/app-service-announcements#201

This started rolling to production environments yesterday and we expect the deployment to be completed by the end of the week.

@fabiocav
Copy link
Member

Closing this as the changes to address the issues originally tracked here have been deployed. For other problems/requests in this area, please open new issues.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

15 participants