Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IContentLinkUrlResolver not async friendly #213

Closed
xantari opened this issue Apr 22, 2020 · 15 comments
Closed

IContentLinkUrlResolver not async friendly #213

xantari opened this issue Apr 22, 2020 · 15 comments
Assignees
Milestone

Comments

@xantari
Copy link

xantari commented Apr 22, 2020

Brief bug description

I wasn't sure if I should file this as a bug report or a feature request.

I have switched to your latest version of the Kentico.Kontent.Delivery.Caching nuget (13.0.1) and removed all the Caching client code we got from the boiler plate example now that you built in this functionality.

The problem is that we store additional items into the IMemoryCache to store the generated SEO URL's that are based off of the web pages taxonomy selection in Kentico Kontent.

So our IContentLinkUrlResolver needs the ability to lookup that information from the cache.

However, the IContentLinkUrlResolver is not async friendly and you have switched your IDeliveryCacheManager to only have async available commands where as the previous boiler plate had them as non-async.

Here is what our IContentLinkUrlResolver looks like today. Notice I already changed the constructor to get the injected IDeliveryCacheManager from your new Kentico.Kontent.Delivery.Caching nuget.

    public class CustomContentLinkUrlResolver : IContentLinkUrlResolver
    {
        private IDeliveryCacheManager _cachingClient;

        public CustomContentLinkUrlResolver(IDeliveryCacheManager cacheClient)
        {
            _cachingClient = cacheClient;
        }

        /// <summary>
        /// Resolves the link URL.
        /// </summary>
        /// <param name="link">The link.</param>
        /// <returns>A relative URL to the page where the content is displayed</returns>
        public string ResolveLinkUrl(ContentLink link)
        {
            List<ControllerMapping> mappings;
            bool result = _cachingClient.TryGet<List<ControllerMapping>>(Constants.ControllerMappingCacheKey, out mappings);

            if (mappings != null && mappings.Count() > 0)
            {
                var contentItem = mappings.FirstOrDefault(p => p.PageCodeName == link.Codename);
                return contentItem.FullUrl;
            }
            return string.Empty;
        }

        /// <summary>
        /// Resolves the broken link URL.
        /// </summary>
        /// <returns>A relative URL to the site's 404 page</returns>
        public string ResolveBrokenLinkUrl()
        {
            // Resolves URLs to unavailable content items
            return "/404";
        }
    }

However the issue is on this line:

            bool result = _cachingClient.TryGet<List<ControllerMapping>>(Constants.ControllerMappingCacheKey, out mappings);

There is no non-async version of TryGet available anymore.

And I can't switch it to TryGetAsync because IContentLinkUrlResolver interface is not defined as async friendly.

The problem is we are blocked on fixing this properly until a async ocmpatible version of IContentLinkUrlResolver is created that might have the following signature:

    public interface IContentLinkUrlResolverAsync
    {
        Task<string> ResolveBrokenLinkUrl();
        Task<string> ResolveLinkUrl(ContentLink link);
    }

Alternatively we need non-async versions of the methods in the IDeliveryCacheManager.

Repro steps

See above code.

Expected behavior

Allow for async methods in IContentLinkUrlResolver.

Alternative solution would be to provide non-async versions of the TryGet and other methods in IDeliveryCacheManager

Test environment

Kentico Nugets 13.0.1

@Enngage
Copy link
Member

Enngage commented Apr 23, 2020

Hi Xantari,

I'm gonna be the devil's advocate and advise you to use the following code to run async code synchronously - https://stackoverflow.com/a/32429753/2223843

Task.Run( () => asyncMethod()).Wait();

No deadlocks should be caused by using this. I've been using this for quite some time in my projects successfully.

Of course, the issue you raised is still valid as this is more of a workaround than a proper solution. However, if this issue is blocking you, feel free to use it in the meantime and switch to full async code once possible.

@petrsvihlik
Copy link
Contributor

Well spotted @xantari! Please follow @Enngage's instructions until this issue is resolved.

Regarding the solution, there are a few things I'd like to avoid:

  • sync wrappers for async methods and vice versa
  • using ConfigureAwait(false) on each async call as it's considered a kinda dirty practice. E.g. Contentful does that but I don't think it's right.
  • having two versions of the code - sync and async. it'd bring a lot of maintenance and little added value. In fact, I'd like to reduce the surface area of the SDK even more. For instance, simplify the IDeliveryClient interface to a bare minimum.

Using non-async/async wrappers is something that should be done on the consumer's side. So @Enngage's solution not bad for the time being.

So my proposal is to switch the IContentLinkUrlResolver to async and possibly do the same for some other parts of the SDK. What do you think about that?

@xantari
Copy link
Author

xantari commented Apr 23, 2020

@petrsvihlik sounds good to me! Thanks for taking a look at it.

@Enngage thanks for the work around!

@petrsvihlik
Copy link
Contributor

Hi @xantari ,
I ran some tests and would really appreciate your feedback here :)

Making the methods of IContentLinkUrlResolverAsync async would inevitably lead to making the whole IModelProvider async. Which I think is a good thing as it would allow us to use async overloads of AngleSharp's parsing methods, etc. Ideally, I'd like to convert all related interfaces (IPropertyValueConverter, IInlineContentItemsResolver, IInlineContentItemsProcessor, ITypelessInlineContentItemsResolver...) to async and make the library async by default.

This, however, leads to a problem in the (I)Delivery*Response models where we use Lazy<T> for initialization of the Items, LinkedItems, and Pagination properties.
This would be very difficult to replicate in async code since properties don't support async/await. The closest we can get is
https://blog.stephencleary.com/2013/01/async-oop-3-properties.html#cached-values based on AsyncLazy (+an improved version here).
So in case of this SDK, it'd look like this:

var response = await _client.GetItemAsync<Article>("<article_codename>");
var item = await response.Item; // Note the await keyword here
var title = item.Title; // -> "On Roasts"

another option would be dropping the properties entirely:

var response = await _client.GetItemAsync<Article>("<article_codename>");
var item = await response.GetItem(); // Note that it's a method now... GetLinkedItems(), GetPagination()...
var title = item.Title; // -> "On Roasts"

Argumentation

Sync hacks

I'd like to avoid using Task.Run as it may lead to thread duplication and thread pool depletion, especially in web apps. Also, because this is an SDK and we don't know how it will be used by the consumers, I'd like to avoid any ConfigureAwait()-related hanky-panky. The thread-pool hack and blocking hacks are well described here.

Lazy loading

I think it's useful to preserve the lazy loading as sometimes one doesn't need the Pagination or the LinkedItems collection and the JSON deserialization would be an unnecessary overhead that would have to be evaluated eagerly during the time of the initialization of the Delivery*Response objects.

Pure async/await vs async/await+AsyncLazy

We could also drop the AsyncLazy completely and it would look the same from the consumer's perspective but I think it's useful to keep it as it caches the value. So the properties/methods would be evaluated just once.

Properties vs methods

Some people say that properties should be evaluated quickly and any time-demanding operation within property setters and getters is an antipattern and methods should be used instead. I personally don't mind it in this case. It doesn't perform any I/O - the HttpResponse is already loaded and cached so it's just the Json deserialization/rich-text resolution, etc. that needs to happen while accessing the property.

Other resources:

@tomasjurasek @robertgregorywest @seangwright feel free to jump into the discussion.

@seangwright
Copy link

seangwright commented Jun 23, 2020

Given the optimizations in Async in C# these days, and the common use case of this library in web/mobile apps that will be able to take advantage of Async (compared to a console app), it makes sense to me to go Async all the way.

If needed, internal code paths could use something like ValueTask for perf benefits, ofc perf measurements would be ideal before perf 'improvements' were made.

With the above options in mind, here are my thoughts:

  1. Sync hacks - Nope! Don't ever do this 😊
  2. Lazy loading - This makes sense to me, so a variation of this option could be to have 2 types of objects, ones that eager load and ones that lazy load/deserialize?
  3. Pure async/await vs async/await+AsyncLazy - I would recommend either Lazy or Async but not both.
  4. Properties vs methods - Something that takes time to access, even if it is later cached, shouldn't be behind a property. Methods imply some amount of work, which I consider deserialization to be. A property implies the value is readily available, or can be calculated deterministically from other existing values, and that calculation is predictable. I like Lazy<T> for internal use, but not exposed to a consumer via properties, especially if the perf hit via 'read' can be significant.

When working with other libraries that retrieve JSON (and possibly large amounts of it), like the CosmosDb SDK, they provide standard deserialization into strongly typed POCOs as the standard API and then also provide low-level access to the JSON string so that devs can work with single JSON properties if they need to.

Maybe you can provide the standard use-case using the best approach above and leave the API exposed enough to allow devs to take alternate routes to the data if they need to (eg they can Async deserialize if they have access to the Stream/string).

@xantari
Copy link
Author

xantari commented Jun 23, 2020

Personally I like awaiting the methods approach as awaiting a property seems really strange to me.

I like the option suggestion of eager loading or lazy loading if it's possible.

@petrsvihlik
Copy link
Contributor

@seangwright @xantari thanks guys for your inputs!

here're my comments:

  1. Thought so :) I just wanted to make that absolutely clear.
  2. I'd like to avoid having 2 types of objects. Historically, there were too many ways to do things with this SDK when clearly some were better than others. This caused a lot of confusion and people were demanding best practices. I think it's better for everyone if we provide a single practice that will cover 90% of cases and offer a way to customize things (e.g. via low-level access to JSON). Alternatively, we could achieve the eagerness vs. laziness via a parameter of the GetItem(s) method but I'd like to find out if it's necessary first. So I'm interested in what would be the usecase for eager loading @xantari ?
  3. Since we want to switch to async/await, the lazy-only approach is not a viable option. We either drop the laziness completely or we make it work with async/await.
  4. Yes, I agree and understand this reasoning. I was just curious if the switch from a property to a method is acceptable for the consumers. From what @xantari is saying, it's not a big deal. From a developer's perspective - is it bearable to have two awaits in a row (to first await the response and then the item(s))?

ad low-level access to JSON) that's planned. it's already in the codebase, in fact.
ad ValueTask) definitely something to look at but it's out of the scope of this task
ad cosmosdb sdk) thanks @seangwright for pointing it out to me. I'll definitely explore that one. I've already explored many so one more won't hurt :)

@xantari
Copy link
Author

xantari commented Jun 25, 2020

I may be misunderstanding what you are lazy loading. Are you making another HTTP request to an endpoint to load more data? If that is the case the eager load option should take less HTTP calls right?

@seangwright
Copy link

Here's an example of using the CosmosDb SDK to manually parse the JSON response (to figure out what the type is of an object):

https://github.com/KyleMcMaster/payroll-processor/blob/master/api/PayrollProcessor.Data.Persistence/Features/Employees/EmployeeDetailQueryHandler.cs#L45

And here is an example of the friendlier, strongly typed data access:

https://github.com/KyleMcMaster/payroll-processor/blob/master/api/PayrollProcessor.Data.Persistence/Features/Employees/EmployeePayrollQueryHandler.cs#L33

The SDK hasn't switched to the new .NET Core JSON APIs (it still uses Newtonsoft.Json).

@robertgregorywest
Copy link

@petrsvihlik, having to await the item after you have already awaited the GetItemAsync feels weird to me. Better to lose the laziness in my view if there is no way to accommodate it nicely in async land.

I agree that the SDK should be opinionated and offer one best practice way to do things - that keeps life simple.

@seangwright
Copy link

This kind of double await is going to be more common in .NET with the new System.Text.Json APIS that support async deserialization.

You would then have 1 await to get the data stream from the remote API and then 1 await to deserialize.

This SDK could put both calls behind a single method, thus giving developers 1 method to await.

@robertgregorywest
Copy link

I'd not heard about the System.Text.Json APIs so thanks for that link. About time JSON was a first class .NET Core citizen.

Unless you are dealing with huge payloads I'm not sure there will be any benefit to going async for deserialization. The amount of work required under the hood to support async/await has a performance impact. I tested on one library and found that impact to be 25-50% increase in execution time over synchronous code. You may gain on throughput, but I think it is something that needs careful consideration and performance testing.

@petrsvihlik
Copy link
Contributor

@xantari No, we're utilizing Lazy<T> for deserialization of the JSON response for a single HTTP request:

https://github.com/Kentico/kontent-delivery-sdk-net/blob/master/Kentico.Kontent.Delivery/ContentItems/DeliveryItemListingResponse.cs#L39-L40

In other words, the data is downloaded but we wait with its deserialization until the user asks for it.

@robertgregorywest @seangwright we could put both await calls behind a single method but then we'd lose the laziness... however, if a double await is something that'll be more common in other .NET/.NET foundation libraries, then we might consider keeping it. I need to do a bit more investigation and perf testing so I won't say yes or now to double await just yet.

What's clear now is that:

  • we don't want to do the "awaited" properties.
  • we are ok with the SDK being opinionated

ad System.Text.Json) I expect that we'll switch to it at some point in the future once we're certain that it supports everything we need. From some early spikes I did recently, it seems that 99% of the functionality is already there and it could even make the current code a bit simpler.

@petrsvihlik
Copy link
Contributor

@robertgregorywest yes, we'll do the perf testing. I think a minor perf decrease due to the async overhead would be worth the increased throughput. the scalability is more important here, I'd say. however, we don't want it to be a huge perf drop for sure.

you're right about the huge payloads but what I'm also worried about is the custom resolvers (for links, richtext) that one might plug into the SDK... we don't have control over consumer's code so it may suddenly make sense to have the whole deserialization async.

It seems that System.Text.Json can offer some performance gain in heavy-load scenarios.

@seangwright
Copy link

seangwright commented Jun 26, 2020

Any perf drop from the Task allocation will be small in comparison to the perf impact of the network requests being made in the first place.

Library authors can use ValueTask, for allocation perf benefits, if the operation completes synchronously most of the time (System.Text.Json uses ValueTask for all of its awaitable / async APIs).

Async also allows for cancellation, which can be beneficial if a browser/user-agent is the initiator of a request. Navigating away from a page will send a cancel to the server, which can use cancellation tokens to stop whatever async execution is happening.

(perf / API comparisons of JSON de/serialization)

petrsvihlik added a commit that referenced this issue Jul 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants