Adding Search document operations #10568

tg-msft · 2020-03-12T15:21:48Z

This adds Search, Suggest, Autocomplete, GetDocument, and IndexDocuments. It also adds custom user schema serialization/deserialization.

If you're in a hurry, please prioritize these:

ApiView / ApiView diff
README

Apologies for the massive PR. I've sliced it up to be easier to review commit-by-commit:

Implementation
~~Generated code~~
Readme/samples
Tests (mostly skim infrastructure)
~~Recordings/swagger~~

Please specifically call out anything you consider a release blocker. I'll be filing tracking bugs for all my TODO: XXXXX comments and updating the PR with bug numbers later today.

pakrym · 2020-03-12T17:19:08Z

sdk/core/Azure.Core/tests/TestFramework/RecordedTestBase.cs

+        /// and debug/re-run at the point of failure without re-running
+        /// potentially lengthy live tests.  This should never be checked in.
+        /// </summary>
+        public bool DEBUG_ONLY_SaveRecordingsOnFailure { get; set; } = false;


Should we instead flow them into the TestLogger?

That wouldn't save them though, right? I want them saved so I can do a dotnet msbuild /t:UpdateSessionRecords and keep re-running the test while I figure out why the damn thing won't parse.

A very common scenario with this PR was letting a test run for a minute plus and return successfully from the service only to fail on the final parse in some uniquely odd way. Being able to replay them even if the test didn't pass was super useful. If we really don't want to check this in, I can just add it back manually whenever I need it.

What about #if DEBUG around this property to make it truly debug only?

That's a good suggestion. I'll make the name slightly less visually gruesome then since the CI will catch this for us if we slip.

pakrym · 2020-03-12T17:22:07Z

sdk/core/Azure.Core/tests/TestFramework/UseSyncMethodsInterceptor.cs

@@ -75,6 +75,10 @@ public void Intercept(IInvocation invocation)

            try
            {
+                if (methodInfo.ContainsGenericParameters)


Can we make sure to add tests for new features to https://github.com/Azure/azure-sdk-for-net/blob/master/sdk/core/Azure.Core/tests/ClientTestBaseTests.cs

Added and commented better

pakrym · 2020-03-12T17:43:17Z

sdk/search/Azure.Search/src/SearchClientOptions.cs

+            Diagnostics.LoggedHeaderNames.Add("return-client-request-id");
+            Diagnostics.LoggedHeaderNames.Add("throttle-reason");
+            Diagnostics.LoggedHeaderNames.Add("User-Agent");
+            Diagnostics.LoggedHeaderNames.Add("x-ms-client-request-id");


Aren't some of these in default options?

Probably - I'll go double check. Might be worth adding a unit test that these are always unique too.

Fixed and added a unit test

sdk/search/Azure.Search/src/SearchFilter.cs

pakrym · 2020-03-12T17:51:16Z

sdk/search/Azure.Search/src/Serialization/JsonExtensions.cs

+        /// <param name="reader">The JSON reader.</param>
+        /// <param name="expected">The expected token type.</param>
+        public static void Expects(
+            this ref Utf8JsonReader reader,


I think you can use in when reader is not mutated in a method.

...though it would pass by value. If the object is larger than a ref (I don't know if it is), is that really what we want for perf?

pakrym · 2020-03-12T17:53:33Z

sdk/search/Azure.Search/src/Serialization/JsonExtensions.cs

+                {
+                    // Holds document content, clear it before returning it.
+                    rented.AsSpan(0, written).Clear();
+                    ArrayPool<byte>.Shared.Return(rented);


I think Return has bool clear parameter.

Yes - I thought the same thing when I saw that at first. It uses Return to clear earlier in the code, but here it only clears however much it's used in case you only used a couple of bytes in a giant buffer.

I think Return has bool clear parameter.

That would clear all of rented which might be up to 2x larger than needed (in the worst case).

pakrym · 2020-03-12T17:54:24Z

sdk/search/Azure.Search/src/Serialization/JsonExtensions.cs

+        /// </typeparam>
+        /// <param name="json">A JSON stream.</param>
+        /// <returns>A deserialized object.</returns>
+        public static T Deserialize<T>(this Stream json)


Why don't you CopyTo a memory stream here?

I borrowed this code at Ahson's suggestion when we were talking about doing a CopyTo MemoryStream.

I really wonder if this amount of complexity is warranted here and if it's worth the support cost later...

This is a fair question. I'm going to leave it today and have that conversation with you and Ahson in the near future when I pull this into Shared Source since anyone else supporting custom user schemas is going to need it.

pakrym · 2020-03-12T17:59:54Z

sdk/search/Azure.Search/src/Utilities/GeneratorTweaks.cs

+    [CodeGenClient("Documents")]
+    internal partial class DocumentsClient { }
+
+    // Work-around the generator not enjoying mixing model types between the


Azure/autorest.csharp#486

Awesome - added a comment to help track

sdk/search/Azure.Search/src/Utilities/SearchExtensions.cs

pakrym · 2020-03-12T18:02:38Z

sdk/search/Azure.Search/src/autorest.md

+directive:
+- from: swagger-document
+  where: $.definitions.QueryType['x-ms-enum']
+  transform: $.name = "SearchQueryType";


Did you rename here because you didn't want to copy all the values? [CodeGenSchema] should be supported on enums now.

No - I just couldn't get it working with CodeGenSchema when I tried. Let me check again.

pakrym · 2020-03-12T19:31:10Z

sdk/core/Azure.Core/src/Shared/ClientDiagnostics.cs

+        /// <param name="message">The error message.</param>
+        /// <param name="errorCode">The error code.</param>
+        /// <param name="additionalInfo">Additional error details.</param>
+        partial void ExtractFailureContent(


This signature looks very unusual, should we make this method virtual?

I was looking for something with zero cost to everyone else who didn't need it and trying to be minimally invasive to code I don't own. I'm happy to change this to whatever you'd prefer it to be.

Lets do virtual, I think JIT can optimize this call away.

We chatted offline about this and decided to stay with what I have here for the short term. It's currently a sealed type and we'll rethink the approach for customizing error messages as the distributed tracing plans evolve.

pakrym · 2020-03-12T19:43:18Z

sdk/core/Azure.Core/src/Shared/ContentTypeUtilities.cs

@@ -45,7 +46,8 @@ public static bool TryGetTextEncoding(string contentType, out Encoding encoding)
            if (contentType.StartsWith(textContentTypePrefix, StringComparison.OrdinalIgnoreCase) ||
                contentType.EndsWith(jsonSuffix, StringComparison.OrdinalIgnoreCase) ||
                contentType.EndsWith(xmlSuffix, StringComparison.OrdinalIgnoreCase) ||
-                contentType.EndsWith(urlEncodedSuffix, StringComparison.OrdinalIgnoreCase))
+                contentType.EndsWith(urlEncodedSuffix, StringComparison.OrdinalIgnoreCase) ||


Test to https://github.com/Azure/azure-sdk-for-net/blob/master/sdk/core/Azure.Core/tests/ContentTypeUtilitiesTests.cs#L22

Added a couple of OData variants

heaths

I'm not actually done, but GH's PR experience is slogging my browser. Going to pick this back up in Code with the SearchIndexClient.

heaths · 2020-03-12T18:11:24Z

sdk/core/Azure.Core/tests/TestFramework/RecordedTestBase.cs

+        /// <summary>
+        /// Flag you can (temporarily) enable to save failed test recordings
+        /// and debug/re-run at the point of failure without re-running
+        /// potentially lengthy live tests.  This should never be checked in.


Why not define an environment variable, then? They only make sense for and impact recordings anyway, so no impact to production. Or was it to avoid restarting VS?

Partly to avoid restarting VS but also partly to make it clear when you're doing something dangerous. I'd worry about leaving that env var on by accident.

heaths · 2020-03-12T18:33:59Z

sdk/core/Azure.Core/tests/TestFramework/UseSyncMethodsInterceptor.cs

+                // Make sure Response<T> is a concrete type
+                if (returnType.IsGenericType &&
+                    returnType.GetGenericTypeDefinition() == typeof(Response<>) &&
+                    returnType.ContainsGenericParameters)


Really just more of a question, but why is this last check necessary? I can't imagine the code would compile otherwise. A generic type constraint is only valid in a type expression like you use in the line above and can't be returned except as a Type.

If I recall correctly, this was the situation where we've got something like SearchAsync which returns a Response<...> and we're trying to make sure when we flip from SearchAsync we don't grab Search and try to use its return type without making it specific to Hotel.

Added more descriptive comments to all the bizarre changes I made in this file.

heaths · 2020-03-12T18:36:59Z

sdk/search/Azure.Search/CHANGELOG.md

@@ -1,5 +1,6 @@
 # Release History

-## 1.0.0-preview.1 (2020)
+## 11.0.0-preview.1 (2020-03)


Supposed to be "(Unreleased)" until released, in which case it needs to be yyyy-MM-dd. See release guidelines for details.

Aren't we supposed to change that in the PR before we release? This is that PR.

https://azure.github.io/azure-sdk/policies_releases.html has an example of yyyy-MM-dd but it doesn't say that explicitly and a bunch of this wave's releases were just doing 2020-03.

Changed to use the full date on this auspicious Friday the 13th since that's the example in the policy doc.

heaths · 2020-03-12T18:39:27Z

sdk/search/Azure.Search/README.md

+adding a rich search experience over private, heterogeneous content in web,
+mobile, and enterprise applications.
+
+The **Azure Cognitive Search service** is well suited for the following


What's with the bold type here and below on names?

Trying to make it easier to skim. This part is telling you about the service, the next part is telling you about the client, you can skip one or both parts if you just want to see the code.

heaths · 2020-03-12T18:46:52Z

sdk/search/Azure.Search/README.md

+  - [Declare custom synonym maps to expand or rewrite queries](https://docs.microsoft.com/rest/api/searchservice/synonym-map-operations)
+  - Most of the `SearchServiceClient` functionality is not yet available in our current preview
+
+- `SearchIndexClient` helps with


Nit: I'd put this above SearchServiceClient since,

People will interact with it more.

It's currently available (as you called out).

It's lexigraphically first.

Good call - will change.

sdk/search/Azure.Search/src/SearchFilter.cs

heaths · 2020-03-12T19:44:07Z

sdk/search/Azure.Search/src/SearchFilter.cs

+                    StringBuilder x => Quote(x.ToString()),
+
+                    // Everything else
+                    object x => throw new ArgumentException(


Use _ instead for the fallback case.

I'm using x in the exception though.

heaths · 2020-03-12T19:45:27Z

sdk/search/Azure.Search/src/SearchFilter.cs

+            foreach (char ch in text)
+            {
+                builder.Append(ch);
+                if (ch == '\'')


Do we need to worry about double quotes as well, or is that not possible in an OData filter?

Only single quotes are allowed as string delimiters in OData expressions.

heaths · 2020-03-12T19:48:55Z

sdk/search/Azure.Search/src/SearchIndexClient.cs

@@ -198,7 +205,8 @@ public class SearchIndexClient
            Debug.Assert(!string.IsNullOrEmpty(indexName));
            Debug.Assert(pipeline != null);
            Debug.Assert(diagnostics != null);
-            Debug.Assert(SearchClientOptions.ServiceVersion.V2019_05_06 <= version &&


Shouldn't this have been "asserted" (with exceptions) higher up the callstack already?

I (try to) use assertions at public facing boundaries and Debug at private boundaries. This is just a sanity check we should never hit. It's verified with an exception everywhere a user passes one.

heaths · 2020-03-12T19:51:15Z

sdk/search/Azure.Search/src/SearchIndexClient.cs


+        #region GetDocumentCount


The region or the name or ...? I definitely do the region thing because a lot of the Storage files were MASSIVE.

ahsonkhan · 2020-03-12T18:37:43Z

sdk/search/Azure.Search/src/Generated/Models/SearchError.Serialization.cs

+            writer.WritePropertyName("message");
+            writer.WriteStringValue(Message);


Writing the name and value together is faster. Across the board, minimize independent/standalone calls to WritePropertyName wherever possible.

Suggested change

writer.WritePropertyName("message");

writer.WriteStringValue(Message);

writer.WriteString("message", Message);

And also, just to re-iterate, use statically created JsonEncodedText for constant names.

I understand this is auto-generated, so probably the fix needs to go in the generator.

ahsonkhan · 2020-03-12T18:38:06Z

sdk/search/Azure.Search/src/Generated/Models/SearchError.Serialization.cs

+                writer.WritePropertyName("details");
+                writer.WriteStartArray();


Same here/elsewhere. Will stop mentioning it.

Suggested change

writer.WritePropertyName("details");

writer.WriteStartArray();

writer.WriteStartArray("details");

ahsonkhan · 2020-03-12T18:39:35Z

sdk/search/Azure.Search/src/Generated/Models/SearchError.Serialization.cs

+        void IUtf8JsonSerializable.Write(Utf8JsonWriter writer)
+        {
+            writer.WriteStartObject();
+            if (Code != null)


Why the null check on Code but not on Message? Is writing {"code": null, ...} invalid?

This is generated from the swagger specification that the service provides as a contract. "code" will often be null (today for Search - they're working on changing that) but "message" never will.

ahsonkhan · 2020-03-12T18:49:41Z

sdk/search/Azure.Search/src/Generated/Models/SearchError.Serialization.cs

+            SearchError result = new SearchError();
+            foreach (var property in element.EnumerateObject())
+            {
+                if (property.NameEquals("code"))


Similarly, use UTF-8 encoded bytes for comparison here for these string literals. Presumably this is already being tracked somewhere.

Azure/autorest.csharp#460

ahsonkhan · 2020-03-12T18:57:23Z

sdk/search/Azure.Search/src/Models/IndexDocumentsAction{T}.cs

+            writer.WritePropertyName("@search.action");
+            writer.WriteStringValue(ActionType.ToSerialString());


Suggested change

writer.WritePropertyName("@search.action");

writer.WriteStringValue(ActionType.ToSerialString());

writer.WriteString("@search.action", ActionType.ToSerialString());

Changed - all the other uses in the handwritten code are for writing a property before starting an array or nested object.

Those too can and should be fixed.

For example, in Models/IndexDocumentsBatch{T}.cs, this:

writer.WritePropertyName("value"); writer.WriteStartArray();

Should be re-written as:

writer.WriteStartArray("value");

Will change. Are these recommendations written down anywhere, btw? It'd be good to link to them from the C# guidelines if at all possible.

We have this, but it doesn't focus too heavily on performance/"best practices". We could add some explicit guidelines on how to use the APIs more optimally:
https://docs.microsoft.com/en-us/dotnet/standard/serialization/system-text-json-migrate-from-newtonsoft-how-to#utf8jsonwriter-compared-to-jsontextwriter

ahsonkhan · 2020-03-12T19:55:41Z

sdk/search/Azure.Search/src/Serialization/JsonExtensions.cs

+            // The built in converters for JsonSerializer are a little more
+            // helpful than we want right now and will do things like turn "1"
+            // to the integer 1 instead of a string.  The number of special


Is this comment true? I am not following the intent here, but I don't think JsonSerializer does any type coercion.

I saw that happening... but maybe it was something screwy I did?

Maybe we can chat offline so I can understand what you were seeing and what you intended.

ahsonkhan · 2020-03-12T19:56:24Z

sdk/search/Azure.Search/src/Serialization/JsonExtensions.cs

+                            object value = ReadSearchDocObject(ref reader);
+                            list.Add(value);
+                        }
+                        return list.ToArray();


Can we just return the list or do we need to call ToArray()?

Track 1 was really aggressive about arrays here and have a lot of specific customer scenarios that matter. I know I'm already not meeting as many of them as I should by always returning object[] for this first preview.

ahsonkhan · 2020-03-12T19:57:54Z

sdk/search/Azure.Search/src/Serialization/JsonExtensions.cs

+                        Constants.NegativeInfValue => double.NegativeInfinity,
+                        string text =>
+                            // JsonReader's TryGetDateTimeOffset doesn't play
+                            // nicely with time zones so we'll do our own parse


Can you clarify on that please. What type of payload strings fail? TryGetDateTimeOffset follows the ISO spec.
https://docs.microsoft.com/en-us/dotnet/standard/datetime/system-text-json-support

One of the test cases was failing at this point - I'll dig into this as part of the cleaning up the JSON handling follow up work.

ahsonkhan · 2020-03-12T19:58:31Z

sdk/search/Azure.Search/src/Serialization/JsonExtensions.cs

+                    {
+                        if (reader.TokenType == JsonTokenType.EndObject) { break; }
+                        string property = reader.ExpectsPropertyName();
+                        object value = ReadObject(ref reader);


Are we OK with recursion here? Could it be unbounded?

The service currently limits the nesting of objects in documents to 10 levels deep.

Not a bad question, but the JSON document would have to be continuous/unending, right? Seems unnecessary to check depth unless we wanted to allow customers to not drag down perf with a lot of deeply nested docs. Seems the kind of thing we guage user feedback over time.

My concern was less to do with perf and more to do with avoiding unnecessary/unrecoverable stackoverflow with a potential payload like. Probably at 100-1k depth, the app would crash.
"{\"a\":{\"b\":{....}}"

If the service already has a limit and there is no way for someone to inject "bad" payload in (either maliciously or accidentally), then recursion is fine, since there is an implied bound :)

I've added it out of an abundance of caution, but I agree it's pretty unlikely.

ahsonkhan · 2020-03-12T20:00:51Z

sdk/search/Azure.Search/src/Serialization/JsonExtensions.cs

+            {
+                return null;
+            }
+            reader.Expects(JsonTokenType.Number);


Is the purpose of this check to change the exception message and type of exception thrown? GetDouble already checks/throws.

Yes - to change the exception message, but I'm going to remove these when we clean up the JSON handling in the near future.

ahsonkhan · 2020-03-12T20:16:40Z

sdk/search/Azure.Search/src/Serialization/ModelConverterFactory.cs

+        public override JsonConverter CreateConverter(Type typeToConvert, JsonSerializerOptions options)
+        {
+            Debug.Assert(CanConvert(typeToConvert));
+            Type modelType = typeToConvert.GetGenericArguments()[0];


Is this guaranteed to not fail with IndexOutOfRangeException?

Yes, the way it's used today. I'll add a Debug.Assert to make sure it stays that way.

ahsonkhan · 2020-03-12T20:25:33Z

sdk/search/Azure.Search/src/Serialization/SearchDoubleConverter.cs

+                switch (reader.GetString())
+                {
+                    case Constants.InfValue:
+                        return double.PositiveInfinity;


I am curious to learn about why this special handling is required. Is this defined in the Search rest api somewhere? Can you share a link if you have it. Thanks.

The service implements an OData-compliant REST API and uses the Entity Data Model representations for data types. The representations for positive and negative infinity for type Edm.Double are "INF" and "-INF", respectively.

Here's a link to the relevant part of the OData spec: http://docs.oasis-open.org/odata/odata-json-format/v4.01/odata-json-format-v4.01.html#sec_PrimitiveValue

ahsonkhan · 2020-03-12T20:34:22Z

sdk/search/Azure.Search/tests/SessionRecords/SearchTests/PagingWithoutSize.json

@@ -0,0 +1,2481 @@
+{


These are some large JSON files. Do we generally check these into the repo (especially PagingStaticDocumentsAsync.json/PagingStaticDocuments.json/etc.)?

As an aside:
Are the contents of the async/non-async json files identical?

The json files are test recordings and yes, we typically check them into the repo.

And the async/non-async are often identical but not always so.

brjohnstmsft · 2020-03-12T22:57:33Z

sdk/search/Azure.Search/README.md

+```
+
+The request will succeed even if any of the individual actions fails and
+return an `IndexDocumentsResult` for inspection.  There's also a `ThrowOnAnyError`


Regarding ThrowOnAnyError -- I'm not sure the explanation makes sense, I guess because this didn't used to be optional. The idea behind throwing IndexBatchException on 207 was to force partial failures to be handled; Otherwise there could effectively be data loss if any failed updates are ignored.

We're going to have a lot of fun digging into this one in more detail. We've been trying really hard not to add other types of exceptions so that using Track 2 means there's one thing you've got to worry about catching. Without a custom exception, we wouldn't be giving people the information to process/recover. This was debated a lot during Storage batching and we landed on this pattern.

The long-term solution is our "smart batching" idea; That makes this problem go away.

You're right that throwing on 207 without having extra info in the exception is an incomplete scenario, which raises the question -- Why have ThrowOnAnyError at all if we're never going to have IndexBatchException?

heaths

Now that I'm done (browser wasn't keeping up as I scrolled through more and more files), changing response. Overall, nothing I would hold this very early preview back for, but some API issues that need to be addressed (some in the custom type definitions, but much more in the autorest.csharp project (not that I called those out; for example, collection properties should be publicly read-only) and never null (use LazyInitializer when it makes sense to).

heaths · 2020-03-12T20:46:00Z

sdk/search/Azure.Search/src/Models/SearchResults{T}.cs

+        /// not include any facet expressions via
+        /// <see cref="SearchOptions.Facets"/>.
+        /// </summary>
+        public IDictionary<string, IList<FacetResult>> Facets => _results.Facets;


Should this be mutable?

Ditto on comments above about tracking the generator on these.

heaths · 2020-03-12T21:17:31Z

sdk/search/Azure.Search/src/Serialization/JsonExtensions.cs

+                    {
+                        if (reader.TokenType == JsonTokenType.EndObject) { break; }
+                        string property = reader.ExpectsPropertyName();
+                        object value = ReadObject(ref reader);


Not a bad question, but the JSON document would have to be continuous/unending, right? Seems unnecessary to check depth unless we wanted to allow customers to not drag down perf with a lot of deeply nested docs. Seems the kind of thing we guage user feedback over time.

heaths · 2020-03-12T21:19:41Z

sdk/search/Azure.Search/src/Serialization/JsonExtensions.cs

+        /// <param name="reader">The JSON reader.</param>
+        /// <param name="expected">The expected token type.</param>
+        public static void Expects(
+            this ref Utf8JsonReader reader,


...though it would pass by value. If the object is larger than a ref (I don't know if it is), is that really what we want for perf?

heaths · 2020-03-12T21:20:52Z

sdk/search/Azure.Search/src/Serialization/JsonExtensions.cs

+            this Stream json,
+            CancellationToken cancellationToken)
+        {
+            if (json == null)


Since Stream is abstract, I recommend json is null in case - though unlikely - someone overrides equality. In general, is null is preferred for reference types.

heaths · 2020-03-12T23:11:05Z

sdk/search/Azure.Search/src/Utilities/Constants.cs

+    /// </summary>
+    internal static class Constants
+    {
+        // TODO: XXXXX - Switch constants to use JsonEncodedText


Tip: You might keep the consts for equality, but then add JsonEncodedText static readonlys for writing.

I believe we can use the encoded text for equality as well, but I'll be following up separately on all of that.

heaths · 2020-03-12T23:17:23Z

sdk/search/Azure.Search/src/autorest.md

@@ -35,11 +35,11 @@ directive:

        // Document operations
        "/docs/$count": $["/docs/$count"],
-        "/docs/search": $["/docs/search.post.search"],
+        "/docs/search.post.search": $["/docs/search.post.search"],


Curious about these changes since I'll be updating this block to add in indexing. What didn't work the way you had it before (which we discussed, so I thought I understood what you were doing)?

These are actually live endpoints. Bruce would know more about why they have both, but I decided not change them when I realized it was intentional.

This all comes down to OData-isms that actually help avoid common pitfalls of REST API design. We model the POST version of query APIs as OData "actions", which can have namespaces. In this case, the namespace is search.post and the action name is search. We support both the qualified and unqualified names for each action mainly for demo-ability and brevity.

People first evaluating an API tend to balk at things like indexes('myindex')/docs/search.post.search but are comfortable with indexes/myindex/docs/search. However, the issue with indexes/myindex/docs/search is that somebody could index a document whose key is search. In this case, we can disambiguate based on HTTP verb to tell whether it's a Search or a Lookup request, but in general the name-clashing problem is real. By always using the fully-qualified OData syntax in the client libraries, we avoid these problems completely.

Okay, so this was to basically eliminate the potential for name collisions in our libraries (i.e. this exact change; not the content itself)?

heaths · 2020-03-12T23:20:27Z

sdk/search/Azure.Search/src/Utilities/SearchExtensions.cs

+        /// Join a collection of strings into a single comma separated string.
+        /// If the collection is null or empty, a null string will be returned.
+        /// </summary>
+        /// <param name="items">The items to join.</param>


Could actually take an IEnumerable<T> and be more permissive. Just use Count() instead (it's optimized for ICollection/ICollection<T>).

The only users are internal IList<T> so I'm not going to bother yet. If anyone else needs this, then yes, let's use Enumerable's Count which does those type specific optimizations.

heaths · 2020-03-12T23:21:59Z

sdk/search/Azure.Search/src/Utilities/SearchExtensions.cs

+        /// <returns>A collection of individual values.</returns>
+        public static IList<string> CommaSplit(string value) =>
+            string.IsNullOrEmpty(value) ?
+                new List<string>() :


Does this need to possibly be mutable? In other projects, I keep a type-cached array (it's what Enumerable.Empty does internally, but exposes as an IEnumerable<T> - I think Array has something though) for this purpose if immutable. Less memory overhead.

Yes - we can give it back to users, they can add some values, then send it back again.

sdk/search/Azure.Search/src/Utilities/SearchExtensions.cs

heaths · 2020-03-12T23:33:26Z

sdk/search/Azure.Search/tests/Serialization/SearchFilterTests.cs

+        [Test]
+        public void ManyArguments()
+        {
+            Assert.AreEqual("Foo eq 2 and Bar eq 3",


Nit here and below: a purist would say "one assert per unit test". While I think that's overkill, what might be better to consider is using a parameterized test with these inputs and outputs as parameters. When a particular tuple asserts, NUnit (via VS, AzPipelines, etc.) will show a much more helpful message at a glance what failed. I have lots of examples in Key Vault if you want to see, but basically you could do something like this:

[TestCaseSource(nameof(ManyArgumentsData))] public void ManyArguments(FormattableString actual, string expected) { Assert.AreEqual(expected, SearchFilter.Create(actual)); } private static IEnumerable ManyArgumentsData() => new [] { new object[] { $"Foo eq {2} and Bar eq {3}", "Foo eq 2 and Bar eq 3"}, // ... }

Where you can use constants, just use [TestCase("foo", 1)] for example.

Part of that was not being an expert in the interpolation compiler decision making about when it turns things into constants strings, etc. But this whole area needs a lot more love and I'd like to clean the tests up when I do that.

ahsonkhan · 2020-03-13T00:19:22Z

sdk/search/Azure.Search/src/Generated/Models/SearchOptions.Serialization.cs

        {
-            SearchRequest result = new SearchRequest();
+            SearchOptions result = new SearchOptions();


nit: Use of var in foreach should be changed to be explicit (auto-gen issue).

ahsonkhan · 2020-03-13T00:21:21Z

sdk/search/Azure.Search/src/Generated/Models/SearchError.Serialization.cs

+            {
+                writer.WritePropertyName("details");
+                writer.WriteStartArray();
+                foreach (var item in Details)


nit: Change var usage.

brjohnstmsft · 2020-03-12T23:33:13Z

sdk/search/Azure.Search/src/Models/IndexDocumentsBatch{T}.cs

+        /// <param name="documents">
+        /// The collection of documents to index.
+        /// </param>
+        internal IndexDocumentsBatch(IndexActionType type, IEnumerable<T> documents)


There's a problem that some customers run into when building batches, and I'm not sure how far we want to go to try to solve it (not in preview 1 for sure, but something to think about for the future).

Some customers don't realize that T is supposed to be a document type, and instead pass a JSON string for the documents parameter. This compiles because T can be char, but the failure mode is really unhelpful. We might want to consider doing some client-side validation for cases like this.

brjohnstmsft · 2020-03-12T23:34:27Z

sdk/search/Azure.Search/src/Models/IndexDocumentsBatch{T}.cs

+        /// </returns>
+        public static IndexDocumentsBatch<SearchDocument> Delete(string keyName, IEnumerable<string> keyValues)
+        {
+            IndexDocumentsBatch<SearchDocument> batch = new IndexDocumentsBatch<SearchDocument>();


@tg-msft If you don't like var, you really wouldn't like let 😉

brjohnstmsft · 2020-03-12T23:36:20Z

sdk/search/Azure.Search/src/Models/IndexingResult.cs

+        /// found, 409 for a version conflict, 422 when the index is
+        /// temporarily unavailable, or 503 for when the service is too busy.
+        /// </summary>
+        [CodeGenSchemaMember("statusCode")]


If there's anything we can do in the Swagger to reduce the number of customizations like this that you need to do, feel free to file an issue in azure-rest-api-specs

This is to align with Response.Status - I'm not sure other languages will be making this choice. That said, yes, we should sit down across and compare deltas across languages sometime early next preview.

brjohnstmsft · 2020-03-12T23:56:30Z

sdk/search/Azure.Search/src/Models/SearchResult{T}.cs

+                string name = clone.ExpectsPropertyName();
+                if (name == Constants.SearchScoreKey)
+                {
+                    propertiesNeeded--;


The service won't return duplicate JSON property names in any circumstance I can think of.

brjohnstmsft · 2020-03-12T23:57:43Z

sdk/search/Azure.Search/src/Models/SearchResult{T}.cs

+                    propertiesNeeded--;
+                    ReadHighlights(ref clone, result);
+                }
+                else


Heads up -- We will be adding more properties here over time. One should be arriving hopefully before the final preview release.

brjohnstmsft · 2020-03-12T23:58:55Z

sdk/search/Azure.Search/src/Models/SearchResult{T}.cs

+            // Clone the reader so we can get the search text property without
+            // advancing the reader over any properties needed to deserialize
+            // the user's model type.
+            Utf8JsonReader clone = reader;


The reader is a struct I assume?

Yes (ref struct).

brjohnstmsft · 2020-03-13T00:01:42Z

sdk/search/Azure.Search/src/Options/IndexDocumentsOptions.cs

+        /// Set this to true if you're not inspecting the results of the Index
+        /// Documents action.
+        /// </summary>
+        public bool ThrowOnAnyError { get; set; } = false;


Not sure I agree with false being the default here. If nothing else, it will probably catch a lot of users who migrate from Track 1 off guard.

This is the pattern we've had for Track 2 batching so far as discussed above. Since the exception doesn't have failure details, we don't want folks to do that by default.

brjohnstmsft · 2020-03-13T00:20:01Z

sdk/search/Azure.Search/src/Serialization/JsonExtensions.cs

+
+                // Ignore OData properties - we don't expose those on custom
+                // user schemas
+                if (!propertyName.StartsWith(Constants.ODataKeyPrefix, StringComparison.OrdinalIgnoreCase))


@tg-msft Although we don't have such annotations now, in the future we may add property-scope annotations to the response payload. For example:

{ "@search.score": 3.5, "myfield": "Hello", "myfield@search.someInterestingStatistic": 6 }

We wouldn't do this without a Swagger change, so you'd know about it in advance. Just a heads up since this code would have to change to handle that case.

brjohnstmsft · 2020-03-13T00:25:05Z

sdk/search/Azure.Search/src/Serialization/SearchDoubleConverter.cs

+                switch (reader.GetString())
+                {
+                    case Constants.InfValue:
+                        return double.PositiveInfinity;


Here's a link to the relevant part of the OData spec: http://docs.oasis-open.org/odata/odata-json-format/v4.01/odata-json-format-v4.01.html#sec_PrimitiveValue

brjohnstmsft · 2020-03-13T01:38:27Z

sdk/search/Azure.Search/tests/Utilities/SearchResources.Data.cs

+                ["address"] = Address?.AsDocument(),
+                // With no elements to infer the type during deserialization, we must assume object[].
+                ["rooms"] = Rooms?.Select(r => r.AsDocument())?.ToArray() ?? new object[0]
+            };
    }

    [SerializePropertyNamesAsCamelCase]


This attribute still exists? Can we make it work in Track 2 with System.Text.Json?

No - that's using the Track 1 library in the tests do our index creation. We will definitely be coming up with some answer here in the near future that allows using naming policies.

ahsonkhan · 2020-03-13T03:37:20Z

Just a heads up because I noticed it a few times in this PR. Try to avoid responding to comments/threads as part of a "bulk" review (and consider adding those standalone comment replies separately). Otherwise, your comments show up twice, both in the review feedback and the comment thread you are responding on, making it difficult to track what exactly the comment is responding to.

tg-msft · 2020-03-13T19:36:50Z

I've filed many tracking bugs at https://github.com/Azure/azure-sdk-for-net/issues?q=assignee%3Atg-msft+is%3Aopen+label%3ASearch

tg-msft marked this pull request as ready for review March 12, 2020 15:22

tg-msft requested review from arv100kri, bleroy, brjohnstmsft, KrzysztofCwalina and pakrym as code owners March 12, 2020 15:22

tg-msft requested review from heaths, ahsonkhan and AlexGhiondea March 12, 2020 15:23

pakrym reviewed Mar 12, 2020

View reviewed changes

sdk/search/Azure.Search/src/SearchFilter.cs Show resolved Hide resolved

pakrym reviewed Mar 12, 2020

View reviewed changes

sdk/search/Azure.Search/src/Utilities/SearchExtensions.cs Outdated Show resolved Hide resolved

pakrym reviewed Mar 12, 2020

View reviewed changes

heaths requested changes Mar 12, 2020

View reviewed changes

ahsonkhan reviewed Mar 12, 2020

View reviewed changes

brjohnstmsft reviewed Mar 12, 2020

View reviewed changes

heaths approved these changes Mar 12, 2020

View reviewed changes

ahsonkhan reviewed Mar 13, 2020

View reviewed changes

brjohnstmsft approved these changes Mar 13, 2020

View reviewed changes

This was referenced Mar 13, 2020

Azure.Search: Clean up JSON parsing #10596

Closed

Azure.Search: Consistent ref docs best practices #10612

Closed

tg-msft added 6 commits March 13, 2020 12:42

Add Search document operations

3ada92e

Add Search document operations generated code

11570d9

Add Search document operations to README

4908cc5

Add Search document operation tests

ab6d4d2

Add Search document operation recordings and swagger

10ce16d

PR feedback

6d6a3c9

tg-msft force-pushed the search-doc branch from 824f789 to 6d6a3c9 Compare March 13, 2020 19:57

tg-msft merged commit d783d90 into Azure:master Mar 13, 2020

		writer.WritePropertyName("message");
		writer.WriteStringValue(Message);

	writer.WritePropertyName("message");
	writer.WriteStringValue(Message);
	writer.WriteString("message", Message);

		writer.WritePropertyName("details");
		writer.WriteStartArray();

	writer.WritePropertyName("details");
	writer.WriteStartArray();
	writer.WriteStartArray("details");

		writer.WritePropertyName("@search.action");
		writer.WriteStringValue(ActionType.ToSerialString());

	writer.WritePropertyName("@search.action");
	writer.WriteStringValue(ActionType.ToSerialString());
	writer.WriteString("@search.action", ActionType.ToSerialString());

Adding Search document operations #10568

Adding Search document operations #10568

Conversation

tg-msft commented Mar 12, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahsonkhan Mar 12, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

heaths left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tg-msft Mar 12, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahsonkhan Mar 16, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tg-msft commented Mar 12, 2020 •

edited

Loading

ahsonkhan Mar 12, 2020 •

edited

Loading

tg-msft Mar 12, 2020 •

edited

Loading

ahsonkhan Mar 16, 2020 •

edited

Loading