Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions configuration.extractor.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
# Note: With the introduction of workspaces we recommend using workspaces with APIOps when possible to limit what can be extracted whenever possible.
# More information about workspaces can be found here https://learn.microsoft.com/en-us/azure/api-management/workspaces-overview

# Configuration Validation:
# This file is automatically validated when used with the extractor.
# You can manually validate it by running: ./extractor validate-config configuration.extractor.yaml
# The validator checks for empty entries, duplicates, invalid data types, and unknown sections.

apiNames:
- apiName1
- apiName2
Expand Down
51 changes: 49 additions & 2 deletions docs/apiops/3-apimTools/apiops-2-1-tools-extractor.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ The tool expects certain configuration parameters. These can be passed as enviro
| API_MANAGEMENT_SERVICE_OUTPUT_FOLDER_PATH | Folder where the APIM artifacts will be saved |
| API_SPECIFICATION_FORMAT | OpenAPI specification format. Valid options are **JSON** or **YAML**. If the variable is missing or invalid, **YAML** will be used by default |
| ARM_API_VERSION | Azure ARM API version that will be used. This is a optional parameter and will default to **2022-04-01-preview** if not specified. Other versions can be found here [APIM Rest API Reference - Overview Docs](https://learn.microsoft.com/en-us/rest/api/apimanagement/current-ga/api-diagnostic/create-or-update?tabs=HTTP).
| CONFIGURATION_YAML_PATH | Path to the Yaml configuration file used to specify select apis to extract. A sample yaml extractor configuration file to signal to the extractor to extract select apis. This is an optional parameter and will only come into play if you want different teams to manage different apis. You typically will have one configuration per team. Note: You can call the file whatever you want as long as you reference the right file within your extractor pipeline.
| CONFIGURATION_YAML_PATH | Path to the Yaml configuration file used to specify select apis to extract. A sample yaml extractor configuration file to signal to the extractor to extract select apis. This is an optional parameter and will only come into play if you want different teams to manage different apis. You typically will have one configuration per team. Note: You can call the file whatever you want as long as you reference the right file within your extractor pipeline. **Configuration files are automatically validated during execution and will fail fast with detailed error messages if invalid.** |
| AZURE_CLOUD_ENVIRONMENT | Azure Authority Host Service url that will be used. This is a optional parameter and will default to **AzurePublicCloud** if not specified.
| Logging__LogLevel__Default: | The allowed values are either "Information", "Debug", or "Trace". Table below shows the description of each logging level.

Expand Down Expand Up @@ -99,5 +99,52 @@ The extractor will export the artifacts listed below.
### Extracting Select Artifacts
There are cases where you may want to extract select artifacts (e.g. specific apis, products, etc.) instead of extracting everything. This could be a result of having a requirement to promote specific artifacts across environments or as a result of supporting multiple teams where each team may be responsible for select artifacts. ApiOPS supports this feature and you can find the details in the "Supporting Independent API Teams" [section](../6-supportingIndependentAPITeams/index.md).

### Configuration Validation

The extractor includes comprehensive validation for YAML configuration files to help catch common configuration errors early:

#### Automatic Validation

- **Runtime Validation**: Configuration files are automatically validated when the extractor starts
- **Fail Fast**: Invalid configurations cause the extractor to exit immediately with detailed error messages
- **Detailed Errors**: Error messages include specific line numbers and clear descriptions of issues

#### Manual Validation Command

You can validate configuration files before running extraction:

```bash
# Validate a configuration file
./extractor validate-config configuration.extractor.yaml

# Example output for valid configuration:
# βœ… Configuration validation PASSED!
# The configuration file is valid and ready to use.

# Example output for invalid configuration:
# ❌ Configuration validation FAILED:
# β€’ apiNames[1]: Items cannot be empty or contain only whitespace.
# β€’ apiNames: Duplicate item found: 'demo-api'. Each item should be unique.
# β€’ productNames: Property 'productNames' must be an array of strings.
```

#### Validation Rules

The validator checks for:

- **Structural Issues**: Empty files, non-array properties, missing files
- **Content Issues**: Empty/whitespace strings, duplicate entries, invalid data types
- **Naming Conventions**: Names too long (>256 characters), invalid characters
- **Unknown Sections**: Warns about unrecognized configuration sections

#### Best Practices

- Use the validation command in CI/CD pipelines before extraction
- Fix validation errors immediately - they indicate configuration problems
- Pay attention to warnings about unknown sections - they may indicate typos
- Keep configuration files under version control

> Note: Configuration validation is designed to catch common mistakes and improve the developer experience. It does not validate Azure APIM-specific constraints (like API name availability).

> **Note**
> We recommend looking into [workspaces](https://learn.microsoft.com/en-us/azure/api-management/workspaces-overview) in the future which allows decentralized API development teams to manage and productize their own APIs, while a central API platform team maintains the API Management infrastructure. Each workspace contains APIs, products, subscriptions, and related entities that are accessible only to the workspace collaborators. Access is controlled through Azure role-based access control (RBAC). `ApiOPS` will bring support for workspaces when it becomes generally available.
> We recommend looking into [workspaces](https://learn.microsoft.com/en-us/azure/api-management/workspaces-overview) in the future which allows decentralized API development teams to manage and productize their own APIs, while a central API platform team maintains the API Management infrastructure. Each workspace contains APIs, products, subscriptions, and related entities that are accessible only to the workspace collaborators. Access is controlled through Azure role-based access control (RBAC). `ApiOPS` will bring support for workspaces when it becomes generally available.
31 changes: 31 additions & 0 deletions tools/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,36 @@
Kindly observe that even though the extractor and publisher binaries are not tightly coupled with the CI/CD pipelines we furnish, it is highly recommended to execute them within the provided pipelines. You can consider utilizing the techniques outlined below for running them as an internal development loop, while utilizing the pipelines we offer for executing the binaries can be seen as an external development loop.

# Configuration Validation

## Validate Configuration Files

Before running the extractor, you can validate your YAML configuration files to catch common errors:

```bash
# Validate extractor configuration
./extractor validate-config configuration.extractor.yaml

# Example output for valid configuration:
βœ… Configuration validation PASSED!
The configuration file is valid and ready to use.

# Example output with errors:
❌ Configuration validation FAILED:
β€’ apiNames[1]: Items cannot be empty or contain only whitespace.
β€’ apiNames: Duplicate item found: 'demo-api'. Each item should be unique.
β€’ productNames: Property 'productNames' must be an array of strings.
```

Configuration validation helps catch:

- Empty or whitespace entries
- Duplicate items (case-insensitive)
- Invalid data types (non-arrays where arrays expected)
- Unknown configuration sections (warnings)
- Names that are too long (>256 characters)

The validation runs automatically when you use configuration files, but using the validation command helps catch issues early in development.

# Debug Instructions using Visual Studio Code Dev Container
This option allows you to run the extractor and publisher binaries on your local machine inside a container. Thus you won't need to install any SDKs on your local machine.

Expand Down
28 changes: 24 additions & 4 deletions tools/code/common/Configuration.cs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.DependencyInjection.Extensions;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using System;
using System.Collections.Generic;
using System.Collections.Immutable;
Expand Down Expand Up @@ -195,21 +196,40 @@ public static void ConfigureConfigurationJson(IHostApplicationBuilder builder)
private static ConfigurationJson GetConfigurationJson(IServiceProvider provider)
{
var configuration = provider.GetRequiredService<IConfiguration>();
var logger = provider.GetRequiredService<ILogger<ConfigurationJson>>();

var configurationJson = ConfigurationJson.From(configuration);

return TryGetConfigurationJsonFromYaml(configuration)
return TryGetConfigurationJsonFromYaml(configuration, logger)
.Map(configurationJson.MergeWith)
.IfNone(configurationJson);
}

private static Option<ConfigurationJson> TryGetConfigurationJsonFromYaml(IConfiguration configuration) =>
private static Option<ConfigurationJson> TryGetConfigurationJsonFromYaml(IConfiguration configuration, ILogger logger) =>
configuration.TryGetValue("CONFIGURATION_YAML_PATH")
.Map(path => new FileInfo(path))
.Where(file => file.Exists)
.Map(file =>
{
using var reader = File.OpenText(file.FullName);
return ConfigurationJson.FromYaml(reader);
logger.LogInformation("Loading configuration from YAML file: {FilePath}", file.FullName);

// Validate the YAML configuration before loading
var validationResult = ConfigurationValidator.ValidateExtractorConfigurationFromFile(file, logger);

return validationResult.Match(
errors =>
{
var errorMessages = string.Join(Environment.NewLine, errors.Select(e => $" - {e}"));
var fullMessage = $"Configuration validation failed for file '{file.FullName}':{Environment.NewLine}{errorMessages}";

logger.LogError("Configuration validation errors: {Errors}", errorMessages);
throw new InvalidOperationException(fullMessage);
},
validConfig =>
{
logger.LogInformation("Configuration validation passed for file: {FilePath}", file.FullName);
return validConfig;
}
);
});
}
217 changes: 217 additions & 0 deletions tools/code/common/ConfigurationValidator.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,217 @@
using LanguageExt;
using Microsoft.Extensions.Logging;
using System;
using System.Collections.Generic;
using System.Collections.Immutable;
using System.IO;
using System.Linq;
using System.Text.Json;
using System.Text.Json.Nodes;
using YamlDotNet.Core;

namespace common;

public record ConfigurationValidationError(string PropertyPath, string Message)
{
public override string ToString() => $"{PropertyPath}: {Message}";
}

public static class ConfigurationValidator
{
private static readonly ImmutableHashSet<string> ValidExtractorSections = ImmutableHashSet.Create(
StringComparer.OrdinalIgnoreCase,
"apiNames",
"backendNames",
"diagnosticNames",
"gatewayNames",
"groupNames",
"loggerNames",
"namedValueNames",
"policyFragmentNames",
"productNames",
"subscriptionNames",
"tagNames",
"versionSetNames",
"workspaceNames"
);

public static Either<ImmutableList<ConfigurationValidationError>, ConfigurationJson> ValidateExtractorConfiguration(
ConfigurationJson configurationJson,
ILogger? logger = null)
{
var errors = new List<ConfigurationValidationError>();

// Validate root structure
ValidateRootStructure(configurationJson.Value, errors);

// Validate each known section
ValidateKnownSections(configurationJson.Value, errors, logger);

// Check for unknown sections
ValidateUnknownSections(configurationJson.Value, errors, logger);

return errors.Count == 0
? Either<ImmutableList<ConfigurationValidationError>, ConfigurationJson>.Right(configurationJson)
: Either<ImmutableList<ConfigurationValidationError>, ConfigurationJson>.Left(errors.ToImmutableList());
}

public static Either<ImmutableList<ConfigurationValidationError>, ConfigurationJson> ValidateExtractorConfigurationFromFile(
FileInfo configurationFile,
ILogger? logger = null)
{
try
{
if (!configurationFile.Exists)
{
return Either<ImmutableList<ConfigurationValidationError>, ConfigurationJson>.Left(
ImmutableList.Create(new ConfigurationValidationError("file", $"Configuration file '{configurationFile.FullName}' does not exist.")));
}

using var reader = File.OpenText(configurationFile.FullName);
var configurationJson = ConfigurationJson.FromYaml(reader);

return ValidateExtractorConfiguration(configurationJson, logger);
}
catch (YamlException yamlEx)
{
return Either<ImmutableList<ConfigurationValidationError>, ConfigurationJson>.Left(
ImmutableList.Create(new ConfigurationValidationError("yaml", $"YAML parsing error: {yamlEx.Message}")));
}
catch (JsonException jsonEx)
{
return Either<ImmutableList<ConfigurationValidationError>, ConfigurationJson>.Left(
ImmutableList.Create(new ConfigurationValidationError("json", $"JSON conversion error: {jsonEx.Message}")));
}
catch (IOException ioEx)
{
return Either<ImmutableList<ConfigurationValidationError>, ConfigurationJson>.Left(
ImmutableList.Create(new ConfigurationValidationError("file", $"File I/O error: {ioEx.Message}")));
}
catch (UnauthorizedAccessException authEx)
{
return Either<ImmutableList<ConfigurationValidationError>, ConfigurationJson>.Left(
ImmutableList.Create(new ConfigurationValidationError("access", $"Access denied: {authEx.Message}")));
}
}

private static void ValidateRootStructure(JsonObject rootObject, List<ConfigurationValidationError> errors)
{
if (rootObject.Count == 0)
{
errors.Add(new ConfigurationValidationError("root", "Configuration file is empty or contains no valid sections."));
return;
}

// Check if all properties are arrays (as expected for extractor config)
foreach (var property in rootObject)
{
if (property.Value is not JsonArray)
{
errors.Add(new ConfigurationValidationError(
property.Key,
$"Property '{property.Key}' must be an array of strings."));
}
}
}

private static void ValidateKnownSections(JsonObject rootObject, List<ConfigurationValidationError> errors, ILogger? logger)
{
foreach (var sectionName in ValidExtractorSections)
{
if (rootObject.TryGetPropertyValue(sectionName, out var sectionNode) && sectionNode is JsonArray sectionArray)
{
ValidateStringArray(sectionName, sectionArray, errors);
}
}
}

private static void ValidateStringArray(string sectionName, JsonArray array, List<ConfigurationValidationError> errors)
{
if (array.Count == 0)
{
errors.Add(new ConfigurationValidationError(sectionName, $"Section '{sectionName}' is empty. Consider removing it if no items need to be extracted."));
return;
}

var duplicates = new System.Collections.Generic.HashSet<string>();
var seen = new System.Collections.Generic.HashSet<string>(StringComparer.OrdinalIgnoreCase);

for (int i = 0; i < array.Count; i++)
{
var item = array[i];

// Check if item is a string
if (item is not JsonValue jsonValue || jsonValue.TryGetValue<string>(out var stringValue) == false)
{
errors.Add(new ConfigurationValidationError(
$"{sectionName}[{i}]",
"All items in the array must be strings."));
continue;
}

// Check for empty or whitespace strings
if (string.IsNullOrWhiteSpace(stringValue))
{
errors.Add(new ConfigurationValidationError(
$"{sectionName}[{i}]",
"Items cannot be empty or contain only whitespace."));
continue;
}

// Check for duplicates
if (!seen.Add(stringValue))
{
duplicates.Add(stringValue);
}

// Validate naming conventions
ValidateNamingConvention(sectionName, i, stringValue, errors);
}

// Report duplicates
foreach (var duplicate in duplicates)
{
errors.Add(new ConfigurationValidationError(
sectionName,
$"Duplicate item found: '{duplicate}'. Each item should be unique."));
}
}

private static void ValidateNamingConvention(string sectionName, int index, string name, List<ConfigurationValidationError> errors)
{
// Basic naming convention validation
if (name.Length > 256)
{
errors.Add(new ConfigurationValidationError(
$"{sectionName}[{index}]",
$"Name '{name}' is too long. Maximum length is 256 characters."));
}

// Check for invalid characters (basic validation)
if (name.Contains("//", StringComparison.Ordinal) || name.Contains("\\\\", StringComparison.Ordinal))
{
errors.Add(new ConfigurationValidationError(
$"{sectionName}[{index}]",
$"Name '{name}' contains invalid character sequences."));
}
}

private static void ValidateUnknownSections(JsonObject rootObject, List<ConfigurationValidationError> errors, ILogger? logger)
{
var unknownSections = rootObject
.Where(kvp => !ValidExtractorSections.Contains(kvp.Key))
.Select(kvp => kvp.Key)
.ToList();

foreach (var unknownSection in unknownSections)
{
var message = $"Unknown configuration section: '{unknownSection}'. Valid sections are: {string.Join(", ", ValidExtractorSections.OrderBy(s => s))}";

logger?.LogWarning("Configuration validation warning: {Message}", message);

// For now, treat unknown sections as warnings, not errors
// Uncomment the next line if you want to treat them as errors
// errors.Add(new ConfigurationValidationError(unknownSection, message));
}
}
}
Loading