Skip to content

[Schema Inaccuracy] Duplicate title properties and inline schemas #4622

@wolfy1339

Description

@wolfy1339

Schema Inaccuracy

The schema contains many inline schemas (that should be using re-usable components) and many inline schemas and components that have duplicate title properties

This causes problems when using tools like json-schema-to-typescript where you end up with many TypeScript interfaces named User_3, Repository_1, etc.

Expected

No inline schemas for re-used definitions, or schemas that have the same title

Reproduction Steps

This script logs all duplicate title properties found

import schema from "./packages/openapi-webhooks/generated/api.github.com.json" with { type: "json" };
import { writeFileSync } from "node:fs";

function findDuplicateTitles(schema) {
  const seen = new Set();
  const duplicates = [];

  function traverse(obj, path = []) {
    if (typeof obj !== "object" || obj === null) return;

    if (obj.title) {
      const titlePath = path.join("/");
      if (seen.has(obj.title)) {
        duplicates.push({ title: obj.title, path: titlePath });
      } else {
        seen.add(obj.title);
      }
    }

    // Traverse properties like oneOf, anyOf, etc.
    for (const [key, value] of Object.entries(obj)) {
      if (Array.isArray(value)) {
        value.forEach((item, index) => traverse(item, [...path, key, index]));
      } else if (typeof value === "object") {
        traverse(value, [...path, key]);
      }
    }
  }

  traverse(schema);
  return duplicates;
}

async function main() {
  const duplicates = findDuplicateTitles(schema)
    .sort((a, b) => {
      if (a.title < b.title) return -1;
      if (a.title > b.title) return 1;
      return 0;
    })
    .map(({ title, path }) => `- [ ] Title: ${title}, Path: \`#/${path}\``);

  writeFileSync("duplicates.txt", duplicates.join("\n"));
}

main().catch(console.error);

Here is the list of duplicates I have found, which is too long to post directly into the issue body:
duplicates.txt

Activity

self-assigned this
on Mar 27, 2025
bearcherian

bearcherian commented on Mar 27, 2025

@bearcherian
Contributor

@wolfy1339 Thanks for opening this issue. The JSON Schema spec provides this documentation for the title field1:

The title keyword in JSON Schema is used to provide a human-readable label for a schema or its parts. It does not affect data validation but serves as an informative annotation.

Since the title is just meta data for the schema and not intended to be a unique identifier, we don't consider the duplicate titles an issue in our schema. I would work with the maintainer of the json-schema-to-typescript and see if they have a way to work around that. Alternatively, GitHub does provide openapi-types, a library of Typescript definitions generated from our OpenAPI schema.

Footnotes

  1. https://www.learnjsonschema.com/2020-12/meta-data/title/

wolfy1339

wolfy1339 commented on Mar 27, 2025

@wolfy1339
Author

Yes, I am aware of all those points. I am also very aware of openapi-types, as I help maintain it.

The duplicate titles are pointing to a bigger issue, those items should most likely be using the reusable components or have a different title that explains the difference between the reusable component and that inline definition.

Example, having the title be App Instance on many of them isn't very informative considering there is already a reusable component with the same title. What is different between that one and the reusable component? Maybe it's App Instance With Organization, or App Instance Owned By Organization?

I understand that titles aren't necessarily unique, but I believe that this points to a bigger issue of general duplication within the OpenAPI spec and should be looked at for each occurrence to see if there is a way to reduce the duplication.

The schema is already a mighty 10MB.

I hope you understand the point I'm trying to make with this issue.

bearcherian

bearcherian commented on Apr 1, 2025

@bearcherian
Contributor

Thanks for clarifying the issue. I'll create an issue to track this internally and de-duplicate the components, or use better titles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @bearcherian@wolfy1339@easyt

      Issue actions

        [Schema Inaccuracy] Duplicate title properties and inline schemas · Issue #4622 · github/rest-api-description