Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: index mechanism to enhance overall performance #6039

Merged
merged 5 commits into from
Jun 21, 2024

Conversation

guqing
Copy link
Member

@guqing guqing commented Jun 5, 2024

What type of PR is this?

/kind improvement
/area core
/milestone 2.17.x

What this PR does / why we need it:

重构索引机制的查询和排序以提升整体性能

how to test it?
使用 postgre 数据库,初始化 Halo ,然后执行以下脚本创建 30w 文章数据进行测试:

点击展开查看 SQL
DO $$
DECLARE
    i integer;
    postNameIndex integer;
    snapshotName varchar;
    totalRecords integer;
BEGIN
    postNameIndex := 1;
    totalRecords := 300000;

    FOR i IN 1..3 LOOP
      INSERT INTO "public"."extensions" ("name", "data", "version")
      VALUES (
          '/registry/content.halo.run/categories/category-'||i,
          convert_to(
              jsonb_build_object(
                  'spec', jsonb_build_object(
                      'displayName', '分类-'||i,
                      'slug', 'category-'||i,
                      'description', '测试分类',
                      'cover', '',
                      'template', '',
                      'priority', 0,
                      'children', '[]'::jsonb
                  ),
                  'status', jsonb_build_object(
                      'permalink', '/categories/category-'||i,
                      'postCount', totalRecords,
                      'visiblePostCount', totalRecords
                  ),
                  'apiVersion', 'content.halo.run/v1alpha1',
                  'kind', 'Category',
                  'metadata', jsonb_build_object(
                      'finalizers', jsonb_build_array('category-protection'),
                      'name', 'category-' || i,
                      'annotations', jsonb_build_object(
                          'content.halo.run/permalink-pattern', 'categories'
                      ),
                      'version', 0,
                      'creationTimestamp', '2024-06-12T03:56:40.315592Z'
                  )
          )::text, 'UTF8'),
          0
      );
    END LOOP;


    FOR i IN 1..3 LOOP
      INSERT INTO "public"."extensions" ("name", "data", "version")
        VALUES (
            '/registry/content.halo.run/tags/tag-' || i,
            convert_to(
               jsonb_build_object(
               'spec', jsonb_build_object(
                   'displayName', 'Halo tag ' || i,
                   'slug', 'tag-'||i,
                   'color', '#ffffff',
                   'cover', ''
               ),
               'status', jsonb_build_object(
                   'permalink', '/tags/tag-' || i,
                   'visiblePostCount', totalRecords,
                   'postCount', totalRecords,
                   'observedVersion', 0
               ),
               'apiVersion', 'content.halo.run/v1alpha1',
               'kind', 'Tag',
               'metadata', jsonb_build_object(
                   'finalizers', jsonb_build_array('tag-protection'),
                   'name', 'tag-'||i,
                   'annotations', jsonb_build_object(
                       'content.halo.run/permalink-pattern', 'tags'
                   ),
                   'version', 0,
                   'creationTimestamp', '2024-06-12T03:56:40.406407Z'
               )
       )::text, 'UTF8'),
       0);
    END LOOP;

    FOR i IN postNameIndex..totalRecords LOOP
        -- Generate snapshotName
        snapshotName := 'snapshot-' || i;

        -- Insert post data
        INSERT INTO "public"."extensions" ("name", "data", "version")
        VALUES (
            '/registry/content.halo.run/posts/post-' || postNameIndex,
            convert_to(
                jsonb_build_object(
                    'spec', jsonb_build_object(
                        'title', 'title-' || postNameIndex,
                        'slug', 'slug-' || postNameIndex,
                        'releaseSnapshot', snapshotName,
                        'headSnapshot', snapshotName,
                        'baseSnapshot', snapshotName,
                        'owner', 'admin',
                        'template', '',
                        'cover', '',
                        'deleted', false,
                        'publish', true,
                        'pinned', false,
                        'allowComment', true,
                        'visible', 'PUBLIC',
                        'priority', 0,
                        'excerpt', jsonb_build_object(
                            'autoGenerate', true,
                            'raw', ''
                        ),
                        'categories', ARRAY['category-kEvDb', 'category-XcRVk', 'category-adca'],
                        'tags', ARRAY['tag-RtKos', 'tag-vEsTR', 'tag-UBKCc'],
                        'htmlMetas', '[]'::jsonb
                    ),
                    'status', jsonb_build_object(
                        'phase', 'PUBLISHED',
                        'conditions', ARRAY[
                            jsonb_build_object(
                                'type', 'PUBLISHED',
                                'status', 'TRUE',
                                'lastTransitionTime', '2024-06-11T10:16:15.617748Z',
                                'message', 'Post published successfully.',
                                'reason', 'Published'
                            ),
                            jsonb_build_object(
                                'type', 'DRAFT',
                                'status', 'TRUE',
                                'lastTransitionTime', '2024-06-11T10:16:15.457668Z',
                                'message', 'Drafted post successfully.',
                                'reason', 'DraftedSuccessfully'
                            )
                        ],
                        'permalink', '/archives/slug-' || postNameIndex,
                        'excerpt', '如果你看到了这一篇文章,那么证明你已经安装成功了,感谢使用 Halo 进行创作,希望能够使用愉快。',
                        'inProgress', false,
                        'contributors', ARRAY['admin'],
                        'lastModifyTime', '2024-06-11T10:16:15.421467Z',
                        'observedVersion', 0
                    ),
                    'apiVersion', 'content.halo.run/v1alpha1',
                    'kind', 'Post',
                    'metadata', jsonb_build_object(
                        'finalizers', ARRAY['post-protection'],
                        'name', 'post-' || postNameIndex,
                        'labels', jsonb_build_object(
                            'content.halo.run/published', 'true',
                            'content.halo.run/deleted', 'false',
                            'content.halo.run/owner', 'admin',
                            'content.halo.run/visible', 'PUBLIC',
                            'content.halo.run/archive-year', '2024',
                            'content.halo.run/archive-month', '06',
                            'content.halo.run/archive-day', '11'
                        ),
                        'annotations', jsonb_build_object(
                            'content.halo.run/permalink-pattern', '/archives/{slug}',
                            'content.halo.run/last-released-snapshot', snapshotName,
                            'checksum/config', '73e40d4115f5a7d1e74fcc9228861c53d2ef60468e1e606e367b01efef339309'
                        ),
                        'version', 0,
                        'creationTimestamp', '2024-06-11T05:51:46.059292Z'
                    )
                )::text, 'UTF8'),
            1
        );

        -- Insert content data
        INSERT INTO "public"."extensions" ("name", "data", "version")
        VALUES (
            '/registry/content.halo.run/snapshots/' || snapshotName,
            convert_to(
                jsonb_build_object(
                    'spec', jsonb_build_object(
                        'subjectRef', jsonb_build_object(
                            'group', 'content.halo.run',
                            'version', 'v1alpha1',
                            'kind', 'Post',
                            'name', 'post-' || postNameIndex
                        ),
                        'rawType', 'HTML',
                        'rawPatch', '<p style=\"\">测试内容</p>',
                        'contentPatch', '<p style=\"\">测试内容</p>',
                        'lastModifyTime', '2024-06-11T06:01:25.748755Z',
                        'owner', 'admin',
                        'contributors', ARRAY['admin']
                    ),
                    'apiVersion', 'content.halo.run/v1alpha1',
                    'kind', 'Snapshot',
                    'metadata', jsonb_build_object(
                        'name', snapshotName,
                        'annotations', jsonb_build_object(
                            'content.halo.run/keep-raw', 'true'
                        ),
                        'creationTimestamp', '2024-06-11T06:01:25.748925Z'
                    )
                )::text, 'UTF8'),
            1
        );

        postNameIndex := postNameIndex + 1;
    END LOOP;
END $$;

使用以下 API 查询文章

curl 'http://localhost:8090/apis/api.console.halo.run/v1alpha1/posts?page=1&size=20&labelSelector=content.halo.run%2Fdeleted%3Dfalse&labelSelector=content.halo.run%2Fpublished%3Dtrue&fieldSelector=spec.categories%3Dcategory-1&fieldSelector=spec.tags%3Dc33ceabb-d8f1-4711-8991-bb8f5c92ad7c&fieldSelector=status.contributors%3Dadmin&fieldSelector=spec.visible%3DPUBLIC' \
--header 'Authorization: Basic YWRtaW46YWRtaW4='

Before:

SCR-20240612-o20
After:

SCR-20240612-q1c

Does this PR introduce a user-facing change?

重构索引机制的查询和排序使整体性能提升 50% 以上

@f2c-ci-robot f2c-ci-robot bot added kind/improvement Categorizes issue or PR as related to a improvement. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Jun 5, 2024
@f2c-ci-robot f2c-ci-robot bot added this to the 2.17.x milestone Jun 5, 2024
@f2c-ci-robot f2c-ci-robot bot added the area/core Issues or PRs related to the Halo Core label Jun 5, 2024
@f2c-ci-robot f2c-ci-robot bot requested review from ruibaby and wan92hen June 5, 2024 08:06
Copy link

codecov bot commented Jun 5, 2024

Codecov Report

Attention: Patch coverage is 34.34650% with 216 lines in your changes missing coverage. Please review.

Project coverage is 54.82%. Comparing base (5fdf6c0) to head (a10e352).
Report is 244 commits behind head on main.

Current head a10e352 differs from pull request most recent head 6cd622e

Please upload reports for the commit 6cd622e to get more accurate results.

Files Patch % Lines
.../app/extension/index/query/QueryIndexViewImpl.java 0.00% 80 Missing ⚠️
...lo/app/extension/index/IndexEntryOperatorImpl.java 0.00% 49 Missing ⚠️
...halo/app/extension/index/query/StringEndsWith.java 0.00% 9 Missing ⚠️
...lo/app/extension/index/query/StringStartsWith.java 0.00% 9 Missing ⚠️
.../main/java/run/halo/app/extension/ListOptions.java 0.00% 8 Missing ⚠️
...halo/app/extension/index/query/StringContains.java 11.11% 8 Missing ⚠️
...ava/run/halo/app/extension/index/query/IsNull.java 0.00% 7 Missing ⚠️
...a/run/halo/app/extension/index/query/NotEqual.java 0.00% 7 Missing ⚠️
...run/halo/app/extension/index/query/EqualQuery.java 0.00% 5 Missing ⚠️
...lo/app/extension/index/query/GreaterThanQuery.java 0.00% 5 Missing ⚠️
... and 11 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #6039      +/-   ##
============================================
- Coverage     56.91%   54.82%   -2.10%     
- Complexity     3319     3400      +81     
============================================
  Files           587      626      +39     
  Lines         18968    21032    +2064     
  Branches       1401     1474      +73     
============================================
+ Hits          10795    11530     +735     
- Misses         7594     8915    +1321     
- Partials        579      587       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@guqing guqing force-pushed the refactor/index branch 4 times, most recently from 0596246 to 2aa7ddf Compare June 5, 2024 09:58
@guqing guqing force-pushed the refactor/index branch 3 times, most recently from d211c16 to e1296b9 Compare June 13, 2024 10:34
Copy link

Quality Gate Passed Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.1% Duplication on New Code

See analysis details on SonarCloud

@guqing guqing marked this pull request as ready for review June 14, 2024 03:52
@f2c-ci-robot f2c-ci-robot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 14, 2024
@f2c-ci-robot f2c-ci-robot bot requested a review from LIlGG June 14, 2024 03:52
@ruibaby
Copy link
Member

ruibaby commented Jun 17, 2024

ping @halo-dev/sig-halo

Copy link
Member

@ruibaby ruibaby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@f2c-ci-robot f2c-ci-robot bot added the lgtm Indicates that a PR is ready to be merged. label Jun 18, 2024
@f2c-ci-robot f2c-ci-robot bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 20, 2024
@guqing guqing requested a review from JohnNiang June 21, 2024 03:40
Copy link
Member

@JohnNiang JohnNiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

本地测试场景:

  1. 1000 篇随机文章
  2. 10 个用户分别并发随机访问首页(随机页码) 100 次

对比结果如下面两张图片所示:

  • Before:

    telegram-cloud-document-5-6314228865791168803

  • After

    telegram-cloud-document-5-6314228865791168804

整体性能提升约 50%。

/approve

@f2c-ci-robot f2c-ci-robot bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 21, 2024
Copy link
Member

@ruibaby ruibaby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@f2c-ci-robot f2c-ci-robot bot added the lgtm Indicates that a PR is ready to be merged. label Jun 21, 2024
Copy link

f2c-ci-robot bot commented Jun 21, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JohnNiang, ruibaby

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@f2c-ci-robot f2c-ci-robot bot merged commit c10862d into halo-dev:main Jun 21, 2024
7 checks passed
@guqing guqing deleted the refactor/index branch June 21, 2024 08:05
@ruibaby ruibaby modified the milestones: 2.17.x, 2.17.0 Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/core Issues or PRs related to the Halo Core kind/improvement Categorizes issue or PR as related to a improvement. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants