Lockfile Statistics Analysis - 2026-02-22 #17756
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-02-23T16:48:15.271Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Analysis of all 158
.lock.ymlfiles in.github/workflows/as of 2026-02-22. Compared to the previous run (2026-02-21), the repository gained 1 new workflow (157 → 158 files). The aggregate lock file corpus now totals ~10.2 MB and reveals a mature, highly consistent agentic workflow platform.Summary
File Size Distribution
Size extremes:
codex-github-remote-mcp-test.lock.yml— 24.4 KBsmoke-claude.lock.yml— 144.2 KBThe tight clustering in the 50–100 KB band (93% of files) reflects the standardized, templated nature of the platform's generated lock files. The 4 outliers above 100 KB are smoke-test workflows that include multiple engine jobs in a single file.
Trigger Analysis
Most Popular Triggers
workflow_dispatchschedulepull_requestissue_commentissuespull_request_review_commentdiscussion_commentdiscussionworkflow_runpushCommon Trigger Combinations
schedule+workflow_dispatchworkflow_dispatchonlypull_request+schedule+workflow_dispatchpull_request+workflow_dispatchdiscussion+discussion_comment+issue_comment+issues+pull_request+pull_request_review_commentissuesonlyissue_comment+issues+pull_requestissue_commentonlyThe dominant pattern (68% of all workflows) is
schedule + workflow_dispatch— autonomous scheduled agents that also support manual re-runs.Schedule Patterns (Top 10 Most Common Cron Expressions)
0 14 * * 1-50 13 * * 1-50 11 * * 1-50 9 * * 1-50 */6 * * *0 15 * * 1-50 10 * * 1-50 16 * * 1-512 9 * * *18 17 * * *The strong preference for weekday-only schedules (Mon–Fri) indicates these are business-oriented workflows tied to team working hours. Schedules cluster in the 09:00–16:00 UTC window, suggesting US/EU business hour alignment. Total distinct schedule expressions: 101 across 117 scheduled workflows.
Safe Outputs Analysis
Output Types Distribution
missing_datamissing_toolcreate_discussioncreate_issueadd_commentcreate_pull_requestadd_labelscreate_pull_request_review_commentupdate_issuepush_to_pull_request_branchclose_discussionsubmit_pull_request_reviewremove_labelsclose_pull_requestlink_sub_issuedispatch_workflowhide_commentupdate_pull_requestcreate_code_scanning_alertcreate_agent_sessionclose_issuecreate_project_status_updateupdate_projectassign_to_userupdate_releaseadd_reviewerresolve_pull_request_review_threadunassign_from_userTotal distinct output types in use: 28.
missing_dataandmissing_toolappear in 95.6% of workflows as universal error-handling primitives.Discussion Categories
auditsannouncementsreportsartifactsdevresearchNO_CATEGORY(omitted)agent-researchdaily-newssecurityauditsis the dominant discussion category (72% of discussion-producing workflows), reflecting this repository's primary use as an operational monitoring and reporting platform.Multi-Output Workflows (52 workflows use 2+ action types)
Example workflows combining multiple safe output types:
agent-performance-analyzer:add_comment+create_discussion+create_issueai-moderator:add_labels+hide_commentauto-triage-issues:add_labels+create_discussionbot-detection:create_issue+update_issuechangeset:push_to_pull_request_branch+update_pull_requestci-doctor:add_comment+create_issue+update_issuecloclo:add_comment+create_pull_requestcode-scanning-fixer:add_labels+create_pull_requestcontribution-check:add_comment+add_labels+create_issuecraft:add_comment+push_to_pull_request_branchStructural Characteristics
Job Complexity
Most Common Job Names
activationagentconclusionsafe_outputsdetectionupdate_cache_memorypre_activationupload_assetspush_repo_memoryAll 158 workflows share the
activationandagentjobs — the universal skeleton of the gh-aw platform.Step Complexity
codex-github-remote-mcp-test,example-*,firewall,test-workflow)daily-copilot-token-report)Top 10 Most Complex Workflows (by step count)
daily-copilot-token-reportaudit-workflowsdeep-reportcopilot-pr-nlp-analysissmoke-claudesmoke-copilot-armsmoke-copilotunbloat-docscopilot-session-insightsdaily-newsTypical Lock File Structure
A representative gh-aw lock file has:
schedule+workflow_dispatchcontents: read,issues: write,discussions: writePermission Patterns
Permission Frequency (across all job permission blocks)
contents: readissues: writediscussions: writecontents: writeissues: readpull-requests: writepull-requests: readactions: readdiscussions: readsecurity-events: readactions: writesecurity-events: writeWorkflow-Level Permission Distribution
contents: read(all jobs)issues: writediscussions: writecontents: writeactions: readpull-requests: writeAll 158 workflows set top-level
permissions: {}(empty) and grant granular permissions only to individual jobs — a security best practice.Engine Distribution
Detected via concurrency group naming conventions (
gh-aw-copilot-*,gh-aw-claude-*,gh-aw-codex-*). Copilot is the dominant engine for the majority of scheduled workflows.Tool and MCP Patterns
MCP Server Usage
safeoutputsgithub-remotebrave-searchThe
safeoutputsMCP server is effectively universal (95.6%), functioning as the platform's write-action gateway. Only 2 workflows use the experimentalgithub-remoteMCP server orbrave-search.Runner Distribution
ubuntu-slimubuntu-latestubuntu-24.04-armubuntu-slimis used more frequently overall, butubuntu-latestis common for steps requiring full toolchain support. One workflow tests on ARM architecture.Timeout Patterns
Interesting Findings
Universal skeleton: Every single one of the 158 workflows contains
activationandagentjobs — the gh-aw platform enforces a rigidly standardized structure, making each lock file a parameterized instance of a common template.28 distinct safe output types: The platform has grown a rich vocabulary of 28 action types. Beyond the ubiquitous error primitives (
missing_data,missing_tool),create_discussion(60 workflows) andcreate_issue(47 workflows) are the primary communication channels agents use to surface findings.Business-hour scheduling bias: Among the 117 scheduled workflows, the majority run during UTC business hours (09:00–17:00) on weekdays. This reflects human review cycles — agents produce reports timed for team morning check-ins.
Security hygiene: All 158 workflows use the
permissions: {}+ job-level grant pattern (principle of least privilege), and all include firewall/awflogging and artifact upload steps — indicating consistent security hardening across the entire platform.Memory and persistence infrastructure: 68 workflows (43%) include an
update_cache_memoryjob, and 21 (13%) have apush_repo_memoryjob. This suggests nearly half of all workflows maintain persistent agent memory across runs.Copilot dominance but multi-engine design: Copilot powers 52.5% of workflows, Claude handles 21.5%, and Codex 5.1%. The remaining ~21% are engine-ambiguous — the platform's concurrency group abstraction cleanly isolates engine choice from workflow structure.
Historical Trends
Comparing with the previous analysis run (2026-02-21):
scheduletriggerworkflow_dispatchGrowth is steady and incremental (+1 workflow/day observed). Trigger distribution ratios are stable, indicating the new workflow follows established platform patterns.
Recommendations
Investigate the 7 files in the 10–50 KB range: These are significantly smaller than the 64.5 KB average and may represent incomplete or minimal workflows that could benefit from full platform feature adoption (cache memory, repo memory, etc.).
Clarify the 33 "unknown/other" engine workflows: These workflows lack the standard
gh-aw-{engine}-concurrency group prefix. Consider standardizing the naming convention to make engine attribution unambiguous.Review the 2
create_discussionconfigs without a category: Discussions without an explicit category fall back to repository defaults, which may route them to unintended categories. Setting explicit categories improves discoverability.Diversify scheduling patterns: With 101 distinct schedule expressions but many clustering at similar times (09:00–16:00 UTC weekdays), consider staggering more workflows to reduce simultaneous runner demand and improve signal/noise for on-call teams.
Consider formalizing the 28 safe output types: With 28 distinct output types now in use (including rarely-used ones like
assign_to_user,unassign_from_user,update_release), a curated registry or documentation page would help workflow authors discover available capabilities.Methodology
yqfor job structure queries.github/workflows/*.lock.yml)/tmp/gh-aw/cache-memory/history/2026-02-22.json; analysis script in/tmp/gh-aw/cache-memory/scripts/full_analysis.py/tmp/gh-aw/cache-memory/history/2026-02-21.jsonReferences:
Beta Was this translation helpful? Give feedback.
All reactions