Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ include .pre-commit-config.yaml
recursive-include awscli/examples *.rst *.txt
recursive-include awscli/data *.json
recursive-include awscli/topics *.rst *.json
prune awsclilinter
27 changes: 27 additions & 0 deletions awsclilinter/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
*.py[co]
*.DS_Store

__pycache__/
*.py[cod]
*$py.class
*.so
.Python
*.egg-info/
dist/
build/
.pytest_cache/

# Unit test / coverage reports
.coverage
htmlcov/

# Virtualenvs
.venv/
venv/
env/

# Keep lockfiles
!*.lock

# Pyenv
.python-version
12 changes: 12 additions & 0 deletions awsclilinter/LICENSE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Copyright 2025 Amazon.com, Inc. or its affiliates. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License"). You
may not use this file except in compliance with the License. A copy of
the License is located at

http://aws.amazon.com/apache2.0/

or in the "license" file accompanying this file. This file is
distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
ANY KIND, either express or implied. See the License for the specific
language governing permissions and limitations under the License.
7 changes: 7 additions & 0 deletions awsclilinter/MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
include README.md
include requirements.txt
include requirements-dev.txt
include requirements-dev-lock.txt
recursive-include awsclilinter *.py
recursive-exclude tests *
recursive-exclude examples *
36 changes: 36 additions & 0 deletions awsclilinter/Makefile
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the purpose of this Makefile. It looks like it can be largely replaced with modern Python dev tools like uv and ruff (also curious why we're using black and isort here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, I'll look into uv and ruff, and potentially delete this Makefile and update the README dev guide

Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
.PHONY: setup test format lint clean

setup:
@echo "Setting up AWS CLI Linter..."
@if [ ! -d "venv" ]; then \
echo "Creating virtual environment..."; \
python3.12 -m venv venv; \
fi
@echo "Installing dependencies..."
@. venv/bin/activate && pip install --upgrade pip
@if [ -f "requirements-dev-lock.txt" ]; then \
echo "Using lockfile for reproducible builds..."; \
. venv/bin/activate && pip install -r requirements-dev-lock.txt; \
else \
. venv/bin/activate && pip install -r requirements-dev.txt; \
fi
@echo "Installing package..."
@. venv/bin/activate && pip install -e .
@echo "Setup complete! Activate the virtual environment with: source venv/bin/activate"

test:
pytest tests/ -v

format:
black awsclilinter tests
isort awsclilinter tests

lint:
black --check awsclilinter tests
isort --check awsclilinter tests

clean:
rm -rf venv
rm -rf build dist *.egg-info
find . -type d -name __pycache__ -exec rm -rf {} +
find . -type f -name "*.pyc" -delete
74 changes: 74 additions & 0 deletions awsclilinter/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# AWS CLI v1-to-v2 Upgrade Linter

A CLI tool that lints bash scripts for AWS CLI v1 usage and updates them to avoid breaking
changes introduced in AWS CLI v2. Not all breaking changes can be detected statically,
thus not all of them are supported by this tool.

For a full list of the breaking changes introduced with AWS CLI v2, see
[Breaking changes between AWS CLI version 1 and AWS CLI version 2](https://docs.aws.amazon.com/cli/latest/userguide/cliv2-migration-changes.html#cliv2-migration-changes-breaking).

## Installation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the idea was to publish this as a separate PyPI package? Why are we instructing users to build from source?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so this was moreso intended for the devs contributing to this package. I agree most customers would be installing it from PyPi. I can try to make that more clear in the next revision with adding clarifications to this README, let me know if you have additional asks for this


Run `make setup` to set up a virtual environment with the linter installed. Alternatively,
you can follow the steps below.

1. Create a virtual environment:
```bash
python3.12 -m venv venv
source venv/bin/activate
```

2. Install dependencies:
```bash
pip install -r requirements.txt
pip install -r requirements-dev-lock.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we anticipate users to require dev dependencies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my last comment, this workflow was intended for dev contributors (i.e. us). The dev dependencies are for PyTest and format packages to test and format the package.

```

3. Install the package in development mode:
```bash
pip install -e .
```

## Usage

### Dry-run mode (default)
Display issues without modifying the script:
```bash
upgrade-aws-cli --script upload_s3_files.sh
```

### Fix mode
Automatically update the input script:
```bash
upgrade-aws-cli --script upload_s3_files.sh --fix
```

### Output mode
Create a new fixed script without modifying the original:
```bash
upgrade-aws-cli --script upload_s3_files.sh --output upload_s3_files_v2.sh
```

### Interactive mode
Review and accept/reject each change individually:
```bash
upgrade-aws-cli --script upload_s3_files.sh --interactive --output upload_s3_files_v2.sh
```

In interactive mode, you can:
- Press `y` to accept the current change
- Press `n` to skip the current change
- Press `u` to accept all remaining changes
- Press `q` to cancel and quit

## Development

### Running tests
```bash
make test
```

### Code formatting
```bash
make format
```
1 change: 1 addition & 0 deletions awsclilinter/awsclilinter/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__version__ = "1.0.0"
162 changes: 162 additions & 0 deletions awsclilinter/awsclilinter/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
import argparse
import sys
from pathlib import Path
from typing import List

from awsclilinter.linter import ScriptLinter
from awsclilinter.rules.base64_rule import Base64BinaryFormatRule
from awsclilinter.rules_base import LintFinding

# ANSI color codes
RED = "\033[31m"
GREEN = "\033[32m"
CYAN = "\033[36m"
RESET = "\033[0m"

# The number of lines to show before an after a fix suggestion, for context within the script
CONTEXT_SIZE = 3


def get_user_choice(prompt: str) -> str:
"""Get user input for interactive mode."""
while True:
choice = input(prompt).lower().strip()
if choice in ["y", "n", "u", "q"]:
return choice
print("Invalid choice. Please enter y, n, u, or q.")


def display_finding(finding: LintFinding, index: int, total: int, script_content: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that's confusing is that a line can show up as a diff even if nothing changed. Using the example file in this commit:

-aws secretsmanager put-secret-value --secret-id secret1213 \
-  --secret-binary file://data.json
+aws secretsmanager put-secret-value --secret-id secret1213 \
+  --secret-binary file://data.json --cli-binary-format raw-in-base64-out

Is this intended behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I tried to keep the diff-generation code simple while also giving a somewhat familiar experience to git diff, and I think this behavior is an artifact of that.

We can pursue more complex diffing logic to address this. If we go down this route, I'd likely utilize difflib which is included in the standard Python library. It has a unified_diff function which may produce what we're looking for.

Below is some simple code I tested locally, seems to show the desired behavior:

>>> src_lines = 'aws secretsmanager put-secret-value --secret-id secret1213 \\\n  --secret-binary file://data.json\n'.splitlines(keepends=True)
>>> dest_lines = 'aws secretsmanager put-secret-value --secret-id secret1213 \\\n  --secret-binary file://data.json --cli-binary-format raw-in-base64-out\n'.splitlines(keepends=True)
>>> diff2 = difflib.unified_diff(src_lines, dest_lines)
>>> sys.stdout.writelines(diff2)
--- 
+++ 
@@ -1,2 +1,2 @@
 aws secretsmanager put-secret-value --secret-id secret1213 \
-  --secret-binary file://data.json
+  --secret-binary file://data.json --cli-binary-format raw-in-base64-out

I'm leaning towards going this route, please share feedback on this. It would likely simplify some of our diff-generation-code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that looks great for our use case

"""Display a finding to the user with context."""
src_lines = script_content.splitlines()
dest_lines = finding.edit.inserted_text.splitlines()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To calculate new_lines_added, is there a reason we're not using LintFinding.line_start and LintFinding.line_end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this relates to a previous reply in another comment. I left open the possibility that there could be more "new lines" than "source lines". line_start and line_end are the line numbers in the source file.

If we end up deciding to be more rigid and force that the number of "new lines" equals the number of "source lines" with an assertion, we may be able use the variables you suggested instead.

With this information, please advise on if you want me to proceed with the more rigid approach, with the assertion that source == new lines.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Echoing my other reply: Yeah let's go ahead and make the assertion

start_line = finding.line_start
end_line = finding.line_end
src_lines_removed = end_line - start_line + 1
new_lines_added = len(dest_lines)
Comment on lines +35 to +36
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding everything correctly, then src_lines_removed should always be equal to new_lines_added, right? If so, should we add validation on that to guard against potentially corrupting the source file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now, yes I expect them to be equal. By calculating it like this it does not lock us in to that as a requirement.

I can imagine down the line us suggesting a 'fix' that takes up multiple lines, in that case they would not be equal. However, we currently have no plans for any of the linting rules to suggest fixes that take up multiple lines. It'll pretty much always be adding a flag to the last line or modifying the command or argument.

I can add this assertion, and there's always the option to remove it if we ever have suggestions for adding multi-line suggestions. Are you recommending it's added as a runtime check? Or introducing a test for this?

I'm leaning towards a runtime check, and raising RuntimeError if they're not equal, with comments explaining some things in this thread. That'll be my plan for next revision unless you share disagreements

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we should just stick to line-by-line changes (ie always 1 line changed per 1 line affected). Go ahead and add the runtime check


# Create a map from line numbers to their indices within the full script file
line_positions = []
pos = 0
for i, line in enumerate(src_lines):
line_positions.append((pos, pos + len(line)))
pos += len(line) + 1

# Get context lines
context_start = max(0, start_line - CONTEXT_SIZE)
context_end = min(len(src_lines), end_line + CONTEXT_SIZE + 1)
src_context_size = context_end - context_start
dest_context_size = src_context_size + (new_lines_added - src_lines_removed)

print(f"\n[{index}/{total}] {finding.rule_name}")
print(f"{finding.description}")
print(
f"\n{CYAN}@@ -{context_start + 1},{src_context_size} "
f"+{context_start + 1},{dest_context_size} @@{RESET}"
)

for i in range(context_start, context_end):
line = src_lines[i] if i < len(src_lines) else ""

if start_line <= i <= end_line:
# This line is being modified
print(f"{RED}-{line}{RESET}")

if i == end_line:
line_start_pos, _ = line_positions[i]
start_pos_in_line = max(0, finding.edit.start_pos - line_start_pos)
end_pos_in_line = min(len(line), finding.edit.end_pos - line_start_pos)
new_line = line[:start_pos_in_line] + finding.edit.inserted_text + line[end_pos_in_line:]
# In case the inserted text takes up multiple lines,
# inject a + at the start of each line.
new_line = new_line.replace("\n", "\n+")
# Print the new line suggestion.
print(f"{GREEN}+{new_line}{RESET}")
else:
# Context line
print(f"{line}")


def interactive_mode(findings: List[LintFinding], script_content: str) -> List[LintFinding]:
"""Run interactive mode and return accepted findings."""
accepted = []
for i, finding in enumerate(findings, 1):
display_finding(finding, i, len(findings), script_content)
choice = get_user_choice("\nApply this fix? [y]es, [n]o, [u]pdate all, [q]uit: ")

if choice == "y":
accepted.append(finding)
elif choice == "u":
accepted.extend(findings[i - 1 :])
break
elif choice == "q":
print("Cancelled.")
sys.exit(0)
Comment on lines +92 to +94
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a bit confusing if a user accepts a few fixes and then presses q to exit early with the expectation that the accepted fixes would be applied.

  • We should explicitly document this behavior.
  • [non-blocking suggestion] And also consider adding a [s]ave option that "flushes" accepted fixes and applies them. If you think we should do this, okay with doing it in a followup PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense. Now that I think about this, I think we can let q be the save and exit option that you're suggesting. Users can always use Ctrl+C to terminate the program immediately.

Let me know if you agree with this. In either case, I would like to do it in a followup PR though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm let's discuss this with the team. I can see advantages to different approaches here


return accepted


def main():
"""Main entry point for the CLI tool."""
parser = argparse.ArgumentParser(
description="Lint and upgrade bash scripts from AWS CLI v1 to v2"
)
parser.add_argument("--script", required=True, help="Path to the bash script to lint")
parser.add_argument(
"--fix", action="store_true", help="Apply fixes to the script (modifies in place)"
)
parser.add_argument("--output", help="Output path for the fixed script")
parser.add_argument(
"-i",
"--interactive",
action="store_true",
help="Interactive mode to review each change",
)

args = parser.parse_args()

if args.fix and args.output:
print("Error: Cannot use both --fix and --output")
sys.exit(1)

if args.fix and args.interactive:
print("Error: Cannot use both --fix and --interactive")
sys.exit(1)

script_path = Path(args.script)
if not script_path.exists():
print(f"Error: Script not found: {args.script}")
sys.exit(1)

script_content = script_path.read_text()

rules = [Base64BinaryFormatRule()]
linter = ScriptLinter(rules)
findings = linter.lint(script_content)

if not findings:
print("No issues found.")
return

if args.interactive:
findings = interactive_mode(findings, script_content)
if not findings:
print("No changes accepted.")
return

if args.fix or args.output or args.interactive:
# Interactive mode is functionally equivalent to --fix, except the user
# can select a subset of the changes to apply.
fixed_content = linter.apply_fixes(script_content, findings)
output_path = Path(args.output) if args.output else script_path
output_path.write_text(fixed_content)
print(f"Fixed script written to: {output_path}")
else:
print(f"\nFound {len(findings)} issue(s):\n")
for i, finding in enumerate(findings, 1):
display_finding(finding, i, len(findings), script_content)
print("\n\nRun with --fix to apply changes or --interactive to review each change.")


if __name__ == "__main__":
main()
26 changes: 26 additions & 0 deletions awsclilinter/awsclilinter/linter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
from typing import List

from ast_grep_py import SgRoot

from awsclilinter.rules_base import LintFinding, LintRule


class ScriptLinter:
"""Linter for bash scripts to detect AWS CLI v1 to v2 migration issues."""

def __init__(self, rules: List[LintRule]):
self.rules = rules

def lint(self, script_content: str) -> List[LintFinding]:
"""Lint the script and return all findings."""
root = SgRoot(script_content, "bash")
findings = []
for rule in self.rules:
findings.extend(rule.check(root))
return sorted(findings, key=lambda f: (f.line_start, f.line_end))

def apply_fixes(self, script_content: str, findings: List[LintFinding]) -> str:
"""Apply fixes to the script content."""
root = SgRoot(script_content, "bash")
node = root.root()
return node.commit_edits([f.edit for f in findings])
3 changes: 3 additions & 0 deletions awsclilinter/awsclilinter/rules/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from awsclilinter.rules_base import LintFinding, LintRule

__all__ = ["LintRule", "LintFinding"]
Loading