Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace regex with winnow based parser #255

Merged
merged 8 commits into from
Jan 21, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Changed

- Regex based parser has been removed in favor of a `winnow` based parser combinator, the format is now more strict. See [SPEC.md](spec.md) for more details. ([#255](https://github.com/heroku/buildpacks-procfile/pull/255))
- Keys starting with spaces now emit a warning
- Underscore key characters (`_`) are now converted to hyphens (`-`) and emit a warning
- Uppercase key characters are now converted to lowercase and emit a warning
- Invalid keys now error, previously they were ignored

## [3.2.0] - 2024-12-20

### Changed
Expand Down
4 changes: 4 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ the recommendations and requirements for how to best contribute to Heroku
Cloud Native Buildpacks. We strive to obey these as best as possible. As
always, thanks for contributing.

## CNB Procfile format

The format of a Procfile is defined in the [SPEC.md](SPEC.md) file. If parsing behavior differs between that description it is a bug and either the specification or parser must be updated.

## Governance Model: Salesforce Sponsored

The intent and goal of open sourcing this project is to increase the contributor
Expand Down
31 changes: 27 additions & 4 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,14 @@ pedantic = { level = "warn", priority = -1 }
unwrap_used = "warn"

[dependencies]
annotate-snippets = "0.11.5"
bullet_stream = "0.3"
fs-err = "3"
indoc = "2"
libcnb = { version = "0.26", features = ["trace"] }
libherokubuildpack = { version = "0.26", default-features = false, features = ["error", "log"] }
linked-hash-map = "0.5"
regex = "1"
winnow = "0.6.22"

[dev-dependencies]
libcnb-test = "0.26"
41 changes: 41 additions & 0 deletions SPEC.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# CNB Procfile format specification
schneems marked this conversation as resolved.
Show resolved Hide resolved

This document outlines the format of a `Procfile`, which defines names to process types. For example:

```
web: rails s
worker: bundle exec sidekiq
```

## Differences from Heroku "classic" Procfile

The classic `Procfile` has no formal specification. It is loosely defined based on a regex `"^[[:space:]]*([a-zA-Z0-9_-]+):?\\s+(.*)[[:space:]]*`. This specification is informed by the CNB specification for process names and [kubernetes](https://github.com/heroku/buildpacks-procfile/issues/251).

## Specification

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt).

- Spaces
- The term spaces refers to non-newline whitespace characters such as tab `'\t'` or space `'\s'` characters.
- Each line MUST contain a comment, empty line, or key/value pair.
- Comments
- A line MAY contain a comment
- Comments MUST contain a `#` as the first visible character on each line.
- A comment MAY be proceeded by one or more spaces.
- Empty line
- A Procfile MAY contain empty lines
- An empty line MUST be zero or more spaces followed by a line ending or end of file (EOF).
- Key/Value pairs
- A line MAY contain a key/value pair where the key represents the name of a process and the value represents a command
- A key MUST be separated from its value by a colon (`:`) followed by zero or more spaces.
schneems marked this conversation as resolved.
Show resolved Hide resolved
- Duplicate keys MUST be allowed and the last entry MUST take precedence. A warning SHOULD be issued.
- Key
- A key's first and last character MUST be a lowercase alphanumeric (a-z0-9) character (but not `-`).
- All other key (middle) characters MUST be lowercase alphanumeric (a-z0-9) characters or hyphen `-`.
- Key length MUST be within the range `1..=63`
- An implementation MAY accept `_` as a middle character provided it converts it to `-` and issues a warning.
- An implementation MAY accept an uppercase character provided it is converted to lowercase characters and issues a warning.
- A key MAY be preceded with zero or more spaces provided they are not included in the return key and a warning is issued.
- Value
schneems marked this conversation as resolved.
Show resolved Hide resolved
- A value MUST contain 1 or more non-whitespace characters.
- A value MUST be terminated by a newline or EOF.
17 changes: 12 additions & 5 deletions src/error.rs
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
use crate::launch::ProcfileConversionError;
use crate::procfile::ProcfileParsingError;
use crate::procfile::ProcfileError;
use bullet_stream::Print;
use indoc::formatdoc;

#[derive(Debug)]
pub(crate) enum ProcfileBuildpackError {
CannotReadProcfileContents(std::io::Error),
ProcfileParsingError(ProcfileParsingError),
ProcfileParsingError(ProcfileError),
ProcfileConversionError(ProcfileConversionError),
}

Expand All @@ -23,9 +23,16 @@ pub(crate) fn error_handler(buildpack_error: ProcfileBuildpackError) {
Underlying cause was: {io_error}
"});
}
// There are currently no ways in which parsing can fail, however we will add some in the future:
// https://github.com/heroku/buildpacks-procfile/issues/73
ProcfileBuildpackError::ProcfileParsingError(parsing_error) => match parsing_error {},
ProcfileBuildpackError::ProcfileParsingError(parsing_error) =>
build_output.error(formatdoc! {"
Invalid Procfile format

The provided `Procfile` contains an invalid format and the buildpack cannot continue.

To fix this problem please correct the following error and commit the results to git:

{parsing_error}
"}),
ProcfileBuildpackError::ProcfileConversionError(conversion_error) => match conversion_error
{
ProcfileConversionError::InvalidProcessType(libcnb_error) => {
Expand Down
13 changes: 7 additions & 6 deletions src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,12 +43,13 @@ impl Buildpack for ProcfileBuildpack {
.map_err(ProcfileBuildpackError::ProcfileParsingError)
})?;

if procfile.is_empty() {
bullet = bullet.sub_bullet("Empty file, no processes defined");
} else {
for (name, command) in &procfile.processes {
bullet = bullet.sub_bullet(format!("{name}: {}", style::command(command)));
}
let warning_prefix = style::important("WARNING:");
for message in &procfile.warnings {
bullet = bullet.sub_bullet(format!("{warning_prefix} {message}"));
}

for (name, command) in &procfile.processes {
bullet = bullet.sub_bullet(format!("{name}: {cmd}", cmd = style::command(command)));
}
bullet.done().done();

Expand Down
Loading