Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

files: percent-decode url path #2398

Merged
merged 14 commits into from
Jan 4, 2022

Conversation

aliemjay
Copy link
Member

@aliemjay aliemjay commented Oct 11, 2021

PR Type

Bug fix

PR Checklist

  • Tests for the changes have been added / updated.
  • Documentation comments have been added / updated.
  • A changelog entry has been made for the appropriate packages.
  • Format code with the latest stable rustfmt.
  • (Team) Label with affected crates and semver status.

Overview

re.match_info().path() may still have some percent-encoded chars.

This is partly a regression after upgrading actix-router because of actix/actix-net#357

@aliemjay aliemjay force-pushed the files/fix/percent-encoding branch from 408c62d to 941e241 Compare October 11, 2021 06:00
@aliemjay aliemjay marked this pull request as draft October 11, 2021 06:01
@aliemjay aliemjay force-pushed the files/fix/percent-encoding branch from 941e241 to cc3cc21 Compare October 11, 2021 06:05
@aliemjay aliemjay marked this pull request as ready for review October 11, 2021 06:35
@aliemjay aliemjay requested a review from robjtede October 11, 2021 06:36
@aliemjay aliemjay force-pushed the files/fix/percent-encoding branch from 1e4f084 to abb8ded Compare October 12, 2021 02:33
@svenstaro

This comment has been minimized.

@robjtede

This comment has been minimized.

@robjtede robjtede added this to the actix-web v4 milestone Nov 25, 2021
@robjtede robjtede added A-files project: actix-files B-semver-major breaking change requiring a major version bump labels Dec 6, 2021
@aliemjay

This comment has been minimized.

@aliemjay aliemjay requested a review from a team December 29, 2021 12:36
@robjtede robjtede removed the B-semver-major breaking change requiring a major version bump label Jan 4, 2022
@robjtede
Copy link
Member

robjtede commented Jan 4, 2022

After some thought about this, I think the best way to move this forward is to just decode the remaining chars we want to, excluding /. I just don't see a good reason to allow decoding this character and the security doubts make it a good choice to exclude in my opinion.

Since percent-encoding can't target character sets for decoding, we can use the Quoter from actix-router.

Thoughts on this patch @aliemjay ?

diff --git a/actix-files/Cargo.toml b/actix-files/Cargo.toml
index 745e3afee..8b7a83f96 100644
--- a/actix-files/Cargo.toml
+++ b/actix-files/Cargo.toml
@@ -25,6 +25,7 @@ experimental-io-uring = ["actix-web/experimental-io-uring", "tokio-uring"]
 actix-http = "3.0.0-beta.17"
 actix-service = "2"
 actix-utils = "3"
+actix-router = "0.5.0-beta.3"
 actix-web = { version = "4.0.0-beta.18", default-features = false }
 
 askama_escape = "0.10"
diff --git a/actix-files/src/error.rs b/actix-files/src/error.rs
index 242eda4be..a7da6452e 100644
--- a/actix-files/src/error.rs
+++ b/actix-files/src/error.rs
@@ -27,15 +27,14 @@ pub enum UriSegmentError {
     /// The segment started with the wrapped invalid character.
     #[display(fmt = "The segment started with the wrapped invalid character")]
     BadStart(char),
+
     /// The segment contained the wrapped invalid character.
     #[display(fmt = "The segment contained the wrapped invalid character")]
     BadChar(char),
+
     /// The segment ended with the wrapped invalid character.
     #[display(fmt = "The segment ended with the wrapped invalid character")]
     BadEnd(char),
-    /// The path is not a valid UTF-8 string after doing percent decoding.
-    #[display(fmt = "The path is not a valif UTF-8 string after percent-decoding")]
-    NotValidUtf8,
 }
 
 /// Return `BadRequest` for `UriSegmentError`
diff --git a/actix-files/src/lib.rs b/actix-files/src/lib.rs
index 5d141d6ea..a32e19d2a 100644
--- a/actix-files/src/lib.rs
+++ b/actix-files/src/lib.rs
@@ -809,7 +809,7 @@ mod tests {
             .uri("/test/%2F..%2F..%2Ftests%2Ftest.binary")
             .to_request();
         let res = test::call_service(&srv, req).await;
-        assert_eq!(res.status(), StatusCode::OK);
+        assert_eq!(res.status(), StatusCode::NOT_FOUND);
 
         let req = TestRequest::get().uri("/test/Cargo.toml%00").to_request();
         let res = test::call_service(&srv, req).await;
diff --git a/actix-files/src/path_buf.rs b/actix-files/src/path_buf.rs
index e46d3f232..2e6a68ab0 100644
--- a/actix-files/src/path_buf.rs
+++ b/actix-files/src/path_buf.rs
@@ -3,11 +3,17 @@ use std::{
     str::FromStr,
 };
 
+use actix_router::Quoter;
 use actix_utils::future::{ready, Ready};
 use actix_web::{dev::Payload, FromRequest, HttpRequest};
 
 use crate::error::UriSegmentError;
 
+thread_local! {
+    /// Replaces remaining reserved characters except `%2F` (forward slash "/").
+    static FILENAME_QUOTER: Quoter = Quoter::new(b"%+", b"/");
+}
+
 #[derive(Debug, PartialEq, Eq)]
 pub(crate) struct PathBufWrap(PathBuf);
 
@@ -26,10 +32,6 @@ impl PathBufWrap {
     pub fn parse_path(path: &str, hidden_files: bool) -> Result<Self, UriSegmentError> {
         let mut buf = PathBuf::new();
 
-        let path = percent_encoding::percent_decode_str(path)
-            .decode_utf8()
-            .map_err(|_| UriSegmentError::NotValidUtf8)?;
-
         for segment in path.split('/') {
             if segment == ".." {
                 buf.pop();
@@ -48,7 +50,11 @@ impl PathBufWrap {
             } else if cfg!(windows) && segment.contains('\\') {
                 return Err(UriSegmentError::BadChar('\\'));
             } else {
-                buf.push(segment)
+                let decoded_segment = FILENAME_QUOTER
+                    .with(|q| q.requote(segment.as_bytes()))
+                    .unwrap_or_else(|| segment.to_owned());
+
+                buf.push(decoded_segment)
             }
         }

@aliemjay
Copy link
Member Author

aliemjay commented Jan 4, 2022

We need then to ouright reject these paths. Leaving some encoded chars seems wrong to me.

Will look into it later today.

@robjtede
Copy link
Member

robjtede commented Jan 4, 2022

They are being rejected with the 404. The only edge case left is if the fs really does contain a file with %2F in the name. And if so, you're 1) asking for trouble 2) possibly really actually want to serve such a stupid file.

IMO, this is such a niche case that it is not worth the performance hit of re-scanning the decoded segment for "%2F". Plus, it's not really a security problem to leave it.

@aliemjay aliemjay requested a review from robjtede January 4, 2022 12:13
Copy link
Member

@robjtede robjtede left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, this approach is fine

// equivalent to `path.split('/').count()`
let mut segment_count = path.matches('/').count() + 1;

let path = percent_encoding::percent_decode_str(path)
Copy link
Member

@robjtede robjtede Jan 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs a comment in here about why the "full" decode is acceptable

@aliemjay
Copy link
Member Author

aliemjay commented Jan 4, 2022

Files with %2F in the name should be already encoded to %252F, so this shouldn't be a
problem.

I also added some code to double-check the resulting path using std::path::Path because of Windows, which seems to have many edge cases.

One Idea I have now is to expose PathBufWrap (under a better name ofc).

@robjtede robjtede merged commit 374dc9b into actix:master Jan 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-files project: actix-files B-semver-patch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants