Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stripping base path from CLI paths (without CLI refactor) #11617

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ Language Features:

Compiler Features:
* AssemblyStack: Also run opcode-based optimizer when compiling Yul code.
* Commandline Interface: Normalize paths specified on the command line and make them relative whenever files are located inside base path.
* Yul EVM Code Transform: Do not reuse stack slots that immediately become unreachable.
* Yul EVM Code Transform: Also pop unused argument slots for functions without return variables (under the same restrictions as for functions with return variables).
* Yul Optimizer: Move function arguments and return variables to memory with the experimental Stack Limit Evader (which is not enabled by default).
Expand Down
48 changes: 45 additions & 3 deletions docs/path-resolution.rst
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,10 @@ The initial content of the VFS depends on how you invoke the compiler:

solc contract.sol /usr/local/dapp-bin/token.sol

The source unit name of a file loaded this way is simply the specified path after shell expansion
and with platform-specific separators converted to forward slashes.
The source unit name of a file loaded this way is constructed by converting its path to a
canonical form and making it relative to the base path if it is located inside.
See :ref:`Base Path Normalization and Stripping <base-path-normalization-and-stripping>` for
a detailed description of this process.

.. index:: standard JSON

Expand Down Expand Up @@ -313,6 +315,46 @@ interpreted as absolute paths on disk.
If the base path itself is relative, it is also interpreted as relative to the current working
directory of the compiler.

.. _base-path-normalization-and-stripping:

Base Path Normalization and Stripping
-------------------------------------

When source file paths are specified on the command line, the base path affects the source unit
names assigned to them in the compiler's VFS.
To compute the names, both base path and source file paths must first be converted to a canonical form.
This ensures that the result is predictable and as platform-independent as possible:

- If a path is relative, it is made absolute by prepending the current working directory to it.

- If the path to the working directory contains symbolic links, they are resolved into actual
directories.

- Internal ``.`` and ``..`` segments are collapsed.
- Platform-specific path separators are replaced with forward slashes.
- Sequences of multiple consecutive path separators are squashed into a single separator (unless
they are the leading slashes of an `UNC path <https://en.wikipedia.org/wiki/Path_(computing)#UNC>`_).
- If the path includes a root name (e.g. a drive letter on Windows) and the root is the same as the
root of the current working directory, the root is replaced with ``/``.
- Symbolic links in the path itself are **not** resolved.
cameel marked this conversation as resolved.
Show resolved Hide resolved
- The original case of the path is preserved even if the filesystem is case-insensitive but
`case-preserving <https://en.wikipedia.org/wiki/Case_preservation>`_ and the actual case on
disk is different.

.. note::

There are situations where paths cannot be made platform-independent.
For example on Windows the compiler can avoid using drive letters by referring to the root
directory of the current drive as ``/`` but drive letters are still necessary for paths leading
to other drives.
You can avoid such situations by ensuring that all the files are available within a single
directory tree on the same drive.

Once canonicalized, the base path is stripped from all source file paths that start with it.
If the base path is empty (e.g. if it is not explicitly provided), it is treated as if it was equal
to the path to the current working directory with all symbolic links resolved.
The result becomes the source unit name.

.. index:: ! remapping; import, ! import; remapping, ! remapping; context, ! remapping; prefix, ! remapping; target
.. _import-remapping:

Expand Down Expand Up @@ -414,7 +456,7 @@ Here are the detailed rules governing the behaviour of remappings:

.. code-block:: bash

solc /project/=/contracts/ /project/contract.sol --base-path /project # source unit name: /project/contract.sol
solc /project/=/contracts/ /project/contract.sol --base-path /project # source unit name: contract.sol

.. code-block:: solidity
:caption: /project/contract.sol
Expand Down
147 changes: 146 additions & 1 deletion libsolidity/interface/FileReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@
#include <libsolutil/CommonIO.h>
#include <libsolutil/Exceptions.h>

#include <boost/algorithm/string/predicate.hpp>

using solidity::frontend::ReadCallback;
using solidity::langutil::InternalCompilerError;
using solidity::util::errinfo_comment;
Expand All @@ -31,9 +33,17 @@ using std::string;
namespace solidity::frontend
{

void FileReader::setBasePath(boost::filesystem::path const& _path)
{
m_basePath = (_path.empty() ? "" : normalizeCLIPathForVFS(_path));
}
cameel marked this conversation as resolved.
Show resolved Hide resolved

void FileReader::setSource(boost::filesystem::path const& _path, SourceCode _source)
{
m_sourceCodes[_path.generic_string()] = std::move(_source);
boost::filesystem::path normalizedPath = normalizeCLIPathForVFS(_path);
boost::filesystem::path prefix = (m_basePath.empty() ? normalizeCLIPathForVFS(".") : m_basePath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove parenthesis

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it better with parentheses and I always use them when assigning a ternary expression. I find this more redaable because they group things visually. Without them the left-hand-side blends a bit too much with the rest of the expression.


m_sourceCodes[stripPrefixIfPresent(prefix, normalizedPath).generic_string()] = std::move(_source);
}

void FileReader::setSources(StringMap _sources)
Expand Down Expand Up @@ -92,5 +102,140 @@ ReadCallback::Result FileReader::readFile(string const& _kind, string const& _so
}
}

boost::filesystem::path FileReader::normalizeCLIPathForVFS(boost::filesystem::path const& _path)
{
// Detailed normalization rules:
// - Makes the path either be absolute or have slash as root (note that on Windows paths with
// slash as root are not considered absolute by Boost). If it is empty, it becomes
// the current working directory.
// - Collapses redundant . and .. segments.
// - Removes leading .. segments from an absolute path (i.e. /../../ becomes just /).
// - Squashes sequences of multiple path separators into one.
// - Ensures that forward slashes are used as path separators on all platforms.
// - Removes the root name (e.g. drive letter on Windows) when it matches the root name in the
// path to the current working directory.
//
// Also note that this function:
// - Does NOT resolve symlinks (except for symlinks in the path to the current working directory).
// - Does NOT check if the path refers to a file or a directory. If the path ends with a slash,
// the slash is preserved even if it's a file.
// - The only exception are paths where the file name is a dot (e.g. '.' or 'a/b/.'). These
// always have a trailing slash after normalization.
// - Preserves case. Even if the filesystem is case-insensitive but case-preserving and the
// case differs, the actual case from disk is NOT detected.

boost::filesystem::path canonicalWorkDir = boost::filesystem::weakly_canonical(boost::filesystem::current_path());

// NOTE: On UNIX systems the path returned from current_path() has symlinks resolved while on
// Windows it does not. To get consistent results we resolve them on all platforms.
boost::filesystem::path absolutePath = boost::filesystem::absolute(_path, canonicalWorkDir);

// NOTE: boost path preserves certain differences that are ignored by its operator ==.
// E.g. "a//b" vs "a/b" or "a/b/" vs "a/b/.". lexically_normal() does remove these differences.
boost::filesystem::path normalizedPath = absolutePath.lexically_normal();
solAssert(normalizedPath.is_absolute() || normalizedPath.root_path() == "/", "");

// If the path is on the same drive as the working dir, for portability we prefer not to
// include the root name. Do this only for non-UNC paths - my experiments show that on Windows
// when the working dir is an UNC path, / does not not actually refer to the root of the UNC path.
boost::filesystem::path normalizedRootPath = normalizedPath.root_path();
if (!isUNCPath(normalizedPath))
{
boost::filesystem::path workingDirRootPath = canonicalWorkDir.root_path();
if (normalizedRootPath == workingDirRootPath)
normalizedRootPath = "/";
}

// lexically_normal() will not squash paths like "/../../" into "/". We have to do it manually.
boost::filesystem::path dotDotPrefix = absoluteDotDotPrefix(normalizedPath);

boost::filesystem::path normalizedPathNoDotDot = normalizedPath;
if (dotDotPrefix.empty())
normalizedPathNoDotDot = normalizedRootPath / normalizedPath.relative_path();
else
normalizedPathNoDotDot = normalizedRootPath / normalizedPath.lexically_relative(normalizedPath.root_path() / dotDotPrefix);
solAssert(!hasDotDotSegments(normalizedPathNoDotDot), "");

// NOTE: On Windows lexically_normal() converts all separators to forward slashes. Convert them back.
// Separators do not affect path comparison but remain in internal representation returned by native().
// This will also normalize the root name to start with // in UNC paths.
normalizedPathNoDotDot = normalizedPathNoDotDot.generic_string();

// For some reason boost considers "/." different than "/" even though for other directories
// the trailing dot is ignored.
if (normalizedPathNoDotDot == "/.")
return "/";

return normalizedPathNoDotDot;
}

bool FileReader::isPathPrefix(boost::filesystem::path _prefix, boost::filesystem::path const& _path)
{
solAssert(!_prefix.empty() && !_path.empty(), "");
// NOTE: On Windows paths starting with a slash (rather than a drive letter) are considered relative by boost.
solAssert(_prefix.is_absolute() || isUNCPath(_prefix) || _prefix.root_path() == "/", "");
solAssert(_path.is_absolute() || isUNCPath(_path) || _path.root_path() == "/", "");
solAssert(_prefix == _prefix.lexically_normal() && _path == _path.lexically_normal(), "");
solAssert(!hasDotDotSegments(_prefix) && !hasDotDotSegments(_path), "");

// Before 1.72.0 lexically_relative() was not handling paths with empty, dot and dot dot segments
// correctly (see https://github.com/boostorg/filesystem/issues/76). The only case where this
// is possible after our normalization is a directory name ending in a slash (filename is a dot).
if (_prefix.filename_is_dot())
_prefix.remove_filename();

boost::filesystem::path strippedPath = _path.lexically_relative(_prefix);
return !strippedPath.empty() && *strippedPath.begin() != "..";
}

boost::filesystem::path FileReader::stripPrefixIfPresent(boost::filesystem::path _prefix, boost::filesystem::path const& _path)
{
if (!isPathPrefix(_prefix, _path))
return _path;

if (_prefix.filename_is_dot())
_prefix.remove_filename();

boost::filesystem::path strippedPath = _path.lexically_relative(_prefix);
solAssert(strippedPath.empty() || *strippedPath.begin() != "..", "");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So each element of strippedPath is a string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually path but strings are implicitly convertible to that type.

Boost let you iterate over path segments while skipping separators. E.g. in /a/bc/def.sol you can iterate over a, bc and def.sol.

return strippedPath;
}

boost::filesystem::path FileReader::absoluteDotDotPrefix(boost::filesystem::path const& _path)
{
solAssert(_path.is_absolute() || _path.root_path() == "/", "");

boost::filesystem::path _pathWithoutRoot = _path.relative_path();
boost::filesystem::path prefix;
for (boost::filesystem::path const& segment: _pathWithoutRoot)
if (segment.filename_is_dot_dot())
prefix /= segment;

return prefix;
}

bool FileReader::hasDotDotSegments(boost::filesystem::path const& _path)
{
for (boost::filesystem::path const& segment: _path)
if (segment.filename_is_dot_dot())
return true;

return false;
}

bool FileReader::isUNCPath(boost::filesystem::path const& _path)
{
string rootName = _path.root_name().string();

return (
rootName.size() == 2 ||
(rootName.size() > 2 && rootName[2] != rootName[1])
) && (
(rootName[0] == '/' && rootName[1] == '/')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why both?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean "why both characters should be /" or "why both // and \\"?

If it's the former - this is meant to detect UNC paths of the form //host/a/bc/def.sol.
If the latter: Windows accepts both //host and \\host as UNC path root name. Outside of windows only the one starting with // is treated specially.

#if defined(_WIN32)
|| (rootName[0] == '\\' && rootName[1] == '\\')
#endif
);
}

}
49 changes: 43 additions & 6 deletions libsolidity/interface/FileReader.h
Original file line number Diff line number Diff line change
Expand Up @@ -45,27 +45,28 @@ class FileReader
boost::filesystem::path _basePath = {},
FileSystemPathSet _allowedDirectories = {}
):
m_basePath(std::move(_basePath)),
m_allowedDirectories(std::move(_allowedDirectories)),
m_sourceCodes()
{}
{
setBasePath(_basePath);
}

void setBasePath(boost::filesystem::path _path) { m_basePath = std::move(_path); }
void setBasePath(boost::filesystem::path const& _path);
boost::filesystem::path const& basePath() const noexcept { return m_basePath; }

void allowDirectory(boost::filesystem::path _path) { m_allowedDirectories.insert(std::move(_path)); }
FileSystemPathSet const& allowedDirectories() const noexcept { return m_allowedDirectories; }

StringMap const& sourceCodes() const noexcept { return m_sourceCodes; }

/// Retrieves the source code for a given source unit ID.
/// Retrieves the source code for a given source unit name.
SourceCode const& sourceCode(SourceUnitName const& _sourceUnitName) const { return m_sourceCodes.at(_sourceUnitName); }

/// Resets all sources to the given map of source unit ID to source codes.
/// Resets all sources to the given map of source unit name to source codes.
/// Does not enforce @a allowedDirectories().
void setSources(StringMap _sources);

/// Adds the source code for a given source unit ID.
/// Adds the source code under a source unit name created by normalizing the file path.
/// Does not enforce @a allowedDirectories().
void setSource(boost::filesystem::path const& _path, SourceCode _source);

Expand All @@ -83,7 +84,43 @@ class FileReader
return [this](std::string const& _kind, std::string const& _path) { return readFile(_kind, _path); };
}

/// Normalizes a filesystem path to make it include all components up to the filesystem root,
/// remove small, inconsequential differences that do not affect the meaning and make it look
/// the same on all platforms (if possible). Symlinks in the path are not resolved.
/// The resulting path uses forward slashes as path separators, has no redundant separators,
/// has no redundant . or .. segments and has no root name if removing it does not change the meaning.
/// The path does not have to actually exist.
static boost::filesystem::path normalizeCLIPathForVFS(boost::filesystem::path const& _path);

/// @returns true if all the path components of @a _prefix are present at the beginning of @a _path.
/// Both paths must be absolute (or have slash as root) and normalized (no . or .. segments, no
/// multiple consecutive slashes).
/// Paths are treated as case-sensitive. Does not require the path to actually exist in the
/// filesystem and does not follow symlinks. Only considers whole segments, e.g. /abc/d is not
/// considered a prefix of /abc/def. Both paths must be non-empty.
/// Ignores the trailing slash, i.e. /a/b/c.sol/ is treated as a valid prefix of /a/b/c.sol.
static bool isPathPrefix(boost::filesystem::path _prefix, boost::filesystem::path const& _path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason _prefix is not const&?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm modifying it in place inside the function. I could add a local variable but it was just more convenient to modify the parameter.


/// If @a _prefix is actually a prefix of @p _path, removes it from @a _path to make it relative.
/// @returns The path without the prefix or unchanged path if there is not prefix.
/// If @a _path and @_prefix are identical, the result is '.'.
static boost::filesystem::path stripPrefixIfPresent(boost::filesystem::path _prefix, boost::filesystem::path const& _path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here re const&

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// @returns true if the specified path is an UNC path.
// UNC paths start with // followed by a name (on Windows they can also start with \\).
// They are used for network shares on Windows. On UNIX systems they do not have the same
// functionality but usually they are still recognized and treated in a special way.
static bool isUNCPath(boost::filesystem::path const& _path);

private:
/// If @a _path starts with a number of .. segments, returns a path consisting only of those
/// segments (root name is not included). Otherwise returns an empty path. @a _path must be
/// absolute (or have slash as root).
static boost::filesystem::path absoluteDotDotPrefix(boost::filesystem::path const& _path);

/// @returns true if the path contains any .. segments.
static bool hasDotDotSegments(boost::filesystem::path const& _path);

/// Base path, used for resolving relative paths in imports.
boost::filesystem::path m_basePath;

Expand Down
1 change: 1 addition & 0 deletions test/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ set(libsolidity_sources
libsolidity/SyntaxTest.h
libsolidity/ViewPureChecker.cpp
libsolidity/analysis/FunctionCallGraph.cpp
libsolidity/interface/FileReader.cpp
)
detect_stray_source_files("${libsolidity_sources}" "libsolidity/")

Expand Down
9 changes: 7 additions & 2 deletions test/FilesystemUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,16 @@ void solidity::test::createFileWithContent(boost::filesystem::path const& _path,

bool solidity::test::createSymlinkIfSupportedByFilesystem(
boost::filesystem::path const& _targetPath,
boost::filesystem::path const& _linkName
boost::filesystem::path const& _linkName,
bool directorySymlink
)
{
boost::system::error_code symlinkCreationError;
boost::filesystem::create_symlink(_targetPath, _linkName, symlinkCreationError);

if (directorySymlink)
boost::filesystem::create_directory_symlink(_targetPath, _linkName, symlinkCreationError);
else
boost::filesystem::create_symlink(_targetPath, _linkName, symlinkCreationError);

if (!symlinkCreationError)
return true;
Expand Down
6 changes: 5 additions & 1 deletion test/FilesystemUtils.h
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,16 @@ void createFileWithContent(boost::filesystem::path const& _path, std::string con

/// Creates a symlink between two paths.
/// The target does not have to exist.
/// If @p directorySymlink is true, indicate to the operating system that this is a directory
/// symlink. On some systems (e.g. Windows) it's possible to create a non-directory symlink pointing
/// at a directory, which makes such a symlinks unusable.
/// @returns true if the symlink has been successfully created, false if the filesystem does not
/// support symlinks.
/// Throws an exception of the operation fails for a different reason.
bool createSymlinkIfSupportedByFilesystem(
boost::filesystem::path const& _targetPath,
boost::filesystem::path const& _linkName
boost::filesystem::path const& _linkName,
bool directorySymlink
);

}
Loading