Skip to content

Commit

Permalink
feat: migrate symbols to treesitter
Browse files Browse the repository at this point in the history
Basic symbol extraction using treesitter.
Querying the list of Symbols is mostly supported now.
However, the symbol under cursor function is now quite limited.
Before, clangd would resolve the symbol, which treesitter of course
doesn't.

Task-Id: KNUT-162
Change-Id: I5d70fa95d5a5aa154d8bedcf577e68709a9f95bb
Reviewed-on: https://codereview.kdab.com/c/knut/+/141041
Reviewed-by: Nicolas Arnaud-Cormos <nicolas@kdab.com>
  • Loading branch information
LeonMatthesKDAB authored and narnaud committed May 19, 2024
1 parent 4700264 commit 34fdebb
Show file tree
Hide file tree
Showing 24 changed files with 520 additions and 651 deletions.
10 changes: 9 additions & 1 deletion docs/API/script/rangemark.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ import Script 1.0
||**[remove](#remove)**()|
||**[replace](#replace)**(string text)|
||**[select](#select)**()|
|string |**[textExcept](#textExcept)**([RangeMark](../script/rangemark.md) other)|

## Detailed Description

Expand Down Expand Up @@ -55,7 +56,7 @@ This read-only property returns the text covered by the range.

Joins the two `RangeMark` and creates a new one.

The new `RangeMark` is spaning from the minimum of the start to the maximum of the end.
The new `RangeMark` is spanning from the minimum of the start to the maximum of the end.

#### <a name="remove"></a>**remove**()

Expand All @@ -68,3 +69,10 @@ Replaces the text defined by this range with the `text` string in the source doc
#### <a name="select"></a>**select**()

Selects the text defined by this range in the source document.

#### <a name="textExcept"></a>string **textExcept**([RangeMark](../script/rangemark.md) other)

Returns the text of this range without the text of the other range.
This assumes that both ranges overlap.

Otherwise, the entire text is returned.
5 changes: 5 additions & 0 deletions docs/API/script/textrange.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import Script 1.0
| | Name |
|-|-|
|int|**[end](#end)**|
|int|**[length](#length)**|
|int|**[start](#start)**|

## Property Documentation
Expand All @@ -23,6 +24,10 @@ import Script 1.0

This read-only property defines the end position of the range.

#### <a name="length"></a>int **length**

This read-only property returns the length of the range (end - start)

#### <a name="start"></a>int **start**

This read-only property defines the start position of the range.
15 changes: 14 additions & 1 deletion docs/getting-started/treesitter.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ Check if all arguments are "alike".
In this case "alike" means the arguments are all string-equal, after all white-space is removed.
This is very useful when comparing strings that might span multiple lines or may be indented/formatted differently depening on preference.
This is very useful when comparing strings that might span multiple lines or may be indented/formatted differently depending on preference.
E.g. `const QString&` could also be formatted as `const QString &`.
The `like?` predicate would match both of these variations.

Expand Down Expand Up @@ -194,6 +194,19 @@ Example usage to find all member function definitions of `MyClass`
body: (compound_statement) @body)
```
### `(#not_is? [capture]+ [node_type]+)`
Check that **none** of the captures are of any of the given node types.
This is especially useful when using the wildcard operators `(_)` and `_`.
These match any (named) node type. Combined with this predicate these can match any node type *but* the given types.
Example usage to find all member functions that return any type *other* than a primitive type:
``` treesitter
(function_definition
type: (_) @type
(#not_is? @type primitive_type)) @function
```

### `(#in_message_map? [capture]+)`
Check if the given capture is within a MFC message map.
Expand Down
10 changes: 7 additions & 3 deletions src/core/classsymbol.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,8 @@ namespace Core {
* Returns the list of members (both data and functions) of this class.
*/

ClassSymbol::ClassSymbol(QObject *parent, const QString &name, const QString &description,
const QString &importLocation, Kind kind, TextRange range, TextRange selectionRange)
: Symbol(parent, name, description, importLocation, kind, range, selectionRange)
ClassSymbol::ClassSymbol(QObject *parent, const QueryMatch &match, Kind kind)
: Symbol(parent, match, kind)
{
}

Expand All @@ -47,6 +46,11 @@ const QVector<Symbol *> &ClassSymbol::members() const
return *m_members;
}

QString ClassSymbol::description() const
{
return "Class with " + QString::number(members().size()) + " members";
}

bool operator==(const ClassSymbol &left, const ClassSymbol &right)
{
return left.Symbol::operator==(right) && left.members() == right.members();
Expand Down
4 changes: 2 additions & 2 deletions src/core/classsymbol.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,14 @@ class ClassSymbol : public Symbol

protected:
friend class Symbol;
ClassSymbol(QObject *parent, const QString &name, const QString &description, const QString &importLocation,
Kind kind, TextRange range, TextRange selectionRange);
ClassSymbol(QObject *parent, const QueryMatch &match, Kind kind);

// mutable for lazy initialization
mutable std::optional<QVector<Symbol *>> m_members;

public:
const QVector<Symbol *> &members() const;
QString description() const override;
};
bool operator==(const ClassSymbol &left, const ClassSymbol &right);

Expand Down
4 changes: 2 additions & 2 deletions src/core/cppdocument.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1342,7 +1342,7 @@ void CppDocument::deleteMethodLocal(const QString &methodName, const QString &si
if (signature.isEmpty())
return !isFunction || symbol->name() != methodName;
else
return !isFunction || symbol->name() != methodName || symbol->description() != signature;
return !isFunction || symbol->name() != methodName || symbol->toFunction()->signature() != signature;
};

auto symbolList = symbols();
Expand Down Expand Up @@ -1442,7 +1442,7 @@ void CppDocument::deleteMethod()
spdlog::error(
"CppDocument::deleteMethod: Cursor is not currently within a function definition or declaration!");
} else {
deleteMethod(symbol->name(), symbol->description());
deleteMethod(symbol->name(), symbol->toFunction()->signature());
}
}

Expand Down
131 changes: 30 additions & 101 deletions src/core/functionsymbol.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -82,135 +82,64 @@ FunctionArgument FunctionArgument::fromHover(const QString &parameter, Document:
* whitespace but everything else like comments.
*/

FunctionSymbol::FunctionSymbol(QObject *parent, const QString &name, const QString &description,
const QString &importLocation, Kind kind, TextRange range, TextRange selectionRange)
: Symbol(parent, name, description, importLocation, kind, range, selectionRange)
FunctionSymbol::FunctionSymbol(QObject *parent, const QueryMatch &match, Kind kind)
: Symbol(parent, match, kind)
{
if (m_description.isEmpty()) {
const auto args = arguments();
auto toType = [](const FunctionArgument &arg) {
return arg.type;
};
auto argumentTypes = kdalgorithms::transformed<QStringList>(args, toType);

m_description = QString("%1 (%2)").arg(returnType(), argumentTypes.join(", "));
}
}

QString FunctionSymbol::returnTypeFromDescription() const
QString FunctionSymbol::description() const
{
auto desc = m_description;
// TODO: Add logic to handle type-qualifiers.
// For now, discard type-qualifier, if found any.
if (desc.startsWith("static "))
desc.remove(0, 7);
desc.chop((desc.length() - desc.lastIndexOf(')') - 1));

return desc.left(desc.indexOf('(')).trimmed();
return signature();
}

std::optional<QString> FunctionSymbol::returnTypeFromLSP() const
QString FunctionSymbol::signature() const
{
if (auto lspdocument = document()) {
auto hover = lspdocument->hover(selectionRange().start);

auto lines = hover.split("\n");
while (!lines.isEmpty()) {
auto line = lines.front();
lines.pop_front();

if (line.startsWith("")) {
line.remove(0, 2);
const auto args = arguments();
auto toType = [](const FunctionArgument &arg) {
return arg.type;
};
auto argumentTypes = kdalgorithms::transformed<QStringList>(args, toType);

return Lsp::Utils::removeTypeAliasInformation(line);
}
}

return "";
}

return {};
return QString("%1 (%2)").arg(returnType(), argumentTypes.join(", "));
}

QVector<FunctionArgument> FunctionSymbol::argumentsFromDescription() const
QString FunctionSymbol::returnTypeFromQueryMatch() const
{
int argStart = m_description.indexOf('(') + 1;
QString args = m_description.mid(argStart, m_description.lastIndexOf(')') - argStart);

const QStringList argsList = args.split(',', Qt::SkipEmptyParts);

QVector<FunctionArgument> arguments;
arguments.reserve(argsList.size());
for (const auto &arg : argsList) {
arguments.push_back(FunctionArgument {.type = arg.trimmed(), .name = ""});
}

return arguments;
}

std::optional<QVector<FunctionArgument>> FunctionSymbol::argumentsFromLSP() const
{
if (auto lspdocument = document()) {
auto hover = lspdocument->hover(selectionRange().start);

spdlog::debug("FunctionSymbol::argumentsFromLSP: Hover string:\n{}", hover.toStdString());

auto lines = hover.split("\n");
while (!lines.isEmpty() && !lines.first().contains("Parameters:")) {
lines.pop_front();
}
if (!lines.isEmpty()) {
// We found a Parameters listing
lines.pop_front(); // Remove the 'Parameters:' text

QVector<Core::FunctionArgument> arguments;
while (!lines.isEmpty() && lines.first().startsWith("- ")) {
auto parameter = lines.first();
lines.pop_front();

// remove the "- "
parameter.remove(0, 2);

arguments.emplaceBack(FunctionArgument::fromHover(parameter, document()->type()));
}

return arguments;
}

// No arguments found
return QVector<Core::FunctionArgument>();
} else {
spdlog::warn("Symbol '{}' doesn't have an associated document!", name().toStdString());
}

return {};
return m_queryMatch.getAllJoined("return").text();
}

QString FunctionSymbol::returnType() const
{
if (!m_returnType.has_value()) {
m_returnType = returnTypeFromLSP();
}

if (!m_returnType.has_value()) {
m_returnType = std::make_optional(returnTypeFromDescription());
m_returnType = std::make_optional(returnTypeFromQueryMatch());
}

return m_returnType.value();
}
const QVector<FunctionArgument> &FunctionSymbol::arguments() const
{
if (!m_arguments.has_value()) {
m_arguments = argumentsFromLSP();
}

if (!m_arguments.has_value()) {
m_arguments = std::make_optional(argumentsFromDescription());
m_arguments = std::make_optional(argumentsFromQueryMatch());
}

return m_arguments.value();
}

QVector<FunctionArgument> FunctionSymbol::argumentsFromQueryMatch() const
{
auto arguments = m_queryMatch.getAll("parameter");
auto to_function_arg = [this](const RangeMark &argument) {
auto result = document()->queryInRange(argument, "(identifier) @name");
auto nameRange = result.isEmpty() ? RangeMark() : result.first().get("name");
auto name = nameRange.text().simplified();
auto type = argument.textExcept(nameRange).simplified();

return FunctionArgument {.type = type, .name = name};
};

return kdalgorithms::transformed<QVector<FunctionArgument>>(arguments, to_function_arg);
}

bool operator==(const FunctionSymbol &left, const FunctionSymbol &right)
{
return left.Symbol::operator==(right) && left.returnType() == right.returnType()
Expand Down
13 changes: 5 additions & 8 deletions src/core/functionsymbol.h
Original file line number Diff line number Diff line change
Expand Up @@ -36,22 +36,19 @@ class FunctionSymbol : public Symbol
// necessary so the constructor can be accessed from the Symbol class.
friend class Symbol;

FunctionSymbol(QObject *parent, const QString &name, const QString &description, const QString &importLocation,
Kind kind, TextRange range, TextRange selectionRange);
FunctionSymbol(QObject *parent, const QueryMatch &match, Kind kind);

mutable std::optional<QString> m_returnType;
mutable std::optional<QVector<FunctionArgument>> m_arguments;

// fallback heuristic, if `Hover` LSP call fails
QString returnTypeFromDescription() const;
QVector<FunctionArgument> argumentsFromDescription() const;

std::optional<QString> returnTypeFromLSP() const;
std::optional<QVector<FunctionArgument>> argumentsFromLSP() const;
QString returnTypeFromQueryMatch() const;
QVector<FunctionArgument> argumentsFromQueryMatch() const;

public:
QString returnType() const;
const QVector<FunctionArgument> &arguments() const;
QString signature() const;
QString description() const override;
};
bool operator==(const FunctionSymbol &left, const FunctionSymbol &right);

Expand Down
18 changes: 4 additions & 14 deletions src/core/lspdocument.cpp
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
#include "lspdocument.h"
#include "lspdocument_p.h"

#include "treesitter/predicates.h"

#include "astnode.h"
#include "logger.h"
#include "project.h"
#include "querymatch.h"
#include "rangemark.h"
#include "scriptmanager.h"
#include "symbol.h"
#include "textlocation.h"
#include "utils/json.h"
Expand Down Expand Up @@ -41,7 +42,6 @@ LspDocument::~LspDocument() = default;

LspDocument::LspDocument(Type type, QObject *parent)
: TextDocument(type, parent)
, m_cache(std::make_unique<LspCache>(this))
, m_treeSitterHelper(std::make_unique<TreeSitterHelper>(this))
{
connect(textEdit()->document(), &QTextDocument::contentsChange, this, &LspDocument::changeContent);
Expand Down Expand Up @@ -127,9 +127,7 @@ void LspDocument::deleteSymbol(const Symbol &symbol)
Core::SymbolList LspDocument::symbols() const
{
LOG("LspDocument::symbols");
if (!checkClient())
return {};
return m_cache->symbols();
return m_treeSitterHelper->symbols();
}

struct RegexpTransform
Expand Down Expand Up @@ -166,13 +164,6 @@ const Core::Symbol *LspDocument::symbolUnderCursor() const
return *symbolIter;
}

auto hover = hoverWithRange(textEdit()->textCursor().position());
if (hover.second) {
return m_cache->inferSymbol(hover.first, hover.second.value());
} else {
spdlog::warn("LspDocument::symbolUnderCursor: Cannot infer Symbol - Language Server did not return a range!");
}

return nullptr;
}

Expand Down Expand Up @@ -440,7 +431,7 @@ Symbol *LspDocument::findSymbol(const QString &name, int options) const
if (!checkClient())
return {};

const auto &symbols = m_cache->symbols();
auto symbols = this->symbols();
const auto regexp = (options & FindRegexp) ? createRegularExpression(name, options) : QRegularExpression {};
auto byName = [name, options, regexp](Symbol *symbol) {
if (options & FindWholeWords)
Expand Down Expand Up @@ -665,7 +656,6 @@ void LspDocument::changeContentLsp(int position, int charsRemoved, int charsAdde
Q_UNUSED(position)
Q_UNUSED(charsRemoved)
Q_UNUSED(charsAdded)
m_cache->clear();

// TODO: Keep copy of previous string around, so we can find the oldEndPosition.
// const auto document = textEdit()->document();
Expand Down
Loading

0 comments on commit 34fdebb

Please sign in to comment.