Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for YAML and Helm chart static analysis #582

Open
wants to merge 74 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 70 commits
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
f21d3a0
Model layer of Yaml Parser
rizwankadhar Mar 26, 2022
aa0bd25
Added Model layer of Yaml Parser
rizwankadhar Mar 26, 2022
2faecd6
Added the parser layer of the YamlParser plugin
rizwankadhar Apr 17, 2022
d99ee72
Added the parser layer of the YamlParser plugin
rizwankadhar Apr 17, 2022
78433b2
Model layer of Yaml Parser
rizwankadhar Mar 26, 2022
168e5f5
Added the parser layer of the YamlParser plugin
rizwankadhar Apr 17, 2022
7db9f02
YamlParser persists whole Yaml file
rizwankadhar Apr 18, 2022
5fde274
Merge branch 'yaml' of https://github.com/rizwankadhar/CodeCompass in…
rizwankadhar Apr 18, 2022
4d00b66
Removed unnecessary columns
rizwankadhar Apr 18, 2022
2694d79
Added Yaml file usage identification
rizwankadhar Apr 23, 2022
617490a
Added service layer and web plugin
rizwankadhar May 3, 2022
03948e1
Added ryml code
rizwankadhar May 3, 2022
a1ea3bd
Fixed runtime issues with YAML service.
intjftw May 3, 2022
d1e0f57
Added a sample code to reach nested nodes.
intjftw May 5, 2022
542cd5c
Removed Unnecessary parameters and commented lines of code
rizwankadhar May 8, 2022
baa130d
Added parent column to the YamlContent table and populated it
rizwankadhar May 17, 2022
323d121
Fixed YamlService to return only the data of the clicked file, added …
rizwankadhar May 17, 2022
d2da47c
YamlInfo menuItem displays type, path, number of key-data pairs and i…
rizwankadhar May 21, 2022
ca9c294
YamlParser now traverses the sequences recursively
rizwankadhar May 21, 2022
621a333
Added a parent value for all the sequence elements
rizwankadhar May 21, 2022
05f9063
Pair with empty data field is also persisted. Removed unnecessary code
rizwankadhar May 22, 2022
1f1f728
Improved the view of the Yaml information displayed on Webserver-HTML
rizwankadhar May 22, 2022
19656ba
Organized code as per CodeCompass coding conventions
rizwankadhar May 24, 2022
013ff98
Starting to transform the YAML plugin to be a full language plugin.
intjftw Jun 15, 2022
a6c53db
Switched to yaml-cpp instead of rapidyaml. YamlAstNodes are built.
intjftw Jun 16, 2022
7c376fb
Recursive parsing of YAML files according to node types, exception ha…
intjftw Jun 17, 2022
3839a6f
Recursive parsing of YAML files enhanced, syntax highlight added (in …
intjftw Jun 21, 2022
2b35b0e
Node clicking error is corrected.
intjftw Jun 27, 2022
7518b36
Various modifications.
intjftw Jul 11, 2022
0e36da6
Info tree is ready for YAML nodes.
intjftw Jul 11, 2022
83ff400
Diagrams re-added, root keys are parsed with full values.
intjftw Jul 12, 2022
b35196d
Major refactor in service.
intjftw Jul 13, 2022
2a4e220
Multithreading works in parser.
intjftw Jul 18, 2022
8a0b964
Multithreading works in parser.
intjftw Jul 18, 2022
f70f203
Parsing microservices and showing microservice diagram on web.
intjftw Jul 18, 2022
ee47f79
Adding edges to store connections between microservices.
intjftw Jul 19, 2022
6449c4c
Added relation collector class.
intjftw Jul 25, 2022
89f4a50
Relation collector developed.
intjftw Aug 8, 2022
9c6fdf0
Relation collector error fixed, relations between automatically detec…
intjftw Aug 12, 2022
76319ff
Adding yaml-cpp to CI.
intjftw Aug 12, 2022
daee3fa
Adding microservice relations diagram to web service.
intjftw Aug 21, 2022
fec8e1a
Microservice relations diagram showing dynamic relation type.
intjftw Aug 30, 2022
8f4b246
Minor changes for PR.
intjftw Sep 16, 2022
42217ef
Edit processing according to specific Kubernetes rules.
intjftw Sep 28, 2022
f10b560
Merge branch 'master' into yaml
intjftw Sep 28, 2022
2ff2861
Early trials.
intjftw Sep 29, 2022
360adf1
Starting helm template parsing.
intjftw Sep 29, 2022
f6ec770
Started to implement helm template parsing.
intjftw Oct 12, 2022
ddf0220
Finished parsing Service templates.
intjftw Oct 13, 2022
d541690
Processing aliased microservices in integration chart.
intjftw Oct 13, 2022
78434f2
Transformed Yaml service to extend Language service, and added a new …
intjftw Oct 14, 2022
64d8137
Added event handler to the microservice navigator menu.
intjftw Oct 19, 2022
7764bac
Modified model, parsing and diagram generation.
intjftw Oct 20, 2022
2aae5cb
Finding arbitrary key in a yaml file.
intjftw Nov 6, 2022
da2d535
Implemented the rules of mount dependency type.
intjftw Nov 13, 2022
20c75ad
Implemented the rules of certificate dependency type, and added diffe…
intjftw Nov 15, 2022
006061d
Corrected an error of helm template ids in parsing.
intjftw Nov 16, 2022
ceb82b0
Fixed microservice detection in subcharts and removed debug messages.
intjftw Nov 16, 2022
e3282e4
Fixed multiplied arrows in k8s diagrams.
intjftw Nov 29, 2022
9ce0c99
Parsing resources.
intjftw Nov 30, 2022
d215660
Resource usage diagram added.
intjftw Nov 30, 2022
2c8697c
Guarantee that microservices will be processed independent of the fil…
intjftw Dec 1, 2022
9015118
Processing values.yaml files to get more service relations.
intjftw Dec 7, 2022
1250090
Complementing Service manifest diagram with back and forth dependencies.
intjftw Dec 7, 2022
f9e1c09
Fixing the detection of microservices to not depend on the order of f…
intjftw Dec 18, 2022
64db368
Documentation and other things.
intjftw Feb 4, 2023
385a361
Improving documentation.
intjftw Feb 14, 2023
b51de7d
Renamed the plugin to helm as it is more precise. Also renamed the va…
intjftw Feb 15, 2023
9155e7c
Complemented documentation and user guide.
intjftw Feb 16, 2023
71047a8
Merge branch 'master' into yaml
mcserep Aug 10, 2023
700bbf1
PR change requests: unified naming and more small modifications.
intjftw Aug 21, 2023
bbf4e81
PR change requests.
intjftw Aug 22, 2023
5d98aee
Merge branch 'master' into yaml
mcserep Sep 9, 2023
ef7bea8
Add missing inclusion dependency in HelmService for LanguageService f…
mcserep Sep 9, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/scripts/setup_build_ubuntu-20.04.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@
# Install required packages for CodeCompass build
sudo apt-get install -y git cmake make g++ libboost-all-dev llvm-10-dev clang-10 \
libclang-10-dev odb libodb-dev thrift-compiler libthrift-dev default-jdk libssl-dev \
libgraphviz-dev libmagic-dev libgit2-dev ctags doxygen libgtest-dev npm libldap2-dev
libgraphviz-dev libmagic-dev libgit2-dev ctags doxygen libgtest-dev npm libldap2-dev libyaml-cpp-dev
2 changes: 1 addition & 1 deletion .github/scripts/setup_runtime_ubuntu-20.04.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@
sudo apt-get install -y git cmake make g++ graphviz \
libboost-filesystem1.71.0 libboost-log1.71.0 libboost-program-options1.71.0 \
libllvm10 clang-10 libclang1-10 libthrift-0.13.0 default-jre libssl1.1 libmagic1 \
libgit2-28 ctags googletest libldap-2.4-2
libgit2-28 ctags googletest libldap-2.4-2 libyaml-cpp0.6
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -256,4 +256,4 @@ jobs:
if: ${{ github.repository == 'Ericsson/CodeCompass' && (github.ref_name == 'master' || startsWith(github.ref_name, 'release/') == true) }}
uses: ./.github/workflows/tarball.yml
secrets:
GITLAB_TRIGGER_TOKEN: ${{ secrets.GITLAB_TRIGGER_TOKEN }}
GITLAB_TRIGGER_TOKEN: ${{ secrets.GITLAB_TRIGGER_TOKEN }}
161 changes: 161 additions & 0 deletions doc/helm_documentation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
# Helm chart static analysis plugin
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this file be linked somewhere in the upper documentation files? Currently it is quite hard to find.

## Developers' documentation

The microservice plugin serves the purpose of analyzing Helm charts and storing data about the microservices and their relations in the database.
mcserep marked this conversation as resolved.
Show resolved Hide resolved

The plugin uses the _yaml-cpp_ library to parse YAML nodes.
The analysis is executed file-by-file.
**This plugin heavily relies on the rules and conventions of Helm and Kubernetes.**

## YAML file analysis steps

The microservice parser receives one or more root directories of Helm charts.
The directory is traversed in a recursive manner.
The plugin is capable of multi-threaded analysis, where each thread receives a file for analysis.

### File purpose analysis

First the parser checks the purpose of the YAML file in analysis.
The following "purposes" (i.e. file types) are checked:

- _Helm charts_: the actual chart files which contains the name and other metadata of the microservice.
Within this category, _integration charts_ and _subcharts_ are distinguished.
Processing is somewhat different based on the subtype (see details below).
Files named _Chart.yaml_ or _Chart.yml_ are classified into this category.

- _Helm values_: files which contain vital information of microservice relations and default values (e.g. for environment variables).
Files named _values.yaml_ or _values.yml_ are classified into this category.
These files are collected in a container for further analysis.

- _Helm templates_: files that define microservice relations and metadata, such as ConfigMaps, Secrets, Certificates etc.
Files that are inside a directory named _templates_ are classified into this category.
These files are collected in a container for further analysis.

- _Docker compose files_: files named _compose.yml, compose.yaml, docker_compose.yml, or docker_compose.yaml_ are classified into this category.

- _CI files_: any file that does not belong to any category above and contains the substring _"ci"_, is classified as a continuous integration file.

_Note:_ At this point, the plugin is more specifically aimed at the analysis of Helm charts.
However, it can be used for generic YAML parsing as well, so the non-Helm types are left in.

### YAML node analysis

After the file type check, the keys and values in the file are analyzed in a recursive manner.
_yaml-cpp_ distinguishes the following node types: scalar, sequence, map, and null and undefined.
These types appear within the `SymbolType` field.
The nodes are processed according to their type.

Generally, every node is broken down to the smallest, scalar node which is persisted in the database.
However, a scalar holds the type of node in which it was included: e.g. if a scalar is originally a key in a map,
its `AstType` will be `MAP`. Maps and sequences need to be parsed in slightly different ways,
because their iterators are not interchangeable. In the end, all nodes are broken down to scalars,
which are then persisted in the database with their containing type.

In order to provide better syntax highlight, a `SymbolType` field holds the place of a node:
`Key`, `Value`, `NestedKey`, `NestedValue`, or `Other`.

The location of nodes within files is calculated by the `Mark()` method of _yaml_cpp_,
and the content of a node is extracted with the `Dump()` method.

### Exception handling

If any error happens during file analysis, the error will be saved as a `BuildLog` object.
The erroneous file will remain in _Not parsed_ status.

Errors usually happen in Helm charts if the user did not execute the `helm template` command on the charts before parsing,
or when there are syntax errors in the file.

### Chart analysis

As mentioned above, Helm charts should be handled differently if they are integration charts or subcharts.

#### Integration charts

If the file is an integration chart - a chart which describes the entire software, not just a component -,
tha parser check if there is a _dependencies_ key in the file.
If not, the service that the file describes is registered in the database.
If there is, the listed microservices are registered in the database.
If a listed service contains an _alias_ key, its value should be persisted as the name of the service.
Otherwise, the value of the obligatory _name_ key is the name of the service.

_Note: this approach might be refined in the future to adapt to more possible architecture description formats._

#### Subcharts

If a _Chart.yaml_ file is contained by a _charts_ directory, it should be considered a subchart
(i.e. a component in a microservice-based software).
The described microservice should be persisted in the database.

_Note: in case of parsing multiple projects, keep in mind that a full-value microservice (i.e.
a microservice which is not defined in a subchart, but not necessarily an integration service)
can define a component in within another project. Thus it can be a subchart in on project, and
an integration chart on its own at the same time._

### Template file analysis

This analysis is implemented in the `TemplateAnalyzer` class.

#### Dependencies

After basic YAML parsing, the previously collected template files are analyzed to collect
microservice dependencies.
The content of each template file is investigated.
Any template object should be persisted in the database as a `HelmTemplate` object with
the file that it is defined in.
Based on the content and its format, the analysis branches off in the following directions:

- _Service dependency:_ if the value of the `kind` key in the template file is _Service_.
This type defines additional services that derive from the original service.
In this case, the new service is persisted in the database as an _EXTERNAL_ service, and an edge
is created between the two services. The defined service depends on the defining service.
- _Mount dependencies:_ these types are usually collected within big deployment files.
The dependencies listed as the value of a _volumes_ key should be analyzed.
Please note that there can be multiple _volumes_ keys in a deployment file.
We consider two subtypes:
- _ConfigMaps_: these contain lots of important information that is needed for K8S pod definition.
- _Secrets_: these contain confidential information within services.
Both types can be identified by a `kind` key in the list.
- _Certificate dependencies:_ while the previous types could be identified by a `kind` key
within a file, a certificate can be defined by several template types.
An approximate heuristics is to check the value of the `kind` key for containing the
"Certificate" or the "InternalUserCA" substring. (_WARNING:_ naming conventions may vary
in every project.) Certificate dependencies define further secrets which should also
be persisted in the database.

#### Resources

All template files are checked to see if they contain resource usage information.
The following resources are collected:

- _CPU:_ listed with `resources` and `requests` keys. This resource is calculated in
the number of CPU cores which can be a fraction.
- _Memory:_ listed with `resources` and `requests` keys. Calculated in gigabytes.
- _Storage:_ listed within a `volumeClaimTemplates` list, with a `storage` key.
Calculated in gigabytes.

### Analysis of _values_ files

This analysis is implemented in the `ValueAnalyzer` class.

After basic YAML parsing, the previously collected _values.yaml_ files are analyzed to find
further connection points based on references.
Values files otherwise contain default values which might be overridden during deployment.

Every values file is checked for references for microservices.
If the analyzer finds a reference, a new `HelmTemplate` object is persisted in the database,
and a new edge is defined between the microservices.

## Frontend functionality

The plugin is capable of providing the following functionality:

- _Syntax highlight:_ the YAML nodes are colored according to the symbol type.
- _InfoTree:_ some very basic metadata is available of the nodes (name, symbol type, value type).
- _Microservice navigator:_ there is an accordion menu on the left side in which the detected
microservices are listed. The diagrams are available by right clicking the services.
- _Diagrams:_
- Dependent services
- Config maps
- Secrets
- Resource usage

15 changes: 14 additions & 1 deletion doc/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ For full documentation see:
(`-E SQL_ASCII` flag is recommended!)
- [Start PostgreSQL database](https://www.postgresql.org/docs/12/app-postgres.html)

## 1. Generate compilation database
## 1/a. Generate compilation database
If you want to parse a C++ project, you have to create a [compilation database
file](http://clang.llvm.org/docs/JSONCompilationDatabase.html).

Expand Down Expand Up @@ -103,6 +103,19 @@ installation. The first command line argument is the output file name and the
second argument is the build command which compiles your project. This can be a
simple compiler invocation or starting a build system.

## 1/b. Prepare Helm charts
If you want to parse a Helm chart, and the chart files contain template files,
you need to run the `helm template` command on the
charts first, in order to bring the chart to a pure YAML format.

1. Run `helm template` on the chart. Save the output of the command in a separate directory,
e.g. if you run the command from the root of the chart,
`helm template <chart name> . --output-dir=./output`
2. Copy all files and directories from the chart to the output directory (except the original
template files), in the original directory order: Chart.yaml, values.yaml files, files
directories, etc. This is needed because of the recursive directory traversal.
3. Parse the project as described in below, with the output directory as input.

## 2. Parse the project
For parsing a project with CodeCompass, the following command has to be emitted:

Expand Down
1 change: 1 addition & 0 deletions docker/dev/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ RUN set -x && apt-get update -qq \
libmagic-dev \
libsqlite3-dev \
libssl-dev \
libyaml-cpp-dev \
llvm-10 clang-10 llvm-10-dev libclang-10-dev \
npm \
thrift-compiler libthrift-dev \
Expand Down
1 change: 1 addition & 0 deletions docker/runtime/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ RUN set -x && apt-get update -qq && \
libldap-2.4-2 \
libmagic-dev \
libthrift-dev \
ibyaml-cpp-dev \
mcserep marked this conversation as resolved.
Show resolved Hide resolved
ctags \
tini && \
apt-get clean && \
Expand Down
7 changes: 7 additions & 0 deletions plugins/helm/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
find_package(yaml-cpp REQUIRED)
mcserep marked this conversation as resolved.
Show resolved Hide resolved

add_subdirectory(model)
add_subdirectory(parser)
add_subdirectory(service)

install_webplugin(webgui)
15 changes: 15 additions & 0 deletions plugins/helm/model/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
set(ODB_SOURCES
include/model/helmtemplate.h
include/model/microservice.h
include/model/microserviceedge.h
include/model/msresource.h
include/model/yamlastnode.h
include/model/yamlfile.h
include/model/yamlcontent.h)

generate_odb_files("${ODB_SOURCES}")

add_odb_library(yamlmodel ${ODB_CXX_SOURCES})
add_dependencies(yamlmodel model)

install_sql()
63 changes: 63 additions & 0 deletions plugins/helm/model/include/model/helmtemplate.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
#ifndef CC_MODEL_HELM_H
#define CC_MODEL_HELM_H

#include <string>

#include <odb/core.hxx>
#include <odb/lazy-ptr.hxx>
#include <odb/nullable.hxx>

#include <model/file.h>
#include <model/microservice.h>

namespace cc
{
namespace model
{

typedef uint64_t HelmTemplateId;

#pragma db object
struct HelmTemplate
{
enum class DependencyType
{
SERVICE,
MOUNT,
CERTIFICATE,
RESOURCE,
OTHER
};

#pragma db id
HelmTemplateId id;

#pragma db not_null
FileId file;

std::string name;

#pragma db not_null
DependencyType dependencyType;

#pragma db not_null
std::string kind;

#pragma db not_null
MicroserviceId depends;

bool operator==(HelmTemplate& rhs);
};

inline std::uint64_t createIdentifier(const HelmTemplate& helm_)
{
return util::fnvHash(
helm_.name +
helm_.kind +
std::to_string(helm_.depends) +
std::to_string(helm_.file));
}
}
}

#endif // CC_MODEL_HELM_H
47 changes: 47 additions & 0 deletions plugins/helm/model/include/model/microservice.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
#ifndef CC_MODEL_MICROSERVICE_H
#define CC_MODEL_MICROSERVICE_H

#include <odb/core.hxx>
#include <odb/lazy-ptr.hxx>
#include <odb/nullable.hxx>

#include "model/file.h"

#include "util/hash.h"

namespace cc
{
namespace model
{

typedef std::uint64_t MicroserviceId;

#pragma db object
struct Microservice
{
enum class ServiceType
{
INTERNAL,
EXTERNAL
};

#pragma db id
MicroserviceId serviceId;

#pragma db not_null
std::string name;

FileId file;

ServiceType type;
};

inline std::uint64_t createIdentifier(const Microservice& service_)
{
return util::fnvHash(
service_.name);
}
}
}

#endif // CC_MODEL_MICROSERVICE_H
Loading
Loading