-
Notifications
You must be signed in to change notification settings - Fork 8
Python unit tests
-
Clone cloud-custodian.
-
Install Custodian using make (if it hasn't been done yet) - here is how, and activate a virtual environment
source .venv/bin/activate
. -
Move ecc-aws-rulepack/.github/workflows/scripts/policy_as_test.py and ecc-aws-rulepack/.github/workflows/scripts/green_policy_test.py to cloud-custodian folder. Move ecc-aws-rulepack/.github/workflows/scripts/gcp_common.py to cloud-custodian/tools/c7n_gcp/tests/ folder.
-
Create policy_test.py inside cloud-custodian/.*
git clone https://github.com/cloud-custodian/cloud-custodian.git && cd cloud-custodian/ python3.8 -m venv .venv source .venv/bin/activate pip install poetry make install mv /path_to_repo/ecc-aws-rulepack/.github/workflows/scripts/policy_as_test.py . mv /path_to_repo/ecc-aws-rulepack/.github/workflows/scripts/green_policy_test.py . mv /path_to_repo/ecc-aws-rulepack/.github/workflows/scripts/gcp_common.py tools/c7n_gcp/tests/. touch policy_test.py
ℹ️ Note: The above described actions should be performed initially when you are setting up your test environment.
Next time, when you want to write or run a test, you can only activate your python environmentsource .venv/bin/activate
and you're ready to go.
We use Github CI to catch bugs and errors early in the rule development cycle. This process allows us to test the quality of our rules. This is implemented by using Github pipelines that execute Python scripts stored in tests/<policy_name>/red_policy_test.py file to run all the rules against a set of infrastructure that is defined in tests/<policy_name>/{placebo-green,placebo-red} folders. At the end of pipeline execution, it returns the result whether it was passed or failed. To create this infrastructure, we use Cloud Custodian flight-data capability, where infrastructure is pre-recorded flight data during the test process.
Pipeline configuration is written in ci.yaml file that is stored in the root directory of repositories – ci.yaml. Pipelines are triggered when you commit your changes in the remote branch and approved by project team members.
For more information read this – Github CI/CD
Before starting working on tests, create the required directories. In tests/<policy_name>/ create red_policy_test.py file and folders placebo-green and placebo-red. You must write tests when your terraform infrastructure is up and running. It could be when two red and green infrastructures are deployed at the same time or separately one by one.
These tests are oriented to find one red resource in pre-recorded flight data inside placebo-red and zero resources in placebo-green flight data.
After running a test with a command python3 policy_as_test.py record <policy_file> <policy_test> <output_dir>
you'll get two folders with output files:
- Folder with recorded responses from a set of calls to the cloud that Custodian executes during policy execution. The folder with a name of a policy inside the one that you specified as output_dir in the command, so the path is <output_dir>/<policy_name>. Files inside of it are responses that are saved as an individual JSON data file, they have the following name convention <cloud_resource_name>., from this information, we can find out what command to what services were applied during policy execution.
- Folder with rule execution results is the same folder as if you've run
custodian run
command, it's stored in the cloud-custodian/<policy_name> folder with a custodian-run.log, metadata.json and resources.json files.
The cloud-custodian/policy_test.py file must always have the following base:
class PolicyTest(object):
def test_resources(self, base_test, resources):
base_test.assertEqual(len(resources), 1)
After 4th line, you should add as many assert tests as the rule requires. Each policy check should be included in the test. To find the names of fields, you can look at resources.json in output folder.
ℹ️ Note: First, you work with cloud-custodian/policy_test.py while you are testing and creating flight data. When you already have cloud-custodian/policy_test.py and red and green flight data tested and ready, your next step is to copy it to the corresponding file and folders in cloud_repository/tests/<policy_name>/.
Documentation about UnitTest Assert Methods - unittest — Unit testing framework
The complete list of Python assert statements – Python 3.x UnitTest Assert Methods
Conditionally, tests can be divided into four types:
- Simple – test uses output of the main policy resource (specified in "resource: [cloud].[resource_name]")
- Complex – test uses output of sub-resource of policy (e.g. policy returns EC2 instances, but checks attached Security Groups (SGs) to it, in this case SG is a sub-resource)
-
Require green test – rules that check time or dates and they will fail if only red_policy_test.py is present, because of this green_policy_test.py must also exist.
Often these rules use
value_type: age
orvalue_type: expiration
filter, but they are not limited only to these filters. -
With pagination (AWS) – the resource returns a large number of items, but it was paginated by AWS default configuration.
Because of this one command API call won't return all existing resources and you need to retrieve the next set of items using Marker(NextToken).
Simple tests
Test is simple when all the data required for testing is already contained in resources.json with no need for additional API calls.
All you need to do in red_policy_test.py is to refer to a required parameter.
Let's take a look at an example for AWS, we have a policy that has two checks:
policies:
- name: ecc-aws-191-eks_cluster_protected_endpoint_access
description: |
EKS cluster endpoint does not have protected access
resource: aws.eks
filters:
- type: value
key: resourcesVpcConfig.endpointPublicAccess
value: true
- type: value
key: resourcesVpcConfig.publicAccessCidrs
value: "0.0.0.0/0"
value_type: swap
op: in
[
{
"name": "191_private_cluster_red",
"arn": "arn:aws:eks:us-east-1:123456789012:cluster/191_private_cluster_red",
"createdAt": "2022-10-07T14:08:04.908000+03:00",
"version": "1.23",
"endpoint": "https://11CD582111C81C7B0247079E52EE1FF8.gr7.us-east-1.eks.amazonaws.com",
"roleArn": "arn:aws:iam::123456789012:role/191_role_red",
"resourcesVpcConfig": {
"subnetIds": [
"subnet-0659eb5f4a6605a1e",
"subnet-08a3cb3e27534e65f"
],
"securityGroupIds": [],
"clusterSecurityGroupId": "sg-0be0f81ec0f58f76d",
"vpcId": "vpc-0cec019829a8f0c0d",
"endpointPublicAccess": true,
"endpointPrivateAccess": true,
"publicAccessCidrs": [
"0.0.0.0/0"
]
},
"kubernetesNetworkConfig": {
"serviceIpv4Cidr": "172.20.0.0/16"
},
...
}
]
Recorded files in the placebo-red/-green :
- eks.DescribeCluster_1.json (default)
- eks.ListClusters_1.json
eks.DescribeCluster_1.json is the default file for this policy because Custodian creates resources.json file based on this one. And by default, policy_test.py sees information about resources only within this default file.
Test file red_policy_test.py looks like this:
class PolicyTest(object):
def test_resources(self, base_test, resources):
base_test.assertEqual(len(resources), 1)
base_test.assertTrue(resources[0]['resourcesVpcConfig']['endpointPublicAccess'])
base_test.assertEqual(resources[0]['resourcesVpcConfig']['publicAccessCidrs'], ['0.0.0.0/0'])
It has two additional tests to the base one. They are:
- assertTrue that matches the first filter in policy that checks whether endpointPublicAccess parameter is True;
- assertEqual that matches the second filter in policy that checks whether publicAccessCidrs parameter equals "0.0.0.0/0".
Complex tests
In case of complex rules when we need to access non-default JSON file we need to use local_session in tests. In this type of test, local_session is used to make additional API calls to another necessary resource in order to access another JSON file with response data.
In this case, a base test template a little bit changes:
class PolicyTest(object):
def test_resources_with_client(self, base_test, resources, local_session):
base_test.assertEqual(len(resources), 1)
Let's take a look at an example for AWS:
policies:
- name: ecc-aws-258-efs_is_encrypted_using_managed_cmk
description: |
EFS file systems are not encrypted using KMS CMK
resource: efs
filters:
- or:
- type: value
key: Encrypted
value: false
- and:
- type: value
key: Encrypted
value: true
- type: kms-key
key: KeyManager
value: AWS
Recorded files in the placebo-red/-green :
- elasticfilesystem.DescribeFileSystems_1.json (default)
- kms.DescribeKey_1.json
- kms.ListAliases_1.json
- tagging.GetResources_1.json
[
{
"OwnerId": "123123181212",
"CreationToken": "258_efs_red",
"FileSystemId": "fs-09cad158119e84920",
"FileSystemArn": "arn:aws:elasticfilesystem:us-east-1:123123181212:file-system/fs-09cad158119e84920",
...
"Encrypted": true,
"KmsKeyId": "arn:aws:kms:us-east-1:123123181212:key/f1222765-672a-4ed9-9390-5dad09bbfd84",
"ThroughputMode": "bursting",
"c7n:MatchedFilters": [
"Encrypted"
],
"c7n:matched-kms-key": [
"f1222765-672a-4ed9-9390-5dad09bbfd84"
]
}
]
The main file which test uses for this rule is elasticfilesystem.DescribeFileSystems_1.json, but we need to access kms.DescribeKey_1.json. To get inside this file we need to use local_session.client("kms") where "kms" is the name of the resource we access and "describe_key" is the method (file). After we get inside the desired file we can use assert method to test our infrastructure.
Test file red_policy_test.py looks like this:
class PolicyTest(object):
def test_resources_with_client(self, base_test, resources, local_session):
base_test.assertEqual(len(resources), 1)
base_test.assertTrue(resources[0] ["Encrypted"])
kms_key_client = local_session.client("kms")
key = kms_key_client.describe_key(KeyId=resources[0]["KmsKeyId"])
base_test.assertNotEqual(key["KeyMetadata"]["KeyManager"], "CUSTOMER")
You can see that in complex tests we use another method (def test_resources_with_client) compared with the simple one (def test_resources).
To make an API call to another service, first, we should create a client using local_session.client() method (line 6).
Secondly, we should use this client to call a specific method, as it's shown in line 7. Some methods can expect request parameters, in this example, it's KeyId.
Having obtained information about a specific resource, we now can perform an assert test on it using the assert method (line 8).
Green & Red tests
As it was said earlier, some rules that check time or dates require green_policy_test.py. This policy is used when the time needs to be frozen at a particular date.
For the green test you should use the following template:
class PolicyTest(object):
def test_resources(self, base_test, resources):
base_test.assertEqual(len(resources), 0)
def mock_time(self):
return <year>, <month>, <day>
The mock_time method should return the date so that when green terraform is deployed it should return 0 resources.
Let's take a look at an example for AWS, we have a policy that returns Workspace images that are older than 90 days:
policies:
- name: ecc-aws-493-workspaces_images_not_older_than_90_days
resource: aws.workspaces-image
description: |
Workspaces images are older than 90 days
filters:
- type: value
key: Created
value_type: age
value: 90
op: ge
API call output for green resource:
{
"status_code": 200,
"data": {
"Images": [
{
"ImageId": "wsi-wtp207q3f",
"Name": "493_workspace_image_green",
"Description": "493_workspace_image_green",
"OperatingSystem": {
"Type": "LINUX"
},
"State": "PENDING",
"RequiredTenancy": "DEFAULT",
"Created": {
"__class__": "datetime",
"year": 2022,
"month": 6,
"day": 7,
"hour": 18,
"minute": 10,
"second": 43,
"microsecond": 35000
},
"OwnerAccountId": "644160558196"
}
],
"ResponseMetadata": {}
}
}
Rule output for red resource:
[
{
"ImageId": "wsi-wtp207q3f",
"Name": "493_workspace_image_red",
"Description": "493_workspace_image_red",
"OperatingSystem": {
"Type": "LINUX"
},
"State": "PENDING",
"RequiredTenancy": "DEFAULT",
"Created": "2022-01-06T18:10:43.035000+00:00",
"OwnerAccountId": "644160558196",
"Tags": [],
"c7n:MatchedFilters": [
"Created"
]
}
]
Test file green_policy_test.py looks like this:
class PolicyTest(object):
def test_resources(self, base_test, resources):
base_test.assertEqual(len(resources), 0)
def mock_time(self):
return 2022, 6, 7
Test file red_policy_test.py looks like this:
from datetime import datetime, timedelta
class PolicyTest(object):
def test_resources(self, base_test, resources):
base_test.assertEqual(len(resources), 1)
LastAccessedDate=datetime.fromisoformat(str(resources[0]['Created']))
time_now=datetime.fromisoformat('2022-05-06T02:00:00+00:00')
datatime90ago=time_now-timedelta(days=90)
base_test.assertFalse(LastAccessedDate>datatime90ago)
Note the dates, the green test is executed as if it was the same day when Workspace image was created for green infrastructure (2022.06.07). With this test, the policy returns 0 resources.
The red test has a date set if it would be 4 months later (2022.05.06) after Workspace image was created for red infrastructure (2022.01.06). Test in line 11 checks the difference between the date when the image was created - 2022.01.06 and the pseudo-current date - 2022.05.06. Because the difference is bigger than 90 days, the policy returns 1 resource.
AWS auto-pagination in tests
Sometimes AWS resources have a large number of items, and because of this AWS automatically on server-side paginates API call output. By default, AWS uses a page size determined by the individual service and retrieves all available items. For example, Amazon S3 has a default page size of 1000.
If the number of items output is fewer than the total number of items returned by the underlying API calls, the output includes a Marker/NextToken that you can pass to a subsequent command to retrieve the next set of items. If the previous command does not return a Marker/NextToken value, there are no more items to return and you do not need to call the command again.
For more details, you can read here - Using AWS CLI pagination options
Let's take a look at an example for AWS, we have a policy where the second filter (line 12) checks a specific parameter value from RDS Parameter Group attached to the RDS instance:
policies:
- name: ecc-aws-385-postgresql_log_connections_flag_enabled
resource: aws.rds
description: |
The 'log_connections' flag is disabled for PostgreSQL
filters:
- and:
- type: value
key: Engine
value: postgres
- not:
- type: db-parameter
key: log_connections
value: 1
RDS Paramates Group has a large number of parameters, and because of this, in the python test we have to make a few consecutive API calls to the Parameters group before we'll find the parameter we are searching for, in our case, it's 'log_connections'.
Test file red_policy_test.py looks like this:
class PolicyTest(object):
def test_resources_with_client(self, base_test, resources, local_session):
base_test.assertEqual(len(resources), 1)
base_test.assertEqual(resources[0]['Engine'], "postgres")
parameter_group_name=resources[0]["DBParameterGroups"][0]["DBParameterGroupName"]
describe_parameters = local_session.client("rds").describe_db_parameters(DBParameterGroupName=parameter_group_name)
parameters=describe_parameters["Parameters"]
marker=describe_parameters["Marker"] if "Marker" in describe_parameters else None
while marker is not None:
for parameter in parameters:
if parameter["ParameterName"]=="log_connections":
base_test.assertNotIn('ParameterValue', parameter)
describe_parameters = local_session.client("rds").describe_db_parameters(DBParameterGroupName=parameter_group_name, Marker=marker)
parameters=describe_parameters["Parameters"]
marker=describe_parameters["Marker"] if "Marker" in describe_parameters else None
In line 6, we found the name of the attached Parameter group to the RDS instance.
In line 8, we make the first call to get the first bunch of parameters and get Marker, which we need to get other items of the Parameter group.
Line 12 contains a loop that makes call after call t find 'log_connections' parameter, if the parameter was found it's tested with assertNotIn method.
After you've prepared a cloud-custodian/policy_test.py, you now can record a set of calls and save them to data files and then replay those calls later. Execute this command twice, for red and green infrastructure, do not forget to change the output folder to separate red and green output.
cd cloud-custodian
python3 policy_as_test.py record /path_to_cloud_repo/policies/policy.yml policy_test /path_to_output_dir
- policy_as_test.py – the script which is used for recording and replaying the tests
- record – the command for recording flight data
- policy.yml – path to rule that you're testing
- policy_test – python test file
- path_to_output_dir – is a path to a directory where you want responses with flight data in JSON to be stored
For example, you can run the commands like this:
cd cloud-custodian
python3 policy_as_test.py record ~/ecc-aws-rulepack/policies/ecc-aws-191-eks_cluster_protected_endpoint_access.yml policy_test ~/191_rule/green
python3 policy_as_test.py record ~/ecc-aws-rulepack/policies/ecc-aws-191-eks_cluster_protected_endpoint_access.yml policy_test ~/191_rule/red
If you have run the test when two red and green infrastructures were deployed at the same time, you have to delete unwanted resources:
- Inside the placebo-green folder, edit each recorded JSON file to keep only created green infrastructure (remove all other resources except for green infrastructure).
- Inside the placebo-red folder, edit each recorded JSON file to keep only created red infrastructure (remove all other resources except for red infrastructure).
In case of the folder has multiple files with the same name, but different numbering (ecs.ListClusters_1.json and ecs.ListClusters_2.json), we need to find a file with green/red infrastructure and delete other files with the same name.
If the file with green/red infrastructure is ecs.ListClusters_2.json we need to rename it to ecs.ListClusters_1.json
After you prepared files in the previous step, you have to replay them to check that everything is correct.
python3 policy_as_test.py replay/path_to_cloud_repo/policies/policy.yml policy_test /path_to_output_dir_green
The correct result of the green test, the output should be 0 resources.
python3 policy_as_test.py replay/path_to_cloud_repo/policies/policy.yml policy_test /path_to_output_dir_red
The correct result of the red test, the output should be 1 resource.
When you already have cloud-custodian/policy_test.py and red and green flight data tested and ready, your next step is to copy it to the corresponding file and folders in cloud_repository/tests/<policy_name>/.
- cloud-custodian/policy_test.py → cloud_repository/tests/<policy_name>/red_policy_test.py
- /path_to_output_dir_green/*.json → cloud_repository/tests/<policy_name>/placebo-green/*.json
- /path_to_output_dir_red/*.json → cloud_repository/tests/<policy_name>/placebo-red/*.json
Github CI/CD documentation - Github CI/CD
Documentation about UnitTest Assert Methods - unittest — Unit testing framework
The complete list of Python assert statements – Python 3.x UnitTest Assert Methods