Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pktgen for CI #148

Merged
merged 52 commits into from
Aug 6, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
74676e5
Save performance in ci
kevindweb Jun 5, 2019
3a97ae4
Add flask to install
kevindweb Jun 5, 2019
ef255d0
Allowed for public linting of unauthorized users
koolzz Jun 5, 2019
8ba65de
Starting the queue
koolzz Jun 6, 2019
3a1874b
Start to fixing the queue problem
koolzz Jun 7, 2019
bf5e2a8
Merge branch 'develop' of https://github.com/kevindweb/openNetVM into…
koolzz Jun 8, 2019
17882a6
Added run mode for future advancements
koolzz Jun 8, 2019
1c2f52d
Major fixes to ci list system
koolzz Jun 8, 2019
29bbef6
Remove debug lines
koolzz Jun 8, 2019
a1749cb
Clean up definitions
koolzz Jun 8, 2019
0c3a90b
Remove exit statement
koolzz Jun 8, 2019
45fdf70
Merge branch 'develop' of https://github.com/sdnfv/openNetVM-dev into…
koolzz Jun 9, 2019
c10dbb4
Add error handling in run_ci and fix docs
kevindweb Jun 9, 2019
5b06b38
Create polling thread for requests
kevindweb Jun 10, 2019
12935e3
Delete new line
kevindweb Jun 11, 2019
b771745
Account for run mode from #140
kevindweb Jun 12, 2019
04380b1
Fixes to merge run mode conflict
kevindweb Jun 12, 2019
e7639ef
Merge branch 'ci_performance_update' into ci_queue
kevindweb Jun 12, 2019
ed65879
Errors are much more difficult in threads :(
kevindweb Jun 12, 2019
3b01660
Smarter than using tuples
kevindweb Jun 12, 2019
58a13ff
Merge branch 'develop' of https://github.com/kevindweb/openNetVM into…
kevindweb Jun 12, 2019
4bef1aa
Fix merge conflict
kevindweb Jun 12, 2019
05a89f8
Added event-based handling instead of polling
onvmstats Jun 17, 2019
24965be
Let request handler clears event
onvmstats Jun 19, 2019
7e71ec2
Initialize ci for pktgen
kevindweb Jun 22, 2019
ee2220c
Huge update to pktgen configuration
kevindweb Jun 24, 2019
1c68a5f
Updated gitignore for log files
kevindweb Jun 24, 2019
32766cd
Added speed tester and pktgen in conjunction
kevindweb Jun 25, 2019
709c61b
Fix redundancies and bugs
kevindweb Jun 28, 2019
a07c457
Merge branch 'develop' of https://github.com/sdnfv/openNetVM into ci-…
kevindweb Jun 29, 2019
89af507
Disabled flow lookup and created better benchmarks
kevindweb Jun 29, 2019
e57a6df
Fixed things so nn30 works with pktgen
kevindweb Jun 30, 2019
6f53017
Fixes from github comments
kevindweb Jul 4, 2019
b270224
Make results more stable by restarting manager
kevindweb Jul 6, 2019
5fa102d
Added fixes from review and redundancies
kevindweb Jul 7, 2019
8c48dd6
Fixes from github review
kevindweb Jul 9, 2019
cbd1a61
Additions to readme about CI developments
kevindweb Jul 9, 2019
eb39bc1
Add a section about Pktgen helper nodes
kevindweb Jul 10, 2019
e787caf
Merge branch 'develop' of https://github.com/sdnfv/openNetVM into ci-…
kevindweb Jul 10, 2019
d6774f3
Merge branch 'ci-pktgen' of https://github.com/kevindweb/openNetVM in…
kevindweb Jul 10, 2019
93b94ee
Fix missed line
kevindweb Jul 10, 2019
5a652cb
Add compatability for different interfaces
kevindweb Jul 13, 2019
d1e83bd
Fixes for user input and debugging
kevindweb Jul 14, 2019
4f8fc06
Fixes for Pktgen and binding interface
kevindweb Jul 25, 2019
fcd3b1a
Renaming a ton of files
kevindweb Aug 2, 2019
ee97651
Fixed symbolic link
kevindweb Aug 2, 2019
2d305f7
Added benchmark info to README and renamed file
kevindweb Aug 3, 2019
4f9928b
Fix Pktgen install requested changes
kevindweb Aug 5, 2019
fd6332d
Merge branch 'develop' of https://github.com/sdnfv/openNetVM into ci-…
kevindweb Aug 5, 2019
a9168f9
Address Github change requests
kevindweb Aug 6, 2019
0f3dfd2
Adjustments to interface configuration
kevindweb Aug 6, 2019
381f27d
Fixed manager debug
kevindweb Aug 6, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions ci/.gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
access_log
config
webhook-config.json
encrypted_secret.bin
Expand All @@ -10,4 +9,7 @@ repository
linter-output.txt
*key
*.pub
*.stats
*stats
*out*
*log*
nimbnode*
37 changes: 34 additions & 3 deletions ci/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ The CI process can be broken into multiple steps:

5. Run linter on the checked out code

Runs the `run_linter` function in `helper-functions.sh`
Runs the `run_linter` function in `helper-manager-functions.sh`

6. Clean up and restart all worker nodes

Expand All @@ -83,11 +83,15 @@ The CI process can be broken into multiple steps:

Use paramiko to ssh and run `run-workload.py`

9. Acquire results from the worker nodes
9. Run modes are supplied to tell the worker which applications to test

Handle installation with `worker_files/worker.sh` for builds, and setting up manager for performance tests

10. Acquire results from the worker nodes

Use scp to copy the result stat file from worker

10. Submit results as a comment on github
11. Submit results as a comment on github

Uses the `post-msg.py` script

Expand All @@ -106,6 +110,33 @@ ProxyPassReverse /onvm-ci/ http://nimbnode44:8080/
```
(Also need to setup github webhook to post to **http://nimbus.seas.gwu.edu/onvm-ci/github-webhook**)

### Public and Private CI Runs

CI is now able to accept requests from unauthenticated users. There is a list of Github users in the public project allowed to create a full run. Anyone who is able to view the private `-dev` repository is able to run CI there as well. In `openNetVM`, if a user is not in our list, the linter and branch checks will be executed, ignoring statistics calculations from the worker nodes.

### Setting Up a Connected Worker

Connecting two nodes is useful for measuring statistics with tools like Pktgen and the MTCP stack. There is a bit of setup required to get working connection working. Firstly, an SFP+ 10Gb Intel cable will be required to connect the Network Interface Cards in the two machines. Once this is done, attempt to bring up the correct interfaces for a stable connection. Some debugging might be required:
- If you don't know which `ifconfig -a` interface is correct, use `ethtool -p <interface name> 120`
- This will blink a light on the interface (you have to be next to the machine for this to help)
- Do this on both machines, to find the name of the interfaces that are linked
- Run `sudo ifconfig <interface name> 11.0.0.1/24 up` on the first machine and `sudo ifconfig <interface name> 11.0.0.2/24 up`
- This will ensure `ping` understands what IP address it is supposed to talk to
- If `ping -I <interface> 11.0.0.2` on the first machine works, great, if not, try changing the IP addresses or viewing `dmesg`

Now that the interfaces are connected, choose which machine will be the CI worker, and which is a helper (Pktgen for example). Install Pktgen on this node by sending the `ci/install_pktgen` files to that machines' home folder. *Remember public keys must be created for all new machines*. Store these public keys in a folder with the server name, see the next section on statistics for more information. Run `chmod +x install-pktgen.sh` if it's not already an executable and run `./install-pktgen.sh` to install everything. If there are dependency errors, the machine might be a different version, so try to install the necessary packages. Once everything is installed, test ONVM->Pktgen between the machines, and if a connection is established, CI should work just fine with no more setup!

### Advanced Statistics

As CI continued to improve, with more programs to test with, benchmarks were made to track the average performance of a worker. In the future, CI will be able to handle multiple workers running many different tests. Since server configurations are not all the same, some with different hardware (Intel x710 vs. x520 NIC for example), performance of the nodes will not be the same. All that matters with CI is that the result of a run is the same or better, not globally across all nodes, but based on the specific server it ran on. For each worker, create a folder in the ci directory with the name of the worker IP. For example if `nimbnode17` is the current worker, a folder with path `/ci/nimbnode17/` should exist. In this folder, 3 files should be there at least. Firstly, a `benchmarks` file (used by the manager) should look similar to this:

```
AVG_SPEED_TESTER_SPEED=40000000
AVG_PKTGEN_SPEED=10000000
AVG_MTCP_SPEED=.230
```
This is a configuration file, sourced by the manager to keep track of `nimbnode17`'s average performance for each test (currently Speed Tester, Pktgen, and mTCP). The other two files in the folder should be the two public keys, one for the worker, and the second for the worker's client server. Check the previous section on setting up a connection for more information.

### Checking if Online

If you are worried if the CI build is offline or want to make sure it is listening for events, you can check the following url: curl http://nimbus.seas.gwu.edu/onvm-ci/status. If that URL returns 404, CI is offline. Otherwise it will display a message saying it is online.
Expand Down
2 changes: 1 addition & 1 deletion ci/ci_busy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

set -e

. helper-functions.sh
. helper-manager-functions.sh
SCRIPT_LOC=$(pwd)

print_header "Validating Config File and Sourcing Variables"
Expand Down
45 changes: 6 additions & 39 deletions ci/helper-functions.sh → ci/helper-manager-functions.sh
Original file line number Diff line number Diff line change
Expand Up @@ -38,45 +38,6 @@ print_header() {
echo ""
}

# sets up dpdk, sets env variables, and runs the install script
install_env() {
git submodule sync
git submodule update --init

echo export ONVM_HOME=$(pwd) >> ~/.bashrc
export ONVM_HOME=$(pwd)

cd dpdk

echo export RTE_SDK=$(pwd) >> ~/.bashrc
export RTE_SDK=$(pwd)

echo export RTE_TARGET=x86_64-native-linuxapp-gcc >> ~/.bashrc
export RTE_TARGET=x86_64-native-linuxapp-gcc

echo export ONVM_NUM_HUGEPAGES=1024 >> ~/.bashrc
export ONVM_NUM_HUGEPAGES=1024

echo $RTE_SDK

sudo sh -c "echo 0 > /proc/sys/kernel/randomize_va_space"

cd ../
pwd
. ./scripts/install.sh
}

# makes all onvm code
build_onvm() {
cd onvm
make clean && make
cd ../

cd examples
make clean && make
cd ../
}

# obtains core config in cores.out file
obtain_core_config() {
cd scripts
Expand Down Expand Up @@ -140,3 +101,9 @@ run_linter() {
fi
done
}

# inputs are key_file, worker ip address, stats file - in that order 1,2,3
fetch_files() {
scp -i $1 -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null $2:$3 ./$2.$3
check_exit_code "ERROR: Failed to fetch results from $2"
}
1 change: 1 addition & 0 deletions ci/install_pktgen/helper-worker-functions.sh
41 changes: 41 additions & 0 deletions ci/install_pktgen/install-pktgen.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#!/bin/bash

. helper-install-functions.sh

set -e

sudo rm -rf repository

git clone https://github.com/sdnfv/openNetVM.git repository
check_exit_code "ERROR: Failed cloning"

print_header "Installing Dependencies"
sudo apt-get update
sudo apt-get upgrade -y
sudo apt-get install -y build-essential linux-headers-$(uname -r) git
sudo apt-get install -y libnuma1
sudo apt-get install -y libnuma-dev
sudo apt-get install libpcap-dev
sudo apt-get install libreadline-dev

print_header "Installing Lua"
curl -R -O http://www.lua.org/ftp/lua-5.3.5.tar.gz
tar zxf lua-5.3.5.tar.gz
cd lua-5.3.5
sudo make linux test
sudo make install

cd repository

print_header "Installing Environment"
install_env $RUN_PKT
check_exit_code "ERROR: Installing environment failed"

print_header "Make pktgen-dpdk"
cd ~/repository/tools/Pktgen/pktgen-dpdk/
make

print_header "Updating lua script"
cp ~/pktgen-timed-config.lua ~/repository/tools/Pktgen/openNetVM-Scripts/pktgen-config.lua

print_header "Pktgen installed"
110 changes: 110 additions & 0 deletions ci/install_pktgen/pktgen-timed-config.lua
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
-- openNetVM
-- https://github.com/sdnfv/openNetVM
--
-- BSD LICENSE
--
-- Copyright(c)
-- 2015-2016 George Washington University
-- 2015-2016 University of California Riverside
-- All rights reserved.

-- Redistribution and use in source and binary forms, with or without
-- modification, are permitted provided that the following conditions
-- are met:

-- Redistributions of source code must retain the above copyright
-- notice, this list of conditions and the following disclaimer.
-- Redistributions in binary form must reproduce the above copyright
-- notice, this list of conditions and the following disclaimer in
-- the documentation and/or other materials provided with the
-- distribution.
-- The name of the author may not be used to endorse or promote
-- products derived from this software without specific prior
-- written permission.

-- THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-- "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-- LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-- A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-- OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-- SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-- LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-- DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-- THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-- (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-- OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

-- Change any of the settings below to configure Pktgen-DPDK

-- A list of the test script for Pktgen and Lua.
-- Each command somewhat mirrors the pktgen command line versions.
-- A couple of the arguments have be changed to be more like the others.

package.path = package.path ..";?.lua;test/?.lua;app/?.lua;"

require "Pktgen"

local function doWait(port, waitTime)
local idx;

pktgen.delay(1000);

pkt_rate_file = io.open("port_stats", "w");

if ( waitTime == 0 ) then
return;
end
waitTime = waitTime - 1;

-- Try to wait for the total number of packets to be sent.
local idx = 0;
while( idx < waitTime ) do
-- Write port stats to output file separated by line
pkt_rate_file:write(pktgen.portStats("all", "rate")[0]["pkts_rx"] .. "\n");
idx = idx + 1;

local sending = pktgen.isSending(port);
if ( sending[tonumber(port)] == "n" ) then
break;
end
pktgen.delay(1000);
end

pkt_rate_file:close()
end

printf("Lua Version : %s\n", pktgen.info.Lua_Version);
printf("Pktgen Version : %s\n", pktgen.info.Pktgen_Version);
printf("Pktgen Copyright : %s\n", pktgen.info.Pktgen_Copyright);

prints("pktgen.info", pktgen.info);

printf("Port Count %d\n", pktgen.portCount());
printf("Total port Count %d\n", pktgen.totalPorts());


-- set up a mac address to set flow to
--
-- TO DO LIST:
--
-- Please update this part with the destination mac address, source and destination ip address you would like to sent packets to

pktgen.set_mac("0", "90:e2:ba:5e:73:21");
pktgen.set_ipaddr("0", "dst", "10.11.1.17");
pktgen.set_ipaddr("0", "src", "10.11.1.16");

pktgen.set_proto("all", "udp");
pktgen.set_type("all", "ipv4");

pktgen.set("all", "size", 64)
pktgen.set("all", "burst", 32);
pktgen.set("all", "sport", 1234);
pktgen.set("all", "dport", 1234);
pktgen.set("all", "count", 1000000000);
pktgen.set("all", "rate",100);

pktgen.vlan_id("all", "start", 1);

pktgen.start("all");
doWait("all", 30);
pktgen.quit();
43 changes: 30 additions & 13 deletions ci/manager.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
set -e

# source helper functions file
. helper-functions.sh
. helper-manager-functions.sh
SCRIPT_LOC=$(pwd)

print_header "Validating Config File and Sourcing Variables"
Expand Down Expand Up @@ -78,7 +78,7 @@ fi
print_header "Cleaning up Old Results"

sudo rm -f *.txt
sudo rm -rf stats
sudo rm -rf *stats
sudo rm -rf repository

print_header "Checking Worker and GitHub Creds Exist"
Expand Down Expand Up @@ -127,7 +127,6 @@ then
fi

print_header "Preparing Workers"

for worker_tuple in "${WORKER_LIST[@]}"
do
tuple_arr=($worker_tuple)
Expand Down Expand Up @@ -157,10 +156,16 @@ do
tuple_arr=($worker_tuple)
worker_ip="${tuple_arr[0]}"
worker_key_file="${tuple_arr[1]}"
scp -i $worker_key_file -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -r ./repository $worker_ip:
check_exit_code "ERROR: Failed to copy ONVM files to $worker_ip"
scp -i $worker_key_file -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null helper-functions.sh worker.sh $worker_ip:
# make sure the config file is updated with the correct run mode
sed -i "/WORKER_MODE*/c\\WORKER_MODE=\"${RUN_MODE}\"" worker_files/worker-config
# create directory for scp
mkdir temp
# put all files in one temporary folder for one scp
koolzz marked this conversation as resolved.
Show resolved Hide resolved
cp -r ./$worker_ip/* ./repository ./worker_files/* temp
scp -i $worker_key_file -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -r ./temp/* $worker_ip:
check_exit_code "ERROR: Failed to copy ONVM files to $worker_ip"
# get rid of the temp folder now for next worker
sudo rm -rf temp
done

print_header "Running Workloads on Workers"
Expand All @@ -174,18 +179,30 @@ do
done

print_header "Obtaining Performance Results from all workers"

rm -f results_summary.stats

for worker_tuple in "${WORKER_LIST[@]}"
do
tuple_arr=($worker_tuple)
worker_ip="${tuple_arr[0]}"
worker_key_file="${tuple_arr[1]}"
scp -i $worker_key_file -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null $worker_ip:stats ./$worker_ip.stats
check_exit_code "ERROR: Failed to fetch results from $worker_ip"
# TODO: this will overwrite results if we have more than 1 worker, investigate this case
python3 speed-tester-analysis.py ./$worker_ip.stats $worker_ip results_summary.stats
# get the benchmarks for each node (some servers are faster)
. ./$worker_ip/benchmarks
koolzz marked this conversation as resolved.
Show resolved Hide resolved
# TODO: this will overwrite results if we have more than 1 worker, investigate this case
if [[ "$RUN_MODE" -eq "0" ]]
then
# fetch pktgen stats
fetch_files $worker_key_file $worker_ip pktgen_stats
python3 pktgen-analysis.py ./$worker_ip.pktgen_stats $worker_ip pktgen_summary.stats $AVG_PKTGEN_SPEED
check_exit_code "Failed to parse Pktgen stats"
# fetch speed_tester stats
fetch_files $worker_key_file $worker_ip speed_stats
python3 speed-tester-analysis.py ./$worker_ip.speed_stats $worker_ip speed_summary.stats $AVG_SPEED_TESTER_SPEED
check_exit_code "Failed to parse Speed Tester stats"
else
# only fetch speed tester stats if mode is not 0
fetch_files $worker_key_file $worker_ip speed_stats
python3 speed-tester-analysis.py ./$worker_ip.speed_stats $worker_ip speed_summary.stats $AVG_SPEED_TESTER_SPEED
check_exit_code "Failed to parse Speed Tester stats"
fi
check_exit_code "ERROR: Failed to analyze results from $worker_ip"
done

Expand Down
Loading