Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-6923 test: Offline Reintegration - More tests #4835

Merged
merged 56 commits into from
Mar 30, 2021
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
112adfa
DAOS-6923 test: Offline Reintegration - More tests
rpadma2 Mar 3, 2021
8e7300e
Merge branch 'master' into daos_6923
rpadma2 Mar 4, 2021
9bc2302
DAOS-6923 test: Added mdtest feature
rpadma2 Mar 5, 2021
d0a22c2
Merge branch 'master' into daos_6923
rpadma2 Mar 5, 2021
7225a5b
DAOS-6923 test: Fix checkpatch issues.
rpadma2 Mar 5, 2021
77f8c65
DAOS-6923 test: Fix the typo
rpadma2 Mar 5, 2021
b2cb8c6
DAOS-6923 test: Removed unwanted results parameter
rpadma2 Mar 7, 2021
078065e
DAOS-6923 test: Address checkpatch issues.
rpadma2 Mar 8, 2021
c072ccb
DAOS-6923 test: Run all the tests once now.
rpadma2 Mar 9, 2021
94f82fb
Merge branch 'master' into daos_6923
rpadma2 Mar 9, 2021
1ed510e
DAOS-6923 test: Fix checkpatch issues.
rpadma2 Mar 9, 2021
f2cedae
DAOS-6923 test: Added skipForTicket
rpadma2 Mar 9, 2021
3b2b7b7
DAOS-6923 test: Just run offline (server stop)
rpadma2 Mar 10, 2021
070af59
Merge branch 'master' into daos_6923
rpadma2 Mar 10, 2021
7b83df8
DAOS-6923 test: Add the loop testing methods.
rpadma2 Mar 11, 2021
68f9753
DAOS-6923 test : Added daos cont check support
rpadma2 Mar 11, 2021
77e73d2
Merge branch 'master' into daos_6923
rpadma2 Mar 11, 2021
85434c3
DAOS-6923 test: Merge with master, minor change.
rpadma2 Mar 11, 2021
b07b581
DAOS-6923 test: Code review script changes.
rpadma2 Mar 14, 2021
8298076
DAOS-6923 test: Fix minor checkpatch issues.
rpadma2 Mar 14, 2021
54115d6
Merge branch 'master' into daos_6923
rpadma2 Mar 14, 2021
01c9e28
Merge branch 'master' into daos_6923
rpadma2 Mar 15, 2021
5ac7f75
DAOS-6923 test: Update the container class
rpadma2 Mar 16, 2021
3a080fc
Merge branch 'master' into daos_6923
rpadma2 Mar 16, 2021
fecfde7
DAOS-6923 test: Fix checkpatch issues.
rpadma2 Mar 16, 2021
b507f47
DAOS-6923 test: Support single/multiple containers
rpadma2 Mar 17, 2021
4973847
Merge branch 'master' into daos_6923
rpadma2 Mar 17, 2021
f4b3a43
DAOS-6923 test: Minor changes to osa_utils.py
rpadma2 Mar 18, 2021
5b9e8e4
Merge branch 'master' into daos_6923
rpadma2 Mar 18, 2021
ff074fc
DAOS-6923 test: Fix minor checkpatch issues.
rpadma2 Mar 18, 2021
189be59
DAOS-6923 test: Fix the ior_thread issue.
rpadma2 Mar 19, 2021
a0f9a04
Merge branch 'master' into daos_6923
rpadma2 Mar 21, 2021
a557e80
DAOS-6923 test: Added skipForTicket (DAOS-6925)
rpadma2 Mar 22, 2021
a5e64fa
Merge branch 'master' into daos_6923
rpadma2 Mar 22, 2021
7684221
DAOS-6923 test: Removed unwanted variable.
rpadma2 Mar 22, 2021
33406af
DAOS-5758 pl: fixes for placement
Mar 16, 2021
0629c3a
Merge branch 'master' into daos_6923
rpadma2 Mar 24, 2021
6188da1
DAOS-6923 test: Merge with Di's branch.
rpadma2 Mar 24, 2021
4983019
DAOS-6923 test: Add skipForTicket-daos cont check
rpadma2 Mar 24, 2021
2e6461a
DAOS-6923 test: Run all the tests including weekly
rpadma2 Mar 25, 2021
9fe0d4b
DAOS-6923 test: Offline reintegration no checksum
rpadma2 Mar 25, 2021
396cdea
Merge branch 'master' into daos_6923
rpadma2 Mar 25, 2021
1ccb85e
DAOS-6923 test: Fix spell check checkpatch issue.
rpadma2 Mar 25, 2021
aa50cc5
DAOS-6923 test: skipforTicket DAOS-6807
rpadma2 Mar 25, 2021
29d61e9
Merge branch 'master' into daos_6923
rpadma2 Mar 25, 2021
f11419b
DAOS-6923 test: Testing without enabling checksum
rpadma2 Mar 26, 2021
ac0c39b
Merge branch 'master' into daos_6923
rpadma2 Mar 26, 2021
ff2148d
DAOS-6923 test: Perform IOR read after excludes
rpadma2 Mar 26, 2021
b440a11
Merge branch 'master' into daos_6923
rpadma2 Mar 26, 2021
a6e8545
Merge branch 'master' into daos_6923
rpadma2 Mar 28, 2021
8349efa
DAOS-6923 test: Enable daos cont check
rpadma2 Mar 28, 2021
3787a55
DAOS-6923 test: Seeing md_test failures.
rpadma2 Mar 28, 2021
de979b5
DAOS-6923 test: Enable daos cont check for tests
rpadma2 Mar 29, 2021
d17a081
DAOS-6923 test: Fix mdtest_test_base
rpadma2 Mar 29, 2021
3ef55fe
DAOS-6923 test: Add log messages
rpadma2 Mar 29, 2021
e975b4f
Merge branch 'master' into daos_6923
rpadma2 Mar 29, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 33 additions & 13 deletions src/tests/ftest/osa/osa_offline_drain.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,10 @@
SPDX-License-Identifier: BSD-2-Clause-Patent
"""
import random
import time
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(pylint-unused-import) Unused import time

from osa_utils import OSAUtils
from test_utils_pool import TestPool
from apricot import skipForTicket
from write_host_file import write_host_file


class OSAOfflineDrain(OSAUtils):
Expand All @@ -22,28 +23,38 @@ def setUp(self):
"""Set up for test case."""
super(OSAOfflineDrain, self).setUp()
self.dmg_command = self.get_dmg_command()
self.ior_test_sequence = self.params.get(
"ior_test_sequence", '/run/ior/iorflags/*')
# Recreate the client hostfile without slots defined
self.hostfile_clients = write_host_file(
self.hostlist_clients, self.workdir, None)

def run_offline_drain_test(self, num_pool, data=False):
def run_offline_drain_test(self, num_pool, data=False,
oclass=None, drain_during_aggregation=False):
"""Run the offline drain without data.
Args:
num_pool (int) : total pools to create for testing purposes.
data (bool) : whether pool has no data or to create
some data in pool. Defaults to False.
oclass (str): DAOS object class (eg: RP_2G1,etc)
drain_during_aggregation (bool) : Perform drain and aggregation
in parallel
"""
# Create a pool
pool = {}
rpadma2 marked this conversation as resolved.
Show resolved Hide resolved
pool_uuid = []
target_list = []
drain_servers = (len(self.hostlist_servers) * 2) - 1

if oclass is None:
oclass = self.ior_cmd.dfs_oclass.value

# Exclude target : random two targets (target idx : 0-7)
n = random.randint(0, 6)
target_list.append(n)
target_list.append(n+1)
t_string = "{},{}".format(target_list[0], target_list[1])

# Drain a rank (or server)
rank = random.randint(1, drain_servers)
# Drain a rank 1 (or server)
rank = 1

for val in range(0, num_pool):
pool[val] = TestPool(self.context, dmg_command=self.dmg_command)
Expand All @@ -54,17 +65,27 @@ def run_offline_drain_test(self, num_pool, data=False):
pool[val].nvme_size.value = int(pool[val].nvme_size.value /
num_pool)
pool[val].create()
pool_uuid.append(pool[val].uuid)
self.pool = pool[val]
if drain_during_aggregation is True:
test_seq = self.ior_test_sequence[1]
self.pool.set_property("reclaim", "disabled")
else:
test_seq = self.ior_test_sequence[0]

rpadma2 marked this conversation as resolved.
Show resolved Hide resolved
if data:
self.write_single_object()
self.run_ior_thread("Write", oclass, test_seq)
self.run_mdtest_thread()

# Drain the pool_uuid, rank and targets
# Drain rank and targets
for val in range(0, num_pool):
self.pool = pool[val]
rank = rank + val
self.pool.display_pool_daos_space("Pool space: Beginning")
pver_begin = self.get_pool_version()
self.log.info("Pool Version at the beginning %s", pver_begin)
if drain_during_aggregation is True:
self.pool.set_property("reclaim", "time")
rpadma2 marked this conversation as resolved.
Show resolved Hide resolved
time.sleep(90)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why sleep(90) here? Then aggregation might already finish its job before drain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, let me constantly, loop the exclude, reintegration operations for 180 seconds and see whether aggregation kicks in... Lot of aggregation scripts have this time delay for the aggregation to kick in after an IO operation is performed.

Copy link
Contributor

@saurabhtandan saurabhtandan Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If aggregation scripts have sleep in them, i think it's not a good approach. There is no concrete way we can determine with a sleep if aggregation started or finished.
I would suggest to use pool query results for verify aggregation if the data has been deleted after aggregation was enabled.

Ravi> I am removing the sleep time... Instead we will be doing exclude,reintegrate in a loop for 100 seconds... In this way, we are sure aggregation is going to happen during the OSA exclud/reintegrate process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed sleep and added the new method. Write ior data and delete the container to make sure aggregation happens.

output = self.dmg_command.pool_drain(self.pool.uuid,
rank, t_string)
self.log.info(output)
Expand All @@ -82,9 +103,9 @@ def run_offline_drain_test(self, num_pool, data=False):
pool[val].display_pool_daos_space(display_string)

if data:
self.verify_single_object()
self.run_ior_thread("Read", oclass, test_seq)
self.run_mdtest_thread()

@skipForTicket("DAOS-6668")
def test_osa_offline_drain(self):
"""
JIRA ID: DAOS-4750
Expand All @@ -94,5 +115,4 @@ def test_osa_offline_drain(self):
:avocado: tags=all,daily_regression,hw,medium,ib2
:avocado: tags=osa,osa_drain,offline_drain
"""
for pool_num in range(1, 3):
self.run_offline_drain_test(pool_num, True)
self.run_offline_drain_test(1, True)
47 changes: 45 additions & 2 deletions src/tests/ftest/osa/osa_offline_drain.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,10 @@ pool:
svcn: 4
control_method: dmg
container:
properties:
enable_checksum: True
type: POSIX
control_method: daos
oclass: RP_2G1
properties: cksum:crc64,cksum_size:16384,srv_cksum:on
dkeys:
single:
no_of_dkeys:
Expand All @@ -62,3 +64,44 @@ record:
1KB:
length:
- 1024
ior:
clientslots:
slots: 48
test_file: /testFile
repetitions: 1
dfs_destroy: False
iorflags:
write_flags: "-w -F -k -G 1"
read_flags: "-F -r -R -k -G 1"
api: DFS
dfs_oclass: RP_2G1
dfs_dir_oclass: RP_2G1
ior_test_sequence:
# - [scmsize, nvmesize, transfersize, blocksize]
# The values are set to be in the multiples of 10.
# Values are appx GB.
- [6000000000, 54000000000, 500000, 500000000]
- [6000000000, 54000000000, 1000, 500000000]
mdtest:
api: DFS
client_processes:
np: 30
num_of_files_dirs: 4067 # creating total of 120K files
test_dir: "/"
iteration: 1
dfs_destroy: False
dfs_oclass: RP_2G1
dfs_dir_oclass: RP_2G1
manager: "MPICH"
flags: "-u"
wr_size:
32K:
write_bytes: 32768
read_bytes: 32768
verbosity_value: 1
depth: 0
test_obj_class:
oclass:
- RP_2G8
- RP_3G6
- RP_4G1
rpadma2 marked this conversation as resolved.
Show resolved Hide resolved
Loading