-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Oardocker does not work on the latest docker version anymore #54
Comments
I think your cgroup hierarchy is in a bad state. The simple approach to clean it is to reboot your machine. |
|
And the output of "docker config inspect" on my laptop is: Containers: 78 |
Bug still persists but error looks different now: $ pip install git+https://github.com/oar-team/oar-docker.git
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.4 LTS
Release: 18.04
Codename: bionic
$ oardocker --version
oardocker, version 1.5.0.dev0
$ docker --version
Docker version 19.03.6, build 369ce74a3c
$ oardocker init -f build install git+https://github.com/oar-team/oar.git start connect frontend Oar versionis git hash: docker@frontend ~
$ oarsub -I
[ADMISSION RULE] Set default walltime to 7200.
[ADMISSION RULE] Modify resource description with type constraints
OAR_JOB_ID=1
Interactive mode: waiting...
Starting...
ERROR: some resources did not respond The following helps to work arround: $ oardocker connect -l root server
root@server ~
vi /etc/oar/job_resource_manager_cgroups.pl And apply the following patch manually (basically to remove all blkio checking): --- job_resource_manager_cgroups.pl 2020-03-19 11:56:51.257232670 +0100
+++ a.pl 2020-03-19 11:54:22.751497831 +0100
@@ -140,7 +140,7 @@
flock(LOCKFILE,LOCK_EX) or exit_myself(17,"flock failed: $!");
if (!(-r $Cgroup_directory_collection_links.'/cpuset/tasks')){
if (!(-r $OS_cgroups_path.'/cpuset/tasks')){
- my $cgroup_list = "cpuset,cpu,cpuacct,devices,freezer,blkio";
+ my $cgroup_list = "cpuset,cpu,cpuacct,devices,freezer";
$cgroup_list .= ",memory" if ($ENABLE_MEMCG eq "YES");
if (system('oardodo mkdir -p '.$Cgroup_mount_point.' &&
oardodo mount -t cgroup -o '.$cgroup_list.' none '.$Cgroup_mount_point.' || exit 1
@@ -152,7 +152,6 @@
oardodo ln -s '.$Cgroup_mount_point.' '.$Cgroup_directory_collection_links.'/cpuacct &&
oardodo ln -s '.$Cgroup_mount_point.' '.$Cgroup_directory_collection_links.'/devices &&
oardodo ln -s '.$Cgroup_mount_point.' '.$Cgroup_directory_collection_links.'/freezer &&
- oardodo ln -s '.$Cgroup_mount_point.' '.$Cgroup_directory_collection_links.'/blkio &&
[ "'.$ENABLE_MEMCG.'" = "YES" ] && oardodo ln -s '.$Cgroup_mount_point.' '.$Cgroup_directory_collection_links.'/memory || true
')){
exit_myself(4,"Failed to mount cgroup pseudo filesystem");
@@ -167,7 +166,6 @@
oardodo ln -s '.$OS_cgroups_path.'/cpuacct '.$Cgroup_directory_collection_links.'/cpuacct &&
oardodo ln -s '.$OS_cgroups_path.'/devices '.$Cgroup_directory_collection_links.'/devices &&
oardodo ln -s '.$OS_cgroups_path.'/freezer '.$Cgroup_directory_collection_links.'/freezer &&
- oardodo ln -s '.$OS_cgroups_path.'/blkio '.$Cgroup_directory_collection_links.'/blkio &&
[ "'.$ENABLE_MEMCG.'" = "YES" ] && oardodo ln -s '.$OS_cgroups_path.'/memory '.$Cgroup_directory_collection_links.'/memory || true
')){
exit_myself(4,"Failed to link existing OS cgroup pseudo filesystem");
@@ -183,8 +181,7 @@
done
/bin/echo 0 | cat > '.$Cgroup_directory_collection_links.'/cpuset/'.$Cpuset->{cpuset_path}.'/cpuset.cpu_exclusive &&
cat '.$Cgroup_directory_collection_links.'/cpuset/cpuset.mems > '.$Cgroup_directory_collection_links.'/cpuset/'.$Cpuset->{cpuset_path}.'/cpuset.mems &&
- cat '.$Cgroup_directory_collection_links.'/cpuset/cpuset.cpus > '.$Cgroup_directory_collection_links.'/cpuset/'.$Cpuset->{cpuset_path}.'/cpuset.cpus &&
- /bin/echo 1000 | cat > '.$Cgroup_directory_collection_links.'/blkio/'.$Cpuset->{cpuset_path}.'/blkio.weight
+ cat '.$Cgroup_directory_collection_links.'/cpuset/cpuset.cpus > '.$Cgroup_directory_collection_links.'/cpuset/'.$Cpuset->{cpuset_path}.'/cpuset.cpus
')){
exit_myself(4,"Failed to create cgroup $Cpuset->{cpuset_path}");
}
@@ -250,9 +247,6 @@
# TODO: Need to do more tests to validate so remove this feature
# Some values are not working when echoing
$IO_ratio = 1000;
- if (system( '/bin/echo '.$IO_ratio.' | cat > '.$Cgroup_directory_collection_links.'/blkio/'.$Cpuset_path_job.'/blkio.weight')){
- exit_myself(5,"Failed to set the blkio.weight to $IO_ratio");
- }
if ($ENABLE_DEVICESCG eq "YES"){
my @devices_deny = ();
|
Looks like "Blkio CG" should be an option as MEMCG, I'll ask Pierre about this. |
Yes, the inferface of the blockio cgroup changed a bit with recent kernels. Commenting the blockio lines in the job_resource_manager is the quick fix. The real fix would be to adapt to the latest blockio interface. PR welcomed. |
To reproduce:
... its using debian strech in oardocker if this is useful.
It is a guess that this is related to the docker version as before an update everything just worked fine.
The text was updated successfully, but these errors were encountered: