It has observed defunct containerd processes accumulating over
time while dockerd was permanently failing to restart containerd.
Due to a bug in the runContainerdDaemon() function, dockerd does not clean up
its child process if containerd already exits very soon after the (re)start.
The reproducer and analysis below comes from docker 1.12.x but bug
still applies on latest master.
- from libcontainerd/remote_linux.go:
329 func (r *remote) runContainerdDaemon() error {
:
: // start the containerd child process
:
403 if err := cmd.Start(); err != nil {
404 return err
405 }
:
: // If containerd exits very soon after (re)start, it is
possible
: // that containerd is already in defunct state at the time
when
: // dockerd gets here. The setOOMScore() function tries to
write
: // to /proc/PID_OF_CONTAINERD/oom_score_adj. However, this
fails
: // with errno EINVAL because containerd is defunct. Please see
: // snippets of kernel source code and further explanation
below.
:
407 if err := setOOMScore(cmd.Process.Pid, r.oomScore); err != nil
{
408 utils.KillProcess(cmd.Process.Pid)
:
: // Due to the error from write() we return here. As
the
: // goroutine that would clean up the child has not
been
: // started yet, containerd remains in the defunct
state
: // and never gets reaped.
:
409 return err
410 }
:
417 go func() {
418 cmd.Wait()
419 close(r.daemonWaitCh)
420 }() // Reap our child when needed
:
423 }
This is the kernel function that gets invoked when dockerd tries to
write
to /proc/PID_OF_CONTAINERD/oom_score_adj.
- from fs/proc/base.c:
1197 static ssize_t oom_score_adj_write(struct file *file, ...
1198 size_t count, loff_t
*ppos)
1199 {
:
1223 task = get_proc_task(file_inode(file));
:
: // The defunct containerd process does not have a virtual
: // address space anymore, i.e. task->mm is NULL. Thus the
: // following code returns errno EINVAL to dockerd.
:
1230 if (!task->mm) {
1231 err = -EINVAL;
1232 goto err_task_lock;
1233 }
:
1253 err_task_lock:
:
1257 return err < 0 ? err : count;
1258 }
The purpose of the following program is to demonstrate the behavior of
the oom_score_adj_write() function in connection with a defunct process.
$ cat defunct_test.c
\#include <unistd.h>
main()
{
pid_t pid = fork();
if (pid == 0)
// child
_exit(0);
// parent
pause();
}
$ make defunct_test
cc defunct_test.c -o defunct_test
$ ./defunct_test &
[1] 3142
$ ps -f | grep defunct_test | grep -v grep
root 3142 2956 0 13:04 pts/0 00:00:00 ./defunct_test
root 3143 3142 0 13:04 pts/0 00:00:00 [defunct_test] <defunct>
$ echo "ps 3143" | crash -s
PID PPID CPU TASK ST %MEM VSZ RSS COMM
3143 3142 2 ffff880035def300 ZO 0.0 0 0
defunct_test
$ echo "px ((struct task_struct *)0xffff880035def300)->mm" | crash -s
$1 = (struct mm_struct *) 0x0
^^^ task->mm is NULL
$ cat /proc/3143/oom_score_adj
0
$ echo 0 > /proc/3143/oom_score_adj
-bash: echo: write error: Invalid argument"
---
This patch fixes the above issue by making sure we start the reaper
goroutine as soon as possible.
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
(cherry picked from commit 27087eacbf96e6ef9d48a6d3dc89c7c1cff155b4)
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
Signed-off-by: Daniel Nephin <dnephin@docker.com>
(cherry picked from commit f1ade82d82e6436971c6b7d08eb1da57ed9ba756)
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
Signed-off-by: Ying Li <ying.li@docker.com>
(cherry picked from commit d60f18204978d438d1eb336512576d47991c8ac1)
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
If a container mount the socket the daemon is listening on into
container while the daemon is being shutdown, the socket will
not exist on the host, then daemon will assume it's a directory
and create it on the host, this will cause the daemon can't start
next time.
fix issue https://github.com/moby/moby/issues/30348
To reproduce this issue, you can add following code
```
--- a/daemon/oci_linux.go
+++ b/daemon/oci_linux.go
@@ -8,6 +8,7 @@ import (
"sort"
"strconv"
"strings"
+ "time"
"github.com/Sirupsen/logrus"
"github.com/docker/docker/container"
@@ -666,7 +667,8 @@ func (daemon *Daemon) createSpec(c *container.Container) (*libcontainerd.Spec, e
if err := daemon.setupIpcDirs(c); err != nil {
return nil, err
}
-
+ fmt.Printf("===please stop the daemon===\n")
+ time.Sleep(time.Second * 2)
ms, err := daemon.setupMounts(c)
if err != nil {
return nil, err
```
step1 run a container which has `--restart always` and `-v /var/run/docker.sock:/sock`
```
$ docker run -ti --restart always -v /var/run/docker.sock:/sock busybox
/ #
```
step2 exit the the container
```
/ # exit
```
and kill the daemon when you see
```
===please stop the daemon===
```
in the daemon log
The daemon can't restart again and fail with `can't create unix socket /var/run/docker.sock: is a directory`.
Signed-off-by: Lei Jitang <leijitang@huawei.com>
(cherry picked from commit 7318eba5b2f8bb4b867ca943c3229260ca98a3bc)
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
The health check process doesn't have all the environment
varialbes in the container or has them set incorrectly.
This patch should fix that problem.
Signed-off-by: Boaz Shuster <ripcurld.github@gmail.com>
(cherry picked from commit 5836d86ac4d617e837d94010aa60384648ab59ea)
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
If configs are declared for a service and pointing on an old
daemon, error out properly (instead of "page not found").
If there is no configs declared, don't call convertServiceConfigObjs
to avoid having an error.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
(cherry picked from commit cf5550c426)
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
If a service alias is copied to task, then the DNS resolution on the
service name will resolve to service VIP and all of Task-IPs and that
will break the concept of vip based load-balancing resulting in all the
dns-rr caching issues.
This is a regression introduced in #33130
Signed-off-by: Madhu Venugopal <madhu@docker.com>
(cherry picked from commit 38c15531501578b96d34be5ce7f33a0be6be078f)
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
(cherry picked from commit 097516fc76a2c9acf798dc523326904520ff1fc7)
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
(cherry picked from commit a1b7f6f3407546ff41ed2e0d16d7e1b2a8ac0ef4)
Conflicts:
components/packaging/static/Makefile
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
So we can use it at will
Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
(cherry picked from commit 533a843393bd7c3674074ec9af73c8e666fc7484)
Conflicts:
components/packaging/static/Makefile
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
(cherry picked from commit f6d30c3631abee04aa3586cba36960901f705fe2)
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
(cherry picked from commit ade95a2b6f8a16aa7e6c387dc9c9dd48bba9a672)
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
(cherry picked from commit e14fca86f8230e10d79a7bfae2873a98a16bb44d)
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
Because of cherry-pick from commit 17ec46a24316f59c808c112e3ca46d7c442a785a
into components/cli/vendor/github.com/docker/docker
Signed-off-by: Tibor Vass <tibor@docker.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
(cherry picked from commit 17ec46a24316f59c808c112e3ca46d7c442a785a)
Manually update pkg/term vendoring to match changes in moby
Signed-off-by: Tibor Vass <tibor@docker.com>
Add armhf dockerfiles for deb building
Signed-off-by: Eli Uriegas <seemethere101@gmail.com>
(cherry picked from commit f0c8cea1b79b049743cd1503f7ac4a34c265f476)
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>