Commit Graph

102 Commits

Author SHA1 Message Date
0c4a3deec0 Merge pull request #25072 from mlaventure/oos-libcontainerd-client
Handle out-of-sync libcontainerd client on restore
Upstream-commit: 6401bd65b11931a27a6d2e1d3b6a9278ed4e8fc7
Component: engine
2016-08-05 14:23:25 -07:00
c1636749f7 libcontainerd: mark container exited after failed restart
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Upstream-commit: 9be0fb45c25e4d8d3cf0aa444da5ae41dd18f435
Component: engine
2016-08-03 17:44:30 -07:00
3eeb6b8c6e libcontainerd: wait for restart after state change
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Upstream-commit: 495448b2903c1a765cc17dff05afebe16a466917
Component: engine
2016-08-03 15:28:07 -07:00
8e4d2991bb Handle out-of-sync libcontainerd client on restore
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Upstream-commit: 51f21a1674c60108f97878815046c69f769cee48
Component: engine
2016-07-28 11:26:07 -07:00
665067dce6 Check if the container is running if no event
When there is no event for the container it can happen because of a
crash and the container state on the persistent disk will have a
mismatch between what was in `/run` ( machine crash ).

This situation will create an unkillable container in docker because
containerd does not see it and it is not running but docker thinks it is
and you cannot tell it anything different.

This fixes the issue by checking if containerd has the container running
if we do not have an event instead of just returning.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Upstream-commit: 2650b1b6abd87d7d888e27abd6110dea83dcd080
Component: engine
2016-07-28 11:17:02 -07:00
e4e65a85a5 Fix daemon panic on restoring containers
Signed-off-by: Lei Jitang <leijitang@huawei.com>
Upstream-commit: c75de8e33cc0db5236eef6146f2de06533b46aa8
Component: engine
2016-07-26 22:52:52 -04:00
0000093b60 Fix missing unlock in libcontainerd.Restore()
This was preventing the "exit" event to be correctly processed during
the restore process without live-restore enabled.

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Upstream-commit: ac068a1f9de2b20b145b5682cd514c1f6b1fac17
Component: engine
2016-07-22 15:21:10 -07:00
1dd4a36c0e Prepend libcontainerd log message with "libcontainerd:"
This will make it easier to pinpoint error messages in the daemon
logs.

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Upstream-commit: 5231c5534679206e20672ca16bbee5c10d699319
Component: engine
2016-07-22 15:20:14 -07:00
ac1b563dd3 Update libcontainerd.AddProcess to accept a context
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Upstream-commit: c02f82756e914081543bf05cb1815a48c02b1ebd
Component: engine
2016-07-19 08:24:39 -07:00
32ecbd59e9 Do not rely on "live" event anymore
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Upstream-commit: 64483c3bdaa1887b8b932e0564362fbbff025dc0
Component: engine
2016-07-19 08:24:39 -07:00
7262ef8faa Vendor in new containerd
This version introduces the following:
 - uses nanosecond timestamps for event
 - ensure events are sent once their effect is "live"

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Upstream-commit: 29b2714580d085533c29807fa337c2b7a302abb6
Component: engine
2016-07-18 11:44:24 -07:00
918edfff9f Wait for the reader fifo opening to block
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Upstream-commit: 0b2023130e285a0207be9fda4b22e1419997c552
Component: engine
2016-07-14 10:14:53 -07:00
8de09a35ce Merge pull request #24593 from mlaventure/fix-libcontainerd-data-race
Fix data race in libcontainerd
Upstream-commit: 0a96ba8a0f6b98d82fe2f8f4e07838785cb8d708
Component: engine
2016-07-14 17:27:24 +02:00
8a49e1f925 Fix data race in libcontainerd
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Upstream-commit: 8e9fbc8f5fc5759eb7f26ec998f227994ff6c642
Component: engine
2016-07-13 10:04:42 -07:00
53c33cc1be Add --oom-score-adjust to daemon
This adds an `--oom-score-adjust` flag to the daemon so that the value
provided can be set for the docker daemon's process.  The default value
for the flag is -500.  This will allow the docker daemon to have a
less chance of being killed before containers do.  The default value for
processes is 0 with a min/max of -1000/1000.

-500 is a good middle ground because it is less than the default for
most processes and still not -1000 which basically means never kill this
process in an OOM condition on the host machine.  The only processes on
my machine that have a score less than -500 are dbus at -900 and sshd
and xfce( my window manager ) at -1000.  I don't think docker should be
set lower, by default, than dbus or sshd so that is why I chose -500.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Upstream-commit: a894aec8d81de5484152a76d76b80809df9edd71
Component: engine
2016-07-12 15:53:15 -07:00
b02186b3ad Merge pull request #24156 from Microsoft/jjh/clearbasefs
Windows: Clear volume path for Hyper-V containers
Upstream-commit: efcf24f0c41412c196390d7208f908d6fc7b9ed6
Component: engine
2016-07-07 15:35:14 -07:00
0bdbf282f3 Fix spelling in comments, strings and documentation
Signed-off-by: Otto Kekäläinen <otto@seravo.fi>
Upstream-commit: 644a7426cc31c338fedb6574d2b88d1cc2f43a08
Component: engine
2016-07-03 20:58:11 +03:00
37fb9cfa09 Windows: Ensure VolumePath is not set for Hyper-V containers
Signed-off-by: John Howard <jhoward@microsoft.com>
Upstream-commit: fd4f5c23650799a7e76e193614bf82454b375fe3
Component: engine
2016-06-29 16:49:09 -07:00
5fab4e7492 Windows: Disable VM cloning for TP5 image
The Windows TP5 image is not compatible with the Hyper-V isolated
container clone feature. Detect old images and pass a flag specifying that
clone should not be enabled.

Signed-off-by: John Starks <jostarks@microsoft.com>
Upstream-commit: 8e3432225357128fc135c8c3cf0318bd944c0c3b
Component: engine
2016-06-24 16:12:44 -07:00
fed645b1f8 Merge pull request #23862 from LK4D4/fix_unused
all: fix usage of some variables
Upstream-commit: c9175a6deb70887afc757702a69bf750b9668fd4
Component: engine
2016-06-23 10:21:10 -07:00
ebb89af810 Merge pull request #23776 from Microsoft/ShutdownError
Windows: Prevent logging errors when shutting down an already shut down container
Upstream-commit: 138f9538f3a740ef56b1a6cd43ae537a78f4d896
Component: engine
2016-06-22 12:11:00 -07:00
9e92fb5474 all: fix usage of some variables
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
Upstream-commit: 57e14714ee85e67f59d8c22aed23dc875cf2e58c
Component: engine
2016-06-22 10:40:32 -07:00
34ef68e15d Added stateinfo to WaitExit info to aid debugging daemon hangs
Signed-off-by: Darren Stahl <darst@microsoft.com>
Upstream-commit: 9aa9bda1780b68946582e36892703fbb4f5f892d
Component: engine
2016-06-21 12:41:08 -07:00
da68531167 Windows: Prevent logging errors when shutting down an already shut down container
Signed-off-by: Darren Stahl <darst@microsoft.com>
Upstream-commit: 79060e821228b9e86bbd9e9212756a61b0901c11
Component: engine
2016-06-20 16:57:08 -07:00
69096baccb Merge pull request #23532 from swernli/exitCodeFix
Fixing exit code return on error case in Windows.
Upstream-commit: a590a6b1800778c86f6649b64283d29d14d83024
Component: engine
2016-06-16 19:01:18 -07:00
b47fcaaeab Fixing exit code return on error case in Windows.
Right now, if we hit an error retrieving the exit code in HCS process.ExitCode, we return that 0 and that error.  Golang convention says that if an error is returned the other values should not be used, but the caller of ExitCode in libcontainerd has to fall through if an error is received.  Rather than return a success exit code in that failure case, we should return -1 to indicate a generic failure.

Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
Upstream-commit: 17c1b9c061139a2655252f6fb5e36f616a8c5f5e
Component: engine
2016-06-14 10:19:55 -07:00
811cef6ca3 Add support for multiples runtimes
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Upstream-commit: 7b2e5216b89b4c454d67473f1fa06c52a4624680
Component: engine
2016-06-14 07:47:31 -07:00
9ada0cfc5f Merge pull request #23213 from crosbymichael/restore-option
Add --live-restore flag
Upstream-commit: 3020081e94277410984c62d12f88de3d4f258681
Component: engine
2016-06-13 20:57:19 -07:00
7af900395b Add --live-restore flag
This flags enables full support of daemonless containers in docker.  It
ensures that docker does not stop containers on shutdown or restore and
properly reconnects to the container when restarted.

This is not the default because of backwards compat but should be the
desired outcome for people running containers in prod.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Upstream-commit: d705dab1b1bd0a946d647374325d61fac57736db
Component: engine
2016-06-13 19:16:26 -07:00
0a17038359 Merge pull request #23443 from swernli/servicing-async
Updating call sequence for servicing Windows containers
Upstream-commit: 50c7bcac1e22a6a3dd39bec4136aa96136f56eb2
Component: engine
2016-06-13 19:49:23 +02:00
763e6c326e *: fix logrus.Warn[f]
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
Upstream-commit: 44ccbb317c2ca67fd8a88147b1ff80ce83d453cc
Component: engine
2016-06-11 19:42:38 +02:00
28d7534bc7 Updating call sequence for servicing Windows containers
This change adjusts the calling pattern for servcing containers to use waiting on the process instead of expecting start to block.  This is safer, as it avoids timeouts in the start code path for the potentially expensive update operation.

Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
Upstream-commit: f2ad7be2c4aa13413d539887e8c13fb47bea7254
Component: engine
2016-06-10 15:19:10 -07:00
743a9e8b07 Increase containerd start-timeout to 2 minutes
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
(cherry picked from commit 4251e1e99e16ff7ff5557ee16e5bef26a14cd127)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 64a91ee74e73c956e92801447ae73ba82d168ed5
Component: engine
2016-06-10 16:22:19 +02:00
79ea898035 Fix some typos
Signed-off-by: Jannick Fahlbusch <git@jf-projects.de>
Upstream-commit: e3490cdcc0e2b1e4c4da125626430016a3048128
Component: engine
2016-06-08 21:59:34 +02:00
4524589dc5 Add support for user-defined healthchecks
This PR adds support for user-defined health-check probes for Docker
containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus
some corresponding "docker run" options. It can be used with a restart policy
to automatically restart a container if the check fails.

The `HEALTHCHECK` instruction has two forms:

* `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container)
* `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image)

The `HEALTHCHECK` instruction tells Docker how to test a container to check that
it is still working. This can detect cases such as a web server that is stuck in
an infinite loop and unable to handle new connections, even though the server
process is still running.

When a container has a healthcheck specified, it has a _health status_ in
addition to its normal status. This status is initially `starting`. Whenever a
health check passes, it becomes `healthy` (whatever state it was previously in).
After a certain number of consecutive failures, it becomes `unhealthy`.

The options that can appear before `CMD` are:

* `--interval=DURATION` (default: `30s`)
* `--timeout=DURATION` (default: `30s`)
* `--retries=N` (default: `1`)

The health check will first run **interval** seconds after the container is
started, and then again **interval** seconds after each previous check completes.

If a single run of the check takes longer than **timeout** seconds then the check
is considered to have failed.

It takes **retries** consecutive failures of the health check for the container
to be considered `unhealthy`.

There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list
more than one then only the last `HEALTHCHECK` will take effect.

The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK
CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands;
see e.g. `ENTRYPOINT` for details).

The command's exit status indicates the health status of the container.
The possible values are:

- 0: success - the container is healthy and ready for use
- 1: unhealthy - the container is not working correctly
- 2: starting - the container is not ready for use yet, but is working correctly

If the probe returns 2 ("starting") when the container has already moved out of the
"starting" state then it is treated as "unhealthy" instead.

For example, to check every five minutes or so that a web-server is able to
serve the site's main page within three seconds:

    HEALTHCHECK --interval=5m --timeout=3s \
      CMD curl -f http://localhost/ || exit 1

To help debug failing probes, any output text (UTF-8 encoded) that the command writes
on stdout or stderr will be stored in the health status and can be queried with
`docker inspect`. Such output should be kept short (only the first 4096 bytes
are stored currently).

When the health status of a container changes, a `health_status` event is
generated with the new status. The health status is also displayed in the
`docker ps` output.

Signed-off-by: Thomas Leonard <thomas.leonard@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: b6c7becbfe1d76b1250f6d8e991e645e13808a9c
Component: engine
2016-06-02 23:58:34 +02:00
3cbc347e63 Merge pull request #23148 from mlaventure/wait-for-containerd-before-restarting-it
Wait for containerd to die before restarting it
Upstream-commit: cb36dddad150a3bc0986736a877c8bdfcfbd346c
Component: engine
2016-06-01 10:35:31 -07:00
e66ae89f7b Merge pull request #23142 from Microsoft/ExtraCleanup
Windows: Remove a double free on hcs container handle
Upstream-commit: bcf0c8ca2883867ba7dcec4824a64359ee7cab12
Component: engine
2016-06-01 11:09:06 -04:00
a612268e81 Wait for containerd to die before restarting it
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Upstream-commit: ce160b37e15ddb86c45314d080718f833e551aa3
Component: engine
2016-06-01 07:45:03 -07:00
9c661aedc4 Merge pull request #22989 from Microsoft/StartCleanup
Windows: Adding missing cleanup call when container start fails
Upstream-commit: c7aba69cc10faee84ba877ad4a94e4e150cb0932
Component: engine
2016-06-01 15:42:47 +02:00
b9c6d22ba9 Set --state-dir on containerd.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Upstream-commit: 8b5e5c61955eba9af7c2975b959c4f4517485389
Component: engine
2016-05-31 11:48:05 -07:00
1ed90a29aa Windows: Remove a double free on hcs container handle
Signed-off-by: Darren Stahl <darst@microsoft.com>
Upstream-commit: c8454394f76865777d4cd013c606d17dcd0f9600
Component: engine
2016-05-31 10:25:38 -07:00
7a3282f5c8 Windows: Adding missing cleanup call when container start fails
Signed-off-by: Darren Stahl <darst@microsoft.com>
Upstream-commit: 054992e2913cf10171eecb5f41e5c19158cf04bc
Component: engine
2016-05-31 10:19:05 -07:00
c9cc850112 Fix a leaked process handle of the first container to start on Windows
Signed-off-by: Darren Stahl <darst@microsoft.com>
Upstream-commit: 717209c9ffc5caa4782dfda39e7be9a7b581a42c
Component: engine
2016-05-25 21:33:50 -07:00
37e80ffc41 Windows: Use image version, not OS version for TTY fixup
A previous change added a TTY fixup for stdin on older Windows versions to
work around a Windows issue with backspace/delete behavior. This change
used the OS version to determine whether to activate the behavior.
However, the Windows bug is actually in the image, not the OS, so it
should have used the image's OS version.

This ensures that a Server TP5 container running on Windows 10 will have
reasonable console behavior.

Signed-off-by: John Starks <jostarks@microsoft.com>
Upstream-commit: 6508c015fe764fd59438cabffcbc6102c9cf04ef
Component: engine
2016-05-25 12:22:52 -07:00
ecdb255cd3 Merge pull request #22958 from Microsoft/hcs_rpc
Windows: Use the new HCS RPC API
Upstream-commit: c7ee50308290d56b70933dfd83bd70e3a9df93d5
Component: engine
2016-05-25 09:25:22 -07:00
d8cc018311 Change Docker to use the new HCS RPC API
Signed-off-by: Darren Stahl <darst@microsoft.com>
Upstream-commit: 959c1a52bf11dd6b3e65f10bbaa867bfabba6838
Component: engine
2016-05-24 16:36:51 -07:00
12513175c8 Merge pull request #22091 from amitkris/build_solaris
Get the Docker Engine to build clean on Solaris
Upstream-commit: 86a7632d63bdddb95aaf1472648056a4fb737d38
Component: engine
2016-05-24 21:41:36 +02:00
d6f4430048 Merge pull request #22541 from crosbymichael/graph-restore
Implement graph driver restore on reboot
Upstream-commit: d7dfe9103bfc275494d936a5d89f3067b0aedbc9
Component: engine
2016-05-23 22:57:23 -07:00
3a35464d9d Get the Docker Engine to build clean on Solaris
Signed-off-by: Amit Krishnan <krish.amit@gmail.com>
Upstream-commit: 86d8758e2bb5e9d21d454ceda90b33feb8e74771
Component: engine
2016-05-23 16:37:12 -07:00
e6822e5504 Remove restart test
This test is not applicable anymore now that containers are not stopped
when the daemon is restored.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Upstream-commit: 31e903b0e17d01a4240f7890218a80088d32658c
Component: engine
2016-05-23 15:57:23 -07:00