Signed-off-by: John Howard <jhoward@microsoft.com>
While debugging #32838, it was found (https://github.com/moby/moby/issues/32838#issuecomment-356005845) that the utility VM in some circumstances was crashing. Unfortunately, this was silently thrown away, and as far as the build step (also applies to docker run) was concerned, the exit code was zero and the error was thrown away. Windows containers operate differently to containers on Linux, and there can be legitimate system errors during container shutdown after the init process exits. This PR handles this and passes the error all the way back to the client, and correctly causes a build step running a container which hits a system error to fail, rather than blindly trying to keep going, assuming all is good, and get a subsequent failure on a commit.
With this change, assuming an error occurs, here's an example of a failure which previous was reported as a commit error:
```
The command 'powershell -Command $ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue'; Install-WindowsFeature -Name Web-App-Dev ; Install-WindowsFeature -Name ADLDS; Install-WindowsFeature -Name Web-Mgmt-Compat; Install-WindowsFeature -Name Web-Mgmt-Service; Install-WindowsFeature -Name Web-Metabase; Install-WindowsFeature -Name Web-Lgcy-Scripting; Install-WindowsFeature -Name Web-WMI; Install-WindowsFeature -Name Web-WHC; Install-WindowsFeature -Name Web-Scripting-Tools; Install-WindowsFeature -Name Web-Net-Ext45; Install-WindowsFeature -Name Web-ASP; Install-WindowsFeature -Name Web-ISAPI-Ext; Install-WindowsFeature -Name Web-ISAPI-Filter; Install-WindowsFeature -Name Web-Default-Doc; Install-WindowsFeature -Name Web-Dir-Browsing; Install-WindowsFeature -Name Web-Http-Errors; Install-WindowsFeature -Name Web-Static-Content; Install-WindowsFeature -Name Web-Http-Redirect; Install-WindowsFeature -Name Web-DAV-Publishing; Install-WindowsFeature -Name Web-Health; Install-WindowsFeature -Name Web-Http-Logging; Install-WindowsFeature -Name Web-Custom-Logging; Install-WindowsFeature -Name Web-Log-Libraries; Install-WindowsFeature -Name Web-Request-Monitor; Install-WindowsFeature -Name Web-Http-Tracing; Install-WindowsFeature -Name Web-Stat-Compression; Install-WindowsFeature -Name Web-Dyn-Compression; Install-WindowsFeature -Name Web-Security; Install-WindowsFeature -Name Web-Windows-Auth; Install-WindowsFeature -Name Web-Basic-Auth; Install-WindowsFeature -Name Web-Url-Auth; Install-WindowsFeature -Name Web-WebSockets; Install-WindowsFeature -Name Web-AppInit; Install-WindowsFeature -Name NET-WCF-HTTP-Activation45; Install-WindowsFeature -Name NET-WCF-Pipe-Activation45; Install-WindowsFeature -Name NET-WCF-TCP-Activation45;' returned a non-zero code: 4294967295: container shutdown failed: container ba9c65054d42d4830fb25ef55e4ab3287550345aa1a2bb265df4e5bfcd79c78a encountered an error during WaitTimeout: failure in a Windows system call: The compute system exited unexpectedly. (0xc0370106)
```
Without this change, it would be incorrectly reported such as in this comment: https://github.com/moby/moby/issues/32838#issuecomment-309621097
```
Step 3/8 : ADD buildtools C:/buildtools
re-exec error: exit status 1: output: time="2017-06-20T11:37:38+10:00" level=error msg="hcsshim::ImportLayer failed in Win32: The system cannot find the path specified. (0x3) layerId=\\\\?\\C:\\ProgramData\\docker\\windowsfilter\\b41d28c95f98368b73fc192cb9205700e21
6691495c1f9ac79b9b04ec4923ea2 flavour=1 folder=C:\\Windows\\TEMP\\hcs232661915"
hcsshim::ImportLayer failed in Win32: The system cannot find the path specified. (0x3) layerId=\\?\C:\ProgramData\docker\windowsfilter\b41d28c95f98368b73fc192cb9205700e216691495c1f9ac79b9b04ec4923ea2 flavour=1 folder=C:\Windows\TEMP\hcs232661915
```
Upstream-commit: 8c52560ea4593935322c1d056124be44e234b934
Component: engine
Currently, if a container removal has failed for some reason,
any client waiting for removal (e.g. `docker run --rm`) is
stuck, waiting for removal to succeed while it has failed already.
For more details and the reproducer, please check
https://github.com/moby/moby/issues/34945
This commit addresses that by allowing `ContainerWait()` with
`container.WaitCondition == "removed"` argument to return an
error in case of removal failure. The `ContainerWaitOKBody`
stucture returned to a client is amended with a pointer to `struct Error`,
containing an error message string, and the `Client.ContainerWait()`
is modified to return the error, if any, to the client.
Note that this feature is only available for API version >= 1.34.
In order for the old clients to be unstuck, we just close the connection
without writing anything -- this causes client's error.
Now, docker-cli would need a separate commit to bump the API to 1.34
and to show an error returned, if any.
[v2: recreate the waitRemove channel after closing]
[v3: document; keep legacy behavior for older clients]
[v4: convert Error from string to pointer to a struct]
[v5: don't emulate old behavior, send empty response in error case]
[v6: rename legacy* vars to include version suffix]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: f963500c544daa3c158c0ca3d2985295c875cb6b
Component: engine
Take an extra reference to rwlayer while the container is being
committed or exported to avoid the removal of that layer.
Also add some checks before commit/export.
Signed-off-by: Yuanhong Peng <pengyuanhong@huawei.com>
Upstream-commit: 8c32659979150630a2c4eae4e7da944806c46297
Component: engine
Do not change pause state when restoring container's
status, or status in docker will be different with
status in runc.
Signed-off-by: Fengtu Wang <wangfengtu@huawei.com>
Upstream-commit: 977c4046fd2147d7c04f4b513a94138013ca0dd6
Component: engine
Description:
1. start a container with restart=always.
`docker run -d --restart=always ubuntu sleep 3`
2. container init process exits.
3. use `docker pause <id>` to pause this container.
if the pause action is before cgroup data is removed and after the init process died.
`Pause` operation will success to write cgroup data, but actually do not freeze any process.
And then docker received pause event and stateExit event from
containerd, the docker state will be Running(paused), but the container
is free running.
Then we can not remove it, stop it , pause it and unpause it.
Signed-off-by: Wentao Zhang <zhangwentao234@huawei.com>
Upstream-commit: fe1b4cfba6320793373c5397641d743d9fe94cf8
Component: engine
Commit abd72d4008dde7ee8249170d49eb4bc963c51e24 added
a "FIXME" comment to the container "State", mentioning
that a container cannot be both "Running" and "Paused".
This comment was incorrect, because containers on
Linux actually _must_ be running in order to be
paused.
This patch adds additional information both in a
comment, and in the API documentation to clarify
that these booleans are not mutually exclusive.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: b654b6244d6d63a0758029488a95feb446e089bd
Component: engine
This patch adds the untilRemoved option to the ContainerWait API which
allows the client to wait until the container is not only exited but
also removed.
This patch also adds some more CLI integration tests for waiting for a
created container and waiting with the new --until-removed flag.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Handle detach sequence in CLI
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Update Container Wait Conditions
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Apply container wait changes to API 1.30
The set of changes to the containerWait API missed the cut for the
Docker 17.05 release (API version 1.29). This patch bumps the version
checks to use 1.30 instead.
This patch also makes a minor update to a testfile which was added to
the builder/dockerfile package.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Remove wait changes from CLI
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Address minor nits on wait changes
- Changed the name of the tty Proxy wrapper to `escapeProxy`
- Removed the unnecessary Error() method on container.State
- Fixes a typo in comment (repeated word)
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Use router.WithCancel in the containerWait handler
This handler previously added this functionality manually but now uses
the existing wrapper which does it for us.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Add WaitCondition constants to api/types/container
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Address more ContainerWait review comments
- Update ContainerWait backend interface to not return pointer values
for container.StateStatus type.
- Updated container state's Wait() method comments to clarify that a
context MUST be used for cancelling the request, setting timeouts,
and to avoid goroutine leaks.
- Removed unnecessary buffering when making channels in the client's
ContainerWait methods.
- Renamed result and error channels in client's ContainerWait methods
to clarify that only a single result or error value would be sent
on the channel.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Move container.WaitCondition type to separate file
... to avoid conflict with swagger-generated code for API response
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Address more ContainerWait review comments
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Upstream-commit: 4921171587c09d0fcd8086a62a25813332f44112
Component: engine
This patch consolidates the two WaitStop and WaitWithContext methods
on the container.State type. Now there is a single method, Wait, which
takes a context and a bool specifying whether to wait for not just a
container exit but also removal.
The behavior has been changed slightly so that a wait call during a
Created state will not return immediately but instead wait for the
container to be started and then exited.
The interface has been changed to no longer block, but instead returns
a channel on which the caller can receive a *StateStatus value which
indicates the ExitCode or an error if there was one (like a context
timeout or state transition error).
These changes have been propagated through the rest of the deamon to
preserve all other existing behavior.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Upstream-commit: cfdf84d5d04c8ee656e5c4ad3db993c258e52674
Component: engine
This removes the SetStoppedLocking, and
SetRestartingLocking functions, which
were not used anywhere.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: a28c389da109808c5b39da02fdfd24b9e36137fe
Component: engine
Those are needed in order to reload their value upon docker daemon
restart.
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Upstream-commit: 2998945a54577e24a6414d576bc861e58fa87359
Component: engine
As described in our ROADMAP.md, introduce new Swarm management API
endpoints relying on swarmkit to deploy services. It currently vendors
docker/engine-api changes.
This PR is fully backward compatible (joining a Swarm is an optional
feature of the Engine, and existing commands are not impacted).
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Upstream-commit: 534a90a99367af6f6bba1ddcc7eb07506e41f774
Component: engine
This PR adds support for user-defined health-check probes for Docker
containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus
some corresponding "docker run" options. It can be used with a restart policy
to automatically restart a container if the check fails.
The `HEALTHCHECK` instruction has two forms:
* `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container)
* `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image)
The `HEALTHCHECK` instruction tells Docker how to test a container to check that
it is still working. This can detect cases such as a web server that is stuck in
an infinite loop and unable to handle new connections, even though the server
process is still running.
When a container has a healthcheck specified, it has a _health status_ in
addition to its normal status. This status is initially `starting`. Whenever a
health check passes, it becomes `healthy` (whatever state it was previously in).
After a certain number of consecutive failures, it becomes `unhealthy`.
The options that can appear before `CMD` are:
* `--interval=DURATION` (default: `30s`)
* `--timeout=DURATION` (default: `30s`)
* `--retries=N` (default: `1`)
The health check will first run **interval** seconds after the container is
started, and then again **interval** seconds after each previous check completes.
If a single run of the check takes longer than **timeout** seconds then the check
is considered to have failed.
It takes **retries** consecutive failures of the health check for the container
to be considered `unhealthy`.
There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list
more than one then only the last `HEALTHCHECK` will take effect.
The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK
CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands;
see e.g. `ENTRYPOINT` for details).
The command's exit status indicates the health status of the container.
The possible values are:
- 0: success - the container is healthy and ready for use
- 1: unhealthy - the container is not working correctly
- 2: starting - the container is not ready for use yet, but is working correctly
If the probe returns 2 ("starting") when the container has already moved out of the
"starting" state then it is treated as "unhealthy" instead.
For example, to check every five minutes or so that a web-server is able to
serve the site's main page within three seconds:
HEALTHCHECK --interval=5m --timeout=3s \
CMD curl -f http://localhost/ || exit 1
To help debug failing probes, any output text (UTF-8 encoded) that the command writes
on stdout or stderr will be stored in the health status and can be queried with
`docker inspect`. Such output should be kept short (only the first 4096 bytes
are stored currently).
When the health status of a container changes, a `health_status` event is
generated with the new status. The health status is also displayed in the
`docker ps` output.
Signed-off-by: Thomas Leonard <thomas.leonard@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: b6c7becbfe1d76b1250f6d8e991e645e13808a9c
Component: engine
Remove function `WaitRunning` because it's actually not necessary, also
remove wait channel for state "running" to avoid mixed use of the state
wait channel.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Upstream-commit: a0191a23419121544a2bae941970ff09a0d272bb
Component: engine
Moving all strings to the errors package wasn't a good idea after all.
Our custom implementation of Go errors predates everything that's nice
and good about working with errors in Go. Take as an example what we
have to do to get an error message:
```go
func GetErrorMessage(err error) string {
switch err.(type) {
case errcode.Error:
e, _ := err.(errcode.Error)
return e.Message
case errcode.ErrorCode:
ec, _ := err.(errcode.ErrorCode)
return ec.Message()
default:
return err.Error()
}
}
```
This goes against every good practice for Go development. The language already provides a simple, intuitive and standard way to get error messages, that is calling the `Error()` method from an error. Reinventing the error interface is a mistake.
Our custom implementation also makes very hard to reason about errors, another nice thing about Go. I found several (>10) error declarations that we don't use anywhere. This is a clear sign about how little we know about the errors we return. I also found several error usages where the number of arguments was different than the parameters declared in the error, another clear example of how difficult is to reason about errors.
Moreover, our custom implementation didn't really make easier for people to return custom HTTP status code depending on the errors. Again, it's hard to reason about when to set custom codes and how. Take an example what we have to do to extract the message and status code from an error before returning a response from the API:
```go
switch err.(type) {
case errcode.ErrorCode:
daError, _ := err.(errcode.ErrorCode)
statusCode = daError.Descriptor().HTTPStatusCode
errMsg = daError.Message()
case errcode.Error:
// For reference, if you're looking for a particular error
// then you can do something like :
// import ( derr "github.com/docker/docker/errors" )
// if daError.ErrorCode() == derr.ErrorCodeNoSuchContainer { ... }
daError, _ := err.(errcode.Error)
statusCode = daError.ErrorCode().Descriptor().HTTPStatusCode
errMsg = daError.Message
default:
// This part of will be removed once we've
// converted everything over to use the errcode package
// FIXME: this is brittle and should not be necessary.
// If we need to differentiate between different possible error types,
// we should create appropriate error types with clearly defined meaning
errStr := strings.ToLower(err.Error())
for keyword, status := range map[string]int{
"not found": http.StatusNotFound,
"no such": http.StatusNotFound,
"bad parameter": http.StatusBadRequest,
"conflict": http.StatusConflict,
"impossible": http.StatusNotAcceptable,
"wrong login/password": http.StatusUnauthorized,
"hasn't been activated": http.StatusForbidden,
} {
if strings.Contains(errStr, keyword) {
statusCode = status
break
}
}
}
```
You can notice two things in that code:
1. We have to explain how errors work, because our implementation goes against how easy to use Go errors are.
2. At no moment we arrived to remove that `switch` statement that was the original reason to use our custom implementation.
This change removes all our status errors from the errors package and puts them back in their specific contexts.
IT puts the messages back with their contexts. That way, we know right away when errors used and how to generate their messages.
It uses custom interfaces to reason about errors. Errors that need to response with a custom status code MUST implementent this simple interface:
```go
type errorWithStatus interface {
HTTPErrorStatusCode() int
}
```
This interface is very straightforward to implement. It also preserves Go errors real behavior, getting the message is as simple as using the `Error()` method.
I included helper functions to generate errors that use custom status code in `errors/errors.go`.
By doing this, we remove the hard dependency we have eeverywhere to our custom errors package. Yes, you can use it as a helper to generate error, but it's still very easy to generate errors without it.
Please, read this fantastic blog post about errors in Go: http://dave.cheney.net/2014/12/24/inspecting-errors
Signed-off-by: David Calavera <david.calavera@gmail.com>
Upstream-commit: a793564b2591035aec5412fbcbcccf220c773a4c
Component: engine
Add `--restart` flag for `update` command, so we can change restart
policy for a container no matter it's running or stopped.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Upstream-commit: ff3ea4c90f2ede5cccc6b49c4d2aad7201c91a4c
Component: engine
Currently if we exec a restarting container, client will fail silently,
and daemon will print error that container can't be found which is not a
very meaningful prompt to user.
This commit will stop user from exec a restarting container and gives
more explicit error message.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Upstream-commit: 1d2208fed90da9ab72ded3b1c65bb2a71b66ce93
Component: engine
Don't involve code waiting for blocking channel in locked critical
section because it has potential risk of hanging forever.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Upstream-commit: 1326f0cba5f933674e23769de1385d3b0841e758
Component: engine
So other packages don't need to import the daemon package when they
want to use this struct.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Signed-off-by: Tibor Vass <tibor@docker.com>
Upstream-commit: 6bb0d1816acd8d4f7a542a6aac047da2b874f476
Component: engine