Currently, if a container removal has failed for some reason,
any client waiting for removal (e.g. `docker run --rm`) is
stuck, waiting for removal to succeed while it has failed already.
For more details and the reproducer, please check
https://github.com/moby/moby/issues/34945
This commit addresses that by allowing `ContainerWait()` with
`container.WaitCondition == "removed"` argument to return an
error in case of removal failure. The `ContainerWaitOKBody`
stucture returned to a client is amended with a pointer to `struct Error`,
containing an error message string, and the `Client.ContainerWait()`
is modified to return the error, if any, to the client.
Note that this feature is only available for API version >= 1.34.
In order for the old clients to be unstuck, we just close the connection
without writing anything -- this causes client's error.
Now, docker-cli would need a separate commit to bump the API to 1.34
and to show an error returned, if any.
[v2: recreate the waitRemove channel after closing]
[v3: document; keep legacy behavior for older clients]
[v4: convert Error from string to pointer to a struct]
[v5: don't emulate old behavior, send empty response in error case]
[v6: rename legacy* vars to include version suffix]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: f963500c544daa3c158c0ca3d2985295c875cb6b
Component: engine
The shutdown timeout for containers in insufficient on Windows. If the daemon is shutting down, and a container takes longer than expected to shut down, this can cause the container to remain in a bad state after restart, and never be able to start again. Increasing the timeout makes this less likely to occur.
Signed-off-by: Darren Stahl <darst@microsoft.com>
Upstream-commit: ed74ee127f42f32ee98be7b908e1562b1c0554d7
Component: engine
Signed-off-by: John Howard <jhoward@microsoft.com>
This PR has the API changes described in https://github.com/moby/moby/issues/34617.
Specifically, it adds an HTTP header "X-Requested-Platform" which is a JSON-encoded
OCI Image-spec `Platform` structure.
In addition, it renames (almost all) uses of a string variable platform (and associated)
methods/functions to os. This makes it much clearer to disambiguate with the swarm
"platform" which is really os/arch. This is a stepping stone to getting the daemon towards
fully multi-platform/arch-aware, and makes it clear when "operating system" is being
referred to rather than "platform" which is misleadingly used - sometimes in the swarm
meaning, but more often as just the operating system.
Upstream-commit: 0380fbff37922cadf294851b1546f4c212c7f364
Component: engine
Take an extra reference to rwlayer while the container is being
committed or exported to avoid the removal of that layer.
Also add some checks before commit/export.
Signed-off-by: Yuanhong Peng <pengyuanhong@huawei.com>
Upstream-commit: 8c32659979150630a2c4eae4e7da944806c46297
Component: engine
The promise package represents a simple enough concurrency pattern that
replicating it in place is sufficient. To end the propagation of this
package, it has been removed and the uses have been inlined.
While this code could likely be refactored to be simpler without the
package, the changes have been minimized to reduce the possibility of
defects. Someone else may want to do further refactoring to remove
closures and reduce the number of goroutines in use.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Upstream-commit: 0cd4ab3f9a3f242468484fc62b46e632fdba5e13
Component: engine
This enables docker cp and ADD/COPY docker build support for LCOW.
Originally, the graphdriver.Get() interface returned a local path
to the container root filesystem. This does not work for LCOW, so
the Get() method now returns an interface that LCOW implements to
support copying to and from the container.
Signed-off-by: Akash Gupta <akagup@microsoft.com>
Upstream-commit: 7a7357dae1bcccb17e9b2d4c7c8f5c025fce56ca
Component: engine
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.
Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: ebcb7d6b406fe50ea9a237c73004d75884184c33
Component: engine
Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 7120976d74195a60334c688a061270a4d95f9aeb
Component: engine
Delete needs to release names related to a container even if that
container isn't present in the db. However, slightly overzealous error
checking causes the transaction to get rolled back. Ignore the error
from Delete on the container itself, since it may not be present.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Upstream-commit: 1d9546fc62c559dbcbb3dbdce40318fb7c4d67a2
Component: engine
Currently, names are maintained by a separate system called "registrar".
This means there is no way to atomically snapshot the state of
containers and the names associated with them.
We can add this atomicity and simplify the code by storing name
associations in the memdb. This removes the need for pkg/registrar, and
makes snapshots a lot less expensive because they no longer need to copy
all the names. This change also avoids some problematic behavior from
pkg/registrar where it returns slices which may be modified later on.
Note that while this change makes the *snapshotting* atomic, it doesn't
yet do anything to make sure containers are named at the same time that
they are added to the database. We can do that by adding a transactional
interface, either as a followup, or as part of this PR.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Upstream-commit: 1128fc1add66a849c12d2045aed39605e673abc6
Component: engine
Do not change pause state when restoring container's
status, or status in docker will be different with
status in runc.
Signed-off-by: Fengtu Wang <wangfengtu@huawei.com>
Upstream-commit: 977c4046fd2147d7c04f4b513a94138013ca0dd6
Component: engine
Do not allow sharing of container network with hyperv containers
Signed-off-by: Madhan Raj Mookkandy <madhanm@microsoft.com>
Upstream-commit: 349913ce9fde34d8acd08fad5ce866401f4d135e
Component: engine
Description:
1. start a container with restart=always.
`docker run -d --restart=always ubuntu sleep 3`
2. container init process exits.
3. use `docker pause <id>` to pause this container.
if the pause action is before cgroup data is removed and after the init process died.
`Pause` operation will success to write cgroup data, but actually do not freeze any process.
And then docker received pause event and stateExit event from
containerd, the docker state will be Running(paused), but the container
is free running.
Then we can not remove it, stop it , pause it and unpause it.
Signed-off-by: Wentao Zhang <zhangwentao234@huawei.com>
Upstream-commit: fe1b4cfba6320793373c5397641d743d9fe94cf8
Component: engine
If a container doesn't exist in the memdb, First will return nil, not an
error. This should be checked for before using the result.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Upstream-commit: c26b0cdfd1a026af88fcfbed9d3c3acdd6d171a0
Component: engine
initHealthMonitor and updateHealthMonitor can cause container state to
be changed (State.Health).
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: 04bd768a889f94a4dc6ad25e2a014dffd0a4e04e
Component: engine
Reuse existing structures and rely on json serialization to deep copy
Container objects.
Also consolidate all "save" operations on container.CheckpointTo, which
now both saves a serialized json to disk, and replicates state to the
ACID in-memory store.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: edad52707c536116363031002e6633e3fec16af5
Component: engine
it is already being saved (with a lock held) on the subsequent
operations.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: f668af4475980e32c99503c4a513668c24ea2da6
Component: engine
Also hide ViewDB behind an inteface.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: aacddda89df05b88a6d15fb33c42864760385ab2
Component: engine
Replicate relevant mutations to the in-memory ACID store. Readers will
then be able to query container state without locking.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: eed4c7b73f0cf98cf48943da1c082f3210b28c82
Component: engine
This can be used by readers/queries so they don't need locks.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: 054728b1f555892c6c0bfd7abfbaeb2fedbc8f10
Component: engine
The Solaris version (previously daemon/inspect_solaris.go) was
apparently missing some fields that should be available on that
platform.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: cfc404a375817125e4b32a9cd6a4ec7e3c55dc4e
Component: engine