Signed-off-by: John Howard <jhoward@microsoft.com>
This PR has the API changes described in https://github.com/moby/moby/issues/34617.
Specifically, it adds an HTTP header "X-Requested-Platform" which is a JSON-encoded
OCI Image-spec `Platform` structure.
In addition, it renames (almost all) uses of a string variable platform (and associated)
methods/functions to os. This makes it much clearer to disambiguate with the swarm
"platform" which is really os/arch. This is a stepping stone to getting the daemon towards
fully multi-platform/arch-aware, and makes it clear when "operating system" is being
referred to rather than "platform" which is misleadingly used - sometimes in the swarm
meaning, but more often as just the operating system.
Upstream-commit: 0380fbff37922cadf294851b1546f4c212c7f364
Component: engine
Take an extra reference to rwlayer while the container is being
committed or exported to avoid the removal of that layer.
Also add some checks before commit/export.
Signed-off-by: Yuanhong Peng <pengyuanhong@huawei.com>
Upstream-commit: 8c32659979150630a2c4eae4e7da944806c46297
Component: engine
The promise package represents a simple enough concurrency pattern that
replicating it in place is sufficient. To end the propagation of this
package, it has been removed and the uses have been inlined.
While this code could likely be refactored to be simpler without the
package, the changes have been minimized to reduce the possibility of
defects. Someone else may want to do further refactoring to remove
closures and reduce the number of goroutines in use.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Upstream-commit: 0cd4ab3f9a3f242468484fc62b46e632fdba5e13
Component: engine
This enables docker cp and ADD/COPY docker build support for LCOW.
Originally, the graphdriver.Get() interface returned a local path
to the container root filesystem. This does not work for LCOW, so
the Get() method now returns an interface that LCOW implements to
support copying to and from the container.
Signed-off-by: Akash Gupta <akagup@microsoft.com>
Upstream-commit: 7a7357dae1bcccb17e9b2d4c7c8f5c025fce56ca
Component: engine
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.
Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: ebcb7d6b406fe50ea9a237c73004d75884184c33
Component: engine
Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 7120976d74195a60334c688a061270a4d95f9aeb
Component: engine
Delete needs to release names related to a container even if that
container isn't present in the db. However, slightly overzealous error
checking causes the transaction to get rolled back. Ignore the error
from Delete on the container itself, since it may not be present.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Upstream-commit: 1d9546fc62c559dbcbb3dbdce40318fb7c4d67a2
Component: engine
Currently, names are maintained by a separate system called "registrar".
This means there is no way to atomically snapshot the state of
containers and the names associated with them.
We can add this atomicity and simplify the code by storing name
associations in the memdb. This removes the need for pkg/registrar, and
makes snapshots a lot less expensive because they no longer need to copy
all the names. This change also avoids some problematic behavior from
pkg/registrar where it returns slices which may be modified later on.
Note that while this change makes the *snapshotting* atomic, it doesn't
yet do anything to make sure containers are named at the same time that
they are added to the database. We can do that by adding a transactional
interface, either as a followup, or as part of this PR.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Upstream-commit: 1128fc1add66a849c12d2045aed39605e673abc6
Component: engine
Do not change pause state when restoring container's
status, or status in docker will be different with
status in runc.
Signed-off-by: Fengtu Wang <wangfengtu@huawei.com>
Upstream-commit: 977c4046fd2147d7c04f4b513a94138013ca0dd6
Component: engine
Do not allow sharing of container network with hyperv containers
Signed-off-by: Madhan Raj Mookkandy <madhanm@microsoft.com>
Upstream-commit: 349913ce9fde34d8acd08fad5ce866401f4d135e
Component: engine
Description:
1. start a container with restart=always.
`docker run -d --restart=always ubuntu sleep 3`
2. container init process exits.
3. use `docker pause <id>` to pause this container.
if the pause action is before cgroup data is removed and after the init process died.
`Pause` operation will success to write cgroup data, but actually do not freeze any process.
And then docker received pause event and stateExit event from
containerd, the docker state will be Running(paused), but the container
is free running.
Then we can not remove it, stop it , pause it and unpause it.
Signed-off-by: Wentao Zhang <zhangwentao234@huawei.com>
Upstream-commit: fe1b4cfba6320793373c5397641d743d9fe94cf8
Component: engine
If a container doesn't exist in the memdb, First will return nil, not an
error. This should be checked for before using the result.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Upstream-commit: c26b0cdfd1a026af88fcfbed9d3c3acdd6d171a0
Component: engine
initHealthMonitor and updateHealthMonitor can cause container state to
be changed (State.Health).
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: 04bd768a889f94a4dc6ad25e2a014dffd0a4e04e
Component: engine
Reuse existing structures and rely on json serialization to deep copy
Container objects.
Also consolidate all "save" operations on container.CheckpointTo, which
now both saves a serialized json to disk, and replicates state to the
ACID in-memory store.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: edad52707c536116363031002e6633e3fec16af5
Component: engine
it is already being saved (with a lock held) on the subsequent
operations.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: f668af4475980e32c99503c4a513668c24ea2da6
Component: engine
Also hide ViewDB behind an inteface.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: aacddda89df05b88a6d15fb33c42864760385ab2
Component: engine
Replicate relevant mutations to the in-memory ACID store. Readers will
then be able to query container state without locking.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: eed4c7b73f0cf98cf48943da1c082f3210b28c82
Component: engine
This can be used by readers/queries so they don't need locks.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: 054728b1f555892c6c0bfd7abfbaeb2fedbc8f10
Component: engine
The Solaris version (previously daemon/inspect_solaris.go) was
apparently missing some fields that should be available on that
platform.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: cfc404a375817125e4b32a9cd6a4ec7e3c55dc4e
Component: engine
Doing a chown/chmod automatically can cause `EPERM` in some cases (e.g.
with an NFS mount). Currently Docker will always call chown+chmod on a
volume path unless `:nocopy` is passed in, but we don't need to make
these calls if the perms and ownership already match and potentially
avoid an uneccessary `EPERM`.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: f05a023760493dbd41fbfc1bb76ad334b579e94e
Component: engine
The use of pools.Copy avoids io.Copy's internal buffer allocation.
This commit replaces io.Copy with pools.Copy to avoid the allocation of
buffers in io.Copy.
Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com>
Upstream-commit: 014095e6a07748d0e1ce2f759f5c4f4b3783e765
Component: engine