containerd has two objects with regard to containers.
There is a "container" object which is metadata and a "task" which is
manging the actual runtime state.
When docker starts a container, it creartes both the container metadata
and the task at the same time. So when a container exits, docker deletes
both of these objects as well.
This ensures that if, on start, when we go to create the container metadata object
in containerd, if there is an error due to a name conflict that we go
ahead and clean that up and try again.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 5ba30cd1dc6000ee53b34f628cbff91d7f6d7231)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 1d0353548a21b0b60595ad44859a7072f90ea6c6
Component: engine
imageService provides the backend for the image API and handles the
imageStore, and referenceStore.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Upstream-commit: 0dab53ff3cb0aae91aae068a3f1f2fd32339e23b
Component: engine
Signed-off-by: John Howard <jhoward@microsoft.com>
The re-coalesces the daemon stores which were split as part of the
original LCOW implementation.
This is part of the work discussed in https://github.com/moby/moby/issues/34617,
in particular see the document linked to in that issue.
Upstream-commit: ce8e529e182bde057cdfafded62c210b7293b8ba
Component: engine
It's a common scenario for admins and/or monitoring applications to
mount in the daemon root dir into a container. When doing so all mounts
get coppied into the container, often with private references.
This can prevent removal of a container due to the various mounts that
must be configured before a container is started (for example, for
shared /dev/shm, or secrets) being leaked into another namespace,
usually with private references.
This is particularly problematic on older kernels (e.g. RHEL < 7.4)
where a mount may be active in another namespace and attempting to
remove a mountpoint which is active in another namespace fails.
This change moves all container resource mounts into a common directory
so that the directory can be made unbindable.
What this does is prevents sub-mounts of this new directory from leaking
into other namespaces when mounted with `rbind`... which is how all
binds are handled for containers.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: eaa5192856c1ad09614318e88030554b96bb6e81
Component: engine
Instead of having to create a bunch of custom error types that are doing
nothing but wrapping another error in sub-packages, use a common helper
to create errors of the requested type.
e.g. instead of re-implementing this over and over:
```go
type notFoundError struct {
cause error
}
func(e notFoundError) Error() string {
return e.cause.Error()
}
func(e notFoundError) NotFound() {}
func(e notFoundError) Cause() error {
return e.cause
}
```
Packages can instead just do:
```
errdefs.NotFound(err)
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 87a12421a94faac294079bebc97c8abb4180dde5
Component: engine
Signed-off-by: John Howard <jhoward@microsoft.com>
This PR has the API changes described in https://github.com/moby/moby/issues/34617.
Specifically, it adds an HTTP header "X-Requested-Platform" which is a JSON-encoded
OCI Image-spec `Platform` structure.
In addition, it renames (almost all) uses of a string variable platform (and associated)
methods/functions to os. This makes it much clearer to disambiguate with the swarm
"platform" which is really os/arch. This is a stepping stone to getting the daemon towards
fully multi-platform/arch-aware, and makes it clear when "operating system" is being
referred to rather than "platform" which is misleadingly used - sometimes in the swarm
meaning, but more often as just the operating system.
Upstream-commit: 0380fbff37922cadf294851b1546f4c212c7f364
Component: engine
This enables docker cp and ADD/COPY docker build support for LCOW.
Originally, the graphdriver.Get() interface returned a local path
to the container root filesystem. This does not work for LCOW, so
the Get() method now returns an interface that LCOW implements to
support copying to and from the container.
Signed-off-by: Akash Gupta <akagup@microsoft.com>
Upstream-commit: 7a7357dae1bcccb17e9b2d4c7c8f5c025fce56ca
Component: engine
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.
Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: ebcb7d6b406fe50ea9a237c73004d75884184c33
Component: engine
Since the commit d88fe447df0e8 ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.
Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).
This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:
- 'shareable': enables sharing this container's IPC with others
(this used to be the implicit default);
- 'private': disables sharing this container's IPC.
In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.
While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:
> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...
...so here's yet yet another mode:
- 'none': no /dev/shm mount inside the container (though it still
has its own private IPC namespace).
Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.
Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).
Some other changes this patch introduces are:
1. A mount for /dev/shm is added to default OCI Linux spec.
2. IpcMode.Valid() is simplified to remove duplicated code that parsed
'container:ID' form. Note the old version used to check that ID does
not contain a semicolon -- this is no longer the case (tests are
modified accordingly). The motivation is we should either do a
proper check for container ID validity, or don't check it at all
(since it is checked in other places anyway). I chose the latter.
3. IpcMode.Container() is modified to not return container ID if the
mode value does not start with "container:", unifying the check to
be the same as in IpcMode.IsContainer().
3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
to add checks for newly added values.
[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 7120976d74195a60334c688a061270a4d95f9aeb
Component: engine
Reuse existing structures and rely on json serialization to deep copy
Container objects.
Also consolidate all "save" operations on container.CheckpointTo, which
now both saves a serialized json to disk, and replicates state to the
ACID in-memory store.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: edad52707c536116363031002e6633e3fec16af5
Component: engine
Also hide ViewDB behind an inteface.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: aacddda89df05b88a6d15fb33c42864760385ab2
Component: engine
Replicate relevant mutations to the in-memory ACID store. Readers will
then be able to query container state without locking.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: eed4c7b73f0cf98cf48943da1c082f3210b28c82
Component: engine
After running the test suite with the race detector enabled I found
these gems that need to be fixed.
This is just round one, sadly lost my test results after I built the
binary to test this... (whoops)
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 7917a36cc787ada58987320e67cc6d96858f3b55
Component: engine
Signed-off-by: Roberto Muñoz Fernández <robertomf@gmail.com>
Added an apparmorEnabled boolean in the Daemon struct to indicate if AppArmor is enabled or not. It is set in NewDaemon using sysInfo information.
Signed-off-by: Roberto Muñoz Fernández <robertomf@gmail.com>
gofmt'd
Signed-off-by: Roberto Muñoz Fernández <robertomf@gmail.com>
change the function name to something more adequate and changed the behaviour to show empty value on an apparmor disabled system.
Signed-off-by: Roberto Muñoz Fernández <robertomf@gmail.com>
go fmt
Signed-off-by: Roberto Muñoz Fernández <robertomf@gmail.com>
Upstream-commit: d97a00dfd5ec884a98e087b1fc6e705459ca81e9
Component: engine
Validation is still done by swarmkit on the service side.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Upstream-commit: ef39256dfb711f8382a5c021b85d6c7d613282b0
Component: engine
Fixes#22564
When an error occurs on mount, there should not be any call later to
unmount. This can throw off refcounting in the underlying driver
unexpectedly.
Consider these two cases:
```
$ docker run -v foo:/bar busybox true
```
```
$ docker run -v foo:/bar -w /foo busybox true
```
In the first case, if mounting `foo` fails, the volume driver will not
get a call to unmount (this is the incorrect behavior).
In the second case, the volume driver will not get a call to unmount
(correct behavior).
This occurs because in the first case, `/bar` does not exist in the
container, and as such there is no call to `volume.Mount()` during the
`create` phase. It will error out during the `start` phase.
In the second case `/bar` is created before dealing with the volume
because of the `-w`. Because of this, when the volume is being setup
docker will try to copy the image path contents in the volume, in which
case it will attempt to mount the volume and fail. This happens during
the `create` phase. This makes it so the container will not be created
(or at least fully created) and the user gets the error on `create`
instead of `start`. The error handling is different in these two phases.
Changed to only send `unmount` if the volume is mounted.
While investigating the cause of the reported issue I found some odd
behavior in unmount calls so I've cleaned those up a bit here as well.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 9a2d0bc3adc0c21c82cd1974be45ea0449f9f224
Component: engine
Convert this to lower before checking the message of the error.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Upstream-commit: 47637b49a0fbd62a25702859f0993666c63ff562
Component: engine
This adds a metrics packages that creates additional metrics. Add the
metrics endpoint to the docker api server under `/metrics`.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add metrics to daemon package
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
api: use standard way for metrics route
Also add "type" query parameter
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
Convert timers to ms
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Upstream-commit: 3343d234f3b131d4be1d4ca84385e184633a79bd
Component: engine
This moves the types for the `engine-api` repo to the existing types
package.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Upstream-commit: 91e197d614547f0202e6ae9b8a24d88ee131d950
Component: engine
Before this, container's auto-removal after exit is done in a goroutine,
this commit will get ContainerRm out of the goroutine.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Upstream-commit: 1537dbe2d617013c94aa42c28744feb07a09fb63
Component: engine
`--rm` is a client side flag which caused lots of problems:
1. if client lost connection to daemon, including client crash or be
killed, there's no way to clean garbage container.
2. if docker stop a `--rm` container, this container won't be
autoremoved.
3. if docker daemon restart, container is also left over.
4. bug: `docker run --rm busybox fakecmd` will exit without cleanup.
In a word, client side `--rm` flag isn't sufficient for garbage
collection. Move the `--rm` flag to daemon will be more reasonable.
What this commit do is:
1. implement a `--rm` on daemon side, adding one flag `AutoRemove` into
HostConfig.
2. Allow `run --rm -d`, no conflicting `--rm` and `-d` any more,
auto-remove can work on detach mode.
3. `docker restart` a `--rm` container will succeed, the container won't
be autoremoved.
This commit will help a lot for daemon to do garbage collection for
temporary containers.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Upstream-commit: 3c2886d8a45d8e79b00ab413d91f1af871b58d0a
Component: engine
In order to keep a little bit of "sanity" on the API side, validate
hostname only starting from v1.24 API version.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Upstream-commit: 6daf3d2a783fd042e870c8af8bbd19fc28989505
Component: engine
Currently `start` will hide some errors and throw a consolidated error,
which will make it hard to debug because developer can't find the
original error.
This commit allow daemon to log original errors first.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Upstream-commit: b4740e3021a39a502ffc82f0f70e9b7ed2f0875f
Component: engine