For some reason, shared mount propagation between the host
and a container does not work for btrfs, unless container
root directory (i.e. graphdriver home) is a bind mount.
The above issue was reproduced on SLES 12sp3 + btrfs using
the following script:
#!/bin/bash
set -eux -o pipefail
# DIR should not be under a subvolume
DIR=${DIR:-/lib}
MNT=$DIR/my-mnt
FILE=$MNT/file
ID=$(docker run -d --privileged -v $DIR:$DIR:rshared ubuntu sleep 24h)
docker exec $ID mkdir -p $MNT
docker exec $ID mount -t tmpfs tmpfs $MNT
docker exec $ID touch $FILE
ls -l $FILE
umount $MNT
docker rm -f $ID
which fails this way:
+ ls -l /lib/my-mnt/file
ls: cannot access '/lib/my-mnt/file': No such file or directory
meaning the mount performed inside a priviledged container is not
propagated back to the host (even if all the mounts have "shared"
propagation mode).
The remedy to the above is to make graphdriver home a bind mount.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 16d822bba8ac5ab22c8697750f700403bca3dbf3)
Upstream-commit: fa8ac946165b8004a15e85744e774ed6ba99fd38
Component: engine
[18.09] backport fix denial of service with large numbers in cpuset-cpus and cpuset-mems
Upstream-commit: 4b8336f7cf091fd5c4742286bda1e34c45667d78
Component: engine
Using a value such as `--cpuset-mems=1-9223372036854775807` would cause
`dockerd` to run out of memory allocating a map of the values in the
validation code. Set limits to the normal limit of the number of CPUs,
and improve the error handling.
Reported by Huawei PSIRT.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit f8e876d7616469d07b8b049ecb48967eeb8fa7a5)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 0922d32bce74657266aff213f83dfa638e8077f4
Component: engine
We check for max value for -default-addr-pool-mask-length param as 32.
But There won't be enough addresses on the overlay network. Hence we are
keeping it 29 so that we would be having atleast 8 addresses in /29 network.
Signed-off-by: selansen <elango.siva@docker.com>
(cherry picked from commit d25c5df80e60cdbdc23fe3d0e2a6808123643dc7)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 9406f3622d18a0d9b6c438190e8fdd8be53d3b22
Component: engine
Addressing few review comments as part of code refactoring.
Also moved validation logic from CLI to Moby.
Signed-off-by: selansen <elango.siva@docker.com>
(cherry picked from commit 148ff00a0a800fad99de11ee3021d4c5d4869157)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 9816bfcaf58a609d64d648043c10817c27dcfa36
Component: engine
Similar to a related issue where previously, private Hyper-V networks
would each add 15 secs to the daemon startup, non-hns governed internal
networks are reported by hns as network type "internal" which is not
mapped to any network plugin (and thus we get the same plugin load retry
loop as before).
This issue hits Docker for Desktop because we setup such a network for
the Linux VM communication.
Signed-off-by: Simon Ferquel <simon.ferquel@docker.com>
(cherry picked from commit 6a1a4f97217b0a8635bc21fc86628f48bf8824d1)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 54bd14a3fe1d4925c6fa88b24949063d99067c07
Component: engine
The parameter name was wrong, which may mislead a user.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit c378fb774e413ba8bf5cadf655d2b67e9c94245a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: c9ddc6effc444c54def41d498b359a9a986ad79d
Component: engine
This allows to run the daemon in environments that have upstream containerd installed.
Signed-off-by: Tibor Vass <tibor@docker.com>
(cherry picked from commit 34eede0296bce6a9c335cb429f10728ae3f4252d)
Signed-off-by: Tibor Vass <tibor@docker.com>
Upstream-commit: b3bb2aabb8ed5a8af0a9f48fb5aba3f39af38e0d
Component: engine
The `/etc/docker` directory is used both by the dockerd daemon
and the docker cli (if installed on the saem host as the daemon).
In situations where the `/etc/docker` directory does not exist,
and an initial `key.json` (legacy trust key) is generated (at the
default location), the `/etc/docker/` directory was created with
0700 permissions, making the directory only accessible by `root`.
Given that the `0600` permissions on the key itself already protect
it from being used by other users, the permissions of `/etc/docker`
can be less restrictive.
This patch changes the permissions for the directory to `0755`, so
that the CLI (if executed as non-root) can also access this directory.
> **NOTE**: "strictly", this patch is only needed for situations where no _custom_
> location for the trustkey is specified (not overridden with `--deprecated-key-path`),
> but setting the permissions only for the "default" case would make
> this more complicated.
```bash
make binary shell
make install
ls -la /etc/ | grep docker
dockerd
^C
ls -la /etc/ | grep docker
drwxr-xr-x 2 root root 4096 Sep 14 12:11 docker
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit cecd9817177093be99c1c9bb0dcf43ccec14ad1d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: fc576226b24e8b5db6e95e48967d56c5808f9fe9
Component: engine
This should test that
- all the messages produced are delivered (i.e. not lost)
- followLogs() exits
Loosely based on the test having the same name by Brian Goff, see
https://gist.github.com/cpuguy83/e538793de18c762608358ee0eaddc197
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit f845d76d047760c91dc0c7076aea840291fbdbc5)
Upstream-commit: 2a82480df9ad91593d59be4b5283917dbea2da39
Component: engine
When daemon.ContainerLogs() is called with options.follow=true
(as in "docker logs --follow"), the "loggerutils.followLogs()"
function never returns (even then the logs consumer is gone).
As a result, all the resources associated with it (including
an opened file descriptor for the log file being read, two FDs
for a pipe, and two FDs for inotify watch) are never released.
If this is repeated (such as by running "docker logs --follow"
and pressing Ctrl-C a few times), this results in DoS caused by
either hitting the limit of inotify watches, or the limit of
opened files. The only cure is daemon restart.
Apparently, what happens is:
1. logs producer (a container) is gone, calling (*LogWatcher).Close()
for all its readers (daemon/logger/jsonfilelog/jsonfilelog.go:175).
2. WatchClose() is properly handled by a dedicated goroutine in
followLogs(), cancelling the context.
3. Upon receiving the ctx.Done(), the code in followLogs()
(daemon/logger/loggerutils/logfile.go#L626-L638) keeps to
send messages _synchronously_ (which is OK for now).
4. Logs consumer is gone (Ctrl-C is pressed on a terminal running
"docker logs --follow"). Method (*LogWatcher).Close() is properly
called (see daemon/logs.go:114). Since it was called before and
due to to once.Do(), nothing happens (which is kinda good, as
otherwise it will panic on closing a closed channel).
5. A goroutine (see item 3 above) keeps sending log messages
synchronously to the logWatcher.Msg channel. Since the
channel reader is gone, the channel send operation blocks forever,
and resource cleanup set up in defer statements at the beginning
of followLogs() never happens.
Alas, the fix is somewhat complicated:
1. Distinguish between close from logs producer and logs consumer.
To that effect,
- yet another channel is added to LogWatcher();
- {Watch,}Close() are renamed to {Watch,}ProducerGone();
- {Watch,}ConsumerGone() are added;
*NOTE* that ProducerGone()/WatchProducerGone() pair is ONLY needed
in order to stop ConsumerLogs(follow=true) when a container is stopped;
otherwise we're not interested in it. In other words, we're only
using it in followLogs().
2. Code that was doing (logWatcher*).Close() is modified to either call
ProducerGone() or ConsumerGone(), depending on the context.
3. Code that was waiting for WatchClose() is modified to wait for
either ConsumerGone() or ProducerGone(), or both, depending on the
context.
4. followLogs() are modified accordingly:
- context cancellation is happening on WatchProducerGone(),
and once it's received the FileWatcher is closed and waitRead()
returns errDone on EOF (i.e. log rotation handling logic is disabled);
- due to this, code that was writing synchronously to logWatcher.Msg
can be and is removed as the code above it handles this case;
- function returns once ConsumerGone is received, freeing all the
resources -- this is the bugfix itself.
While at it,
1. Let's also remove the ctx usage to simplify the code a bit.
It was introduced by commit a69a59ffc7e3d ("Decouple removing the
fileWatcher from reading") in order to fix a bug. The bug was actually
a deadlock in fsnotify, and the fix was just a workaround. Since then
the fsnofify bug has been fixed, and a new fsnotify was vendored in.
For more details, please see
https://github.com/moby/moby/pull/27782#issuecomment-416794490
2. Since `(*filePoller).Close()` is fixed to remove all the files
being watched, there is no need to explicitly call
fileWatcher.Remove(name) anymore, so get rid of the extra code.
Should fix https://github.com/moby/moby/issues/37391
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 916eabd459fe707b5c4a86377d12e2ad1871b353)
Upstream-commit: 84a5b528aede5579861201e869870d10fc98c07c
Component: engine
This test case checks that followLogs() exits once the reader is gone.
Currently it does not (i.e. this test is supposed to fail) due to #37391.
[kolyshkin@: test case Brian Goff, changelog and all bugs are by me]
Source: https://gist.github.com/cpuguy83/e538793de18c762608358ee0eaddc197
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit d37a11bfbab83ab42b1160f116e863daac046192)
Upstream-commit: 511741735e0aa2fe68a66d99384c00d187d1a157
Component: engine
This code has many return statements, for some of them the
"end logs" or "end stream" message was not printed, giving
the impression that this "for" loop never ended.
Make sure that "begin logs" is to be followed by "end logs".
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 2e4c2a6bf9cb47fd07e42f9c043024ed3dbcd04d)
Upstream-commit: 2b8bc86679b7153bb4ace063a858637df0f16a2e
Component: engine
[18.09] backport: fix regression when filtering container names using a leading slash
Upstream-commit: b8a4fe5f8f58bccfa3763ca75c6465447c8ccd06
Component: engine
In case a volume is specified via Mounts API, and SELinux is enabled,
the following error happens on container start:
> $ docker volume create testvol
> $ docker run --rm --mount source=testvol,target=/tmp busybox true
> docker: Error response from daemon: error setting label on mount
> source '': no such file or directory.
The functionality to relabel the source of a local mount specified via
Mounts API was introduced in commit 5bbf5cc and later broken by commit
e4b6adc, which removed setting mp.Source field.
With the current data structures, the host dir is already available in
v.Mountpoint, so let's just use it.
Fixes: e4b6adc
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 4032b6778df39f53fda0e6e54f0256c9a3b1d618
Component: engine
Commit 5c8da2e96738addd8a175e5b9a23b8c37d4a4dfe updated the filtering behavior
to match container-names without having to specify the leading slash.
This change caused a regression in situations where a regex was provided as
filter, using an explicit leading slash (`--filter name=^/mycontainername`).
This fix changes the filters to match containers both with, and without the
leading slash, effectively making the leading slash optional when filtering.
With this fix, filters with and without a leading slash produce the same result:
$ docker ps --filter name=^a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
21afd6362b0c busybox "sh" 2 minutes ago Up 2 minutes a2
56e53770e316 busybox "sh" 2 minutes ago Up 2 minutes a1
$ docker ps --filter name=^/a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
21afd6362b0c busybox "sh" 2 minutes ago Up 2 minutes a2
56e53770e316 busybox "sh" 3 minutes ago Up 3 minutes a1
$ docker ps --filter name=^b
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b69003b6a6fe busybox "sh" About a minute ago Up About a minute b1
$ docker ps --filter name=^/b
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b69003b6a6fe busybox "sh" 56 seconds ago Up 54 seconds b1
$ docker ps --filter name=/a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
21afd6362b0c busybox "sh" 3 minutes ago Up 3 minutes a2
56e53770e316 busybox "sh" 4 minutes ago Up 4 minutes a1
$ docker ps --filter name=a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
21afd6362b0c busybox "sh" 3 minutes ago Up 3 minutes a2
56e53770e316 busybox "sh" 4 minutes ago Up 4 minutes a1
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 6f9b5ba810e9314ab9335490cd41316940a32b5e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 5fa80da2d385520f2c3c91e8ce23d181e4f93097
Component: engine
The remote API allows full privilege escalation and is equivalent to
having root access on the host. Because of this, the API should never
be accessible through an insecure connection (TCP without TLS, or TCP
without TLS verification).
Although a warning is already logged on startup if the daemon uses an
insecure configuration, this warning is not very visible (unless someone
decides to read the logs).
This patch attempts to make insecure configuration more visible by sending
back warnings through the API (which will be printed when using `docker info`).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 547b993e07330f3e74cba935975fce05e8661381
Component: engine
When requesting information about the daemon's configuration through the `/info`
endpoint, missing features (or non-recommended settings) may have to be presented
to the user.
Detecting these situations, and printing warnings currently is handled by the
cli, which results in some complications:
- duplicated effort: each client has to re-implement detection and warnings.
- it's not possible to generate warnings for reasons outside of the information
returned in the `/info` response.
- cli-side detection has to be updated for new conditions. This means that an
older cli connecting to a new daemon may not print all warnings (due to
it not detecting the new conditions)
- some warnings (in particular, warnings about storage-drivers) depend on
driver-status (`DriverStatus`) information. The format of the information
returned in this field is not part of the API specification and can change
over time, resulting in cli-side detection no longer being functional.
This patch adds a new `Warnings` field to the `/info` response. This field is
to return warnings to be presented by the user.
Existing warnings that are currently handled by the CLI are copied to the daemon
as part of this patch; This change is backward-compatible with existing
clients; old client can continue to use the client-side warnings, whereas new
clients can skip client-side detection, and print warnings that are returned by
the daemon.
Example response with this patch applied;
```bash
curl --unix-socket /var/run/docker.sock http://localhost/info | jq .Warnings
```
```json
[
"WARNING: bridge-nf-call-iptables is disabled",
"WARNING: bridge-nf-call-ip6tables is disabled"
]
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: a3d4238b9ce653d2863fbc93057ed4162a83221e
Component: engine
Blocks the execution of tasks during the Prepare phase until there
exists an IP address for every overlay network in use by the task. This
prevents a task from starting before the NetworkAttachment containing
the IP address has been sent down to the node.
Includes a basic test for the correct use case.
Signed-off-by: Drew Erny <drew.erny@docker.com>
Upstream-commit: 3c81dc3103d9c88cb333c80e0810f80ab80c374e
Component: engine
This feature allows user to specify list of subnets for global
default address pool. User can configure subnet list using
'swarm init' command. Daemon passes the information to swarmkit.
We validate the information in swarmkit, then store it in cluster
object. when IPAM init is called, we pass subnet list to IPAM driver.
Signed-off-by: selansen <elango.siva@docker.com>
Upstream-commit: f7ad95cab9cc7ba8925673a933028d53284c13f5
Component: engine
* Expose license status in Info
This wires up a new field in the Info payload that exposes the license.
For moby this is hardcoded to always report a community edition.
Downstream enterprise dockerd will have additional licensing logic wired
into this function to report details about the current license status.
Signed-off-by: Daniel Hiltgen <daniel.hiltgen@docker.com>
* Code review comments
Signed-off-by: Daniel Hiltgen <daniel.hiltgen@docker.com>
* Add windows autogen support
Signed-off-by: Daniel Hiltgen <daniel.hiltgen@docker.com>
Upstream-commit: 896d1b1c61a48e2df1a7b4644ddde6ee97db6111
Component: engine
This driver uses protobuf to store log messages and has better defaults
for log file handling (e.g. compression and file rotation enabled by
default).
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: a351b38e7217af059eb2f8fc3dba14dc03836a45
Component: engine