[18.09 backport] Graphdriver: fix "device" mode not being detected if "character-device" bit is set
Upstream-commit: 2e4c5c57c30522dc4b33b5cd5371f294ae3fee82
Component: engine
Due to a bug in Golang (github.com/golang#27640), the "character device"
bit was omitted when checking file-modes with `os.ModeType`.
This bug was resolved in Go 1.12, but as a result, graphdrivers
would no longer recognize "device" files, causing pulling of
images that have a file with this filemode to fail;
failed to register layer:
unknown file type for /var/lib/docker/vfs/dir/.../dev/console
The current code checked for an exact match of Modes to be set. The
`os.ModeCharDevice` and `os.ModeDevice` bits will always be set in
tandem, however, because the code was only looking for an exact
match, this detection broke now that `os.ModeCharDevice` was added.
This patch changes the code to be more defensive, and instead
check if the `os.ModeDevice` bit is set (either with, or without
the `os.ModeCharDevice` bit).
In addition, some information was added to the error-message if
no type was matched, to assist debugging in case additional types
are added in future.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit c7a38c2c06f7ab844a48c6c447942913131b83d6)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 3744b45ba8ad93f1a21cbc80420856b04efc4593
Component: engine
containerd has two objects with regard to containers.
There is a "container" object which is metadata and a "task" which is
manging the actual runtime state.
When docker starts a container, it creartes both the container metadata
and the task at the same time. So when a container exits, docker deletes
both of these objects as well.
This ensures that if, on start, when we go to create the container metadata object
in containerd, if there is an error due to a name conflict that we go
ahead and clean that up and try again.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 5ba30cd1dc6000ee53b34f628cbff91d7f6d7231)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 1d0353548a21b0b60595ad44859a7072f90ea6c6
Component: engine
for windows all networks are re-populated in the store during network controller initialization. In current version it also regenerate network Ids which may be referenced by other components and it may cause broken references to a networks. This commit avoids regeneration of network ids.
Signed-off-by: Andrey Kolomentsev <andrey.kolomentsev@docker.com>
(cherry picked from commit e017717d96540dd263d95f90fdb2457928909924)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 317e0acc4e602f978e4d9c0130a113d179026c8e
Component: engine
Signed-off-by: John Howard <jhoward@microsoft.com>
This is a fix for a few related scenarios where it's impossible to remove layers or containers
until the host is rebooted. Generally (or at least easiest to repro) through a forced daemon kill
while a container is running.
Possibly slightly worse than that, as following a host reboot, the scratch layer would possibly be leaked and
left on disk under the dataroot\windowsfilter directory after the container is removed.
One such example of a failure:
1. run a long running container with the --rm flag
docker run --rm -d --name test microsoft/windowsservercore powershell sleep 30
2. Force kill the daemon not allowing it to cleanup. Simulates a crash or a host power-cycle.
3. (re-)Start daemon
4. docker ps -a
PS C:\control> docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7aff773d782b malloc "powershell start-sl…" 11 seconds ago Removal In Progress malloc
5. Try to remove
PS C:\control> docker rm 7aff
Error response from daemon: container 7aff773d782bbf35d95095369ffcb170b7b8f0e6f8f65d5aff42abf61234855d: driver "windowsfilter" failed to remove root filesystem: rename C:\control\windowsfilter\7aff773d782bbf35d95095369ffcb170b7b8f0e6f8f65d5aff42abf61234855d C:\control\windowsfilter\7aff773d782bbf35d95095369ffcb170b7b8f0e6f8f65d5aff42abf61234855d-removing: Access is denied.
PS C:\control>
Step 5 fails.
(cherry picked from commit efdad5374465a2a889d7572834a2dcca147af4fb)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 02fe71843e0e45ddc986a6c8182370b042349a27
Component: engine
The CloudWatch Logs API defines its limits in terms of bytes, but its
inputs in terms of UTF-8 encoded strings. Byte-sequences which are not
valid UTF-8 encodings are normalized to the Unicode replacement
character U+FFFD, which is a 3-byte sequence in UTF-8. This replacement
can cause the input to grow, exceeding the API limit and causing failed
API calls.
This commit adds logic for counting the effective byte length after
normalization and splitting input without splitting valid UTF-8
byte-sequences into two invalid byte-sequences.
Fixes https://github.com/moby/moby/issues/37747
Signed-off-by: Samuel Karp <skarp@amazon.com>
(cherry picked from commit 1e8ef386279e2e28aff199047e798fad660efbdd)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 757650e8dcca87f95ba083a80639769a0b6ca1cc
Component: engine
Signed-off-by: John Howard <jhoward@microsoft.com>
This removes the need for an SVM in the LCOW driver to ApplyDiff.
This change relates to a fix for https://github.com/moby/moby/issues/36353
However, it found another issue, tracked by https://github.com/moby/moby/issues/37955
(cherry picked from commit bde99960657818efabf313991867c42921882275)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 9cf6464b639aef09df9b16999cc340a1276d0bf8
Component: engine
4MB client side limit was introduced in vendoring go-grpc#1165 (v1.4.0)
making these requests likely to produce errors
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
(cherry picked from commit 489b8eda6674523df8b82a210399b7d2954427d0)
Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
Upstream-commit: 6ca0546f2571cf4acdc1f541bccfac23a78cb8d2
Component: engine
Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
(cherry picked from commit f962bd06ed8824d1f75d8546b428965cd61bdf7f)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 5591f0b1ee7dec101b490228258613cd7caf64ee
Component: engine
For some reason, shared mount propagation between the host
and a container does not work for btrfs, unless container
root directory (i.e. graphdriver home) is a bind mount.
The above issue was reproduced on SLES 12sp3 + btrfs using
the following script:
#!/bin/bash
set -eux -o pipefail
# DIR should not be under a subvolume
DIR=${DIR:-/lib}
MNT=$DIR/my-mnt
FILE=$MNT/file
ID=$(docker run -d --privileged -v $DIR:$DIR:rshared ubuntu sleep 24h)
docker exec $ID mkdir -p $MNT
docker exec $ID mount -t tmpfs tmpfs $MNT
docker exec $ID touch $FILE
ls -l $FILE
umount $MNT
docker rm -f $ID
which fails this way:
+ ls -l /lib/my-mnt/file
ls: cannot access '/lib/my-mnt/file': No such file or directory
meaning the mount performed inside a priviledged container is not
propagated back to the host (even if all the mounts have "shared"
propagation mode).
The remedy to the above is to make graphdriver home a bind mount.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 16d822bba8ac5ab22c8697750f700403bca3dbf3)
Upstream-commit: fa8ac946165b8004a15e85744e774ed6ba99fd38
Component: engine
The `overlay` storage driver is deprecated in favor of the `overlay2` storage
driver, which has all the benefits of `overlay`, without its limitations (excessive
inode consumption). The legacy `overlay` storage driver will be removed in a future
release. Users of the `overlay` storage driver should migrate to the `overlay2`
storage driver.
The legacy `overlay` storage driver allowed using overlayFS-backed filesystems
on pre 4.x kernels. Now that all supported distributions are able to run `overlay2`
(as they are either on kernel 4.x, or have support for multiple lowerdirs
backported), there is no reason to keep maintaining the `overlay` storage driver.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 31be4e0ba11532e5d6df8401f5d459e292c58f36)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: c20e8dffbbb4dd328e4c00d18e04eecb80cf4d4e
Component: engine
The `devicemapper` storage driver is deprecated in favor of `overlay2`, and will
be removed in a future release. Users of the `devicemapper` storage driver are
recommended to migrate to a different storage driver, such as `overlay2`, which
is now the default storage driver.
The `devicemapper` storage driver facilitates running Docker on older (3.x) kernels
that have no support for other storage drivers (such as overlay2, or AUFS).
Now that support for `overlay2` is added to all supported distros (as they are
either on kernel 4.x, or have support for multiple lowerdirs backported), there
is no reason to continue maintenance of the `devicemapper` storage driver.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 06fcabbaa0d4f25db0580d3604bbed708223fb06)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 734e7a8e55e41b064025273776dc5a76e45bf2e1
Component: engine
[18.09] backport fix denial of service with large numbers in cpuset-cpus and cpuset-mems
Upstream-commit: 4b8336f7cf091fd5c4742286bda1e34c45667d78
Component: engine
As pointed out in https://github.com/moby/moby/issues/37970,
Docker overlay driver can't work with index=on feature of
the Linux kernel "overlay" filesystem. In case the global
default is set to "yes", Docker will fail with EBUSY when
trying to mount, like this:
> error creating overlay mount to ...../merged: device or resource busy
and the kernel log should contain something like:
> overlayfs: upperdir is in-use by another mount, mount with
> '-o index=off' to override exclusive upperdir protection.
A workaround is to set index=off in overlay kernel module
parameters, or even recompile the kernel with
CONFIG_OVERLAY_FS_INDEX=n in .config. Surely this is not
always practical or even possible.
The solution, as pointed out my Amir Goldstein (as well as
the above kernel message:) is to use 'index=off' option
when mounting.
NOTE since older (< 4.13rc1) kernels do not support "index="
overlayfs parameter, try to figure out whether the option
is supported. In case it's not possible to figure out,
assume it is not.
NOTE the default can be changed anytime (by writing to
/sys/module/overlay/parameters/index) so we need to always
use index=off.
[v2: move the detection code to Init()]
[v3: don't set index=off if stat() failed]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 8422d85087bfa770b62ef4e1daaca95ee6783d86)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 690e097fedd7362f3b2781c32ca872ad966d286e
Component: engine
This simplifies the code a lot.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit a55d32546a8556f9e6cabbc99836b573b9944f0c)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: dc0a4db7c9dc593a8568a8e30e4e21e118c2839d
Component: engine
Signed-off-by: John Howard <jhoward@microsoft.com>
(cherry picked from commit c907c2486c0850030f8ac1819ac8c87631472c68)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 7184074c0880c656be00645007588a00ec2266cd
Component: engine
Using a value such as `--cpuset-mems=1-9223372036854775807` would cause
`dockerd` to run out of memory allocating a map of the values in the
validation code. Set limits to the normal limit of the number of CPUs,
and improve the error handling.
Reported by Huawei PSIRT.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit f8e876d7616469d07b8b049ecb48967eeb8fa7a5)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 0922d32bce74657266aff213f83dfa638e8077f4
Component: engine
With containerd reaching 1.0, the runtime now
has a stable API, so there's no need to do a check
if the installed version matches the expected version.
Current versions of Docker now also package containerd
and runc separately, and can be _updated_ separately.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit c65f0bd13c85d29087419fa555281311091825e7)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 054c3c2931cec5dca8bb84af97f1457c343ec02f
Component: engine
We check for max value for -default-addr-pool-mask-length param as 32.
But There won't be enough addresses on the overlay network. Hence we are
keeping it 29 so that we would be having atleast 8 addresses in /29 network.
Signed-off-by: selansen <elango.siva@docker.com>
(cherry picked from commit d25c5df80e60cdbdc23fe3d0e2a6808123643dc7)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 9406f3622d18a0d9b6c438190e8fdd8be53d3b22
Component: engine
Addressing few review comments as part of code refactoring.
Also moved validation logic from CLI to Moby.
Signed-off-by: selansen <elango.siva@docker.com>
(cherry picked from commit 148ff00a0a800fad99de11ee3021d4c5d4869157)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 9816bfcaf58a609d64d648043c10817c27dcfa36
Component: engine
Similar to a related issue where previously, private Hyper-V networks
would each add 15 secs to the daemon startup, non-hns governed internal
networks are reported by hns as network type "internal" which is not
mapped to any network plugin (and thus we get the same plugin load retry
loop as before).
This issue hits Docker for Desktop because we setup such a network for
the Linux VM communication.
Signed-off-by: Simon Ferquel <simon.ferquel@docker.com>
(cherry picked from commit 6a1a4f97217b0a8635bc21fc86628f48bf8824d1)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 54bd14a3fe1d4925c6fa88b24949063d99067c07
Component: engine
The parameter name was wrong, which may mislead a user.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit c378fb774e413ba8bf5cadf655d2b67e9c94245a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: c9ddc6effc444c54def41d498b359a9a986ad79d
Component: engine
This allows to run the daemon in environments that have upstream containerd installed.
Signed-off-by: Tibor Vass <tibor@docker.com>
(cherry picked from commit 34eede0296bce6a9c335cb429f10728ae3f4252d)
Signed-off-by: Tibor Vass <tibor@docker.com>
Upstream-commit: b3bb2aabb8ed5a8af0a9f48fb5aba3f39af38e0d
Component: engine
The `/etc/docker` directory is used both by the dockerd daemon
and the docker cli (if installed on the saem host as the daemon).
In situations where the `/etc/docker` directory does not exist,
and an initial `key.json` (legacy trust key) is generated (at the
default location), the `/etc/docker/` directory was created with
0700 permissions, making the directory only accessible by `root`.
Given that the `0600` permissions on the key itself already protect
it from being used by other users, the permissions of `/etc/docker`
can be less restrictive.
This patch changes the permissions for the directory to `0755`, so
that the CLI (if executed as non-root) can also access this directory.
> **NOTE**: "strictly", this patch is only needed for situations where no _custom_
> location for the trustkey is specified (not overridden with `--deprecated-key-path`),
> but setting the permissions only for the "default" case would make
> this more complicated.
```bash
make binary shell
make install
ls -la /etc/ | grep docker
dockerd
^C
ls -la /etc/ | grep docker
drwxr-xr-x 2 root root 4096 Sep 14 12:11 docker
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit cecd9817177093be99c1c9bb0dcf43ccec14ad1d)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: fc576226b24e8b5db6e95e48967d56c5808f9fe9
Component: engine
This should test that
- all the messages produced are delivered (i.e. not lost)
- followLogs() exits
Loosely based on the test having the same name by Brian Goff, see
https://gist.github.com/cpuguy83/e538793de18c762608358ee0eaddc197
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit f845d76d047760c91dc0c7076aea840291fbdbc5)
Upstream-commit: 2a82480df9ad91593d59be4b5283917dbea2da39
Component: engine
When daemon.ContainerLogs() is called with options.follow=true
(as in "docker logs --follow"), the "loggerutils.followLogs()"
function never returns (even then the logs consumer is gone).
As a result, all the resources associated with it (including
an opened file descriptor for the log file being read, two FDs
for a pipe, and two FDs for inotify watch) are never released.
If this is repeated (such as by running "docker logs --follow"
and pressing Ctrl-C a few times), this results in DoS caused by
either hitting the limit of inotify watches, or the limit of
opened files. The only cure is daemon restart.
Apparently, what happens is:
1. logs producer (a container) is gone, calling (*LogWatcher).Close()
for all its readers (daemon/logger/jsonfilelog/jsonfilelog.go:175).
2. WatchClose() is properly handled by a dedicated goroutine in
followLogs(), cancelling the context.
3. Upon receiving the ctx.Done(), the code in followLogs()
(daemon/logger/loggerutils/logfile.go#L626-L638) keeps to
send messages _synchronously_ (which is OK for now).
4. Logs consumer is gone (Ctrl-C is pressed on a terminal running
"docker logs --follow"). Method (*LogWatcher).Close() is properly
called (see daemon/logs.go:114). Since it was called before and
due to to once.Do(), nothing happens (which is kinda good, as
otherwise it will panic on closing a closed channel).
5. A goroutine (see item 3 above) keeps sending log messages
synchronously to the logWatcher.Msg channel. Since the
channel reader is gone, the channel send operation blocks forever,
and resource cleanup set up in defer statements at the beginning
of followLogs() never happens.
Alas, the fix is somewhat complicated:
1. Distinguish between close from logs producer and logs consumer.
To that effect,
- yet another channel is added to LogWatcher();
- {Watch,}Close() are renamed to {Watch,}ProducerGone();
- {Watch,}ConsumerGone() are added;
*NOTE* that ProducerGone()/WatchProducerGone() pair is ONLY needed
in order to stop ConsumerLogs(follow=true) when a container is stopped;
otherwise we're not interested in it. In other words, we're only
using it in followLogs().
2. Code that was doing (logWatcher*).Close() is modified to either call
ProducerGone() or ConsumerGone(), depending on the context.
3. Code that was waiting for WatchClose() is modified to wait for
either ConsumerGone() or ProducerGone(), or both, depending on the
context.
4. followLogs() are modified accordingly:
- context cancellation is happening on WatchProducerGone(),
and once it's received the FileWatcher is closed and waitRead()
returns errDone on EOF (i.e. log rotation handling logic is disabled);
- due to this, code that was writing synchronously to logWatcher.Msg
can be and is removed as the code above it handles this case;
- function returns once ConsumerGone is received, freeing all the
resources -- this is the bugfix itself.
While at it,
1. Let's also remove the ctx usage to simplify the code a bit.
It was introduced by commit a69a59ffc7e3d ("Decouple removing the
fileWatcher from reading") in order to fix a bug. The bug was actually
a deadlock in fsnotify, and the fix was just a workaround. Since then
the fsnofify bug has been fixed, and a new fsnotify was vendored in.
For more details, please see
https://github.com/moby/moby/pull/27782#issuecomment-416794490
2. Since `(*filePoller).Close()` is fixed to remove all the files
being watched, there is no need to explicitly call
fileWatcher.Remove(name) anymore, so get rid of the extra code.
Should fix https://github.com/moby/moby/issues/37391
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 916eabd459fe707b5c4a86377d12e2ad1871b353)
Upstream-commit: 84a5b528aede5579861201e869870d10fc98c07c
Component: engine
This test case checks that followLogs() exits once the reader is gone.
Currently it does not (i.e. this test is supposed to fail) due to #37391.
[kolyshkin@: test case Brian Goff, changelog and all bugs are by me]
Source: https://gist.github.com/cpuguy83/e538793de18c762608358ee0eaddc197
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit d37a11bfbab83ab42b1160f116e863daac046192)
Upstream-commit: 511741735e0aa2fe68a66d99384c00d187d1a157
Component: engine
This code has many return statements, for some of them the
"end logs" or "end stream" message was not printed, giving
the impression that this "for" loop never ended.
Make sure that "begin logs" is to be followed by "end logs".
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 2e4c2a6bf9cb47fd07e42f9c043024ed3dbcd04d)
Upstream-commit: 2b8bc86679b7153bb4ace063a858637df0f16a2e
Component: engine