We just introduced a new tunable dm.xfs_nospace_max_retries. But this tunable
will work only on new kernels where xfs supports this feature. On older
kernels xfs does not allow tuning this behavior.
There are two issues. First one is that if xfsSetNospaceRetries() fails,
it returns error but leaves the device activated and mounted. We should
be unmounting the device and deactivate it before returning.
Second issue is, if docker is started on older kernel, with
dm.xfs_nospace_max_retries specified, then docker will silently ignore the
fact that /sys file to tweak this behavior is not present and will continue.
But I think it might be better to fail container creation/start if kernel
does not support this feature.
This patch fixes it. After this patch, user will get an error like following
when container is run.
# docker run -ti fedora bash
docker: Error response from daemon: devmapper: user specified daemon option dm.xfs_nospace_max_retries but it does not seem to be supported on this system :open /sys/fs/xfs/dm-5/error/metadata/ENOSPC/max_retries: no such file or directory.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: 6cc55dd65b363fe520c2ab29a9303f79afd4cadb
Component: engine
When xfs filesystem is being used on top of thin pool, xfs can get ENOSPC
errors from thin pool when thin pool is full. As of now xfs retries the
IO and keeps on retrying and does not give up. This can result in container
application being stuck for a very long time. In fact I have seen instances
of unkillable processes. So that means once thin pool is full and process
gets stuck, container can't be stopped/killed either and only option left
seems to be power recycle of the box.
In another instance, writer did not block but failed after a while. But
when I tried to exit/stop the container, unmounting xfs hanged and only
thing I could do was power cycle the machine.
Now upstream kernel has committed patches where it allows user space to
customize user space behavior in case of errors. One of the knobs is
max_retries, which specifies how many times an IO should be retried
when ENOSPC is encountered.
This patch sets provides a tunable knob (dm.xfs_nospace_max_retries) so
that user can specify value for max_retries and tune xfs behavior. If
one sets this value to 0, xfs will not retry IO when ENOSPC error is
encountered. It will instead give up and shutdown filesystem.
This knob can be useful if one is running into unkillable
processes/containers issue on top of xfs.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: 4f0017b9ad7dfa2e9dcdee69d000b98595893e60
Component: engine
In the common case where the user is using /var/lib/docker and
an image with less than 60 layers, forking is not needed. Calculate
whether absolute paths can be used and avoid forking to mount in
those cases.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Upstream-commit: c13a985fa1196a5ed782d5ac68a4bbb68dd529ca
Component: engine
If we are running in a user namespace, don't try to mknod as
it won't be allowed. libcontainer will bind-mount the host's
devices over files in the container anyway, so it's not needed.
The chrootarchive package does a chroot (without mounting /proc) before
its work, so we cannot check /proc/self/uid_map when we need to. So
compute it in advance and pass it along with the tar options.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Upstream-commit: 617c352e9225b1d598e893aa5f89a8863808e4f2
Component: engine
Problem Description:
An example scenario that involves deferred removal
1. A new base image gets created (e.g. 'docker load -i'). The base device is activated and
mounted at some point in time during image creation.
2. While image creation is in progress, a privileged container is started
from another image and the host's mount name space is shared with this
container ('docker run --privileged -v /:/host').
3. Image creation completes and the base device gets unmounted. However,
as the privileged container still holds a reference on the base image
mount point, the base device cannot be removed right away. So it gets
flagged for deferred removal.
4. Next, the privileged container terminates and thus its reference to the
base image mount point gets released. The base device (which is flagged
for deferred removal) may now be cleaned up by the device-mapper. This
opens up an opportunity for a race between a 'kworker' thread (executing
the do_deferred_remove() function) and the Docker daemon (executing the
CreateSnapDevice() function).
This PR cancel the deferred removal, if the device is marked for it. And reschedule the
deferred removal later after the device is resumed successfully.
Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>
Upstream-commit: 0e633ee14aca32480ac4735675222c35f4e11d8c
Component: engine
Truncated dir name can't give any useful information, print whole dir
name will.
Bad debug log is like this:
```
DEBU[2449] aufs error unmounting /var/lib/doc: no such file or directory
```
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Upstream-commit: af8359562c9561afad0a05e66386588d17788804
Component: engine
Use module name logrus instead of log.
Use logrus.[Error|Warn|Debug|Fatal|Panic|Info]f instead of w/o f
Signed-off-by: Daehyeok Mun <daehyeok@gmail.com>
Upstream-commit: 6a1183b3ae512961f41a8ccfc8205e08294216f4
Component: engine
Add option to skip kernel check for older kernels which have been patched to support multiple lower directories in overlayfs.
Fixes#24023
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Upstream-commit: ff98da0607c4d6a94a2356d9ccaa64cc9d7f6a78
Component: engine
Now that Windows base images can be loaded directly into docker via "docker load" of a specialized tar file (with docker pull support on the horizon) we no longer have need of the custom images code path that loads images from a shared central location. Removing that code and it's call points.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
Upstream-commit: 3e109f349ff42ea1a0f555b6e645c51d9bc1539b
Component: engine
Currently when overlay creates a whiteout file then the overlay2 layer is archived,
the correct tar header will be created for the whiteout file, but the tar logic will then attempt to open the file causing a failure.
When tar encounters such failures the file is skipped and excluded for the archive, causing the whiteout to be ignored.
By skipping the copy of empty files, no open attempt will be made on whiteout files.
Fixes#23863
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Upstream-commit: bd13c53f8dc1504c9e681dfabef2afc1685ccec7
Component: engine
This patch introduces a new experimental engine-level plugin management
with a new API and command line. Plugins can be distributed via a Docker
registry, and their lifecycle is managed by the engine.
This makes plugins a first-class construct.
For more background, have a look at issue #20363.
Documentation is in a separate commit. If you want to understand how the
new plugin system works, you can start by reading the documentation.
Note: backwards compatibility with existing plugins is maintained,
albeit they won't benefit from the advantages of the new system.
Signed-off-by: Tibor Vass <tibor@docker.com>
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
Upstream-commit: f37117045c5398fd3dca8016ea8ca0cb47e7312b
Component: engine
Symlinks are currently not getting cleaned up when removing layers since only the root directory is removed.
On remove, read the link file and remove the associated link from the link directory.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Upstream-commit: e6f2e7646c813e8225b0bc16d3a0c13c76a9cd97
Component: engine
We added docs about ecryptfs check but not in code side.
Also refactor code to make it clean.
Signed-off-by: Kai Qiang Wu(Kennan) <wkqwu@cn.ibm.com>
Upstream-commit: 136323b04315eceffbed61d680878ed23cecc015
Component: engine
Diff apply is sometimes producing a different change list causing the tests to fail.
Overlay has a known issue calculating diffs of files which occur within the same second they were created.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Upstream-commit: 0e74aabbb9aa5cea0b6bf7342f9e325f989468fa
Component: engine
This fix tries to fix logrus formatting by removing `f` from
`logrus.[Error|Warn|Debug|Fatal|Panic|Info]f` when formatting string
is not present.
This fix fixes#23459.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Upstream-commit: a72b45dbec3caeb3237d1af5aedd04adeb083571
Component: engine
Adds a new overlay driver which uses multiple lower directories to create the union fs.
Additionally it uses symlinks and relative mount paths to allow a depth of 128 and stay within the mount page size limit.
Diffs and done directly over a single directory allowing diffs to be done efficiently and without the need fo the naive diff driver.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Upstream-commit: 23e5c94cfb26eb72c097892712d3dbaa93ee9bc0
Component: engine
device Base should not exists on failure:
--- FAIL: TestDevmapperCreateBase (0.06s)
graphtest_unix.go:122: stat
/tmp/docker-graphtest-079240530/devicemapper/mnt/Base/rootfs/a subdir:
no such file or directory
--- FAIL: TestDevmapperCreateSnap (0.00s)
graphtest_unix.go:219: devmapper: device Base already
exists.
it should be:
--- FAIL: TestDevmapperCreateBase (0.25s)
graphtest_unix.go:122: stat
/tmp/docker-graphtest-828994195/devicemapper/mnt/Base/rootfs/a subdir:
no such file or directory
--- FAIL: TestDevmapperCreateSnap (0.13s)
graphtest_unix.go:122: stat
/tmp/docker-graphtest-828994195/devicemapper/mnt/Snap/rootfs/a subdir:
no such file or directory
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
Upstream-commit: b18062122d76a9f53822889874aa12103d491372
Component: engine