Commit Graph

814 Commits

Author SHA1 Message Date
ad01430349 Merge pull request #35829 from cpuguy83/no_private_mount_for_plugins
Perform plugin mounts in the runtime
Upstream-commit: 20028325daab4fcbee9c8e28f43dbfb2b1c5d568
Component: engine
2018-02-21 12:28:13 +01:00
39cad2aa10 Merge pull request #36256 from wcwxyz/fix-refcounter-memory-leak
graphdriver: Fix RefCounter memory leak
Upstream-commit: 733ed2ddd3c621dadafbb74feb7b80d20fd3fd6f
Component: engine
2018-02-19 10:32:14 -08:00
8bb8847f9c Merge pull request #36237 from cpuguy83/zfs_do_not_unmount
Do not recursive unmount on cleanup of zfs/btrfs
Upstream-commit: 68c3201626439d5be5c24d14d4fe7e27fe93954d
Component: engine
2018-02-14 09:49:17 -05:00
4adc380b90 Fix typos in daemon
Signed-off-by: bin liu <liubin0329@gmail.com>
Upstream-commit: b00a67be6e3d3f241879110bd342abaa8e23cbac
Component: engine
2018-02-10 19:42:54 +08:00
f74751654b graphdriver: Fix RefCounter memory leak
Signed-off-by: WANG Chao <chao.wang@ucloud.cn>
Upstream-commit: 9015a05606a9bb80f0d8d2e3d43b0b682ca53db4
Component: engine
2018-02-09 10:26:06 +08:00
90450b2044 Ensure plugin returns correctly scoped paths
Before this change, volume management was relying on the fact that
everything the plugin mounts is visible on the host within the plugin's
rootfs. In practice this caused some issues with mount leaks, so we
changed the behavior such that mounts are not visible on the plugin's
rootfs, but available outside of it, which breaks volume management.

To fix the issue, allow the plugin to scope the path correctly rather
than assuming that everything is visible in `p.Rootfs`.
In practice this is just scoping the `PropagatedMount` paths to the
correct host path.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 0e5eaf8ee32662182147f5f62c1bfebef66f5c47
Component: engine
2018-02-07 15:48:27 -05:00
3928c278e5 Plugins perform propagated mount in runtime spec
Setting up the mounts on the host increases chances of mount leakage and
makes for more cleanup after the plugin has stopped.
With this change all mounts for the plugin are performed by the
container runtime and automatically cleaned up when the container exits.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: a53930a04fa81b082aa78e66b342ff19cc63cc5f
Component: engine
2018-02-07 15:48:27 -05:00
33ddc6d172 Do not recursive unmount on cleanup of zfs/btrfs
This was added in #36047 just as a way to make sure the tree is fully
unmounted on shutdown.

For ZFS this could be a breaking change since there was no unmount before.
Someone could have setup the zfs tree themselves. It would be better, if
we really do want the cleanup to actually the unpacked layers checking
for mounts rather than a blind recursive unmount of the root.

BTRFS does not use mounts and does not need to unmount anyway.
These was only an unmount to begin with because for some reason the
btrfs tree was being moutned with `private` propagation.

For the other graphdrivers that still have a recursive unmount here...
these were already being unmounted and performing the recursive unmount
shouldn't break anything. If anyone had anything mounted at the
graphdriver location it would have been unmounted on shutdown anyway.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 2fe4f888bee52b1f256d6fa5e20f9b061d30221c
Component: engine
2018-02-07 15:08:17 -05:00
be83c11fb0 Add canonical import comment
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Upstream-commit: 4f0d95fa6ee7f865597c03b9e63702cdcb0f7067
Component: engine
2018-02-05 16:51:57 -05:00
6c313c03fe Merge pull request #36114 from Microsoft/jjh/fixdeadlock-lcowdriver-hotremove
LCOW: Graphdriver fix deadlock in hotRemoveVHDs
Upstream-commit: 03a1df95369ddead968e48697038904c84578d00
Component: engine
2018-01-29 09:57:43 -08:00
db2a10168c Merge pull request #36047 from cpuguy83/graphdriver_improvements
Do not make graphdriver homes private mounts.
Upstream-commit: 2c05aefc99d33edde47b08e38978b6c2f4178648
Component: engine
2018-01-26 13:54:30 -05:00
a002f8068e LCOW: Graphdriver fix deadlock
Signed-off-by: John Howard <jhoward@microsoft.com>
Upstream-commit: a44fcd3d27c06aaa60d8d1cbce169f0d982e74b1
Component: engine
2018-01-26 08:57:52 -08:00
c49971b835 Merge pull request #36052 from Microsoft/jjh/no-overlay-off-only-one-disk
LCOW: Regular mount if only one layer
Upstream-commit: a8d0e36d0329063af9205b4848d1f5c09bd4c3be
Component: engine
2018-01-25 15:46:16 -08:00
12ceea25e6 Merge pull request #36051 from Microsoft/jjh/remotefs-read-return-error
LCOW remotefs - return error in Read() implementation
Upstream-commit: 3c9d023af3428f49241a2e2385dae43151185466
Component: engine
2018-01-19 11:27:13 -08:00
ebd586c561 LCOW remotefs - return error in Read() implementation
Signed-off-by: John Howard <jhoward@microsoft.com>
Upstream-commit: 6112ad6e7d5d7f5afc698447da80f91bdbf62720
Component: engine
2018-01-18 17:46:58 -08:00
942fd3c62c LCOW: Regular mount if only one layer
Signed-off-by: John Howard <jhoward@microsoft.com>
Upstream-commit: 420dc4eeb48b155e6b83fccf62f8727ce4bf5b21
Component: engine
2018-01-18 12:01:58 -08:00
852153685d LCOW: Refactor to multiple layer-stores based on feedback
Signed-off-by: John Howard <jhoward@microsoft.com>
Upstream-commit: afd305c4b5682fbc297e1685e2b7a49628b7c7f0
Component: engine
2018-01-18 08:31:05 -08:00
9b47a9d16f Do not make graphdriver homes private mounts.
The idea behind making the graphdrivers private is to prevent leaking
mounts into other namespaces.
Unfortunately this is not really what happens.

There is one case where this does work, and that is when the namespace
was created before the daemon's namespace.
However with systemd each system servie winds up with it's own mount
namespace. This causes a race betwen daemon startup and other system
services as to if the mount is actually private.

This also means there is a negative impact when other system services
are started while the daemon is running.

Basically there are too many things that the daemon does not have
control over (nor should it) to be able to protect against these kinds
of leakages. One thing is certain, setting the graphdriver roots to
private disconnects the mount ns heirarchy preventing propagation of
unmounts... new mounts are of course not propagated either, but the
behavior is racey (or just bad in the case of restarting services)... so
it's better to just be able to keep mount propagation in tact.

It also does not protect situations like `-v
/var/lib/docker:/var/lib/docker` where all mounts are recursively bound
into the container anyway.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 9803272f2db84df7955b16c0d847ad72cdc494d1
Component: engine
2018-01-18 09:34:00 -05:00
5a20e1240c LCOW: Fix OpenFile parameters
Signed-off-by: John Howard <jhoward@microsoft.com>
Upstream-commit: 141b9a74716c016029badf16aca21dc96975aaac
Component: engine
2018-01-17 07:58:18 -08:00
4a656e30d0 Improve zfs init log message for zfs
Signed-off-by: Drew Hubl <drew.hubl@gmail.com>
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 27b002f4a02e2d9f6eded9004b82cb81f121264f
Component: engine
2018-01-16 21:42:05 -05:00
d7084e8097 Merge pull request #35638 from cpuguy83/error_helpers2
Add helpers to create errdef errors
Upstream-commit: c36274da8326c59aaa12c48196671b41dcb89e5b
Component: engine
2018-01-15 10:56:46 -08:00
621388138c Golint: remove redundant ifs
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: b4a63139696aea2c73ec361a9af8b36a118f0423
Component: engine
2018-01-15 00:42:25 +01:00
d4d0b5c268 Move api/errdefs to errdefs
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: d453fe35b9b8b52d0677fe0c3cc8373f2f5d30d0
Component: engine
2018-01-11 21:21:43 -05:00
58f87a5e9a Cleanup graphdriver/quota test
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Upstream-commit: e3a6323419c8b2a81a21134ed49a1624eb483c81
Component: engine
2018-01-08 13:41:45 -05:00
f1638ade2e Merge pull request #35674 from kolyshkin/zfs-rmdir
zfs: fix ebusy on umount etc
Upstream-commit: daded8da9178ec097e59a6b0b2bf12754b5472d8
Component: engine
2017-12-28 15:35:19 -08:00
0a947da27f Merge pull request #35827 from kolyshkin/vfs-quota
Fix VFS vs quota regression
Upstream-commit: 3e1df952b7d693ac3961f492310852bdf3106240
Component: engine
2017-12-22 12:32:03 -06:00
f0d368cb33 vfs gd: ignore quota setup errors
This is a fix to regression in vfs graph driver introduced by
commit 7a1618ced359a3ac92 ("add quota support to VFS graphdriver").

On some filesystems, vfs fails to init with the following error:

> Error starting daemon: error initializing graphdriver: Failed to mknod
> /go/src/github.com/docker/docker/bundles/test-integration/d6bcf6de610e9/root/vfs/backingFsBlockDev:
> function not implemented

As quota is not essential for vfs, let's ignore (but log as a warning) any error
from quota init.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 1e8a087850aa9f96c5000a3ad90757d2e9c0499f
Component: engine
2017-12-19 13:55:04 -08:00
c5f1cdf92c projectquota: treat ENOSYS as quota unsupported
If mknod() returns ENOSYS, it most probably means quota is not supported
here, so return the appropriate error.

This is a conservative* fix to regression in vfs graph driver introduced
by commit 7a1618ced359a3ac92 ("add quota support to VFS graphdriver").
On some filesystems, vfs fails to init with the following error:

> Error starting daemon: error initializing graphdriver: Failed to mknod
> /go/src/github.com/docker/docker/bundles/test-integration/d6bcf6de610e9/root/vfs/backingFsBlockDev:
> function not implemented

Reported-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 2dd39b7841bdb9968884bbedc5db97ff77d4fe3e
Component: engine
2017-12-18 10:32:36 -08:00
fd08bae89c Remove redundant build-tags
Files that are suffixed with `_linux.go` or `_windows.go` are
already only built on Linux / Windows, so these build-tags
were redundant.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 6ed1163c98703f8dd0693cecbadc84d2cda811c3
Component: engine
2017-12-18 17:41:53 +01:00
feccd72a49 Fix overlay2 storage driver inside a user namespace
The overlay2 driver was not setting up the archive.TarOptions field
properly like other storage backend routes to "applyTarLayer"
functionality. The InUserNS field is populated now for overlay2 using
the same query function used by the other storage drivers.

Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
Upstream-commit: 05b8d59015f8a5ce26c8bbaa8053b5bc7cb1a77b
Component: engine
2017-12-13 23:38:22 -05:00
ef0d4a2bec devmapper: Log fstype and mount options during mount error
Right now we only log source and destination (and demsg) if mount operation
fails. fstype and mount options are available easily. It probably is a good
idea to log these as well. Especially sometimes failures can happen due to
mount options.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: f728d74ac5d185adaa5f1a88eadc71217806859f
Component: engine
2017-12-11 13:27:14 -05:00
e8e8613267 Merge pull request #35514 from thaJeztah/disable-overlay-without-d_type
Remove support for overlay/overlay2 without d_type
Upstream-commit: bd8a9c25ee257384ca24cf32e61b6b0ef71f521d
Component: engine
2017-12-06 14:13:58 -08:00
0ec86524ed fixed typo (reliablity -> reliability)
Signed-off-by: Jihyun Hwang <jhhwang@telcoware.com>
Upstream-commit: 518c50c9b21225ee991d5147cccb687ea8640afc
Component: engine
2017-12-06 09:51:53 +09:00
0c8a47d019 Allow existing setups to continue using d_type
Even though it's highly discouraged, there are existing
installs that are running overlay/overlay2 on filesystems
without d_type support.

This patch allows the daemon to start in such cases, instead of
refusing to start without an option to override.

For fresh installs, backing filesystems without d_type support
will still cause the overlay/overlay2 drivers to be marked as
"unsupported", and skipped during the automatic selection.

This feature is only to keep backward compatibility, but
will be removed at some point.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 0a4e793a3da9ba6d20bccfb83f7c48e20a76d895
Component: engine
2017-12-04 18:41:25 -08:00
327b80ad82 Merge pull request #35537 from sargun/vfs-use-copy_file_range
Have VFS graphdriver use accelerated in-kernel copy
Upstream-commit: 4047cede65862aa0ea5616297d7c0f3b12526ad4
Component: engine
2017-12-04 19:34:56 -06:00
e4dde67875 Remove support for overlay/overlay2 without d_type
Support for running overlay/overlay2 on a backing filesystem
without d_type support (most likely: xfs, as ext4 supports
this by default), was deprecated for some time.

Running without d_type support is problematic, and can
lead to difficult to debug issues ("invalid argument" errors,
or unable to remove files from the container's filesystem).

This patch turns the warning that was previously printed
into an "unsupported" error, so that the overlay/overlay2
drivers are not automatically selected when detecting supported
storage drivers.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 0abb8dec3f730f3ad2cc9a161c97968a6bfd0631
Component: engine
2017-12-04 17:10:20 -08:00
403fcf5047 Perform fsmagic detection on driver's home-dir if it exists
The fsmagic check was always performed on "data-root" (`/var/lib/docker`),
not on the storage-driver's home directory (e.g. `/var/lib/docker/<somedriver>`).

This caused detection to be done on the wrong filesystem in situations
where `/var/lib/docker/<somedriver>` was a mount, and a different
filesystem than `/var/lib/docker` itself.

This patch checks if the storage-driver's home directory exists, and only
falls back to `/var/lib/docker` if it doesn't exist.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: f9c8fa305e1501d8056f8744cb193a720aab0e13
Component: engine
2017-12-04 17:10:07 -08:00
f1cfd038e2 zfs: fix ebusy on umount
This commit is a set of fixes and improvement for zfs graph driver,
in particular:

1. Remove mount point after umount in `Get()` error path, as well
   as in `Put()` (with `MNT_DETACH` flag). This should solve "failed
   to remove root filesystem for <ID> .... dataset is busy" error
   reported in Moby issue 35642.

To reproduce the issue:

   - start dockerd with zfs
   - docker run -d --name c1 --rm busybox top
   - docker run -d --name c2 --rm busybox top
   - docker stop c1
   - docker rm c1

Output when the bug is present:
```
Error response from daemon: driver "zfs" failed to remove root
filesystem for XXX : exit status 1: "/sbin/zfs zfs destroy -r
scratch/docker/YYY" => cannot destroy 'scratch/docker/YYY':
dataset is busy
```

Output when the bug is fixed:
```
Error: No such container: c1
```
(as the container has been successfully autoremoved on stop)

2. Fix/improve error handling in `Get()` -- do not try to umount
   if `refcount` > 0

3. Simplifies unmount in `Get()`. Specifically, remove call to
   `graphdriver.Mounted()` (which checks if fs is mounted using
   `statfs()` and check for fs type) and `mount.Unmount()` (which
   parses `/proc/self/mountinfo`). Calling `unix.Unmount()` is
   simple and sufficient.

4. Add unmounting of driver's home to `Cleanup()`.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: a450a575a672b90f02a4b1d3d9300ce1d70a6311
Component: engine
2017-12-01 13:46:47 -08:00
20a2865e53 Fix setting mtimes on directories
Previously, the code would set the mtime on the directories before
creating files in the directory itself. This was problematic
because it resulted in the mtimes on the directories being
incorrectly set. This change makes it so that the mtime is
set only _after_ all of the files have been created.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Upstream-commit: 77a2bc3e5bbc9be3fe166ed8321b7cd04e7bd097
Component: engine
2017-12-01 09:12:43 -08:00
87f648cd90 Merge pull request #35618 from kolyshkin/mkdir-all
Fix MkdirAll* and its usage
Upstream-commit: 72e45fd54e13256c813fdb39b18e26a0de980733
Component: engine
2017-11-30 11:19:29 -05:00
a88b24bfeb Merge pull request #35483 from thaJeztah/disallow-nfs-backing-for-overlay
Disallow overlay/overlay2 on top of NFS
Upstream-commit: bdd9668b489c65eb1ef7272d38ad877ffda2041c
Component: engine
2017-11-29 19:24:58 -08:00
006ed302fe Merge pull request #35527 from thaJeztah/feature-detect-overlay2
Detect overlay2 support on pre-4.0 kernels
Upstream-commit: 11e07e7da6023b872788b1e05f83e147f0984fb2
Component: engine
2017-11-29 13:51:26 -08:00
87cfb875f7 Merge pull request #34948 from euank/public-mounts
Fix EBUSY errors under overlayfs and v4.13+ kernels
Upstream-commit: 09eb7bcc3624f5bd70135d6d24021b40c1095b46
Component: engine
2017-11-29 12:48:39 +09:00
877f5d0f1f Fix bug, where copy_file_range was still calling legacy copy
There was a small issue here, where it copied the data using
traditional mechanisms, even when copy_file_range was successful.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Upstream-commit: 0eac562281782257e6f69d58bcbc13fa889f1759
Component: engine
2017-11-28 14:59:56 -08:00
608a03b9d5 Have VFS graphdriver use accelerated in-kernel copy
This change makes the VFS graphdriver use the kernel-accelerated
(copy_file_range) mechanism of copying files, which is able to
leverage reflinks.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Upstream-commit: d2b71b26604370620630d8d3f35aba75ae474f3f
Component: engine
2017-11-28 14:59:56 -08:00
85f5db8154 Fix copying hardlinks in graphdriver/copy
Previously, graphdriver/copy would improperly copy hardlinks as just regular
files. This patch changes that behaviour, and instead the code now keeps
track of inode numbers, and if it sees the same inode number again
during the copy loop, it hardlinks it, instead of copying it.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Upstream-commit: b467f8b2ef21dc2239dcd136a29283ea6c3a0aee
Component: engine
2017-11-28 14:59:56 -08:00
efd7b774aa Detect overlay2 support on pre-4.0 kernels
The overlay2 storage-driver requires multiple lower dir
support for overlayFs. Support for this feature was added
in kernel 4.x, but some distros (RHEL 7.4, CentOS 7.4) ship with
an older kernel with this feature backported.

This patch adds feature-detection for multiple lower dirs,
and will perform this feature-detection on pre-4.x kernels
with overlayFS support.

With this patch applied, daemons running on a kernel
with multiple lower dir support will now select "overlay2"
as storage-driver, instead of falling back to "overlay".

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 955c1f881ac94af19c99f0f7d5635e6a574789f2
Component: engine
2017-11-28 13:55:33 -08:00
2ddb9ceeba Merge pull request #35528 from thaJeztah/ignore-empty-graphdirs
Skip empty directories on prior graphdriver detection
Upstream-commit: 9ae697167c8b90b40d5248e99a8f0a0fc304637e
Component: engine
2017-11-27 17:37:39 -08:00
bc89af9929 Simplify/fix MkdirAll usage
This subtle bug keeps lurking in because error checking for `Mkdir()`
and `MkdirAll()` is slightly different wrt to `EEXIST`/`IsExist`:

 - for `Mkdir()`, `IsExist` error should (usually) be ignored
   (unless you want to make sure directory was not there before)
   as it means "the destination directory was already there"

 - for `MkdirAll()`, `IsExist` error should NEVER be ignored.

Mostly, this commit just removes ignoring the IsExist error, as it
should not be ignored.

Also, there are a couple of cases then IsExist is handled as
"directory already exist" which is wrong. As a result, some code
that never worked as intended is now removed.

NOTE that `idtools.MkdirAndChown()` behaves like `os.MkdirAll()`
rather than `os.Mkdir()` -- so its description is amended accordingly,
and its usage is handled as such (i.e. IsExist error is not ignored).

For more details, a quote from my runc commit 6f82d4b (July 2015):

    TL;DR: check for IsExist(err) after a failed MkdirAll() is both
    redundant and wrong -- so two reasons to remove it.

    Quoting MkdirAll documentation:

    > MkdirAll creates a directory named path, along with any necessary
    > parents, and returns nil, or else returns an error. If path
    > is already a directory, MkdirAll does nothing and returns nil.

    This means two things:

    1. If a directory to be created already exists, no error is
    returned.

    2. If the error returned is IsExist (EEXIST), it means there exists
    a non-directory with the same name as MkdirAll need to use for
    directory. Example: we want to MkdirAll("a/b"), but file "a"
    (or "a/b") already exists, so MkdirAll fails.

    The above is a theory, based on quoted documentation and my UNIX
    knowledge.

    3. In practice, though, current MkdirAll implementation [1] returns
    ENOTDIR in most of cases described in #2, with the exception when
    there is a race between MkdirAll and someone else creating the
    last component of MkdirAll argument as a file. In this very case
    MkdirAll() will indeed return EEXIST.

    Because of #1, IsExist check after MkdirAll is not needed.

    Because of #2 and #3, ignoring IsExist error is just plain wrong,
    as directory we require is not created. It's cleaner to report
    the error now.

    Note this error is all over the tree, I guess due to copy-paste,
    or trying to follow the same usage pattern as for Mkdir(),
    or some not quite correct examples on the Internet.

    [1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 516010e92d56cfcd6d1e343bdc02b6f04bc43039
Component: engine
2017-11-27 17:32:12 -08:00
7830091c4b graphdriver/overlay{,2}: remove 'merged' on umount
This removes and recreates the merged dir with each umount/mount
respectively.
This is done to make the impact of leaking mountpoints have less
user-visible impact.

It's fairly easy to accidentally leak mountpoints (even if moby doesn't,
other tools on linux like 'unshare' are quite able to incidentally do
so).

As of recently, overlayfs reacts to these mounts being leaked (see

One trick to force an unmount is to remove the mounted directory and
recreate it. Devicemapper now does this, overlay can follow suit.

Signed-off-by: Euan Kemp <euan.kemp@coreos.com>
Upstream-commit: af0d589623eff9f8cefced8b527dbd7cf221ce61
Component: engine
2017-11-22 14:32:30 -08:00