The overlay2 driver was not setting up the archive.TarOptions field
properly like other storage backend routes to "applyTarLayer"
functionality. The InUserNS field is populated now for overlay2 using
the same query function used by the other storage drivers.
Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
Upstream-commit: 05b8d59015f8a5ce26c8bbaa8053b5bc7cb1a77b
Component: engine
Right now we only log source and destination (and demsg) if mount operation
fails. fstype and mount options are available easily. It probably is a good
idea to log these as well. Especially sometimes failures can happen due to
mount options.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: f728d74ac5d185adaa5f1a88eadc71217806859f
Component: engine
Even though it's highly discouraged, there are existing
installs that are running overlay/overlay2 on filesystems
without d_type support.
This patch allows the daemon to start in such cases, instead of
refusing to start without an option to override.
For fresh installs, backing filesystems without d_type support
will still cause the overlay/overlay2 drivers to be marked as
"unsupported", and skipped during the automatic selection.
This feature is only to keep backward compatibility, but
will be removed at some point.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 0a4e793a3da9ba6d20bccfb83f7c48e20a76d895
Component: engine
Support for running overlay/overlay2 on a backing filesystem
without d_type support (most likely: xfs, as ext4 supports
this by default), was deprecated for some time.
Running without d_type support is problematic, and can
lead to difficult to debug issues ("invalid argument" errors,
or unable to remove files from the container's filesystem).
This patch turns the warning that was previously printed
into an "unsupported" error, so that the overlay/overlay2
drivers are not automatically selected when detecting supported
storage drivers.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 0abb8dec3f730f3ad2cc9a161c97968a6bfd0631
Component: engine
The fsmagic check was always performed on "data-root" (`/var/lib/docker`),
not on the storage-driver's home directory (e.g. `/var/lib/docker/<somedriver>`).
This caused detection to be done on the wrong filesystem in situations
where `/var/lib/docker/<somedriver>` was a mount, and a different
filesystem than `/var/lib/docker` itself.
This patch checks if the storage-driver's home directory exists, and only
falls back to `/var/lib/docker` if it doesn't exist.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: f9c8fa305e1501d8056f8744cb193a720aab0e13
Component: engine
Previously, the code would set the mtime on the directories before
creating files in the directory itself. This was problematic
because it resulted in the mtimes on the directories being
incorrectly set. This change makes it so that the mtime is
set only _after_ all of the files have been created.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Upstream-commit: 77a2bc3e5bbc9be3fe166ed8321b7cd04e7bd097
Component: engine
There was a small issue here, where it copied the data using
traditional mechanisms, even when copy_file_range was successful.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Upstream-commit: 0eac562281782257e6f69d58bcbc13fa889f1759
Component: engine
This change makes the VFS graphdriver use the kernel-accelerated
(copy_file_range) mechanism of copying files, which is able to
leverage reflinks.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Upstream-commit: d2b71b26604370620630d8d3f35aba75ae474f3f
Component: engine
Previously, graphdriver/copy would improperly copy hardlinks as just regular
files. This patch changes that behaviour, and instead the code now keeps
track of inode numbers, and if it sees the same inode number again
during the copy loop, it hardlinks it, instead of copying it.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Upstream-commit: b467f8b2ef21dc2239dcd136a29283ea6c3a0aee
Component: engine
The overlay2 storage-driver requires multiple lower dir
support for overlayFs. Support for this feature was added
in kernel 4.x, but some distros (RHEL 7.4, CentOS 7.4) ship with
an older kernel with this feature backported.
This patch adds feature-detection for multiple lower dirs,
and will perform this feature-detection on pre-4.x kernels
with overlayFS support.
With this patch applied, daemons running on a kernel
with multiple lower dir support will now select "overlay2"
as storage-driver, instead of falling back to "overlay".
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 955c1f881ac94af19c99f0f7d5635e6a574789f2
Component: engine
This subtle bug keeps lurking in because error checking for `Mkdir()`
and `MkdirAll()` is slightly different wrt to `EEXIST`/`IsExist`:
- for `Mkdir()`, `IsExist` error should (usually) be ignored
(unless you want to make sure directory was not there before)
as it means "the destination directory was already there"
- for `MkdirAll()`, `IsExist` error should NEVER be ignored.
Mostly, this commit just removes ignoring the IsExist error, as it
should not be ignored.
Also, there are a couple of cases then IsExist is handled as
"directory already exist" which is wrong. As a result, some code
that never worked as intended is now removed.
NOTE that `idtools.MkdirAndChown()` behaves like `os.MkdirAll()`
rather than `os.Mkdir()` -- so its description is amended accordingly,
and its usage is handled as such (i.e. IsExist error is not ignored).
For more details, a quote from my runc commit 6f82d4b (July 2015):
TL;DR: check for IsExist(err) after a failed MkdirAll() is both
redundant and wrong -- so two reasons to remove it.
Quoting MkdirAll documentation:
> MkdirAll creates a directory named path, along with any necessary
> parents, and returns nil, or else returns an error. If path
> is already a directory, MkdirAll does nothing and returns nil.
This means two things:
1. If a directory to be created already exists, no error is
returned.
2. If the error returned is IsExist (EEXIST), it means there exists
a non-directory with the same name as MkdirAll need to use for
directory. Example: we want to MkdirAll("a/b"), but file "a"
(or "a/b") already exists, so MkdirAll fails.
The above is a theory, based on quoted documentation and my UNIX
knowledge.
3. In practice, though, current MkdirAll implementation [1] returns
ENOTDIR in most of cases described in #2, with the exception when
there is a race between MkdirAll and someone else creating the
last component of MkdirAll argument as a file. In this very case
MkdirAll() will indeed return EEXIST.
Because of #1, IsExist check after MkdirAll is not needed.
Because of #2 and #3, ignoring IsExist error is just plain wrong,
as directory we require is not created. It's cleaner to report
the error now.
Note this error is all over the tree, I guess due to copy-paste,
or trying to follow the same usage pattern as for Mkdir(),
or some not quite correct examples on the Internet.
[1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 516010e92d56cfcd6d1e343bdc02b6f04bc43039
Component: engine
This removes and recreates the merged dir with each umount/mount
respectively.
This is done to make the impact of leaking mountpoints have less
user-visible impact.
It's fairly easy to accidentally leak mountpoints (even if moby doesn't,
other tools on linux like 'unshare' are quite able to incidentally do
so).
As of recently, overlayfs reacts to these mounts being leaked (see
One trick to force an unmount is to remove the mounted directory and
recreate it. Devicemapper now does this, overlay can follow suit.
Signed-off-by: Euan Kemp <euan.kemp@coreos.com>
Upstream-commit: af0d589623eff9f8cefced8b527dbd7cf221ce61
Component: engine
When starting the daemon, the `/var/lib/docker` directory
is scanned for existing directories, so that the previously
selected graphdriver will automatically be used.
In some situations, empty directories are present (those
directories can be created during feature detection of
graph-drivers), in which case the daemon refuses to start.
This patch improves detection, and skips empty directories,
so that leftover directories don't cause the daemon to
fail.
Before this change:
$ mkdir /var/lib/docker /var/lib/docker/aufs /var/lib/docker/overlay2
$ dockerd
...
Error starting daemon: error initializing graphdriver: /var/lib/docker contains several valid graphdrivers: overlay2, aufs; Please cleanup or explicitly choose storage driver (-s <DRIVER>)
With this patch applied:
$ mkdir /var/lib/docker /var/lib/docker/aufs /var/lib/docker/overlay2
$ dockerd
...
INFO[2017-11-16T17:26:43.207739140Z] Docker daemon commit=ab90bc296 graphdriver(s)=overlay2 version=dev
INFO[2017-11-16T17:26:43.208033095Z] Daemon has completed initialization
And on restart (prior graphdriver is still picked up):
$ dockerd
...
INFO[2017-11-16T17:27:52.260361465Z] [graphdriver] using prior storage driver: overlay2
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 1262c57714e694193be6bbcbed83e859dc246c2f
Component: engine
Commit 7a1618ced359a3ac921d8a05903d62f544ff17d0 regresses running Docker
in user namespaces. The new check for whether quota are supported calls
NewControl() which in turn calls makeBackingFsDev() which tries to
mknod(). Skip quota tests when we detect that we are running in a user
namespace and return ErrQuotaNotSupported to the caller. This just
restores the status quo.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Upstream-commit: 7e35df0e0484118740dbf01e7db9b482a1827ef1
Component: engine
Add a way to specify a custom graphdriver priority list
during build. This can be done with something like
go build -ldflags "-X github.com/docker/docker/daemon/graphdriver.priority=overlay2,devicemapper"
As ldflags are already used by the engine build process, and it seems
that only one (last) `-ldflags` argument is taken into account by go,
an envoronment variable `DOCKER_LDFLAGS` is introduced in order to
be able to append some text to `-ldflags`. With this in place,
using the feature becomes
make DOCKER_LDFLAGS="-X github.com/docker/docker/daemon/graphdriver.priority=overlay2,devicemapper" dynbinary
The idea behind this is, the priority list might be different
for different distros, so vendors are now able to change it
without patching the source code.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 17708e72a7ef29fb1d4b03fbded1c5e4c08105fd
Component: engine
Make it possible to disable overlay and overlay2 separately.
With this commit, we now have `exclude_graphdriver_overlay` and
`exclude_graphdriver_overlay2` build tags for the engine, which
is in line with any other graph driver.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: d014be5426c869d429c1a11cad9e76321dd7a326
Component: engine
From https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt:
> The lower filesystem can be any filesystem supported by Linux and does
> not need to be writable. The lower filesystem can even be another
> overlayfs. The upper filesystem will normally be writable and if it
> is it must support the creation of trusted.* extended attributes, and
> must provide valid d_type in readdir responses, so NFS is not suitable.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 90dfb1d0cc59d79ccb272997d735864615010785
Component: engine
In order to avoid reverting our fix for mount leakage in devicemapper,
add a test which checks that devicemapper's Get() and Put() cycle can
survive having a command running in an rprivate mount propagation setup
in-between. While this is quite rudimentary, it should be sufficient.
We have to skip this test for pre-3.18 kernels.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Upstream-commit: 1af8ea681fba1935c60c11edbbe19b894c9b286f
Component: engine
This patch adds the capability for the VFS graphdriver to use
XFS project quotas. It reuses the existing quota management
code that was created by overlay2 on XFS.
It doesn't rely on a filesystem whitelist, but instead
the quota-capability detection code.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Upstream-commit: 7a1618ced359a3ac921d8a05903d62f544ff17d0
Component: engine
This adds a mechanism (read-only) to check for project quota support
in a standard way. This mechanism is leveraged by the tests, which
test for the following:
1. Can we get a quota controller?
2. Can we set the quota for a particular directory?
3. Is the quota being over-enforced?
4. Is the quota being under-enforced?
5. Can we retrieve the quota?
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Upstream-commit: 6966dc0aa9134c518babcbf1f02684cae5374843
Component: engine
Do not print "Data file" and "Metadata file" if they're
not used, and sort/group output.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 8f702de9b705ced68b6244239ac81d86ebdd6b0a
Component: engine
This changeset allows Docker's VFS, and Overlay to take advantage of
Linux's zerocopy APIs.
The copy function first tries to use the ficlone ioctl. Reason being:
- they do not allow partial success (aka short writes)
- clones are expected to be a fast metadata operation
See: http://oss.sgi.com/archives/xfs/2015-12/msg00356.html
If the clone fails, we fall back to copy_file_range, which internally
may fall back to splice, which has an upper limit on the size
of copy it can perform. Given that, we have to loop until the copy
is done.
For a given dirCopy operation, if the clone fails, we will not try
it again during any other file copy. Same is true with copy_file_range.
If all else fails, we fall back to traditional copy.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Upstream-commit: 3ec4ec2857c714387e7b59c2cf324565f6ae55e2
Component: engine
For obvious reasons that it is not really supported now.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Upstream-commit: 5a9b5f10cf967f31f0856871ad08f9a0286b4a46
Component: engine
Signed-off-by: John Howard <jhoward@microsoft.com>
This PR has the API changes described in https://github.com/moby/moby/issues/34617.
Specifically, it adds an HTTP header "X-Requested-Platform" which is a JSON-encoded
OCI Image-spec `Platform` structure.
In addition, it renames (almost all) uses of a string variable platform (and associated)
methods/functions to os. This makes it much clearer to disambiguate with the swarm
"platform" which is really os/arch. This is a stepping stone to getting the daemon towards
fully multi-platform/arch-aware, and makes it clear when "operating system" is being
referred to rather than "platform" which is misleadingly used - sometimes in the swarm
meaning, but more often as just the operating system.
Upstream-commit: 0380fbff37922cadf294851b1546f4c212c7f364
Component: engine
When use overlay2 as the graphdriver and the kernel enable
`CONFIG_OVERLAY_FS_REDIRECT_DIR=y`, rename a dir in lower layer
will has a xattr to redirct its dir to source dir. This make the
image layer unportable. This patch fallback to use naive diff driver
when kernel enable CONFIG_OVERLAY_FS_REDIRECT_DIR
Signed-off-by: Lei Jitang <leijitang@huawei.com>
Upstream-commit: 49c3a7c4bac2877265ef8c4eaf210159560f08b4
Component: engine
The change in 7a7357dae1bcccb17e9b2d4c7c8f5c025fce56ca inadvertently
changed the `defer` error code into a no-op. This restores its behavior
prior to that code change, and also introduces a little more error
logging.
Signed-off-by: Euan Kemp <euan.kemp@coreos.com>
Upstream-commit: 639ab92f011245e17e9a293455a8dae1eb034022
Component: engine