The idea behind making the graphdrivers private is to prevent leaking
mounts into other namespaces.
Unfortunately this is not really what happens.
There is one case where this does work, and that is when the namespace
was created before the daemon's namespace.
However with systemd each system servie winds up with it's own mount
namespace. This causes a race betwen daemon startup and other system
services as to if the mount is actually private.
This also means there is a negative impact when other system services
are started while the daemon is running.
Basically there are too many things that the daemon does not have
control over (nor should it) to be able to protect against these kinds
of leakages. One thing is certain, setting the graphdriver roots to
private disconnects the mount ns heirarchy preventing propagation of
unmounts... new mounts are of course not propagated either, but the
behavior is racey (or just bad in the case of restarting services)... so
it's better to just be able to keep mount propagation in tact.
It also does not protect situations like `-v
/var/lib/docker:/var/lib/docker` where all mounts are recursively bound
into the container anyway.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 9803272f2db84df7955b16c0d847ad72cdc494d1
Component: engine
The fsmagic check was always performed on "data-root" (`/var/lib/docker`),
not on the storage-driver's home directory (e.g. `/var/lib/docker/<somedriver>`).
This caused detection to be done on the wrong filesystem in situations
where `/var/lib/docker/<somedriver>` was a mount, and a different
filesystem than `/var/lib/docker` itself.
This patch checks if the storage-driver's home directory exists, and only
falls back to `/var/lib/docker` if it doesn't exist.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: f9c8fa305e1501d8056f8744cb193a720aab0e13
Component: engine
This subtle bug keeps lurking in because error checking for `Mkdir()`
and `MkdirAll()` is slightly different wrt to `EEXIST`/`IsExist`:
- for `Mkdir()`, `IsExist` error should (usually) be ignored
(unless you want to make sure directory was not there before)
as it means "the destination directory was already there"
- for `MkdirAll()`, `IsExist` error should NEVER be ignored.
Mostly, this commit just removes ignoring the IsExist error, as it
should not be ignored.
Also, there are a couple of cases then IsExist is handled as
"directory already exist" which is wrong. As a result, some code
that never worked as intended is now removed.
NOTE that `idtools.MkdirAndChown()` behaves like `os.MkdirAll()`
rather than `os.Mkdir()` -- so its description is amended accordingly,
and its usage is handled as such (i.e. IsExist error is not ignored).
For more details, a quote from my runc commit 6f82d4b (July 2015):
TL;DR: check for IsExist(err) after a failed MkdirAll() is both
redundant and wrong -- so two reasons to remove it.
Quoting MkdirAll documentation:
> MkdirAll creates a directory named path, along with any necessary
> parents, and returns nil, or else returns an error. If path
> is already a directory, MkdirAll does nothing and returns nil.
This means two things:
1. If a directory to be created already exists, no error is
returned.
2. If the error returned is IsExist (EEXIST), it means there exists
a non-directory with the same name as MkdirAll need to use for
directory. Example: we want to MkdirAll("a/b"), but file "a"
(or "a/b") already exists, so MkdirAll fails.
The above is a theory, based on quoted documentation and my UNIX
knowledge.
3. In practice, though, current MkdirAll implementation [1] returns
ENOTDIR in most of cases described in #2, with the exception when
there is a race between MkdirAll and someone else creating the
last component of MkdirAll argument as a file. In this very case
MkdirAll() will indeed return EEXIST.
Because of #1, IsExist check after MkdirAll is not needed.
Because of #2 and #3, ignoring IsExist error is just plain wrong,
as directory we require is not created. It's cleaner to report
the error now.
Note this error is all over the tree, I guess due to copy-paste,
or trying to follow the same usage pattern as for Mkdir(),
or some not quite correct examples on the Internet.
[1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 516010e92d56cfcd6d1e343bdc02b6f04bc43039
Component: engine
This enables docker cp and ADD/COPY docker build support for LCOW.
Originally, the graphdriver.Get() interface returned a local path
to the container root filesystem. This does not work for LCOW, so
the Get() method now returns an interface that LCOW implements to
support copying to and from the container.
Signed-off-by: Akash Gupta <akagup@microsoft.com>
Upstream-commit: 7a7357dae1bcccb17e9b2d4c7c8f5c025fce56ca
Component: engine
Addresses some comments on 276b44608b04f08bdf46ce7c816b1f744bf24b7d
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 0a98025d4b2910c2089325b87d28c32d05803e13
Component: engine
In d42dbdd3d48d0134f8bba7ead92a7067791dffab the code was re-arranged to
better report errors, and ignore non-errors.
In doing so we removed a deferred remove of the AUFS diff path, but did
not replace it with a non-deferred one.
This fixes the issue and makes the code a bit more readable.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 276b44608b04f08bdf46ce7c816b1f744bf24b7d
Component: engine
Specifically, none of the graphdrivers are supposed to return a
not-exist type of error on remove (or at least that's how they are
currently handled).
Found that AUFS still had one case where a not-exist error could escape,
when checking if the directory is mounted we call a `Statfs` on the
path.
This fixes AUFS to not return an error in this case, but also
double-checks at the daemon level on layer remove that the error is not
a `not-exist` type of error.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: d42dbdd3d48d0134f8bba7ead92a7067791dffab
Component: engine
Changes most references of syscall to golang.org/x/sys/
Ones aren't changes include, Errno, Signal and SysProcAttr
as they haven't been implemented in /x/sys/.
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
[s390x] switch utsname from unsigned to signed
per 33267e036f
char in s390x in the /x/sys/unix package is now signed, so
change the buildtags
Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
Upstream-commit: 069fdc8a083cb1663e4f86fe3fd9b9a1aebc3e54
Component: engine
We add ",dirperm1" but only increase length by len("dirperm1").
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Upstream-commit: 1a6bf8248a32c160347e4daf3dd4f15023357889
Component: engine
Before this, if `forceRemove` is set the container data will be removed
no matter what, including if there are issues with removing container
on-disk state (rw layer, container root).
In practice this causes a lot of issues with leaked data sitting on
disk that users are not able to clean up themselves.
This is particularly a problem while the `EBUSY` errors on remove are so
prevalent. So for now let's not keep this behavior.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 54dcbab25ea4771da303fa95e0c26f2d39487b49
Component: engine
Since it was introduced no reports were made and lsof seems to cause
issues on some systems.
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Upstream-commit: eac66b67be0710322d0e21a67a55f5731be48f68
Component: engine
text does not appear to contain a placeholder
Signed-off-by: Helen Xie <chenjg@harmonycloud.cn>
Upstream-commit: 2a8d6368d4a930203b93f75914173ab65bf3b0bc
Component: engine
fix typo I found AMAP in integration-cli/*
fix typo mentioned by Allencloud
Signed-off-by: Aaron.L.Xu <likexu@harmonycloud.cn>
Upstream-commit: 40af5691648c5b9d07b1231e3ed3be29fd66521a
Component: engine
This allows for easy extension of adding more parameters to existing
parameters list. Otherwise adding a single parameter changes code
at so many places.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: b937aa8e6968d805527d163e6f477d496ceb88d7
Component: engine
Allow built images to be squash to scratch.
Squashing does not destroy any images or layers, and preserves the
build cache.
Introduce a new CLI argument --squash to docker build
Introduce a new param to the build API endpoint `squash`
Once the build is complete, docker creates a new image loading the diffs
from each layer into a single new layer and references all the parent's
layers.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 362369b4bbea38881402d281ee2015d16e8b10ce
Component: engine
This fix tries to fix logrus formatting by removing `f` from
`logrus.[Error|Warn|Debug|Fatal|Panic|Info]f` when formatting string
is not present.
Fixed issue #23459
Signed-off-by: Daehyeok Mun <daehyeok@gmail.com>
Upstream-commit: fa710e504b0e3e51d4031790c18621b02dcd2600
Component: engine
The `archive` package defines aliases for `io.ReadCloser` and
`io.Reader`. These don't seem to provide an benefit other than type
decoration. Per this change, several unnecessary type cases were
removed.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Upstream-commit: aa2cc18745cbe0231c33782f0fa764f657e3fb88
Component: engine
Truncated dir name can't give any useful information, print whole dir
name will.
Bad debug log is like this:
```
DEBU[2449] aufs error unmounting /var/lib/doc: no such file or directory
```
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Upstream-commit: af8359562c9561afad0a05e66386588d17788804
Component: engine
use a consistent approach for checking if the
backing filesystem is compatible with the
storage driver.
also add an error-message for the AUFS driver if
an incompatible combination is found.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 1fc0acc9ae77752858057d1f6f8487ccd82372be
Component: engine
If aufs is already modprobe'd but we are in a user namespace, the
aufs driver will happily load but then get eperm when it actually tries
to do something. So detect that condition.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Upstream-commit: 2a71f28a4e1167dee32aa16ddbc819c9d9e77f71
Component: engine
On aufs, auplink is run before the Unmount. Irrespective of the
result, we proceed to issue a Unmount syscall. In which case,
demote erros on auplink to warning.
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
Upstream-commit: dbd9b7e121c2e20e5429fdc97421c9510746161e
Component: engine
Since the layer store was introduced, the level above the graphdriver
now differentiates between read/write and read-only layers. This
distinction is useful for graphdrivers that need to take special steps
when creating a layer based on whether it is read-only or not.
Adding this parameter allows the graphdrivers to differentiate, which
in the case of the Windows graphdriver, removes our dependence on parsing
the id of the parent for "-init" in order to infer this information.
This will also set the stage for unblocking some of the layer store
unit tests in the next preview build of Windows.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
Upstream-commit: ef5bfad3210a9e9c8b761f2c11c0c6289490ebff
Component: engine
Instead of implementing refcounts at each graphdriver, implement this in
the layer package which is what the engine actually interacts with now.
This means interacting directly with the graphdriver is no longer
explicitly safe with regard to Get/Put calls being refcounted.
In addition, with the containerd, layers may still be mounted after
a daemon restart since we will no longer explicitly kill containers when
we shutdown or startup engine.
Because of this ref counts would need to be repopulated.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 65d79e3e5e537039b244afd7eda29e721a93d84f
Component: engine
Instead of implementing refcounts at each graphdriver, implement this in
the layer package which is what the engine actually interacts with now.
This means interacting directly with the graphdriver is no longer
explicitly safe with regard to Get/Put calls being refcounted.
In addition, with the containerd, layers may still be mounted after
a daemon restart since we will no longer explicitly kill containers when
we shutdown or startup engine.
Because of this ref counts would need to be repopulated.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 563d0711f83952e561a0d7d5c48fef9810b4f010
Component: engine
Check whether or not the file system type of a mountpoint is aufs
by calling statfs() instead of parsing mountinfo. This assumes
that aufs graph driver does not allow aufs as a backing file
system.
Signed-off-by: Tatsushi Inagaki <e29253@jp.ibm.com>
Upstream-commit: e8513675a20e2756e6c2915604605236d1a94d65
Component: engine
This allows a graph driver to provide a custom FileGetter for tar-split
to use. Windows will use this to provide a more efficient implementation
in a follow-up change.
Signed-off-by: John Starks <jostarks@microsoft.com>
Upstream-commit: 58bec40d16265362fd4e41dbd652e6fba903794d
Component: engine