1. The journald client library initializes inotify watch(es)
during the first call to sd_journal_get_fd(), and it make sense
to open it earlier in order to not lose any journal file rotation
events.
2. It only makes sense to call this if we're going to use it
later on -- so add a check for config.Follow.
3. Remove the redundant call to sd_journal_get_fd().
NOTE that any subsequent calls to sd_journal_get_fd() return
the same file descriptor, so there's no real need to save it
for later use in wait_for_data_cancelable().
Based on earlier patch by Nalin Dahyabhai <nalin@redhat.com>.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 981c01665bcb2c9fc5e555c5b976995f31c2a6b4)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 349e199eab5337f03b442f38720293143e1b1fca
Component: engine
In case the LogConsumer is gone, the code that sends the message can
stuck forever. Wrap the code in select case, as all other loggers do.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 79039720c8b7352691350bd56be3cc226d67f205)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 56a8a516127a2630c3a68ab31c416f223b91e9df
Component: engine
In case Tail=N parameter is requested, we need to show N lines.
It does not make sense to walk backwards one by one if we can
do it at once. Now, if Since=T is also provided, make sure we
haven't jumped too far (before T), and if we did, move forward.
The primary motivation for this was to make the code simpler.
This also fixes a tiny bug in the "since" implementation.
Before this commit:
> $ docker logs -t --tail=6000 --since="2019-03-10T03:54:25.00" $ID | head
> 2019-03-10T03:54:24.999821000Z 95981
After:
> $ docker logs -t --tail=6000 --since="2019-03-10T03:54:25.00" $ID | head
> 2019-03-10T03:54:25.000013000Z 95982
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit ff3cd167ea4d089b7695a263ba2fc4caa0a0750c)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 2a124db7da440f1efd4c2957320d8b25d9d9ce36
Component: engine
Minor code simplification.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit e8f6166791c097deb15c39f8dddf6f97be65b224)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 1d336dc53bd7bf5941596ffeb253d102de609a51
Component: engine
Clean up a deferred function call in the journal reading logic.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
(cherry picked from commit 1ada3e85bf89201910c28f2ff6892c00cee0f137)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: e700930ca521d0c004b6a3ed8bdd35a2d538aa15
Component: engine
As in other similar drivers (jsonlog, local), use a set
(i.e. `map[whatever]struct{}`), making the code simpler.
While at it, make sure we remove the reader from the set
after calling `ProducerGone()` on it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit b2b169f13f681cd0d591ccb06d6cfff97933db77)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: fe85c72a2eac4cbf249d2c4c754684bb447eefdd
Component: engine
There are a few more places, apparently, that List operations against
Swarm exist, besides just in the List methods. This increases the max
received message size in those places.
Signed-off-by: Drew Erny <drew.erny@docker.com>
(cherry picked from commit a84a78e9767d82abd4744dad9ce4fb3f64141a8f)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 41da428d065ab15fd2d8aba1fbd0d0056136a43b
Component: engine
Protect access to q.quotas map, and lock around changing nextProjectID.
Techinically, the lock in findNextProjectID() is not needed as it is
only called during initialization, but one can never be too careful.
Fixes: 52897d1c092 ("projectquota: utility class for project quota controls")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 1ac0a66a64a906911d0708cd0e5fa397a2f0b595)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 7027bb9bedae63879c1e41894739ba0ea2deedc1
Component: engine
This should eliminate a bunch of new (go-1.11 related) validation
errors telling that the code is not formatted with `gofmt -s`.
No functional change, just whitespace (i.e.
`git show --ignore-space-change` shows nothing).
Patch generated with:
> git ls-files | grep -v ^vendor/ | grep .go$ | xargs gofmt -s -w
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 9b0097a69900009ab5c2480e047952cba60462a7)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: ee28567c7066368207a947e02c6242db7a4adb16
Component: engine
Before 7a7357da, archive.TarResourceRebase was being used to copy files
and folders from the container. That function splits the source path
into a dirname + basename pair to support copying a file:
if you wanted to tar `dir/file` it would tar from `dir` the file `file`
(as part of the IncludedFiles option).
However, that path splitting logic was kept for folders as well, which
resulted in weird inputs to archive.TarWithOptions:
if you wanted to tar `dir1/dir2` it would tar from `dir1` the directory
`dir2` (as part of IncludedFiles option).
Although it was weird, it worked fine until we started chrooting into
the container rootfs when doing a `docker cp` with container source set
to `/` (cf 3029e765).
The fix is to only do the path splitting logic if the source is a file.
Unfortunately, 7a7357da added support for LCOW by duplicating some of
this subtle logic. Ideally we would need to do more refactoring of the
archive codebase to properly encapsulate these behaviors behind well-
documented APIs.
This fix does not do that. Instead, it fixes the issue inline.
Signed-off-by: Tibor Vass <tibor@docker.com>
(cherry picked from commit 171538c190ee3a1a8211946ab8fa78cdde54b47a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 70a837138aa1c5c0260a005b470ac0c5717984b4
Component: engine
Make sure adapter.removeNetworks executes during task Remove
adapter.removeNetworks was being skipped for cases when
isUnknownContainer(err) was true after adapter.remove was executed
This fix eliminates the nil return case forcing the function
to continue executing unless there is a true error
Fixes https://github.com/moby/moby/issues/39225
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
(cherry picked from commit 70fa7b6a3fd9aaada582ae02c50710f218b54d1a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 75887d37e1ddbef579e239ff0b1b7a2508e486fd
Component: engine
make sure the LB sandbox is removed when a service is updated
with a --network-rm option
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
(cherry picked from commit 680d0ba4abfb00c8ac959638c72003dba0826b46)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 280b8dff7df137dd21fcf27cacf6d12ed531770a
Component: engine
[18.09 backport] bugfix: fetch the right device number which great than 255
Upstream-commit: b236a1e78d35a62b164d00f4e19f59a556da8831
Component: engine
Running a bundled aufs benchmark sometimes results in this warning:
> WARN[0001] Couldn't run auplink before unmount /tmp/aufs-tests/aufs/mnt/XXXXX error="exit status 22" storage-driver=aufs
If we take a look at what aulink utility produces on stderr, we'll see:
> auplink:proc_mnt.c:96: /tmp/aufs-tests/aufs/mnt/XXXXX: Invalid argument
and auplink exits with exit code of 22 (EINVAL).
Looking into auplink source code, what happens is it tries to find a
record in /proc/self/mounts corresponding to the mount point (by using
setmntent()/getmntent_r() glibc functions), and it fails.
Some manual testing, as well as runtime testing with lots of printf
added on mount/unmount, as well as calls to check the superblock fs
magic on mount point (as in graphdriver.Mounted(graphdriver.FsMagicAufs, target)
confirmed that this record is in fact there, but sometimes auplink
can't find it. I was also able to reproduce the same error (inability
to find a mount in /proc/self/mounts that should definitely be there)
using a small C program, mocking what `auplink` does:
```c
#include <stdio.h>
#include <err.h>
#include <mntent.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
FILE *fp;
struct mntent m, *p;
char a[4096];
char buf[4096 + 1024];
int found =0, lines = 0;
if (argc != 2) {
fprintf(stderr, "Usage: %s <mountpoint>\n", argv[0]);
exit(1);
}
fp = setmntent("/proc/self/mounts", "r");
if (!fp) {
err(1, "setmntent");
}
setvbuf(fp, a, _IOLBF, sizeof(a));
while ((p = getmntent_r(fp, &m, buf, sizeof(buf)))) {
lines++;
if (!strcmp(p->mnt_dir, argv[1])) {
found++;
}
}
printf("found %d entries for %s (%d lines seen)\n", found, argv[1], lines);
return !found;
}
```
I have also wrote a few other C proggies -- one that reads
/proc/self/mounts directly, one that reads /proc/self/mountinfo instead.
They are also prone to the same occasional error.
It is not perfectly clear why this happens, but so far my best theory
is when a lot of mounts/unmounts happen in parallel with reading
contents of /proc/self/mounts, sometimes the kernel fails to provide
continuity (i.e. it skips some part of file or mixes it up in some
other way). In other words, this is a kernel bug (which is probably
hard to fix unless some other interface to get a mount entry is added).
Now, there is no real fix, and a workaround I was able to come up
with is to retry when we got EINVAL. It usually works on the second
attempt, although I've once seen it took two attempts to go through.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit ae431b10a9508e2bf3b1782e9d435855e3e13977)
Upstream-commit: c303e63ca6c3d25d29b8992451898734fa6e4e7c
Component: engine
Do not use filepath.Walk() as there's no requirement to recursively
go into every directory under mnt -- a (non-recursive) list of
directories in mnt is sufficient.
With filepath.Walk(), in case some container will fail to unmount,
it'll go through the whole container filesystem which is both
excessive and useless.
This is similar to commit f1a459229724f5e8e440b49f ("devmapper.shutdown:
optimize")
While at it, raise the priority of "unmount error" message from debug
to a warning. Note we don't have to explicitly add `m` as unmount error (from
pkg/mount) will have it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 8fda12c6078ed5c86be0822a7a980c6fbc9220bf)
Upstream-commit: b85d4a4f09badf40201ef7fcbd5b930de216190d
Component: engine
In case there are a big number of layers, so that mount data won't fit
into a single memory page (4096 bytes on most platforms, which is good
enough for about 40 layers, depending on how long graphdriver root path
is), we supply additional layers with O_REMOUNT, as described in aufs
documentation.
Problem is, the current implementation does that one layer at a time
(i.e. there is one mount syscall per each additional layer).
Optimize the code to supply as many layers as we can fit in one page
(basically reusing the same code as for the original mount).
Note, per aufs docs, "[a]t remount-time, the options are interpreted
in the given order, e.g. left to right" so we should be good.
Tested on an image with ~100 layers.
Before (35 syscalls):
> [pid 22756] 1556919088.686955 mount("none", "/mnt/volume_sfo2_09/docker-aufs/aufs/mnt/a86f8c9dd0ec2486293119c20b0ec026e19bbc4d51332c554f7cf05d777c9866", "aufs", 0, "br:/mnt/volume_sfo2_09/docker-au"...) = 0 <0.000504>
> [pid 22756] 1556919088.687643 mount("none", "/mnt/volume_sfo2_09/docker-aufs/aufs/mnt/a86f8c9dd0ec2486293119c20b0ec026e19bbc4d51332c554f7cf05d777c9866", 0xc000c451b0, MS_REMOUNT, "append:/mnt/volume_sfo2_09/docke"...) = 0 <0.000105>
> [pid 22756] 1556919088.687851 mount("none", "/mnt/volume_sfo2_09/docker-aufs/aufs/mnt/a86f8c9dd0ec2486293119c20b0ec026e19bbc4d51332c554f7cf05d777c9866", 0xc000c451ba, MS_REMOUNT, "append:/mnt/volume_sfo2_09/docke"...) = 0 <0.000098>
> ..... (~30 lines skipped for clarity)
> [pid 22756] 1556919088.696182 mount("none", "/mnt/volume_sfo2_09/docker-aufs/aufs/mnt/a86f8c9dd0ec2486293119c20b0ec026e19bbc4d51332c554f7cf05d777c9866", 0xc000c45310, MS_REMOUNT, "append:/mnt/volume_sfo2_09/docke"...) = 0 <0.000266>
After (2 syscalls):
> [pid 24352] 1556919361.799889 mount("none", "/mnt/volume_sfo2_09/docker-aufs/aufs/mnt/8e7ba189e347a834e99eea4ed568f95b86cec809c227516afdc7c70286ff9a20", "aufs", 0, "br:/mnt/volume_sfo2_09/docker-au"...) = 0 <0.001717>
> [pid 24352] 1556919361.801761 mount("none", "/mnt/volume_sfo2_09/docker-aufs/aufs/mnt/8e7ba189e347a834e99eea4ed568f95b86cec809c227516afdc7c70286ff9a20", 0xc000dbecb0, MS_REMOUNT, "append:/mnt/volume_sfo2_09/docke"...) = 0 <0.001358>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit d58c434bffef76e48bff75ede290937874488dd3)
Upstream-commit: 75488521735325e4564e66d9cf9c2b1bc1c2c64b
Component: engine
Apparently there is some kind of race in aufs kernel module code,
which leads to the errors like:
[98221.158606] aufs au_xino_create2:186:dockerd[25801]: aufs.xino create err -17
[98221.162128] aufs au_xino_set:1229:dockerd[25801]: I/O Error, failed creating xino(-17).
[98362.239085] aufs au_xino_create2:186:dockerd[6348]: aufs.xino create err -17
[98362.243860] aufs au_xino_set:1229:dockerd[6348]: I/O Error, failed creating xino(-17).
[98373.775380] aufs au_xino_create:767:dockerd[27435]: open /dev/shm/aufs.xino(-17)
[98389.015640] aufs au_xino_create2:186:dockerd[26753]: aufs.xino create err -17
[98389.018776] aufs au_xino_set:1229:dockerd[26753]: I/O Error, failed creating xino(-17).
[98424.117584] aufs au_xino_create:767:dockerd[27105]: open /dev/shm/aufs.xino(-17)
So, we have to have a lock around mount syscall.
While at it, don't call the whole Unmount() on an error path, as
it leads to bogus error from auplink flush.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 5cd62852fa199f272b542d828c8c5d1db427ea53)
Upstream-commit: bb4b9fe29eba57ed71d8105ec0e5dacc8e72129e
Component: engine
1. Use mount.Unmount() which ignores EINVAL ("not mounted") error,
and provides better error diagnostics (so we don't have to explicitly
add target to error messages).
2. Since we're ignoring "not mounted" error, we can call
multiple unmounts without any locking -- but since "auplink flush"
is still involved and can produce an error in logs, let's keep
the check for fs being mounted (it's just a statfs so should be fast).
2. While at it, improve the "can't unmount" error message in Put().
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 4beee98026feabe4f4f0468215b8fd9b56f90d5e)
Upstream-commit: 5b68d00abc4cad8b707a9a01dde9a420ca6eb995
Component: engine
Both mount and unmount calls are already protected by fine-grained
(per id) locks in Get()/Put() introduced in commit fc1cf1911bb
("Add more locking to storage drivers"), so there's no point in
having a global lock in mount/unmount.
The only place from which unmount is called without any locking
is Cleanup() -- this is to be addressed in the next patch.
This reverts commit 824c24e6802ad3ed7e26b4f16e5ae81869b98185.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit f93750b2c4d5f6144f0790ffa89291da3c097b80)
Upstream-commit: 701939112efc71a0c867a626f0a56df9c56030db
Component: engine
The function is not needed as it's just a shallow wrapper around
unix.Mount().
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 2f98b5f51fb0a21e1bd930c0e660c4a7c4918aa5)
Upstream-commit: 023b63a0f276b79277c1c961d8c98d4d5b39525d
Component: engine
The errors returned from Mount and Unmount functions are raw
syscall.Errno errors (like EPERM or EINVAL), which provides
no context about what has happened and why.
Similar to os.PathError type, introduce mount.Error type
with some context. The error messages will now look like this:
> mount /tmp/mount-tests/source:/tmp/mount-tests/target, flags: 0x1001: operation not permitted
or
> mount tmpfs:/tmp/mount-test-source-516297835: operation not permitted
Before this patch, it was just
> operation not permitted
[v2: add Cause()]
[v3: rename MountError to Error, document Cause()]
[v4: fixes; audited all users]
[v5: make Error type private; changes after @cpuguy83 reviews]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 65331369617e89ce54cc9be080dba70f3a883d1c)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 7f1c6bf5a745c8faeba695d3556dff4c4ff5f473
Component: engine
Signed-off-by: John Howard <jhoward@microsoft.com>
(cherry picked from commit 293c74ba79f0008f48073985507b34af59b45fa6)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 91f5be57af815df372371e1e989d00963ce4d02f
Component: engine
As standard mount.Unmount does what we need, let's use it.
In addition, this adds ignoring "not mounted" condition, which
was previously implemented (see PR#33329, commit cfa2591d3f26)
via a very expensive call to mount.Mounted().
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
(cherry picked from commit 77bc327e24a60791fe7e87980faf704cf7273cf9)
Upstream-commit: 893b24b80db170279d5c9532ed508a81c328de5e
Component: engine
Increases the max recieved gRPC message size for Node and Secret list
operations. This has already been done for the other swarm types, but
was not done for these.
Signed-off-by: Drew Erny <drew.erny@docker.com>
(cherry picked from commit a0903e1fa3eca32065c7dbfda8d1e0879cbfbb8f)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: a987f31fbc42526bf2cccf1381066f075b18dfe0
Component: engine
Previously only unpack operations were supported with chroot.
This adds chroot support for packing operations.
This prevents potential breakouts when copying data from a container.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 3029e765e241ea2b5249868705dbf9095bc4d529)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 61e0459053c359e322b8d5c017e855f616fd34c0
Component: engine
This is useful for preventing CVE-2018-15664 where a malicious container
process can take advantage of a race on symlink resolution/sanitization.
Before this change chrootarchive would chroot to the destination
directory which is attacker controlled. With this patch we always chroot
to the container's root which is not attacker controlled.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit d089b639372a8f9301747ea56eaf0a42df24016a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 155939994f453559676656bc4b05635e83ebef56
Component: engine
Today `$ docker service create --limit-cpu` configures a containers
`CpuPeriod` and `CpuQuota` variables, this commit switches this to
configure a containers `NanoCpu` variable instead.
Signed-off-by: Olly Pomeroy <olly@docker.com>
(cherry picked from commit 8a60a1e14a5e98b7ed0fbea57369615a766a1ae9)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 675f4256ae2b85556b4243da44ff627900660463
Component: engine
When manually stopping a container with a restart-policy, the container
would show as "restarting" in `docker ps` whereas its actual state
is "exited".
Stopping a container with a restart policy shows the container as "restarting"
docker run -d --name test --restart unless-stopped busybox false
docker stop test
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7e07409fa1d3 busybox "false" 5 minutes ago Restarting (1) 4 minutes ago test
However, inspecting the same container shows that it's exited:
docker inspect test --format '{{ json .State }}'
{
"Status": "exited",
"Running": false,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 0,
"ExitCode": 1,
"Error": "",
"StartedAt": "2019-02-14T13:26:27.6091648Z",
"FinishedAt": "2019-02-14T13:26:27.689427Z"
}
And killing the container confirms this;
docker kill test
Error response from daemon: Cannot kill container: test: Container 7e07409fa1d36dc8d8cb8f25cf12ee1168ad9040183b85fafa73ee2c1fcf9361 is not running
docker run -d --name test --restart unless-stopped busybox false
docker stop test
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d0595237054a busybox "false" 5 minutes ago Restarting (1) 4 minutes ago exit
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 8c0ecb638705e89746a81fa1320aafaa7ff701b2)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 00f0b9df0dde80b2c40023b12a29dd476f2a77e2
Component: engine
When running a container in the host's network namespace, the container
gets a copy of the host's resolv.conf (copied to `/etc/resolv.conf` inside
the container).
The current code always used the default (`/etc/resolv.conf`) path on the
host, irregardless if `systemd-resolved` was used or not.
This patch uses the correct file if `systemd-resolved` was detected
to be running.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
(cherry picked from commit 8364d1c9d590d4266871cd820b76ef12e2b934ed)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 04ae160eca4a8bc134ca9abafa044191b893080b
Component: engine