If "ps" fails, in many cases it prints a meaningful error message
which a user can benefit from. Let's use it.
While at it, let's use errdefs.System to classify the error,
as well as errors.Wrap.
Before:
> $ docker top $CT <any bad ps options>
> Error response from daemon: Error running ps: exit status 1
After:
> $ docker top $CT auxm
> Error response from daemon: ps: error: thread display conflicts with forest display
or
> $ docker top $CT saur
> Error response from daemon: ps: error: conflicting format options
or, if there's no meaningful error on stderr, same as before:
> $ docker top $CT 1234
> Error response from daemon: ps: exit status 1
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: a41328d5704b8d1adbcd099fb4bb0697060df806
Component: engine
Current ContainerTop (a.k.a. docker top) implementation uses "ps"
to get the info about *all* running processes, then parses it, then
filters the results to only contain PIDs used by the container.
Collecting data only to throw most of it away is inefficient,
especially on a system running many containers (or processes).
For example, "docker top" on a container with a single process
can take up to 0.5 seconds to execute (on a mostly idle system)
which is noticeably slow.
Since the containers PIDs are known beforehand, let's use ps's
"-q" option to provide it with a list of PIDs we want info about.
The problem with this approach is, some ps options can't be used
with "-q" (the only one I'm aware of is "f" ("forest view") but
there might be more). As the list of such options is not known,
in case ps fails, it is executed again without "q" (retaining
the old behavior).
Next, the data produced by "ps" is filtered in the same way as before.
The difference here is, in case "-q" worked, the list is much shorter.
I ran some benchmarks on my laptop, with about 8000 "sleep" processes
running to amplify the savings.
The improvement in "docker top" execution times is 5x to 10x (roughly
0.05s vs 0.5s).
The improvement in ContainerTop() execution time is up to 100x
(roughly 3ms vs 300ms).
I haven't measured the memory or the CPU time savings, guess those
are not that critical.
NOTE that busybox ps does not implement -q so the fallback is always
used, but AFAIK it is not usable anyway and Docker expects a normal
ps to be on the system (say the list of fields produced by
"busybox ps -ef" differs from normal "ps -ef" etc.).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: a076badb8b33f1ecdc5d46f0a3701f10c0579f73
Component: engine
remove unnescessary import aliases, brackets, and so on.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: f23c00d8701e4bd0f2372a586dacbf66a26f9a51
Component: engine
It does not make sense to keep looking for PID once
we found it, so let's give it a break.
The side effect of this patch is, if there's more than one column
titled "PID", the last (rightmost) column was used before, and now
the first (leftmost) column is used. Should make no practical
difference whatsoever.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 654a7625fc9f0b7b04da0e0e4d151af04a65cc7f
Component: engine
Adds functionality to parse and return network attachment spec
information. Network attachment tasks are phony tasks created in
swarmkit to deal with unmanaged containers attached to swarmkit. Before
this change, attempting `docker inspect` on the task id of a network
attachment task would result in an empty task object. After this change,
a full task object is returned
Fixes#26548 the correct way.
Signed-off-by: Drew Erny <drew.erny@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 5b69ff466e61fa168f24869710f2070c742a5565
Component: engine
This prevents the following test case failures "go test" is run
as non-root in the daemon/ directory:
> --- FAIL: TestContainerInitDNS (0.02s)
> daemon_test.go:209: chown /tmp/docker-container-test-054812199/volumes: operation not permitted
>
> --- FAIL: TestDaemonReloadNetworkDiagnosticPort (0.00s)
> reload_test.go:525: mkdir /var/lib/docker/network/files/: permission denied
> --- FAIL: TestRootMountCleanup (0.00s)
> daemon_linux_test.go:240: assertion failed: error is not nil: operation not permitted
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 16670ed4842b1ee4853ba39b6ebf2b771d28db9a
Component: engine
In case aufs driver is not supported because supportsAufs() said so,
it is not possible to get a real reason from the logs.
To fix, log the error returned.
Note we're not using WithError here as the error message itself is the
sole message we want to print (i.e. there's nothing to add to it).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 91f85d1c784f3dc9d892b2af2f51d6b6f3b0be69
Component: engine
Simplify the code by using a single logger instance.
While at it, use WithError in Umount.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: c6e2af54256211b8ac757e9b25caa6fb6c9b3c6e
Component: engine
ext4 support d_type by default, but filetype feature is a tunable so
there is still a chance to disable it for some reasons. In this case,
print additional message to explicitly tell how to support d_type.
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Upstream-commit: 8a21b128d4deb874c05eb81ebbc1265175ad69ba
Component: engine
In cases where a logging plugin has crashed when the daemon tries to
copy the container stdio to the logging plugin it returns a broken pipe
error and any log entries that occurr while the plugin is down are lost.
Fix this by opening read+write in the daemon so logs are not lost while
the plugin is down.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: e7479e3ab8128f9e84cc640f0bed4e77b268a6e9
Component: engine
- Check errors.Cause(err) when comparing errors
- Fix bug where oldest log file is not actually removed. This in
particular causes issues when compression is enabled. On rotate it just
overwrites the data in the log file corrupting it.
- Use O_TRUNC to open new gzip files to ensure we don't corrupt log
files as happens without the above fix.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: e87e9e6ad6ba501cc42a2ef47ac18c88a68f258f
Component: engine
Closing the log driver was in a defer meanwhile logs are
collected asyncronously, so the log driver was being closed before reads
were actually finished.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 2c252a48c252749d41079cf8c450d00a5c18296e
Component: engine
This was broken by bf6a790f00ab96bb8857e3c73502909ee5e36217
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: b16b125bb4326e2eec9948fd54ca8c5d83eba36a
Component: engine
This function was added in 23e5c94cfb26eb72c097892712d3dbaa93ee9bc0 but never used
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 8a8ec00c1fece033b20548b7a5ec2f3a1f340834
Component: engine
The overlay storage driver currently does not support any option, but was silently
ignoring any option that was passed.
This patch verifies that no options are passed, and if they are passed will produce
an error.
Before this change:
dockerd --storage-driver=overlay --storage-opt dm.thinp_percent=95
INFO[2018-05-11T11:40:40.996597152Z] libcontainerd: started new docker-containerd process pid=256
....
INFO[2018-05-11T11:40:41.135392535Z] Daemon has completed initialization
INFO[2018-05-11T11:40:41.141035093Z] API listen on /var/run/docker.sock
After this change:
dockerd --storage-driver=overlay --storage-opt dm.thinp_percent=95
INFO[2018-05-11T11:39:21.632610319Z] libcontainerd: started new docker-containerd process pid=233
....
Error starting daemon: error initializing graphdriver: overlay: unknown option dm.thinp_percent
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 30f15d2bdc551870464d1cd024a92341cf1ae4aa
Component: engine
These network operations really don't have anything to do with the
container but rather are setting up the networking.
Ideally these wouldn't get shoved into the daemon package, but doing
something else (e.g. extract a network service into a new package) but
there's a lot more work to do in that regard.
In reality, this probably simplifies some of that work as it moves all
the network operations to the same place.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: cc8f358c23d307d19b06cd5e7d039d8fad01a947
Component: engine
A recent optimization in getSourceMount() made it return an error
in case when the found mount point is "/". This prevented bind-mounted
volumes from working in such cases.
A (rather trivial but adeqate) unit test case is added.
Fixes: 871c957242 ("getSourceMount(): simplify")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: d8fd6137a1f6d95a2bcdfeb6e1dfa6b816790c5e
Component: engine
The Partial property of the Logger message
was replaced by PLogMetaData, causing the build to fail.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: daaef83cd24f6eb335b77e9d8d692235eff1b201
Component: engine
It does not make sense to copy a slice element by element, then discard
the source one. Let's do copy in place instead which is way more
efficient.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: d4c94e83ca58b5ffd99840a8e3cbf22f1ab331c0
Component: engine