If container will run as non root user, drop permitted, effective caps early
Upstream-commit: b67c1e078c7eeb20199dce301e95fa8999c98109
Component: engine
As soon as the initial executable in the container is executed as a non root user,
permitted and effective capabilities are dropped. Drop them earlier than this, so
that they are dropped before executing the file. The main effect of this is that
if `CAP_DAC_OVERRIDE` is set (the default) the user will not be able to execute
files they do not have permission to execute, which previously they could.
The old behaviour was somewhat surprising and the new one is definitely correct,
but it is not in any meaningful way exploitable, and I do not think it is
necessary to backport this fix. It is unlikely to have any negative effects as
almost all executables have world execute permission anyway.
Use the bounding set not the effective set as the canonical set of capabilities, as
effective will now vary.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Upstream-commit: 15ff09395c001bcb0f284461abbc404a1d8bab4d
Component: engine
Commit fd0e24b7189374e0fe7c55b6d26ee916d3ee1655 changed
the stats collection loop to use a `sleep()` instead
of `time.Tick()` in the for-loop.
This change caused a regression in situations where
no stats are being collected, or an error is hit
in the loop (in which case the loop would `continue`,
and the `sleep()` is not hit).
This patch puts the sleep at the start of the loop
to guarantee it's always hit.
This will delay the sampling, which is similar to the
behavior before fd0e24b7189374e0fe7c55b6d26ee916d3ee1655.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 481b8e54b45955e40075f49a9af321afce439320
Component: engine
This PR adds support for compressibility of log file.
I added a new option conpression for the jsonfile log driver,
this option allows the user to specify compression algorithm to
compress the log files. By default, the log files will be
not compressed. At present, only support 'gzip'.
Signed-off-by: Yanqiang Miao <miao.yanqiang@zte.com.cn>
'docker logs' can read from compressed files
Signed-off-by: Yanqiang Miao <miao.yanqiang@zte.com.cn>
Add Metadata to the gzip header, optmize 'readlog'
Signed-off-by: Yanqiang Miao <miao.yanqiang@zte.com.cn>
Upstream-commit: f69f09f44ce9fedbc9d70f11980c1fc8d7f77cec
Component: engine
Commit 7a7357dae1bccc ("LCOW: Implemented support for docker cp + build")
changed `container.BaseFS` from being a string (that could be empty but
can't lead to nil pointer dereference) to containerfs.ContainerFS,
which could be be `nil` and so nil dereference is at least theoretically
possible, which leads to panic (i.e. engine crashes).
Such a panic can be avoided by carefully analysing the source code in all
the places that dereference a variable, to make the variable can't be nil.
Practically, this analisys are impossible as code is constantly
evolving.
Still, we need to avoid panics and crashes. A good way to do so is to
explicitly check that a variable is non-nil, returning an error
otherwise. Even in case such a check looks absolutely redundant,
further changes to the code might make it useful, and having an
extra check is not a big price to pay to avoid a panic.
This commit adds such checks for all the places where it is not obvious
that container.BaseFS is not nil (which in this case means we do not
call daemon.Mount() a few lines earlier).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: d6ea46cedaca0098c15843c5254a337d087f5cd6
Component: engine
In case ContainerExport() is called for an unmounted container, it leads
to a daemon panic as container.BaseFS, which is dereferenced here, is
nil.
To fix, do not rely on container.BaseFS; use the one returned from
rwlayer.Mount().
Fixes: 7a7357dae1bccc ("LCOW: Implemented support for docker cp + build")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 81f6307eda44ab3a91de6e29304810a976161d74
Component: engine
In info, we only need the number of images, but `CountImages` was
getting the whole map of images and then grabbing the length from that.
This causes a lot of unnecessary CPU usage and memory allocations, which
increases with O(n) on the number of images.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: f6a7763b6f3256bed9a7352021745189d0ca8dc9
Component: engine
Ingress networks will no longer automatically remove their
load-balancing endpoint (and sandbox) automatically when the network is
otherwise upopulated. This is to prevent automatic removal of the
ingress networks when all the containers leave them. Therefore
explicit removal of an ingress network also requires explicit removal
of its load-balancing endpoint.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Upstream-commit: 3da4ebf355d3494d1403b2878a1ae6958b2724e9
Component: engine
It has been pointed out that if --read-only flag is given, /dev/shm
also becomes read-only in case of --ipc private.
This happens because in this case the mount comes from OCI spec
(since commit 7120976d74195), and is a regression caused by that
commit.
The meaning of --read-only flag is to only have a "main" container
filesystem read-only, not the auxiliary stuff (that includes /dev/shm,
other mounts and volumes, --tmpfs, /proc, /dev and so on).
So, let's make sure /dev/shm that comes from OCI spec is not made
read-only.
Fixes: 7120976d74195 ("Implement none, private, and shareable ipc modes")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: cad74056c09f6276b0f4a996a1511553177cd3d7
Component: engine
The test case checks that in case of IpcMode: private and
ReadonlyRootfs: true (as in "docker run --ipc private --read-only")
the resulting /dev/shm mount is NOT made read-only.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 33dd562e3acff71ee18a2543d14fcbecf9bf0e62
Component: engine
To avoid noise in sampling CPU usage metrics, we now sample the system
usage closer to the actual response from the underlying runtime. Because
the response from the runtime may be delayed, this makes the sampling
more resilient in loaded conditions. In addition to this, we also
replace the tick with a sleep to avoid situations where ticks can backup
under loaded conditions.
The trade off here is slightly more load reading the system CPU usage
for each container. There may be an optimization required for large
amounts of containers but the cost is on the order of 15 ms per 1000
containers. If this becomes a problem, we can time slot the sampling,
but the complexity may not be worth it unless we can test further.
Unfortunately, there aren't really any good tests for this condition.
Triggering this behavior is highly system dependent. As a matter of
course, we should qualify the fix with the users that are affected.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Upstream-commit: fd0e24b7189374e0fe7c55b6d26ee916d3ee1655
Component: engine
With the inclusion of PR 30897, creating service for host network
fails in 18.02. Modified IsPreDefinedNetwork check and return
NetworkNameError instead of errdefs.Forbidden to address this issue
Signed-off-by: selansen <elango.siva@docker.com>
Upstream-commit: 7cf8b20762cc9491f52ff3f3d94c880378183696
Component: engine
While a `types.go` file is handly when there are a lot of record types,
it is completely obnoxious when used for concrete, utility types with a
struct, new function and method set in the same file. This change
removes the `types.go` file in favor of the simpler approach.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Upstream-commit: 244e59e94f153af82e6c3bd8a6c200a48d3cea60
Component: engine
Move the "unmount and deactivate" code into a separate method, and
optimize it a bit:
1. Do not use filepath.Walk() as there's no requirement to recursively
go into every directory under home/mnt; a list of directories in mnt
is sufficient. With filepath.Walk(), in case some container will fail
to unmount, it'll go through the whole container filesystem which is
excessive and useless.
2. Do not use GetMounts() and check if a directory is mounted; just
unmount it and ignore "not mounted" error. Note the same error
is returned in case of wrong flags set, but as flags are hardcoded
we can safely ignore such case.
While at it, promote "can't unmount" log level from debug to warning.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: f1a459229724f5e8e440b49f058167c2eeeb2dc6
Component: engine
1. Make sure it's clear the error is from unmount.
2. Simplify the code a bit to make it more readable.
[v2: use errors.Wrap]
[v3: use errors.Wrapf]
[v4: lowercase the error message]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 9d00aedebc25507042c5afd4ab8fc6b333ca7c53
Component: engine
Log the error returned from logdriver.Log() instead of the logdriver
itself.
Signed-off-by: Cody Roseborough <crrosebo@amazon.com>
Upstream-commit: a1956b5623fad186ad39ae8aca998284003b0cd3
Component: engine
1. Replace EnsureRemoveAll() with Rmdir(), as here we are removing
the container's mount point, which is already properly unmounted
and is therefore an empty directory.
2. Ignore the Rmdir() error (but log it unless it's ENOENT). This
is a mount point, currently unmounted (i.e. an empty directory),
and an older kernel can return EBUSY if e.g. the mount was
leaked to other mount namespaces.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 732dd9b848bec70a2ecb5b4998918886a0cec497
Component: engine
Exec processes do not automatically inherit AppArmor
profiles from the container.
This patch sets the AppArmor profile for the exec
process.
Before this change:
apparmor_parser -q -r <<EOF
#include <tunables/global>
profile deny-write flags=(attach_disconnected) {
#include <abstractions/base>
file,
network,
deny /tmp/** w,
capability,
}
EOF
docker run -dit --security-opt "apparmor=deny-write" --name aa busybox
docker exec aa sh -c 'mkdir /tmp/test'
(no error)
With this change applied:
docker exec aa sh -c 'mkdir /tmp/test'
mkdir: can't create directory '/tmp/test': Permission denied
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 8f3308ae10ec9ad0dd4edfb46fde53a0e1e19b34
Component: engine
It looks like no one uses this function.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 0450f61cb995c8fc2f41a6909526be6ed4093565
Component: engine
imageService provides the backend for the image API and handles the
imageStore, and referenceStore.
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Upstream-commit: 0dab53ff3cb0aae91aae068a3f1f2fd32339e23b
Component: engine