We currently drop the global lock while holding a per-device lock when
waiting for device removal, and then we re-aquire it when the sleep is done.
This is causing a AB-BA deadlock if anyone at the same time tries to do any
operation on that device like this:
thread A: thread B
grabs global lock
grabs device lock
releases global lock
sleeps
grabs global lock
blocks on device lock
wakes up
blocks on global lock
To trigger this you can for instance do:
ID=`docker run -d fedora sleep 5`
cd /var/lib/docker/devicemapper/mnt/$ID
docker wait $ID
docker rm $ID &
docker rm $ID
The unmount will fail due to the mount being busy thus causing the
timeout and the second rm will then trigger the deadlock.
We fix this by adding a lock ordering such that the device locks
are always grabbed before the global lock. This is safe since the
device lookups now have a separate lock.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Upstream-commit: 2ffef1b7eb618162673c6ffabccb9ca57c7dfce3
Component: engine
Currently access to the Devices map is serialized by the main
DeviceSet lock, but we need to access it outside that lock, so we
add a separate lock for this and grab that everywhere we modify
or read the map.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Upstream-commit: 70826e8b3fee27b971852aad89053507c6866d3e
Component: engine
This centralizes the lookup of devices so it is only done in one place.
This will be needed later when we change the locking for it.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Upstream-commit: e01b71cebeb96755641a18762dea5b843f107bee
Component: engine
We already have this at the caller, no need to look up again.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Upstream-commit: 74edcaf1e84aa8bf35e496b2bead833172a79fca
Component: engine
We already have the info in most cases, no need to look this up multiple times.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Upstream-commit: 5955846774c9b43291d6de0584fa8c3f62414c43
Component: engine
All the callers already have the info, no need for an extra lookup.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Upstream-commit: 8e39b35c7cd02bbb644b7faf2a434de0098e6dea
Component: engine
There is no need to look this up again, we have it already in all callers.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Upstream-commit: e5394e35c7a8f730ac76d24dee74d769049a0428
Component: engine
Fixes#4741
Right now volumes from expected a dir and not a file so when the drivers
tried to do the bind mount, the destination was a dir, not a file so it
fails to run.
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
Upstream-commit: a57900e35f2c30026a070fdfdbdb0ce99b35e1ff
Component: engine
The change in commit a9fa1a13c3b0a654a96be01ff7ec19e8009b2094
made us only deactivate devices that were mounted. Unfortunately
this made us not deactivate the base device. Which caused
us to not be able to deactivate the pool.
This fixes that by always just deactivating the base device.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Upstream-commit: 66c5e19f9bd057644fab475499ea45bb428ba2b2
Component: engine
If an admin mounts all file systems as -rshared (Default on RHEL and Fedora)
we see a scaling problem as the number of container increase.
Basically every new container needs to have it new mounts in /var/lib/docker
shared to all other containers, this ends up with us only able to scale to
around 100 containers, before the system slows down.
By simply bind mounting /var/lib/docker on its and then setting it private,
the scaling issue goes away.
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
Upstream-commit: 792bb41e524615486ef8266b7bf4804b4fe178f1
Component: engine
This implements cgroup.Apply() using the systemd apis.
We create a transient unit called "docker-$id.scope" that contains
the container processes. We also have a way to set unit specific
properties, currently only defining the Slice to put the
scope in.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Upstream-commit: 6c7835050e53b733181ddfca6152c358fd625400
Component: engine
In order to handle special configuration for different drivers we
make the Config field a map to string array. This lets
us use it for lxc, by using the "lxc" key for those, and we can
later extend it easily for other backend-specific options.
Docker-DCO-1.1-Signed-off-by: Alexander Larsson <alexl@redhat.com> (github: alexlarsson)
Upstream-commit: 7a3070a6000963d12be9dcd2698d911b848a33b6
Component: engine
docker will run the process(es) within the container with an SELinux label and will label
all of the content within the container with mount label. Any temporary file systems
created within the container need to be mounted with the same mount label.
The user can override the process label by specifying
-Z With a string of space separated options.
-Z "user=unconfined_u role=unconfined_r type=unconfined_t level=s0"
Would cause the process label to run with unconfined_u:unconfined_r:unconfined_t:s0"
By default the processes will run execute within the container as svirt_lxc_net_t.
All of the content in the container as svirt_sandbox_file_t.
The process mcs level is based of the PID of the docker process that is creating the container.
If you run the container in --priv mode, the labeling will be disabled.
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
Upstream-commit: 4c4356692580afb3971094e322aea64abe0e2500
Component: engine