Handle the case of systemd-resolved, and if in place
use a different resolv.conf source.
Set appropriately the option on libnetwork.
Move unix specific code to container_operation_unix
Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Upstream-commit: e353e7e3f0ce8eceeff657393cba2876375403fa
Component: engine
These network operations really don't have anything to do with the
container but rather are setting up the networking.
Ideally these wouldn't get shoved into the daemon package, but doing
something else (e.g. extract a network service into a new package) but
there's a lot more work to do in that regard.
In reality, this probably simplifies some of that work as it moves all
the network operations to the same place.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: cc8f358c23d307d19b06cd5e7d039d8fad01a947
Component: engine
This call was added as part of commit a042e5a20 and at the time was
useful. sandbox.DisableService() basically calls
endpoint.deleteServiceInfoFromCluster() for every endpoint in the
sandbox. However, with the libnetwork change, endpoint.sbLeave()
invokes endpoint.deleteServiceInfoFromCluster(). The releaseNetwork()
call invokes sandbox.Delete() immediately after
sandbox.DisableService(). The sandbox.Delete() in turn ultimately
invokes endpoint.sbLeave() for every endpoint in the sandbox which thus
removes the endpoint's load balancing entry via
endpoint.deleteServiceInfoFromCluster(). So the call to
sandbox.DisableService() is now redundant.
It is noteworthy that, while redundant, the presence of the call would
not cause errors. It would just be sub-optimal. The DisableService()
call would cause libnetwork to down-weight the load balancing entries
while the call to sandbox.Delete() would cause it to remove the entries
immediately afterwards. Aside from the wasted computation, the extra
call would also propagate an extra state change in the networkDB gossip
messages. So, overall, it is much better to just avoid the extra
overhead.
Signed-off-by: Chris Telfer <ctelfer@docker.com>
Upstream-commit: c27417aa7de46daa415600b39fc8a9c411c8c493
Component: engine
Attachable networks are networks created on the cluster which can then
be attached to by non-swarm containers. These networks are lazily
created on the node that wants to attach to that network.
When no container is currently attached to one of these networks on a
node, and then multiple containers which want that network are started
concurrently, this can cause a race condition in the network attachment
where essentially we try to attach the same network to the node twice.
To easily reproduce this issue you must use a multi-node cluster with a
worker node that has lots of CPUs (I used a 36 CPU node).
Repro steps:
1. On manager, `docker network create -d overlay --attachable test`
2. On worker, `docker create --restart=always --network test busybox
top`, many times... 200 is a good number (but not much more due to
subnet size restrictions)
3. Restart the daemon
When the daemon restarts, it will attempt to start all those containers
simultaneously. Note that you could try to do this yourself over the API,
but it's harder to trigger due to the added latency from going over
the API.
The error produced happens when the daemon tries to start the container
upon allocating the network resources:
```
attaching to network failed, make sure your network options are correct and check manager logs: context deadline exceeded
```
What happens here is the worker makes a network attachment request to
the manager. This is an async call which in the happy case would cause a
task to be placed on the node, which the worker is waiting for to get
the network configuration.
In the case of this race, the error ocurrs on the manager like this:
```
task allocation failure" error="failed during network allocation for task n7bwwwbymj2o2h9asqkza8gom: failed to allocate network IP for task n7bwwwbymj2o2h9asqkza8gom network rj4szie2zfauqnpgh4eri1yue: could not find an available IP" module=node node.id=u3489c490fx1df8onlyfo1v6e
```
The task is not created and the worker times out waiting for the task.
---
The mitigation for this is to make sure that only one attachment reuest
is in flight for a given network at a time *when the network doesn't
already exist on the node*. If the network already exists on the node
there is no need for synchronization because the network is already
allocated and on the node so there is no need to request it from the
manager.
This basically comes down to a race with `Find(network) ||
Create(network)` without any sort of syncronization.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: c379d2681ffe8495a888fb1d0f14973fbdbdc969
Component: engine
This fix tries to address the issue raised in 33661 where
network alias does not work when connect to a network the second time.
This fix address the issue.
This fix fixes 33661.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Upstream-commit: d63a5a1ff593f14957f3e0a9678633e8237defc9
Component: engine
This PR contains a fix for moby/moby#30321. There was a moby/moby#31142
PR intending to fix the issue by adding a delay between disabling the
service in the cluster and the shutdown of the tasks. However
disabling the service was not deleting the service info in the cluster.
Added a fix to delete service info from cluster and verified using siege
to ensure there is zero downtime on rolling update of a service.In order
to support it and ensure consitency of enabling and disable service knob
from the daemon, we need to ensure we disable service when we release
the network from the container. This helps in making the enable and
disable service less racy. The corresponding part of libnetwork fix is
part of docker/libnetwork#1824
Signed-off-by: abhi <abhi@docker.com>
Upstream-commit: a042e5a20a7801efc936daf7a639487bb37ca966
Component: engine
This fix is a follow up to 30397, with `FindUniqueNetwork`
changed to `FindNetwork` based on the review feedback.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Upstream-commit: ccc2ed01894a1950eaf47db2ad0860ad87cd78d1
Component: engine
Instead of having to create a bunch of custom error types that are doing
nothing but wrapping another error in sub-packages, use a common helper
to create errors of the requested type.
e.g. instead of re-implementing this over and over:
```go
type notFoundError struct {
cause error
}
func(e notFoundError) Error() string {
return e.cause.Error()
}
func(e notFoundError) NotFound() {}
func(e notFoundError) Cause() error {
return e.cause
}
```
Packages can instead just do:
```
errdefs.NotFound(err)
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 87a12421a94faac294079bebc97c8abb4180dde5
Component: engine
This fix is part of the effort to address 30242 where
issue arise because of the fact that multiple networks
may share the same name (within or across local/swarm scopes).
The focus of this fix is to allow creation of service
when a network in local scope has the same name as the
service network.
An integration test has been added.
This fix fixes 30242.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Upstream-commit: cafed80cd019a8b40025eaa5e5b37459362607fb
Component: engine
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.
Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: ebcb7d6b406fe50ea9a237c73004d75884184c33
Component: engine
Do not allow sharing of container network with hyperv containers
Signed-off-by: Madhan Raj Mookkandy <madhanm@microsoft.com>
Upstream-commit: 349913ce9fde34d8acd08fad5ce866401f4d135e
Component: engine
Fix a deadlock caused by re-entrant locks on container objects.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: 37addf0a50ccba51630368c6ed09eb08166d6f48
Component: engine
Reuse existing structures and rely on json serialization to deep copy
Container objects.
Also consolidate all "save" operations on container.CheckpointTo, which
now both saves a serialized json to disk, and replicates state to the
ACID in-memory store.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: edad52707c536116363031002e6633e3fec16af5
Component: engine
Also hide ViewDB behind an inteface.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: aacddda89df05b88a6d15fb33c42864760385ab2
Component: engine
Replicate relevant mutations to the in-memory ACID store. Readers will
then be able to query container state without locking.
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: eed4c7b73f0cf98cf48943da1c082f3210b28c82
Component: engine
`ConnectToNetwork` is modfying the container but is not locking the
object.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 4d0888e32bccfd8c0f27a7b66b2a5607d42e2698
Component: engine
- If the nw sbox is not there, then there is nothing to deactivate.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Upstream-commit: 2418f257675807e5ae578137af48a2622797b746
Component: engine
Both method are trying to detach the container from a cluster
network. The code is exactly the same, this removes the duplication.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Upstream-commit: cb6832c6d34cdce33341c4f71ba8f22a23608c70
Component: engine
This also moves some cli specific in `cmd/dockerd` as it does not
really belong to the `daemon/config` package.
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Upstream-commit: db63f9370e26d725357c703cbaf9ab63cc7b6d0a
Component: engine
This fix tries to address the issue raised in 29129 where
"--hostname" not working when running in "--net=host" for
`docker run`.
The fix fixes the issue by not resetting the `container.Config.Hostname`
if the `Hostname` has already been assigned through `--hostname`.
An integration test has been added to cover the changes.
This fix fixes 29129.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Upstream-commit: b0a7b0120f4461daa34527a743087e73ef8f5963
Component: engine
When a container is attached to an "--attachable" network, it strictly
forms the attacherKey using either the network-id or network-name
because at the time of attachment, the daemon may not have the network
downloaded locally from the manager. Hence, when the NetworkDettach is
called, it should use either network-name or network-id. This fix
addresses the missing network-id based dettachment case.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Upstream-commit: 5f17e0f6c91b36a8f33d1efa1be879d6eb80132f
Component: engine
When trying to attach to swarm scope network for an unmanaged container
sometimes even if attaching to network succeeds, we may not find the
network because some other container which was using the network went
down and removed the network. So if it is not found, try to detach and
reattach to re-download the network from the manager.
Fixes#26588
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Upstream-commit: 849e345e2c366d875b03f8b33b31308ba16fb4fd
Component: engine
When a container is run on a --attachable network, the endpoint
configs passed by the user were incorrectly overwritten.
Copy the relevant configs instead of overwriting the entire configs.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Upstream-commit: c5dd4d70c6ea3371d61409112a45c0573280111d
Component: engine
This adds a metrics packages that creates additional metrics. Add the
metrics endpoint to the docker api server under `/metrics`.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add metrics to daemon package
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
api: use standard way for metrics route
Also add "type" query parameter
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
Convert timers to ms
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Upstream-commit: 3343d234f3b131d4be1d4ca84385e184633a79bd
Component: engine