Commit Graph

78 Commits

Author SHA1 Message Date
1d8c80a782 Fixes for resolv.conf
Handle the case of systemd-resolved, and if in place
use a different resolv.conf source.
Set appropriately the option on libnetwork.
Move unix specific code to container_operation_unix

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Upstream-commit: e353e7e3f0ce8eceeff657393cba2876375403fa
Component: engine
2018-07-26 11:17:56 -07:00
c6a8d2ea17 Move network operations out of container package
These network operations really don't have anything to do with the
container but rather are setting up the networking.

Ideally these wouldn't get shoved into the daemon package, but doing
something else (e.g. extract a network service into a new package) but
there's a lot more work to do in that regard.
In reality, this probably simplifies some of that work as it moves all
the network operations to the same place.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: cc8f358c23d307d19b06cd5e7d039d8fad01a947
Component: engine
2018-05-10 17:16:00 -04:00
e260e8f94d Remove (now) extra call to sb.DisableService()
This call was added as part of commit a042e5a20 and at the time was
useful.  sandbox.DisableService() basically calls
endpoint.deleteServiceInfoFromCluster() for every endpoint in the
sandbox.  However, with the libnetwork change, endpoint.sbLeave()
invokes endpoint.deleteServiceInfoFromCluster(). The releaseNetwork()
call invokes sandbox.Delete() immediately after
sandbox.DisableService().  The sandbox.Delete() in turn ultimately
invokes endpoint.sbLeave() for every endpoint in the sandbox which thus
removes the endpoint's load balancing entry via
endpoint.deleteServiceInfoFromCluster().  So the call to
sandbox.DisableService() is now redundant.

It is noteworthy that, while redundant, the presence of the call would
not cause errors.  It would just be sub-optimal.  The DisableService()
call would cause libnetwork to down-weight the load balancing entries
while the call to sandbox.Delete() would cause it to remove the entries
immediately afterwards.  Aside from the wasted computation, the extra
call would also propagate an extra state change in the networkDB gossip
messages.  So, overall, it is much better to just avoid the extra
overhead.

Signed-off-by: Chris Telfer <ctelfer@docker.com>
Upstream-commit: c27417aa7de46daa415600b39fc8a9c411c8c493
Component: engine
2018-03-28 14:16:31 -04:00
be83c11fb0 Add canonical import comment
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Upstream-commit: 4f0d95fa6ee7f865597c03b9e63702cdcb0f7067
Component: engine
2018-02-05 16:51:57 -05:00
efc7d712c1 Fix race in attachable network attachment
Attachable networks are networks created on the cluster which can then
be attached to by non-swarm containers. These networks are lazily
created on the node that wants to attach to that network.

When no container is currently attached to one of these networks on a
node, and then multiple containers which want that network are started
concurrently, this can cause a race condition in the network attachment
where essentially we try to attach the same network to the node twice.

To easily reproduce this issue you must use a multi-node cluster with a
worker node that has lots of CPUs (I used a 36 CPU node).

Repro steps:

1. On manager, `docker network create -d overlay --attachable test`
2. On worker, `docker create --restart=always --network test busybox
top`, many times... 200 is a good number (but not much more due to
subnet size restrictions)
3. Restart the daemon

When the daemon restarts, it will attempt to start all those containers
simultaneously. Note that you could try to do this yourself over the API,
but it's harder to trigger due to the added latency from going over
the API.

The error produced happens when the daemon tries to start the container
upon allocating the network resources:

```
attaching to network failed, make sure your network options are correct and check manager logs: context deadline exceeded
```

What happens here is the worker makes a network attachment request to
the manager. This is an async call which in the happy case would cause a
task to be placed on the node, which the worker is waiting for to get
the network configuration.
In the case of this race, the error ocurrs on the manager like this:

```
task allocation failure" error="failed during network allocation for task n7bwwwbymj2o2h9asqkza8gom: failed to allocate network IP for task n7bwwwbymj2o2h9asqkza8gom network rj4szie2zfauqnpgh4eri1yue: could not find an available IP" module=node node.id=u3489c490fx1df8onlyfo1v6e
```

The task is not created and the worker times out waiting for the task.

---

The mitigation for this is to make sure that only one attachment reuest
is in flight for a given network at a time *when the network doesn't
already exist on the node*. If the network already exists on the node
there is no need for synchronization because the network is already
allocated and on the node so there is no need to request it from the
manager.

This basically comes down to a race with `Find(network) ||
Create(network)` without any sort of syncronization.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: c379d2681ffe8495a888fb1d0f14973fbdbdc969
Component: engine
2018-02-02 13:46:23 -05:00
74da78f854 Fix network alias issue
This fix tries to address the issue raised in 33661 where
network alias does not work when connect to a network the second time.

This fix address the issue.

This fix fixes 33661.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Upstream-commit: d63a5a1ff593f14957f3e0a9678633e8237defc9
Component: engine
2018-01-23 01:04:33 +00:00
2012c45c5a Disable service on release network
This PR contains a fix for moby/moby#30321. There was a moby/moby#31142
PR intending to fix the issue by adding a delay between disabling the
service in the cluster and the shutdown of the tasks. However
disabling the service was not deleting the service info in the cluster.
Added a fix to delete service info from cluster and verified using siege
to ensure there is zero downtime on rolling update of a service.In order
to support it and ensure consitency of enabling and disable service knob
from the daemon, we need to ensure we disable service when we release
the network from the container. This helps in making the enable and
disable service less racy. The corresponding part of libnetwork fix is
part of docker/libnetwork#1824

Signed-off-by: abhi <abhi@docker.com>
Upstream-commit: a042e5a20a7801efc936daf7a639487bb37ca966
Component: engine
2018-01-17 14:19:51 -08:00
a9d5589889 Merge pull request #36021 from yongtang/30897-follow-up
Rename FindUniqueNetwork to FindNetwork
Upstream-commit: be146652108f7145e902c275807f9ef71464b031
Component: engine
2018-01-16 09:38:16 +01:00
72b954bd11 Rename FindUniqueNetwork to FindNetwork
This fix is a follow up to 30397, with `FindUniqueNetwork`
changed to `FindNetwork` based on the review feedback.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Upstream-commit: ccc2ed01894a1950eaf47db2ad0860ad87cd78d1
Component: engine
2018-01-15 17:34:40 +00:00
d4d0b5c268 Move api/errdefs to errdefs
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: d453fe35b9b8b52d0677fe0c3cc8373f2f5d30d0
Component: engine
2018-01-11 21:21:43 -05:00
952c29f8da Add helpers to create errdef errors
Instead of having to create a bunch of custom error types that are doing
nothing but wrapping another error in sub-packages, use a common helper
to create errors of the requested type.

e.g. instead of re-implementing this over and over:

```go
type notFoundError struct {
  cause error
}

func(e notFoundError) Error() string {
  return e.cause.Error()
}

func(e notFoundError) NotFound() {}

func(e notFoundError) Cause() error {
  return e.cause
}
```

Packages can instead just do:

```
  errdefs.NotFound(err)
```

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 87a12421a94faac294079bebc97c8abb4180dde5
Component: engine
2018-01-11 21:21:43 -05:00
df9a679f90 Update FindUniqueNetwork to address network name duplications
This fix is part of the effort to address 30242 where
issue arise because of the fact that multiple networks
may share the same name (within or across local/swarm scopes).

The focus of this fix is to allow creation of service
when a network in local scope has the same name as the
service network.

An integration test has been added.

This fix fixes 30242.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Upstream-commit: cafed80cd019a8b40025eaa5e5b37459362607fb
Component: engine
2018-01-06 01:55:28 +00:00
30f1b651e2 Remove string checking in API error handling
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.

Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: ebcb7d6b406fe50ea9a237c73004d75884184c33
Component: engine
2017-08-15 16:01:11 -04:00
d659edcaf5 Update logrus to v1.0.1
Fixes case sensitivity issue

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Upstream-commit: 1009e6a40b295187e038b67e184e9c0384d95538
Component: engine
2017-07-31 13:16:46 -07:00
455cc50b83 Include Endpoint List for Shared Endpoints
Do not allow sharing of container network with hyperv containers

Signed-off-by: Madhan Raj Mookkandy <madhanm@microsoft.com>
Upstream-commit: 349913ce9fde34d8acd08fad5ce866401f4d135e
Component: engine
2017-07-06 12:19:17 -07:00
55b5e8b06f Net operations already hold locks to containers
Fix a deadlock caused by re-entrant locks on container objects.

Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: 37addf0a50ccba51630368c6ed09eb08166d6f48
Component: engine
2017-06-23 07:52:35 -07:00
9755d1918d avoid re-reading json files when copying containers
Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: a43be3431e51b914ab170569b9e58239d64b199c
Component: engine
2017-06-23 07:52:34 -07:00
940ce3d71d save deep copies of Container in the replica store
Reuse existing structures and rely on json serialization to deep copy
Container objects.

Also consolidate all "save" operations on container.CheckpointTo, which
now both saves a serialized json to disk, and replicates state to the
ACID in-memory store.

Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: edad52707c536116363031002e6633e3fec16af5
Component: engine
2017-06-23 07:52:33 -07:00
ae145558ae Move checkpointing to the Container object
Also hide ViewDB behind an inteface.

Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: aacddda89df05b88a6d15fb33c42864760385ab2
Component: engine
2017-06-23 07:52:32 -07:00
8889d56b70 keep a consistent view of containers rendered
Replicate relevant mutations to the in-memory ACID store. Readers will
then be able to query container state without locking.

Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
Upstream-commit: eed4c7b73f0cf98cf48943da1c082f3210b28c82
Component: engine
2017-06-23 07:52:31 -07:00
103f3041e5 Lock container while connecting to a new network.
`ConnectToNetwork` is modfying the container but is not locking the
object.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 4d0888e32bccfd8c0f27a7b66b2a5607d42e2698
Component: engine
2017-05-31 15:13:04 -04:00
bad834206e Do not error out on serv bind deactivation if no sbox is found
- If the nw sbox is not there, then there is nothing to deactivate.

Signed-off-by: Alessandro Boch <aboch@docker.com>
Upstream-commit: 2418f257675807e5ae578137af48a2622797b746
Component: engine
2017-04-10 09:13:41 -07:00
09b605e914 Fix start/restart of detached container
Signed-off-by: Alessandro Boch <aboch@docker.com>
Upstream-commit: 4ca7d4f0c1490845cbf04220a408f67c006af248
Component: engine
2017-03-22 02:38:26 -07:00
65906b40c8 Fix nw sandbox leak when stopping detached container
Signed-off-by: Ryan Liu <ryanlyy@me.com>
Upstream-commit: 786f30107b0c63e8a42ca4b0d8d40621f45861c6
Component: engine
2017-03-21 23:51:52 -07:00
709434a772 (*) Support --net:container:<containername/id> for windows
(*) (vdemeester) Removed duplicate code across Windows and Unix wrt Net:Containers
(*) Return unsupported error for network sharing for hyperv isolation containers

Signed-off-by: Madhan Raj Mookkandy <MadhanRaj.Mookkandy@microsoft.com>
Upstream-commit: 040afcce8f3f54c64d328929c5115128f623deb1
Component: engine
2017-02-28 20:03:43 -08:00
aeb62e6edf Merge pull request #31384 from allencloud/validate-extrahosts-in-deamon-side
validate extraHosts in daemon side
Upstream-commit: 40f390e67eead490fa0b8e1684fc2dc8528cd8c8
Component: engine
2017-02-28 18:28:10 +01:00
7a647c8187 Extract common code from disconnectFromNetwork and releaseNetwork
Both method are trying to detach the container from a cluster
network. The code is exactly the same, this removes the duplication.

Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Upstream-commit: cb6832c6d34cdce33341c4f71ba8f22a23608c70
Component: engine
2017-02-28 11:11:59 +01:00
37a58aacb7 validate extraHosts in daemon side
Signed-off-by: allencloud <allen.sun@daocloud.io>
Upstream-commit: d524dd95ccadf0b3dea2e04f5813abbc1423a84e
Component: engine
2017-02-28 10:37:59 +08:00
dd8010e320 Extract daemon configuration and discovery to their own package
This also moves some cli specific in `cmd/dockerd` as it does not
really belong to the `daemon/config` package.

Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Upstream-commit: db63f9370e26d725357c703cbaf9ab63cc7b6d0a
Component: engine
2017-02-08 09:53:38 +01:00
b304d83344 Merge pull request #30117 from msabansal/natfix
Added support for dns-search and fixes #30102
Upstream-commit: c0a1d2e0d88ff3cae6802dfbd128c7739e8c2bcc
Component: engine
2017-01-31 11:05:29 +01:00
377b3251ff Fix some typos
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Upstream-commit: 827bbe90a0fc53342a85d2653d96b8ba55658adf
Component: engine
2017-01-19 15:29:28 +08:00
3fdf20b049 Added support for dns-search and fixes #30102
Signed-off-by: msabansal <sabansal@microsoft.com>
Upstream-commit: e6962481a032c7278bc17c8fdcc42831c6d0b88f
Component: engine
2017-01-13 12:01:10 -08:00
aa92b07a0f fix nit in comments
Signed-off-by: allencloud <allen.sun@daocloud.io>
Upstream-commit: 847de59934bd1f9134b235ce70a5c2e1085a470e
Component: engine
2017-01-08 21:32:30 +08:00
a7d9e53a66 display network name when disconnecting network error
Signed-off-by: allencloud <allen.sun@daocloud.io>
Upstream-commit: f0844de8f00eef8fe42503986a87933516b11b46
Component: engine
2016-12-27 13:37:54 +08:00
e76537d57c Fix issue for --hostname when running in "--net=host"
This fix tries to address the issue raised in 29129 where
"--hostname" not working when running in "--net=host" for
`docker run`.

The fix fixes the issue by not resetting the `container.Config.Hostname`
if the `Hostname` has already been assigned through `--hostname`.

An integration test has been added to cover the changes.

This fix fixes 29129.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Upstream-commit: b0a7b0120f4461daa34527a743087e73ef8f5963
Component: engine
2016-12-06 07:29:45 -08:00
0fab75c2a1 Fix grammar on error message
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Upstream-commit: cd5c8e9c2dc181cd624bdb10947a19197215d730
Component: engine
2016-11-25 14:58:20 +00:00
e5fe6e7966 Merge pull request #27466 from mrjana/net
Retry AttachNetwork when it fails to find network
Upstream-commit: 9a61bd05f8c86be2a7f03d57f29991abaae20e5a
Component: engine
2016-11-08 18:25:45 +01:00
393d61cc88 dynamic service binding.
Signed-off-by: Dong Chen <dongluo.chen@docker.com>
Upstream-commit: ca81f6ee7c74a3bf27dc9b044742961f4ef78094
Component: engine
2016-11-04 21:50:56 -07:00
c790a14c2d Handle NetworkDettach for the case of network-id
When a container is attached to an "--attachable" network, it strictly
forms the attacherKey using either the network-id or network-name
because at the time of attachment, the daemon may not have the network
downloaded locally from the manager. Hence, when the NetworkDettach is
called, it should use either network-name or network-id. This fix
addresses the missing network-id based dettachment case.

Signed-off-by: Madhu Venugopal <madhu@docker.com>
Upstream-commit: 5f17e0f6c91b36a8f33d1efa1be879d6eb80132f
Component: engine
2016-11-03 15:56:35 -07:00
42f7097187 Retry AttachNetwork when it fails to find network
When trying to attach to swarm scope network for an unmanaged container
sometimes even if attaching to network succeeds, we may not find the
network because some other container which was using the network went
down and removed the network. So if it is not found, try to detach and
reattach to re-download the network from the manager.

Fixes #26588

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Upstream-commit: 849e345e2c366d875b03f8b33b31308ba16fb4fd
Component: engine
2016-11-01 21:46:24 -07:00
11be31b1b8 Copy only the relevant endpoint configs from Attachable config
When a container is run on a --attachable network, the endpoint
configs passed by the user were incorrectly overwritten.
Copy the relevant configs instead of overwriting the entire configs.

Signed-off-by: Madhu Venugopal <madhu@docker.com>
Upstream-commit: c5dd4d70c6ea3371d61409112a45c0573280111d
Component: engine
2016-10-29 17:11:30 -07:00
2a9003b823 Add basic prometheus support
This adds a metrics packages that creates additional metrics.  Add the
metrics endpoint to the docker api server under `/metrics`.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Add metrics to daemon package

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

api: use standard way for metrics route

Also add "type" query parameter

Signed-off-by: Alexander Morozov <lk4d4@docker.com>

Convert timers to ms

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Upstream-commit: 3343d234f3b131d4be1d4ca84385e184633a79bd
Component: engine
2016-10-27 10:34:38 -07:00
75231e91af Delete a redundant error return
Signed-off-by: Yanqiang Miao <miao.yanqiang@zte.com.cn>

update

Signed-off-by: Yanqiang Miao <miao.yanqiang@zte.com.cn>

update

Signed-off-by: Yanqiang Miao <miao.yanqiang@zte.com.cn>
Upstream-commit: 2d126f190d5a9c07b3d3aad93244055d8c9d5fc8
Component: engine
2016-10-22 08:53:57 +08:00
763c2d8e2f Windows: Factor out unused fields in container
Signed-off-by: John Howard <jhoward@microsoft.com>
Upstream-commit: 600f0ad21142f4085330107f629a80099af0490f
Component: engine
2016-10-13 14:51:10 -07:00
dc54335e17 Delete the redundant function 'errClusterNetworkOnRun'
Signed-off-by: Yanqiang Miao <miao.yanqiang@zte.com.cn>
Upstream-commit: 9c3d1236d22504410bf7363acbad382917e3a630
Component: engine
2016-09-24 11:24:48 +08:00
29a3433f96 Merge pull request #26805 from miaoyq/refactor-allocateNetwork
Replace two array with a map type, make it easier to understand.
Upstream-commit: 8643903e49c2d25272e59fe09a75803aa05ed841
Component: engine
2016-09-23 10:15:37 +02:00
0afb3aa46c Merge pull request #25987 from msabansal/dnssupport
Support for Windows service discovery
Upstream-commit: d3139fc84af390142341a29d3512462d1a449045
Component: engine
2016-09-22 20:56:03 -07:00
792dff1fb0 Replace two array with a map type, make it easier to understand.
Signed-off-by: Yanqiang Miao <miao.yanqiang@zte.com.cn>

update

Signed-off-by: Yanqiang Miao <miao.yanqiang@zte.com.cn>

update

Signed-off-by: Yanqiang Miao <miao.yanqiang@zte.com.cn>
Upstream-commit: 1989b1b58cc8c416e430a1fd824626b259dab5f6
Component: engine
2016-09-23 09:52:05 +08:00
6bf922c44e Changes required to support windows service discovery
Signed-off-by: msabansal <sabansal@microsoft.com>
Upstream-commit: d1e0a78614d4efa768c88c9db3868bc9e7782efc
Component: engine
2016-09-22 12:21:21 -07:00
51c529dd35 Fixed support for docker compose by allowing connect/disconnect on stopped containers
Signed-off-by: msabansal <sabansal@microsoft.com>
Upstream-commit: 50f02b585c6e5866cf458c7f91a2e99e82ea9a2d
Component: engine
2016-09-21 13:29:17 -07:00