Commit Graph

20 Commits

Author SHA1 Message Date
4524589dc5 Add support for user-defined healthchecks
This PR adds support for user-defined health-check probes for Docker
containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus
some corresponding "docker run" options. It can be used with a restart policy
to automatically restart a container if the check fails.

The `HEALTHCHECK` instruction has two forms:

* `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container)
* `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image)

The `HEALTHCHECK` instruction tells Docker how to test a container to check that
it is still working. This can detect cases such as a web server that is stuck in
an infinite loop and unable to handle new connections, even though the server
process is still running.

When a container has a healthcheck specified, it has a _health status_ in
addition to its normal status. This status is initially `starting`. Whenever a
health check passes, it becomes `healthy` (whatever state it was previously in).
After a certain number of consecutive failures, it becomes `unhealthy`.

The options that can appear before `CMD` are:

* `--interval=DURATION` (default: `30s`)
* `--timeout=DURATION` (default: `30s`)
* `--retries=N` (default: `1`)

The health check will first run **interval** seconds after the container is
started, and then again **interval** seconds after each previous check completes.

If a single run of the check takes longer than **timeout** seconds then the check
is considered to have failed.

It takes **retries** consecutive failures of the health check for the container
to be considered `unhealthy`.

There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list
more than one then only the last `HEALTHCHECK` will take effect.

The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK
CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands;
see e.g. `ENTRYPOINT` for details).

The command's exit status indicates the health status of the container.
The possible values are:

- 0: success - the container is healthy and ready for use
- 1: unhealthy - the container is not working correctly
- 2: starting - the container is not ready for use yet, but is working correctly

If the probe returns 2 ("starting") when the container has already moved out of the
"starting" state then it is treated as "unhealthy" instead.

For example, to check every five minutes or so that a web-server is able to
serve the site's main page within three seconds:

    HEALTHCHECK --interval=5m --timeout=3s \
      CMD curl -f http://localhost/ || exit 1

To help debug failing probes, any output text (UTF-8 encoded) that the command writes
on stdout or stderr will be stored in the health status and can be queried with
`docker inspect`. Such output should be kept short (only the first 4096 bytes
are stored currently).

When the health status of a container changes, a `health_status` event is
generated with the new status. The health status is also displayed in the
`docker ps` output.

Signed-off-by: Thomas Leonard <thomas.leonard@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: b6c7becbfe1d76b1250f6d8e991e645e13808a9c
Component: engine
2016-06-02 23:58:34 +02:00
b2ac99b3fa Remove static errors from errors package.
Moving all strings to the errors package wasn't a good idea after all.

Our custom implementation of Go errors predates everything that's nice
and good about working with errors in Go. Take as an example what we
have to do to get an error message:

```go
func GetErrorMessage(err error) string {
	switch err.(type) {
	case errcode.Error:
		e, _ := err.(errcode.Error)
		return e.Message

	case errcode.ErrorCode:
		ec, _ := err.(errcode.ErrorCode)
		return ec.Message()

	default:
		return err.Error()
	}
}
```

This goes against every good practice for Go development. The language already provides a simple, intuitive and standard way to get error messages, that is calling the `Error()` method from an error. Reinventing the error interface is a mistake.

Our custom implementation also makes very hard to reason about errors, another nice thing about Go. I found several (>10) error declarations that we don't use anywhere. This is a clear sign about how little we know about the errors we return. I also found several error usages where the number of arguments was different than the parameters declared in the error, another clear example of how difficult is to reason about errors.

Moreover, our custom implementation didn't really make easier for people to return custom HTTP status code depending on the errors. Again, it's hard to reason about when to set custom codes and how. Take an example what we have to do to extract the message and status code from an error before returning a response from the API:

```go
	switch err.(type) {
	case errcode.ErrorCode:
		daError, _ := err.(errcode.ErrorCode)
		statusCode = daError.Descriptor().HTTPStatusCode
		errMsg = daError.Message()

	case errcode.Error:
		// For reference, if you're looking for a particular error
		// then you can do something like :
		//   import ( derr "github.com/docker/docker/errors" )
		//   if daError.ErrorCode() == derr.ErrorCodeNoSuchContainer { ... }

		daError, _ := err.(errcode.Error)
		statusCode = daError.ErrorCode().Descriptor().HTTPStatusCode
		errMsg = daError.Message

	default:
		// This part of will be removed once we've
		// converted everything over to use the errcode package

		// FIXME: this is brittle and should not be necessary.
		// If we need to differentiate between different possible error types,
		// we should create appropriate error types with clearly defined meaning
		errStr := strings.ToLower(err.Error())
		for keyword, status := range map[string]int{
			"not found":             http.StatusNotFound,
			"no such":               http.StatusNotFound,
			"bad parameter":         http.StatusBadRequest,
			"conflict":              http.StatusConflict,
			"impossible":            http.StatusNotAcceptable,
			"wrong login/password":  http.StatusUnauthorized,
			"hasn't been activated": http.StatusForbidden,
		} {
			if strings.Contains(errStr, keyword) {
				statusCode = status
				break
			}
		}
	}
```

You can notice two things in that code:

1. We have to explain how errors work, because our implementation goes against how easy to use Go errors are.
2. At no moment we arrived to remove that `switch` statement that was the original reason to use our custom implementation.

This change removes all our status errors from the errors package and puts them back in their specific contexts.
IT puts the messages back with their contexts. That way, we know right away when errors used and how to generate their messages.
It uses custom interfaces to reason about errors. Errors that need to response with a custom status code MUST implementent this simple interface:

```go
type errorWithStatus interface {
	HTTPErrorStatusCode() int
}
```

This interface is very straightforward to implement. It also preserves Go errors real behavior, getting the message is as simple as using the `Error()` method.

I included helper functions to generate errors that use custom status code in `errors/errors.go`.

By doing this, we remove the hard dependency we have eeverywhere to our custom errors package. Yes, you can use it as a helper to generate error, but it's still very easy to generate errors without it.

Please, read this fantastic blog post about errors in Go: http://dave.cheney.net/2014/12/24/inspecting-errors

Signed-off-by: David Calavera <david.calavera@gmail.com>
Upstream-commit: a793564b2591035aec5412fbcbcccf220c773a4c
Component: engine
2016-02-26 15:49:09 -05:00
68f96de053 Remove redundant error message
Currently some commands including `kill`, `pause`, `restart`, `rm`,
`rmi`, `stop`, `unpause`, `udpate`, `wait` will print a lot of error
message on client side, with a lot of redundant messages, this commit is
trying to remove the unuseful and redundant information for user.

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Upstream-commit: 894266c1bbdfeb53bf278f3cb762945bac69e592
Component: engine
2016-02-03 15:45:20 +08:00
2336bc4aed Correct the info message when stop container
Signed-off-by: Lei Jitang <leijitang@huawei.com>
Upstream-commit: 6716a3a167fcd0a9abc013ce536117175922af96
Component: engine
2016-01-27 03:06:45 -05:00
2cee7ddb46 Rename Daemon.Get to Daemon.GetContainer.
This is more aligned with `Daemon.GetImage` and less confusing.

Signed-off-by: David Calavera <david.calavera@gmail.com>
Upstream-commit: d7d512bb927023b76c3c01f54a3655ee7c341637
Component: engine
2015-12-11 12:39:28 -05:00
d010c48ce4 Move Container to its own package.
So other packages don't need to import the daemon package when they
want to use this struct.

Signed-off-by: David Calavera <david.calavera@gmail.com>
Signed-off-by: Tibor Vass <tibor@docker.com>
Upstream-commit: 6bb0d1816acd8d4f7a542a6aac047da2b874f476
Component: engine
2015-12-03 17:39:49 +01:00
cf2d677f4e Decouple daemon and container to log events.
Create a supervisor interface to let the container monitor to emit events.

Signed-off-by: David Calavera <david.calavera@gmail.com>
Upstream-commit: ca5ede2d0a23cb84cac3b863c363d0269e6438df
Component: engine
2015-11-04 12:27:48 -05:00
41d5167da1 Decouple daemon and container to stop and kill containers.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Upstream-commit: 4f2a5ba360d0b00213d31f50a5be074c89124c52
Component: engine
2015-11-04 12:27:47 -05:00
69db6279aa Revert "Merge pull request #16228 from duglin/ContextualizeEvents"
Although having a request ID available throughout the codebase is very
valuable, the impact of requiring a Context as an argument to every
function in the codepath of an API request, is too significant and was
not properly understood at the time of the review.

Furthermore, mixing API-layer code with non-API-layer code makes the
latter usable only by API-layer code (one that has a notion of Context).

This reverts commit de4164043546d2b9ee3bf323dbc41f4979c84480, reversing
changes made to 7daeecd42d7bb112bfe01532c8c9a962bb0c7967.

Signed-off-by: Tibor Vass <tibor@docker.com>

Conflicts:
	api/server/container.go
	builder/internals.go
	daemon/container_unix.go
	daemon/create.go
Upstream-commit: b08f071e18043abe8ce15f56826d38dd26bedb78
Component: engine
2015-09-29 14:26:51 -04:00
bf44c732da Add context.RequestID to event stream
This PR adds a "request ID" to each event generated, the 'docker events'
stream now looks like this:

```
2015-09-10T15:02:50.000000000-07:00 [reqid: c01e3534ddca] de7c5d4ca927253cf4e978ee9c4545161e406e9b5a14617efb52c658b249174a: (from ubuntu) create
```
Note the `[reqID: c01e3534ddca]` part, that's new.

Each HTTP request will generate its own unique ID. So, if you do a
`docker build` you'll see a series of events all with the same reqID.
This allow for log processing tools to determine which events are all related
to the same http request.

I didn't propigate the context to all possible funcs in the daemon,
I decided to just do the ones that needed it in order to get the reqID
into the events. I'd like to have people review this direction first, and
if we're ok with it then I'll make sure we're consistent about when
we pass around the context - IOW, make sure that all funcs at the same level
have a context passed in even if they don't call the log funcs - this will
ensure we're consistent w/o passing it around for all calls unnecessarily.

ping @icecrime @calavera @crosbymichael

Signed-off-by: Doug Davis <dug@us.ibm.com>
Upstream-commit: 26b1064967d9fcefd4c35f60e96bf6d7c9a3b5f8
Component: engine
2015-09-24 11:56:37 -07:00
3904dd3167 Move api/errors/ to errors/
Per @calavera's suggestion: https://github.com/docker/docker/pull/16355#issuecomment-141139220

Signed-off-by: Doug Davis <dug@us.ibm.com>
Upstream-commit: a283a30fb026aad4434a9f2e34f7ce955d27a957
Component: engine
2015-09-17 11:54:14 -07:00
6295202aba Convert some "daemon" static error strings to the new errocode package format
Signed-off-by: Doug Davis <dug@us.ibm.com>
Upstream-commit: f7d4b4fe2b130a522dee847a657218806180fa52
Component: engine
2015-09-16 16:16:42 -07:00
1870e3919c golint fixes for daemon/ package
- some method names were changed to have a 'Locking' suffix, as the
 downcased versions already existed, and the existing functions simply
 had locks around the already downcased version.
 - deleting unused functions
 - package comment
 - magic numbers replaced by golang constants
 - comments all over

Signed-off-by: Morgan Bauer <mbauer@us.ibm.com>
Upstream-commit: abd72d4008dde7ee8249170d49eb4bc963c51e24
Component: engine
2015-08-27 22:07:42 -07:00
4f79291859 Cleanup container LogEvent calls
Move some calls to container.LogEvent down lower so that there's
less of a chance of them being missed. Also add a few more events
that appear to have been missed.

Added testcases for new events: commit, copy, resize, attach, rename, top

Signed-off-by: Doug Davis <dug@us.ibm.com>
Upstream-commit: 8232312c1e705753d3db82dca3d9bb23e59c3b52
Component: engine
2015-06-01 12:39:28 -07:00
d05177503f Remove job from stop
Signed-off-by: Antonio Murdaca <me@runcom.ninja>
Upstream-commit: 04cc6c6aa4f8ea656d23268b7bff0136e4fc6c8d
Component: engine
2015-04-12 00:41:16 +02:00
a16f3d6cb4 Remove engine.Status and replace it with standard go error
Signed-off-by: Antonio Murdaca <me@runcom.ninja>
Upstream-commit: c79b9bab541673af121d829ebc3b29ff1b01efa2
Component: engine
2015-03-25 22:32:08 +01:00
685b876322 Closes #9311 Handles container id/name collisions against daemon functionalities according to #8069
Signed-off-by: Andrew C. Bodine <acbodine@us.ibm.com>
Upstream-commit: d25a65375c880017ac0c516389b0b7afde810517
Component: engine
2015-01-21 17:11:31 -08:00
13e3017817 Use State as embedded to Container
Signed-off-by: Alexandr Morozov <lk4d4math@gmail.com>
Upstream-commit: e0339d4b88989a31b72be02582eee72433d2f0ec
Component: engine
2014-09-03 00:01:11 +04:00
a195939eec Separate events subsystem
* Events subsystem merged from `server/events.go` and
  `utils/jsonmessagepublisher.go` and moved to `events/events.go`
* Only public interface for this subsystem is engine jobs
* There is two new engine jobs - `log_event` and `subscribers_count`
* There is auxiliary function `container.LogEvent` for logging events for
  containers

Docker-DCO-1.1-Signed-off-by: Alexandr Morozov <lk4d4math@gmail.com> (github: LK4D4)
[solomon@docker.com: resolve merge conflicts]
Signed-off-by: Solomon Hykes <solomon@docker.com>
Upstream-commit: 8d056423f8c433927089bd7eb6bc97abbc1ed502
Component: engine
2014-08-06 10:08:19 +00:00
4b7b997c6f Move "stop" to daemon/stop.go
This is part of an effort to break apart the deprecated server/ package

Docker-DCO-1.1-Signed-off-by: Solomon Hykes <solomon@docker.com> (github: shykes)
Upstream-commit: 03c07617cce5167ac854ff84637be9cccef39328
Component: engine
2014-08-01 14:16:59 -04:00