Commit Graph

15 Commits

Author SHA1 Message Date
3daa4b4cdd Fix stats collector spinning CPU if no stats are collected
Commit fd0e24b7189374e0fe7c55b6d26ee916d3ee1655 changed
the stats collection loop to use a `sleep()` instead
of `time.Tick()` in the for-loop.

This change caused a regression in situations where
no stats are being collected, or an error is hit
in the loop (in which case the loop would `continue`,
and the `sleep()` is not hit).

This patch puts the sleep at the start of the loop
to guarantee it's always hit.

This will delay the sampling, which is similar to the
behavior before fd0e24b7189374e0fe7c55b6d26ee916d3ee1655.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 481b8e54b45955e40075f49a9af321afce439320
Component: engine
2018-03-15 17:56:15 +01:00
e11a0c3a06 Merge pull request #36519 from stevvooe/resilient-cpu-sampling
daemon/stats: more resilient cpu sampling
Upstream-commit: 623b1a5c3c7d6b4d6d5943b64bd9ae6a7813786e
Component: engine
2018-03-09 14:34:45 -08:00
aa0ca25049 daemon/stats: more resilient cpu sampling
To avoid noise in sampling CPU usage metrics, we now sample the system
usage closer to the actual response from the underlying runtime. Because
the response from the runtime may be delayed, this makes the sampling
more resilient in loaded conditions. In addition to this, we also
replace the tick with a sleep to avoid situations where ticks can backup
under loaded conditions.

The trade off here is slightly more load reading the system CPU usage
for each container. There may be an optimization required for large
amounts of containers but the cost is on the order of 15 ms per 1000
containers. If this becomes a problem, we can time slot the sampling,
but the complexity may not be worth it unless we can test further.

Unfortunately, there aren't really any good tests for this condition.
Triggering this behavior is highly system dependent. As a matter of
course, we should qualify the fix with the users that are affected.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
Upstream-commit: fd0e24b7189374e0fe7c55b6d26ee916d3ee1655
Component: engine
2018-03-07 13:20:21 -08:00
804775ddc9 daemon/stats: remove obnoxious types file
While a `types.go` file is handly when there are a lot of record types,
it is completely obnoxious when used for concrete, utility types with a
struct, new function and method set in the same file. This change
removes the `types.go` file in favor of the simpler approach.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
Upstream-commit: 244e59e94f153af82e6c3bd8a6c200a48d3cea60
Component: engine
2018-03-05 15:59:04 -08:00
be83c11fb0 Add canonical import comment
Signed-off-by: Daniel Nephin <dnephin@docker.com>
Upstream-commit: 4f0d95fa6ee7f865597c03b9e63702cdcb0f7067
Component: engine
2018-02-05 16:51:57 -05:00
fd08bae89c Remove redundant build-tags
Files that are suffixed with `_linux.go` or `_windows.go` are
already only built on Linux / Windows, so these build-tags
were redundant.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: 6ed1163c98703f8dd0693cecbadc84d2cda811c3
Component: engine
2017-12-18 17:41:53 +01:00
eefbd135ae Remove solaris build tag and `contrib/mkimage/solaris
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
Upstream-commit: 4785f1a7ab7ec857dc3ca849ee6ecadf519ef30e
Component: engine
2017-11-02 00:01:46 +00:00
d78181e968 Remove solaris files
For obvious reasons that it is not really supported now.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Upstream-commit: 5a9b5f10cf967f31f0856871ad08f9a0286b4a46
Component: engine
2017-10-24 15:39:34 -04:00
30f1b651e2 Remove string checking in API error handling
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.

Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: ebcb7d6b406fe50ea9a237c73004d75884184c33
Component: engine
2017-08-15 16:01:11 -04:00
d659edcaf5 Update logrus to v1.0.1
Fixes case sensitivity issue

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Upstream-commit: 1009e6a40b295187e038b67e184e9c0384d95538
Component: engine
2017-07-31 13:16:46 -07:00
9e646d8386 Return an empty stats if "container not found"
If we get "container not found" error from containerd, it's possibly
because that this container has already been stopped. It will be ok to
ignore this error and just return an empty stats.

Signed-off-by: Yuanhong Peng <pengyuanhong@huawei.com>
Upstream-commit: 4a6cbf9bcb78d38c48ef963f585f0fadf733e101
Component: engine
2017-07-10 16:30:48 +08:00
008c8eb206 Do not treat C.sysconf(C._SC_NPROCESSORS_ONLN) non-zero errno as error
Treat return code -1 as error instead.

People from glibc say that errno is undefined in case of successful
sysconf call according to POSIX standard:
Glibc bug: https://sourceware.org/bugzilla/show_bug.cgi?id=21536

More over in sysconf man it is wrongly said that "errno is not changed"
on success. So I've created a bug to man-pages:
https://bugzilla.kernel.org/show_bug.cgi?id=195955

Background: Glibc's sysconf(_SC_NPROCESSORS_ONLN) changes errno to
ENOENT, if there is no /sys/devices/system/cpu/online file, while
the call itself is successful. In Virtuozzo containers we prohibit
most of sysfs files for security reasons. So we have Run():daemon
/stats/collector.go infinitely loop never actualy collecting stats
from publisher pairs.

v2: add comment

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Upstream-commit: dec084962eab41eb20b1808955de34cfec4fc8b3
Component: engine
2017-06-01 18:23:49 +03:00
6f5f3ec34f Correct CPU usage calculation in presence of offline CPUs and newer Linux
In https://github.com/torvalds/linux/commit/5ca3726 (released in v4.7-rc1) the
content of the `cpuacct.usage_percpu` file in sysfs was changed to include both
online and offline cpus. This broke the arithmetic in the stats helpers used by
`docker stats`, since it was using the length of the PerCPUUsage array as a
proxy for the number of online CPUs.

Add current number of online CPUs to types.StatsJSON and use it in the
calculation.

Keep a fallback to `len(v.CPUStats.CPUUsage.PercpuUsage)` so this code
continues to work when talking to an older daemon. An old client talking to a
new daemon will ignore the new field and behave as before.

Fixes #28941.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Upstream-commit: 115f91d7575d6de6c7781a96a082f144fd17e400
Component: engine
2017-03-10 10:24:33 +00:00
50715b9350 Send "Name" and "ID" when stating stopped containers
When `docker stats` stopped containers, client will get empty stats data,
this commit will gurantee client always get "Name" and "ID" field, so
that it can format with `ID` and `Name` fields successfully.

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Upstream-commit: eb3a7c232908377f4d4c0b0ec6fd35b0be138fdd
Component: engine
2017-02-09 09:46:59 +08:00
22fd058892 Extract daemon statsCollector to its own package
Signed-off-by: Vincent Demeester <vincent@sbr.pm>
Upstream-commit: 835971c6fdaf6ea35a0e7e45f6d9a09fd5f03ce1
Component: engine
2017-01-04 18:18:30 +01:00