The mountinfo parser implemented via `fmt.Sscanf()` is slower than the one
using `strings.Split()` and `strconv.Atoi()`. This rewrite helps to speed it
up to a factor of 8x, here is a result from go bench:
> BenchmarkParsingScanf-4 300 22294112 ns/op
> BenchmarkParsingSplit-4 3000 2780703 ns/op
I tried other approaches, such as using `fmt.Sscanf()` for the first
three (integer) fields and `strings.Split()` for the rest, but it slows
things down considerably:
> BenchmarkParsingMixed-4 1000 8827058 ns/op
Note the old code uses `fmt.Sscanf`, when a linear search for '-' field,
when a split for the last 3 fields. The new code relies on a single
split.
I have also added more comments to aid in future development.
Finally, the test data is fixed to now have white space before the first field.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: c611f18a7f16d8aa878a5a5c7537d23a0937c40a
Component: engine
The flow of getSourceMount was:
1 get all entries from /proc/self/mountinfo
2 do a linear search for the `source` directory
3 if found, return its data
4 get the parent directory of `source`, goto 2
The repeated linear search through the whole mountinfo (which can have
thousands of records) is inefficient. Instead, let's just
1 collect all the relevant records (only those mount points
that can be a parent of `source`)
2 find the record with the longest mountpath, return its data
This was tested manually with something like
```go
func TestGetSourceMount(t *testing.T) {
mnt, flags, err := getSourceMount("/sys/devices/msr/")
assert.NoError(t, err)
t.Logf("mnt: %v, flags: %v", mnt, flags)
}
```
...but it relies on having a specific mount points on the system
being used for testing.
[v2: add unit tests for ParentsFilter]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 871c957242df9f8c74faf751a2f14eb5178d4140
Component: engine
Functions `GetMounts()` and `parseMountTable()` return all the entries
as read and parsed from /proc/self/mountinfo. In many cases the caller
is only interested only one or a few entries, not all of them.
One good example is `Mounted()` function, which looks for a specific
entry only. Another example is `RecursiveUnmount()` which is only
interested in mount under a specific path.
This commit adds `filter` argument to `GetMounts()` to implement
two things:
1. filter out entries a caller is not interested in
2. stop processing if a caller is found what it wanted
`nil` can be passed to get a backward-compatible behavior, i.e. return
all the entries.
A few filters are implemented:
- `PrefixFilter`: filters out all entries not under `prefix`
- `SingleEntryFilter`: looks for a specific entry
Finally, `Mounted()` is modified to use `SingleEntryFilter()`, and
`RecursiveUnmount()` is using `PrefixFilter()`.
Unit tests are added to check filters are working.
[v2: ditch NoFilter, use nil]
[v3: ditch GetMountsFiltered()]
[v4: add unit test for filters]
[v5: switch to gotestyourself]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: bb934c6aca3e77541dd4fd51b9ab2706294dadda
Component: engine
Prevent changing the tar output by setting the format to
PAX and keeping the times truncated.
Without this change the archiver will produce different tar
archives with different hashes with go 1.10.
The addition of the access and changetime timestamps would
also cause diff comparisons to fail.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: fb170206ba12752214630b269a40ac7be6115ed4
Component: engine
Remove invalid flush commands, flush should only occur when file
has been completely written. This is already handle, remove these calls.
Ensure data gets written after EOF in correct order and before close.
Remove gname and uname from sum for hash compatibility.
Update tarsum tests for gname/uname removal.
Return valid length after eof.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: a422774e593b33bd287d9890544ad9e09b380d8c
Component: engine
When the authz response buffer limit is hit, perform a flush.
This prevents excessive buffer sizes, especially on large responses
(e.g. `/containers/<id>/archive` or `/containers/<id>/export`).
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 74f8e47352e71aad4015d8d9dea8f16e7a055863
Component: engine
Makes sure that if the user cancels a request that the daemon stops
trying to traverse a directory.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 9d46c4c138d7b3f7778c13fe84857712bd6c97a9
Component: engine
Sorting by mount point length can be implemented in a more
straightforward fashion since Go 1.8 introduced sort.Slice()
with an ability to provide a less() function in place.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: a00310b54c0cdcafb402aeea92feca865da9fdf3
Component: engine
This makes `go test .` to pass if run as non-root user, skipping
those tests that require superuser privileges (for `mount`).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Upstream-commit: 4aae77602a7540b4f977572f3fbdc0891ac57cab
Component: engine
dm_task_deferred_remove is not supported by all distributions, due to
out-dated versions of devicemapper. However, in the case where the
devicemapper library was updated without rebuilding Docker (which can
happen in some distributions) then we should attempt to dynamically load
the relevant object rather than try to link to it.
This can only be done if Docker was built dynamically, for obvious
reasons.
In order to avoid having issues arise when dlsym(3) was unnecessary,
gate the whole dlsym(3) logic behind a buildflag that we disable by
default (libdm_dlsym_deferred_remove).
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Upstream-commit: 98fe4bd8f1e35f8e498e268f653a43cbfa31e751
Component: engine
Before this change, volume management was relying on the fact that
everything the plugin mounts is visible on the host within the plugin's
rootfs. In practice this caused some issues with mount leaks, so we
changed the behavior such that mounts are not visible on the plugin's
rootfs, but available outside of it, which breaks volume management.
To fix the issue, allow the plugin to scope the path correctly rather
than assuming that everything is visible in `p.Rootfs`.
In practice this is just scoping the `PropagatedMount` paths to the
correct host path.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: 0e5eaf8ee32662182147f5f62c1bfebef66f5c47
Component: engine
The plugin spec says that plugins can live in one of:
- /var/run/docker/plugins/<name>.sock
- /var/run/docker/plugins/<name>/<name>.sock
- /etc/docker/plugins/<name>.[json,spec]
- /etc/docker/plugins/<name>/<name>.<json,spec>
- /usr/lib/docker/plugins/<name>.<json,spec>
- /usr/lib/docker/plugins/<name>/<name>.<json,spec>
However, the plugin scanner which is used by the volume list API was
doing `filepath.Walk`, which will walk the entire tree for each of the
supported paths.
This means that even v2 plugins in
`/var/run/docker/plugins/<id>/<name>.sock` were being detected as a v1
plugin.
When the v1 plugin loader tried to load such a plugin it would log an
error that it couldn't find it because it doesn't match one of the
supported patterns... e.g. when in a subdir, the subdir name must match
the plugin name for the socket.
There is no behavior change as the error is only on the `Scan()` call,
which is passing names to the plugin registry when someone calls the
volume list API.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: b27f70d45a0fbb744c17dda02f597ffa6a47d4d9
Component: engine
Signed-off-by: John Howard <jhoward@microsoft.com>
The re-coalesces the daemon stores which were split as part of the
original LCOW implementation.
This is part of the work discussed in https://github.com/moby/moby/issues/34617,
in particular see the document linked to in that issue.
Upstream-commit: ce8e529e182bde057cdfafded62c210b7293b8ba
Component: engine
This protects the daemon from volume plugins that are slow or
deadlocked.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: b15f8d2d4f054a87052a7065c50441f7e8479fa9
Component: engine
The Golang built-in gzip library is serialized, and fairly slow
at decompressing. It also only decompresses on demand, versus
pipelining decompression.
This change switches to using the pigz external command
for gzip decompression, as opposed to using the built-in
golang one. This code is not vendored, but will be used
if it autodetected as part of the OS.
This also switches to using context, versus a manually
managed channel to manage cancellations, and synchronization.
There is a little bit of weirdness around manually having
to cancel in the error cases.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Upstream-commit: fd35494a251a497c359f706f61f33e689e2af678
Component: engine
When a recursive unmount fails, don't bother parsing the mount table to check
if what we expected to be a mountpoint is still mounted. `EINVAL` is
returned when you try to unmount something that is not a mountpoint, the
other cases of `EINVAL` would not apply here unless everything is just
wrong. Parsing the mount table over and over is relatively expensive,
especially in the code path that it's in.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Upstream-commit: dd2108766017c13a19bdbfd1a56cd1358580e0bb
Component: engine