Right now devicemapper mounts thin device using online discards by default
and passes mount option "discard". Generally people discourage usage of
online discards as they can be a drain on performance. Instead it is
recommended to use fstrim once in a while to reclaim the space.
In case of containers, we recommend to keep data volumes separate. So
there might not be lot of rm, unlink operations going on and there might
not be lot of space being freed by containers. So it might not matter
much if we don't reclaim that free space in pool.
User can still pass mount option explicitly using dm.mountopt=discard to
enable discards if they would like to.
So this is more like setting the containers by default for better performance
instead of better space efficiency in pool. And user can change the behavior
if they don't like default behavior.
Reported-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: 04adaaf1ee1688ff48cb8f541dcb80e965f45080
Component: engine
If device is being reactivated before it could go away and deferred
deactivation is scheduled on it, cancel it.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: ddc8acebecfdc7dbc0357f5c009fb3ee0a2ae906
Component: engine
This will help with debugging as one could just do "docker info" and figure
out of deferred removal is enabled or not.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: 66a53819aea2ab1ab0d50be1f8d32fcb2427cd78
Component: engine
Make use of deferred removal of devices.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: e37c7203bb1d840e9383ac08bf87afda3e722344
Component: engine
Provide a new command line knob dm.deferred_device_removal which will enable
deferred device deactivation if driver and library support it.
This patch also checks for library support and driver version.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: 15c158b20725fd62e2ee0a72ffaf1617852cd0d9
Component: engine
This provides an override for forcing the daemon to still attempt
running the devicemapper driver even when udev sync is not supported.
Intended to be a very clear impairment for those choosing to use it. If
udev sync is false, there will still be an error in the daemon logs,
even when the override is in place. The docs have an explicit WARNING.
Including link to the docs for users that encounter this daemon error
during an upgrade.
Signed-off-by: Vincent Batts <vbatts@redhat.com>
Upstream-commit: 0e21782de5c038dfa3cfdfc7655b9e6b143baa7b
Component: engine
Right now we try device removal at the interval of 10ms and keep on trying
till either device is removed or 10 seconds are over. That means if device
is busy, we will try 1000 times in those 10 seconds.
Sounds too high a frequency of deivce removal retrial. All the logs are
filled easily. I think it is a good idea to slow down a bit and retry at
the interval of 100ms instead of 10ms.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: c737800b7faced4b53854c8cb6766ebe58a3c3e9
Component: engine
During device removal, we are first waiting for device to close() in a tight
loop for 10 seconds. I am not sure why do we need it. First of all we come
here once the umount() is successful so device should be free. For some reason
of device is temporarily busy, then removeDevice() logic retries device removal
logic in a loop for 10 seconds and that should cover it. Can't see why one
more 10 seoncds loop is required before attempting device removal.
One loop should be able to cover all the temporary device busy conditions and
if condition is not temporary then 10 seconds loop is not going to help anyway.
So instead of two loops of 10 seconds each, I am converting it to a single
loop of 20 seconds. May be 10 second loop is good enough but for now I am
keeping it 20 seconds to avoid any regressions.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: f74d12012c21349b2bd51d9c395a99331ff0a9a5
Component: engine
Currently in device removal path (device deactivation), we wait
for 10 seconds for devive to actually go away. waitRemove().
In current code this is not required. If dm removal task has completed
and one has done the wait on udev cookie, then device is gone and there
is no need to write another loop to wait for device removal.
This patch removes the waitRemove() which waits for 10 seconds after
device removal. This seems unnecessary.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: dbf04ec4e2a6b4fe73f7f300918a906c0ff1a37b
Component: engine
devmapper graph driver retries device removal 1000 times in case of failure
and if this fills up console with 1000 messages (when daemon is running in
debug mode). So remove these debug messages.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: cb7c893275c32ddfa775c3f22869a9c211024c71
Component: engine
There are issues with libdm logging. Right now if docker daemon is run
in debug mode, logging by libdm is too verbose. And if a device can't
be removed, thousands of messages fill the console and one can not see
what's going on.
This patch removes devicemapper.LogInitVerbose() call as that call will
only work if docker was not registering its own log handler with libdm.
For some reason docker registers one with libdm and libdm hands over
all the messages to docker (including debug ones). And now it is up to
devmapper backend to figure out which ones should go to console and
which ones should not.
So by default log only fatal messages from libdm. One can easily modify
the code to change it for debugging purposes.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: e07d3cd9acf14219f33e12375fb8c2e3fe02ad0c
Component: engine
Any file which starts with "." is not a valid metadata file. Skip it
during device map construction.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: 080a6f1e4b7c33ce9b7b1b90cbf8fd90658226c2
Component: engine
moar information for the information gods
Signed-off-by: Vincent Batts <vbatts@redhat.com>
Upstream-commit: 4cfe9df0a9c206c368a90f460fea8fab197265d9
Component: engine
when initializing the devmapper driver, attempt to sync udev and device
mapper. If udev sync is not supported, print a warning. Eventually we'll
likely bail here to avoid unpredictable behavior for users.
Signed-off-by: Vincent Batts <vbatts@redhat.com>
Upstream-commit: 022e1232f84966c4b70a612bc35463ebb58e3137
Component: engine
Presenly the "Data file:" shows either the loopback _file_ or the block device.
With this, the "Data file:" will always show the device, and if it is a
loopback, then there will additionally be a "Data loop file:".
(Same for "Metadata file:")
Signed-off-by: Vincent Batts <vbatts@redhat.com>
Upstream-commit: 09c033ff872334cdcc45172ac57dbf21573481ef
Component: engine
syscall.Unmount failed sometimes when user interrupted exporting,
for example a Ctrl-C, or pipe to commands which closed the pipe early,
like "docker export <container_name> | file -"; this syscall.Unmount
could sometimes return EBUSY and didn't actually umount the filesystem;
which would cause a following export command fail to mount;
change to lazy Unmount with MNT_DETACH can fix the problem, this is
the same behavior as in Shutdown;
```text
time="2015-01-03T21:27:26Z" level=error msg="Warning: error unmounting device
34a3e77cdbca17ceffd0636aee0415bb412996adb12360bfe2585ce30467fa8e: device or resource busy"
```
```
$ docker export thirsty_ardinghelli | file -
/dev/stdin: POSIX tar archive
time="2015-01-03T21:58:17Z" level=fatal msg="write /dev/stdout: broken pipe"
$ docker export thirsty_ardinghelli
time="2015-01-03T21:54:33Z" level=fatal msg="Error: thirsty_ardinghelli: Error getting container
34a3e77cdbca17ceffd0636aee0415bb412996adb12360bfe2585ce30467fa8e from driver devicemapper:
Error mounting '/dev/mapper/docker-253:0-3148372-34a3e77cdbca17ceffd0636aee0415bb412996adb12360bfe2585ce30467fa8e'
on '/var/lib/docker/devicemapper/mnt/34a3e77cdbca17ceffd0636aee0415bb412996adb12360bfe2585ce30467fa8e': device or resource busy"
```
Signed-off-by: Derek Che <drc@yahoo-inc.com>
Upstream-commit: 9bbed5ab4ceaff5e78c21f0fa2d84de5ffd41f94
Component: engine
Use transaction logic during device deletion and do rollback if transaction
is not complete. Following is the sequence of events.
- Open transaction and save to metafile
- Delete device from pool
- Delete device metadata file from disk
- Close Transaction
If docker crashes without closing transaction then rollback will take
place upon next docker start.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: 17b75a21a667a27a9a27565ab282cd615dbdb66e
Component: engine
Finally this patch uses the notion of transaction for device or snapshot
device creation.
Following is sequence of event.
- Open a trasaction and save details in a file.
- Create a new device/snapshot device
- If a new device id is used, refresh transaction with new device id details.
- Create device metadata file
- Close transaction.
If docker crashes anywhere in between without closing transaction, then
upon next start, docker will figure out that there was a pending transaction
and it will roll back transaction. That is it will do following.
- Delete Device from pool
- Delete device metadata file
- Remove transaction file to mark no transaction is pending.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: c115c4aa45ba82f27859b0afba5724d437857879
Component: engine
Finally, we seem to have all the bits to keep track of all used device
Ids and find a free device Id to use when creating a new device. Start
using it.
Ideally we should completely move away from retry logic when pool returns
-EEXISTS. For now I have retained that logic and I simply output a warning.
When things are stable, we should be able to get rid of it.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: e28a419e1197bf50bbb378b02f0226c3115edeaa
Component: engine
Open code createDevice() and createSnapDevice() and move all the logic
in the caller.
This is a sheer code reorganization so that all device Id allocation
logic is in one function. That way in case of erros, one can easily
cleanup and mark device Id free again. (Later patches benefit from
it).
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: 14d0dd855ee1e7cd1a3185c3d5a00e7afccb5c43
Component: engine
Right now we are accessing devices.NextDeviceId directly and also
incrementing it at various places.
Instead provide a helper function which is responsile for
incrementing NextDeviceId and return next deviceId.
This is just code structuring. This will help later once we
convert this function to find a free device Id and it goes
through a bitmap of used/free device Ids.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: a44c23fe6604d1de59c64bbb9dc234c7c3dbede9
Component: engine
When docker starts, build a used/free Device Id map from the per
device meta files we already have. These meta files have the data
which device Ids are in use. Parse these files and mark device as
used in the map.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: 39dc7829dea87d4be8e6e9b2a598fb354ebf4ba0
Component: engine
Currently devicemapper backend does not keep track of used device Ids in
the pool. It tries a device Id and if that device Id exists in pool, it
tries with a different Id and keeps on doing this in a loop till it succeeds.
This worked fine so far but now we are moving to transaction based
device creation and deletion. We will keep deviceId information in
transaction which will be rolled back if docker crashed before transaction
was complete.
If we store a deviceId in transaction and later figure out it already
existed in pool and docker crashed, then we will rollback and remove
that existing device Id from pool (which we should not have).
That means, we should know free device Id in pool in advance before
we put that device Id in transaction.
Hence this patch creates a bitmap (one bit each for a deviceId), and
sets the bit if device Id is used otherwise resets it. This patch
is just preparing the ground right now. Actual usage will follow
in later patches.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: 4d39e056aac2fadffcb8560101f3c31a2b7db3ae
Component: engine
Right now setupBaseImage() uses deleteDevice() to delete uninitialized
base image while rest of the code uses DeleteDevice(). Change it and
use a common function everywhere for the sake of uniformity.
I can't see what harm can be done by doing little extra locking done
by DeleteDevice().
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: 359a38b26a164f430c79fe542babb77c6e48dcc3
Component: engine
Very soon we will have the notion of an open transaction and keep its
details in a metafile.
When a new transaction is opened, we allocate a new transaction Id,
do the device creation/deletion and then we will close the transaction.
I thought that OpenTransactionId better represents the semantics of
transaction Id associated with an open transaction instead of NewtransactionId.
This patch just does the renaming. No functionality change.
I have also introduced a structure "Transaction" which will keep all
the details associated with a transaction. Later patches will add more
fields in this structure.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: f078bcd8e50913fd8b05022ebd047c5a1f2e3d52
Component: engine
Currently new transaction Id is created using allocateTransactionId()
function. This function takes NewTransactionId and bumps up by one
to create NewTransactionId.
I think ideally we should be bumping up devices.TransactionId by 1
to come up with NewTransactionId. Because idea is that devices.TransactionId
contains the current pool transaction Id and to come up with a new
transaction Id bump it up by one.
Current code is not wrong as we are keeping NewTransactionId and
TransactionId in sync. But it will be more direct if we look at
devices.TransactionId to come up with NewTransactionId. That way
we don't have to even initialize NewTransactionId during startup
as first time somebody wants to do a transaction, it will be
allocated fresh.
So simplify the code a bit. No functionality change.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: 7b0a1b814b8f13e30df466dd66c3fdc2114eac28
Component: engine
Currently updatePoolTransactionId() checks if NewTransactionId and
TransactionId are not same only then update the transaction Id in pool. This
check is redundant. Currently we call updatePoolTransactionId() only from
two places and both of these first allocate a new transaction Id.
Also updatePoolTransactionId() should only be called after allocating
new transaction Id otherwise it does not make any sense.
Remove the redundant check and reduce confusion.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: 6d347aeb6984ebdcb1051212ab3103880ef69ab0
Component: engine
Create two new helper functions for device and snap device creation. These
functions will not only create the device and also register the device.
Again, makes the code structure better and keeps all transaction logic
contained to functions instead of spilling over into functions like
setupBaseImage or AddDevice().
Just the code reorganization. No functionality change.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Upstream-commit: ad9118c696c0953ec48eec15ea4b7546296d7c20
Component: engine