mirror of
https://github.com/torvalds/linux.git
synced 2025-04-09 14:45:27 +00:00

-----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90r2wAKCRCRxhvAZXjc ouC6AQCk3MoqskN0WeNcaZT23dB7dHbEhf/7YXOFC9MFRMKXqQD9Fbn95+GuIe3U nBVPbVyQfDtfXE08ml6gbDJrCsbkkQI= =Xm1C -----END PGP SIGNATURE----- Merge tag 'vfs-6.15-rc1.mount.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount namespace updates from Christian Brauner: "This expands the ability of anonymous mount namespaces: - Creating detached mounts from detached mounts Currently, detached mounts can only be created from attached mounts. This limitaton prevents various use-cases. For example, the ability to mount a subdirectory without ever having to make the whole filesystem visible first. The current permission modelis: (1) Check that the caller is privileged over the owning user namespace of it's current mount namespace. (2) Check that the caller is located in the mount namespace of the mount it wants to create a detached copy of. While it is not strictly necessary to do it this way it is consistently applied in the new mount api. This model will also be used when allowing the creation of detached mount from another detached mount. The (1) requirement can simply be met by performing the same check as for the non-detached case, i.e., verify that the caller is privileged over its current mount namespace. To meet the (2) requirement it must be possible to infer the origin mount namespace that the anonymous mount namespace of the detached mount was created from. The origin mount namespace of an anonymous mount is the mount namespace that the mounts that were copied into the anonymous mount namespace originate from. In order to check the origin mount namespace of an anonymous mount namespace the sequence number of the original mount namespace is recorded in the anonymous mount namespace. With this in place it is possible to perform an equivalent check (2') to (2). The origin mount namespace of the anonymous mount namespace must be the same as the caller's mount namespace. To establish this the sequence number of the caller's mount namespace and the origin sequence number of the anonymous mount namespace are compared. The caller is always located in a non-anonymous mount namespace since anonymous mount namespaces cannot be setns()ed into. The caller's mount namespace will thus always have a valid sequence number. The owning namespace of any mount namespace, anonymous or non-anonymous, can never change. A mount attached to a non-anonymous mount namespace can never change mount namespace. If the sequence number of the non-anonymous mount namespace and the origin sequence number of the anonymous mount namespace match, the owning namespaces must match as well. Hence, the capability check on the owning namespace of the caller's mount namespace ensures that the caller has the ability to copy the mount tree. - Allow mount detached mounts on detached mounts Currently, detached mounts can only be mounted onto attached mounts. This limitation makes it impossible to assemble a new private rootfs and move it into place. Instead, a detached tree must be created, attached, then mounted open and then either moved or detached again. Lift this restriction. In order to allow mounting detached mounts onto other detached mounts the same permission model used for creating detached mounts from detached mounts can be used (cf. above). Allowing to mount detached mounts onto detached mounts leaves three cases to consider: (1) The source mount is an attached mount and the target mount is a detached mount. This would be equivalent to moving a mount between different mount namespaces. A caller could move an attached mount to a detached mount. The detached mount can now be freely attached to any mount namespace. This changes the current delegatioh model significantly for no good reason. So this will fail. (2) Anonymous mount namespaces are always attached fully, i.e., it is not possible to only attach a subtree of an anoymous mount namespace. This simplifies the implementation and reasoning. Consequently, if the anonymous mount namespace of the source detached mount and the target detached mount are the identical the mount request will fail. (3) The source mount's anonymous mount namespace is different from the target mount's anonymous mount namespace. In this case the source anonymous mount namespace of the source mount tree must be freed after its mounts have been moved to the target anonymous mount namespace. The source anonymous mount namespace must be empty afterwards. By allowing to mount detached mounts onto detached mounts a caller may do the following: fd_tree1 = open_tree(-EBADF, "/mnt", OPEN_TREE_CLONE) fd_tree2 = open_tree(-EBADF, "/tmp", OPEN_TREE_CLONE) fd_tree1 and fd_tree2 refer to two different detached mount trees that belong to two different anonymous mount namespace. It is important to note that fd_tree1 and fd_tree2 both refer to the root of their respective anonymous mount namespaces. By allowing to mount detached mounts onto detached mounts the caller may now do: move_mount(fd_tree1, "", fd_tree2, "", MOVE_MOUNT_F_EMPTY_PATH | MOVE_MOUNT_T_EMPTY_PATH) This will cause the detached mount referred to by fd_tree1 to be mounted on top of the detached mount referred to by fd_tree2. Thus, the detached mount fd_tree1 is moved from its separate anonymous mount namespace into fd_tree2's anonymous mount namespace. It also means that while fd_tree2 continues to refer to the root of its respective anonymous mount namespace fd_tree1 doesn't anymore. This has the consequence that only fd_tree2 can be moved to another anonymous or non-anonymous mount namespace. Moving fd_tree1 will now fail as fd_tree1 doesn't refer to the root of an anoymous mount namespace anymore. Now fd_tree1 and fd_tree2 refer to separate detached mount trees referring to the same anonymous mount namespace. This is conceptually fine. The new mount api does allow for this to happen already via: mount -t tmpfs tmpfs /mnt mkdir -p /mnt/A mount -t tmpfs tmpfs /mnt/A fd_tree3 = open_tree(-EBADF, "/mnt", OPEN_TREE_CLONE | AT_RECURSIVE) fd_tree4 = open_tree(-EBADF, "/mnt/A", 0) Both fd_tree3 and fd_tree4 refer to two different detached mount trees but both detached mount trees refer to the same anonymous mount namespace. An as with fd_tree1 and fd_tree2, only fd_tree3 may be moved another mount namespace as fd_tree3 refers to the root of the anonymous mount namespace just while fd_tree4 doesn't. However, there's an important difference between the fd_tree3/fd_tree4 and the fd_tree1/fd_tree2 example. Closing fd_tree4 and releasing the respective struct file will have no further effect on fd_tree3's detached mount tree. However, closing fd_tree3 will cause the mount tree and the respective anonymous mount namespace to be destroyed causing the detached mount tree of fd_tree4 to be invalid for further mounting. By allowing to mount detached mounts on detached mounts as in the fd_tree1/fd_tree2 example both struct files will affect each other. Both fd_tree1 and fd_tree2 refer to struct files that have FMODE_NEED_UNMOUNT set. To handle this we use the fact that @fd_tree1 will have a parent mount once it has been attached to @fd_tree2. When dissolve_on_fput() is called the mount that has been passed in will refer to the root of the anonymous mount namespace. If it doesn't it would mean that mounts are leaked. So before allowing to mount detached mounts onto detached mounts this would be a bug. Now that detached mounts can be mounted onto detached mounts it just means that the mount has been attached to another anonymous mount namespace and thus dissolve_on_fput() must not unmount the mount tree or free the anonymous mount namespace as the file referring to the root of the namespace hasn't been closed yet. If it had been closed yet it would be obvious because the mount namespace would be NULL, i.e., the @fd_tree1 would have already been unmounted. If @fd_tree1 hasn't been unmounted yet and has a parent mount it is safe to skip any cleanup as closing @fd_tree2 will take care of all cleanup operations. - Allow mount propagation for detached mount trees In commit ee2e3f50629f ("mount: fix mounting of detached mounts onto targets that reside on shared mounts") I fixed a bug where propagating the source mount tree of an anonymous mount namespace into a target mount tree of a non-anonymous mount namespace could be used to trigger an integer overflow in the non-anonymous mount namespace causing any new mounts to fail. The cause of this was that the propagation algorithm was unable to recognize mounts from the source mount tree that were already propagated into the target mount tree and then reappeared as propagation targets when walking the destination propagation mount tree. When fixing this I disabled mount propagation into anonymous mount namespaces. Make it possible for anonymous mount namespace to receive mount propagation events correctly. This is now also a correctness issue now that we allow mounting detached mount trees onto detached mount trees. Mark the source anonymous mount namespace with MNTNS_PROPAGATING indicating that all mounts belonging to this mount namespace are currently in the process of being propagated and make the propagation algorithm discard those if they appear as propagation targets" * tag 'vfs-6.15-rc1.mount.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (21 commits) selftests: test subdirectory mounting selftests: add test for detached mount tree propagation fs: namespace: fix uninitialized variable use mount: handle mount propagation for detached mount trees fs: allow creating detached mounts from fsmount() file descriptors selftests: seventh test for mounting detached mounts onto detached mounts selftests: sixth test for mounting detached mounts onto detached mounts selftests: fifth test for mounting detached mounts onto detached mounts selftests: fourth test for mounting detached mounts onto detached mounts selftests: third test for mounting detached mounts onto detached mounts selftests: second test for mounting detached mounts onto detached mounts selftests: first test for mounting detached mounts onto detached mounts fs: mount detached mounts onto detached mounts fs: support getname_maybe_null() in move_mount() selftests: create detached mounts from detached mounts fs: create detached mounts from detached mounts fs: add may_copy_tree() fs: add fastpath for dissolve_on_fput() fs: add assert for move_mount() fs: add mnt_ns_empty() helper ...
643 lines
16 KiB
C
643 lines
16 KiB
C
// SPDX-License-Identifier: GPL-2.0-only
|
|
/*
|
|
* linux/fs/pnode.c
|
|
*
|
|
* (C) Copyright IBM Corporation 2005.
|
|
* Author : Ram Pai (linuxram@us.ibm.com)
|
|
*/
|
|
#include <linux/mnt_namespace.h>
|
|
#include <linux/mount.h>
|
|
#include <linux/fs.h>
|
|
#include <linux/nsproxy.h>
|
|
#include <uapi/linux/mount.h>
|
|
#include "internal.h"
|
|
#include "pnode.h"
|
|
|
|
/* return the next shared peer mount of @p */
|
|
static inline struct mount *next_peer(struct mount *p)
|
|
{
|
|
return list_entry(p->mnt_share.next, struct mount, mnt_share);
|
|
}
|
|
|
|
static inline struct mount *first_slave(struct mount *p)
|
|
{
|
|
return list_entry(p->mnt_slave_list.next, struct mount, mnt_slave);
|
|
}
|
|
|
|
static inline struct mount *last_slave(struct mount *p)
|
|
{
|
|
return list_entry(p->mnt_slave_list.prev, struct mount, mnt_slave);
|
|
}
|
|
|
|
static inline struct mount *next_slave(struct mount *p)
|
|
{
|
|
return list_entry(p->mnt_slave.next, struct mount, mnt_slave);
|
|
}
|
|
|
|
static struct mount *get_peer_under_root(struct mount *mnt,
|
|
struct mnt_namespace *ns,
|
|
const struct path *root)
|
|
{
|
|
struct mount *m = mnt;
|
|
|
|
do {
|
|
/* Check the namespace first for optimization */
|
|
if (m->mnt_ns == ns && is_path_reachable(m, m->mnt.mnt_root, root))
|
|
return m;
|
|
|
|
m = next_peer(m);
|
|
} while (m != mnt);
|
|
|
|
return NULL;
|
|
}
|
|
|
|
/*
|
|
* Get ID of closest dominating peer group having a representative
|
|
* under the given root.
|
|
*
|
|
* Caller must hold namespace_sem
|
|
*/
|
|
int get_dominating_id(struct mount *mnt, const struct path *root)
|
|
{
|
|
struct mount *m;
|
|
|
|
for (m = mnt->mnt_master; m != NULL; m = m->mnt_master) {
|
|
struct mount *d = get_peer_under_root(m, mnt->mnt_ns, root);
|
|
if (d)
|
|
return d->mnt_group_id;
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
static int do_make_slave(struct mount *mnt)
|
|
{
|
|
struct mount *master, *slave_mnt;
|
|
|
|
if (list_empty(&mnt->mnt_share)) {
|
|
if (IS_MNT_SHARED(mnt)) {
|
|
mnt_release_group_id(mnt);
|
|
CLEAR_MNT_SHARED(mnt);
|
|
}
|
|
master = mnt->mnt_master;
|
|
if (!master) {
|
|
struct list_head *p = &mnt->mnt_slave_list;
|
|
while (!list_empty(p)) {
|
|
slave_mnt = list_first_entry(p,
|
|
struct mount, mnt_slave);
|
|
list_del_init(&slave_mnt->mnt_slave);
|
|
slave_mnt->mnt_master = NULL;
|
|
}
|
|
return 0;
|
|
}
|
|
} else {
|
|
struct mount *m;
|
|
/*
|
|
* slave 'mnt' to a peer mount that has the
|
|
* same root dentry. If none is available then
|
|
* slave it to anything that is available.
|
|
*/
|
|
for (m = master = next_peer(mnt); m != mnt; m = next_peer(m)) {
|
|
if (m->mnt.mnt_root == mnt->mnt.mnt_root) {
|
|
master = m;
|
|
break;
|
|
}
|
|
}
|
|
list_del_init(&mnt->mnt_share);
|
|
mnt->mnt_group_id = 0;
|
|
CLEAR_MNT_SHARED(mnt);
|
|
}
|
|
list_for_each_entry(slave_mnt, &mnt->mnt_slave_list, mnt_slave)
|
|
slave_mnt->mnt_master = master;
|
|
list_move(&mnt->mnt_slave, &master->mnt_slave_list);
|
|
list_splice(&mnt->mnt_slave_list, master->mnt_slave_list.prev);
|
|
INIT_LIST_HEAD(&mnt->mnt_slave_list);
|
|
mnt->mnt_master = master;
|
|
return 0;
|
|
}
|
|
|
|
/*
|
|
* vfsmount lock must be held for write
|
|
*/
|
|
void change_mnt_propagation(struct mount *mnt, int type)
|
|
{
|
|
if (type == MS_SHARED) {
|
|
set_mnt_shared(mnt);
|
|
return;
|
|
}
|
|
do_make_slave(mnt);
|
|
if (type != MS_SLAVE) {
|
|
list_del_init(&mnt->mnt_slave);
|
|
mnt->mnt_master = NULL;
|
|
if (type == MS_UNBINDABLE)
|
|
mnt->mnt.mnt_flags |= MNT_UNBINDABLE;
|
|
else
|
|
mnt->mnt.mnt_flags &= ~MNT_UNBINDABLE;
|
|
}
|
|
}
|
|
|
|
/*
|
|
* get the next mount in the propagation tree.
|
|
* @m: the mount seen last
|
|
* @origin: the original mount from where the tree walk initiated
|
|
*
|
|
* Note that peer groups form contiguous segments of slave lists.
|
|
* We rely on that in get_source() to be able to find out if
|
|
* vfsmount found while iterating with propagation_next() is
|
|
* a peer of one we'd found earlier.
|
|
*/
|
|
static struct mount *propagation_next(struct mount *m,
|
|
struct mount *origin)
|
|
{
|
|
/* are there any slaves of this mount? */
|
|
if (!IS_MNT_PROPAGATED(m) && !list_empty(&m->mnt_slave_list))
|
|
return first_slave(m);
|
|
|
|
while (1) {
|
|
struct mount *master = m->mnt_master;
|
|
|
|
if (master == origin->mnt_master) {
|
|
struct mount *next = next_peer(m);
|
|
return (next == origin) ? NULL : next;
|
|
} else if (m->mnt_slave.next != &master->mnt_slave_list)
|
|
return next_slave(m);
|
|
|
|
/* back at master */
|
|
m = master;
|
|
}
|
|
}
|
|
|
|
static struct mount *skip_propagation_subtree(struct mount *m,
|
|
struct mount *origin)
|
|
{
|
|
/*
|
|
* Advance m such that propagation_next will not return
|
|
* the slaves of m.
|
|
*/
|
|
if (!IS_MNT_PROPAGATED(m) && !list_empty(&m->mnt_slave_list))
|
|
m = last_slave(m);
|
|
|
|
return m;
|
|
}
|
|
|
|
static struct mount *next_group(struct mount *m, struct mount *origin)
|
|
{
|
|
while (1) {
|
|
while (1) {
|
|
struct mount *next;
|
|
if (!IS_MNT_PROPAGATED(m) && !list_empty(&m->mnt_slave_list))
|
|
return first_slave(m);
|
|
next = next_peer(m);
|
|
if (m->mnt_group_id == origin->mnt_group_id) {
|
|
if (next == origin)
|
|
return NULL;
|
|
} else if (m->mnt_slave.next != &next->mnt_slave)
|
|
break;
|
|
m = next;
|
|
}
|
|
/* m is the last peer */
|
|
while (1) {
|
|
struct mount *master = m->mnt_master;
|
|
if (m->mnt_slave.next != &master->mnt_slave_list)
|
|
return next_slave(m);
|
|
m = next_peer(master);
|
|
if (master->mnt_group_id == origin->mnt_group_id)
|
|
break;
|
|
if (master->mnt_slave.next == &m->mnt_slave)
|
|
break;
|
|
m = master;
|
|
}
|
|
if (m == origin)
|
|
return NULL;
|
|
}
|
|
}
|
|
|
|
/* all accesses are serialized by namespace_sem */
|
|
static struct mount *last_dest, *first_source, *last_source, *dest_master;
|
|
static struct hlist_head *list;
|
|
|
|
static inline bool peers(const struct mount *m1, const struct mount *m2)
|
|
{
|
|
return m1->mnt_group_id == m2->mnt_group_id && m1->mnt_group_id;
|
|
}
|
|
|
|
static int propagate_one(struct mount *m, struct mountpoint *dest_mp)
|
|
{
|
|
struct mount *child;
|
|
int type;
|
|
/* skip ones added by this propagate_mnt() */
|
|
if (IS_MNT_PROPAGATED(m))
|
|
return 0;
|
|
/* skip if mountpoint isn't covered by it */
|
|
if (!is_subdir(dest_mp->m_dentry, m->mnt.mnt_root))
|
|
return 0;
|
|
if (peers(m, last_dest)) {
|
|
type = CL_MAKE_SHARED;
|
|
} else {
|
|
struct mount *n, *p;
|
|
bool done;
|
|
for (n = m; ; n = p) {
|
|
p = n->mnt_master;
|
|
if (p == dest_master || IS_MNT_MARKED(p))
|
|
break;
|
|
}
|
|
do {
|
|
struct mount *parent = last_source->mnt_parent;
|
|
if (peers(last_source, first_source))
|
|
break;
|
|
done = parent->mnt_master == p;
|
|
if (done && peers(n, parent))
|
|
break;
|
|
last_source = last_source->mnt_master;
|
|
} while (!done);
|
|
|
|
type = CL_SLAVE;
|
|
/* beginning of peer group among the slaves? */
|
|
if (IS_MNT_SHARED(m))
|
|
type |= CL_MAKE_SHARED;
|
|
}
|
|
|
|
child = copy_tree(last_source, last_source->mnt.mnt_root, type);
|
|
if (IS_ERR(child))
|
|
return PTR_ERR(child);
|
|
read_seqlock_excl(&mount_lock);
|
|
mnt_set_mountpoint(m, dest_mp, child);
|
|
if (m->mnt_master != dest_master)
|
|
SET_MNT_MARK(m->mnt_master);
|
|
read_sequnlock_excl(&mount_lock);
|
|
last_dest = m;
|
|
last_source = child;
|
|
hlist_add_head(&child->mnt_hash, list);
|
|
return count_mounts(m->mnt_ns, child);
|
|
}
|
|
|
|
/*
|
|
* mount 'source_mnt' under the destination 'dest_mnt' at
|
|
* dentry 'dest_dentry'. And propagate that mount to
|
|
* all the peer and slave mounts of 'dest_mnt'.
|
|
* Link all the new mounts into a propagation tree headed at
|
|
* source_mnt. Also link all the new mounts using ->mnt_list
|
|
* headed at source_mnt's ->mnt_list
|
|
*
|
|
* @dest_mnt: destination mount.
|
|
* @dest_dentry: destination dentry.
|
|
* @source_mnt: source mount.
|
|
* @tree_list : list of heads of trees to be attached.
|
|
*/
|
|
int propagate_mnt(struct mount *dest_mnt, struct mountpoint *dest_mp,
|
|
struct mount *source_mnt, struct hlist_head *tree_list)
|
|
{
|
|
struct mount *m, *n;
|
|
int ret = 0;
|
|
|
|
/*
|
|
* we don't want to bother passing tons of arguments to
|
|
* propagate_one(); everything is serialized by namespace_sem,
|
|
* so globals will do just fine.
|
|
*/
|
|
last_dest = dest_mnt;
|
|
first_source = source_mnt;
|
|
last_source = source_mnt;
|
|
list = tree_list;
|
|
dest_master = dest_mnt->mnt_master;
|
|
|
|
/* all peers of dest_mnt, except dest_mnt itself */
|
|
for (n = next_peer(dest_mnt); n != dest_mnt; n = next_peer(n)) {
|
|
ret = propagate_one(n, dest_mp);
|
|
if (ret)
|
|
goto out;
|
|
}
|
|
|
|
/* all slave groups */
|
|
for (m = next_group(dest_mnt, dest_mnt); m;
|
|
m = next_group(m, dest_mnt)) {
|
|
/* everything in that slave group */
|
|
n = m;
|
|
do {
|
|
ret = propagate_one(n, dest_mp);
|
|
if (ret)
|
|
goto out;
|
|
n = next_peer(n);
|
|
} while (n != m);
|
|
}
|
|
out:
|
|
read_seqlock_excl(&mount_lock);
|
|
hlist_for_each_entry(n, tree_list, mnt_hash) {
|
|
m = n->mnt_parent;
|
|
if (m->mnt_master != dest_mnt->mnt_master)
|
|
CLEAR_MNT_MARK(m->mnt_master);
|
|
}
|
|
read_sequnlock_excl(&mount_lock);
|
|
return ret;
|
|
}
|
|
|
|
static struct mount *find_topper(struct mount *mnt)
|
|
{
|
|
/* If there is exactly one mount covering mnt completely return it. */
|
|
struct mount *child;
|
|
|
|
if (!list_is_singular(&mnt->mnt_mounts))
|
|
return NULL;
|
|
|
|
child = list_first_entry(&mnt->mnt_mounts, struct mount, mnt_child);
|
|
if (child->mnt_mountpoint != mnt->mnt.mnt_root)
|
|
return NULL;
|
|
|
|
return child;
|
|
}
|
|
|
|
/*
|
|
* return true if the refcount is greater than count
|
|
*/
|
|
static inline int do_refcount_check(struct mount *mnt, int count)
|
|
{
|
|
return mnt_get_count(mnt) > count;
|
|
}
|
|
|
|
/**
|
|
* propagation_would_overmount - check whether propagation from @from
|
|
* would overmount @to
|
|
* @from: shared mount
|
|
* @to: mount to check
|
|
* @mp: future mountpoint of @to on @from
|
|
*
|
|
* If @from propagates mounts to @to, @from and @to must either be peers
|
|
* or one of the masters in the hierarchy of masters of @to must be a
|
|
* peer of @from.
|
|
*
|
|
* If the root of the @to mount is equal to the future mountpoint @mp of
|
|
* the @to mount on @from then @to will be overmounted by whatever is
|
|
* propagated to it.
|
|
*
|
|
* Context: This function expects namespace_lock() to be held and that
|
|
* @mp is stable.
|
|
* Return: If @from overmounts @to, true is returned, false if not.
|
|
*/
|
|
bool propagation_would_overmount(const struct mount *from,
|
|
const struct mount *to,
|
|
const struct mountpoint *mp)
|
|
{
|
|
if (!IS_MNT_SHARED(from))
|
|
return false;
|
|
|
|
if (IS_MNT_PROPAGATED(to))
|
|
return false;
|
|
|
|
if (to->mnt.mnt_root != mp->m_dentry)
|
|
return false;
|
|
|
|
for (const struct mount *m = to; m; m = m->mnt_master) {
|
|
if (peers(from, m))
|
|
return true;
|
|
}
|
|
|
|
return false;
|
|
}
|
|
|
|
/*
|
|
* check if the mount 'mnt' can be unmounted successfully.
|
|
* @mnt: the mount to be checked for unmount
|
|
* NOTE: unmounting 'mnt' would naturally propagate to all
|
|
* other mounts its parent propagates to.
|
|
* Check if any of these mounts that **do not have submounts**
|
|
* have more references than 'refcnt'. If so return busy.
|
|
*
|
|
* vfsmount lock must be held for write
|
|
*/
|
|
int propagate_mount_busy(struct mount *mnt, int refcnt)
|
|
{
|
|
struct mount *m, *child, *topper;
|
|
struct mount *parent = mnt->mnt_parent;
|
|
|
|
if (mnt == parent)
|
|
return do_refcount_check(mnt, refcnt);
|
|
|
|
/*
|
|
* quickly check if the current mount can be unmounted.
|
|
* If not, we don't have to go checking for all other
|
|
* mounts
|
|
*/
|
|
if (!list_empty(&mnt->mnt_mounts) || do_refcount_check(mnt, refcnt))
|
|
return 1;
|
|
|
|
for (m = propagation_next(parent, parent); m;
|
|
m = propagation_next(m, parent)) {
|
|
int count = 1;
|
|
child = __lookup_mnt(&m->mnt, mnt->mnt_mountpoint);
|
|
if (!child)
|
|
continue;
|
|
|
|
/* Is there exactly one mount on the child that covers
|
|
* it completely whose reference should be ignored?
|
|
*/
|
|
topper = find_topper(child);
|
|
if (topper)
|
|
count += 1;
|
|
else if (!list_empty(&child->mnt_mounts))
|
|
continue;
|
|
|
|
if (do_refcount_check(child, count))
|
|
return 1;
|
|
}
|
|
return 0;
|
|
}
|
|
|
|
/*
|
|
* Clear MNT_LOCKED when it can be shown to be safe.
|
|
*
|
|
* mount_lock lock must be held for write
|
|
*/
|
|
void propagate_mount_unlock(struct mount *mnt)
|
|
{
|
|
struct mount *parent = mnt->mnt_parent;
|
|
struct mount *m, *child;
|
|
|
|
BUG_ON(parent == mnt);
|
|
|
|
for (m = propagation_next(parent, parent); m;
|
|
m = propagation_next(m, parent)) {
|
|
child = __lookup_mnt(&m->mnt, mnt->mnt_mountpoint);
|
|
if (child)
|
|
child->mnt.mnt_flags &= ~MNT_LOCKED;
|
|
}
|
|
}
|
|
|
|
static void umount_one(struct mount *mnt, struct list_head *to_umount)
|
|
{
|
|
CLEAR_MNT_MARK(mnt);
|
|
mnt->mnt.mnt_flags |= MNT_UMOUNT;
|
|
list_del_init(&mnt->mnt_child);
|
|
list_del_init(&mnt->mnt_umounting);
|
|
move_from_ns(mnt, to_umount);
|
|
}
|
|
|
|
/*
|
|
* NOTE: unmounting 'mnt' naturally propagates to all other mounts its
|
|
* parent propagates to.
|
|
*/
|
|
static bool __propagate_umount(struct mount *mnt,
|
|
struct list_head *to_umount,
|
|
struct list_head *to_restore)
|
|
{
|
|
bool progress = false;
|
|
struct mount *child;
|
|
|
|
/*
|
|
* The state of the parent won't change if this mount is
|
|
* already unmounted or marked as without children.
|
|
*/
|
|
if (mnt->mnt.mnt_flags & (MNT_UMOUNT | MNT_MARKED))
|
|
goto out;
|
|
|
|
/* Verify topper is the only grandchild that has not been
|
|
* speculatively unmounted.
|
|
*/
|
|
list_for_each_entry(child, &mnt->mnt_mounts, mnt_child) {
|
|
if (child->mnt_mountpoint == mnt->mnt.mnt_root)
|
|
continue;
|
|
if (!list_empty(&child->mnt_umounting) && IS_MNT_MARKED(child))
|
|
continue;
|
|
/* Found a mounted child */
|
|
goto children;
|
|
}
|
|
|
|
/* Mark mounts that can be unmounted if not locked */
|
|
SET_MNT_MARK(mnt);
|
|
progress = true;
|
|
|
|
/* If a mount is without children and not locked umount it. */
|
|
if (!IS_MNT_LOCKED(mnt)) {
|
|
umount_one(mnt, to_umount);
|
|
} else {
|
|
children:
|
|
list_move_tail(&mnt->mnt_umounting, to_restore);
|
|
}
|
|
out:
|
|
return progress;
|
|
}
|
|
|
|
static void umount_list(struct list_head *to_umount,
|
|
struct list_head *to_restore)
|
|
{
|
|
struct mount *mnt, *child, *tmp;
|
|
list_for_each_entry(mnt, to_umount, mnt_list) {
|
|
list_for_each_entry_safe(child, tmp, &mnt->mnt_mounts, mnt_child) {
|
|
/* topper? */
|
|
if (child->mnt_mountpoint == mnt->mnt.mnt_root)
|
|
list_move_tail(&child->mnt_umounting, to_restore);
|
|
else
|
|
umount_one(child, to_umount);
|
|
}
|
|
}
|
|
}
|
|
|
|
static void restore_mounts(struct list_head *to_restore)
|
|
{
|
|
/* Restore mounts to a clean working state */
|
|
while (!list_empty(to_restore)) {
|
|
struct mount *mnt, *parent;
|
|
struct mountpoint *mp;
|
|
|
|
mnt = list_first_entry(to_restore, struct mount, mnt_umounting);
|
|
CLEAR_MNT_MARK(mnt);
|
|
list_del_init(&mnt->mnt_umounting);
|
|
|
|
/* Should this mount be reparented? */
|
|
mp = mnt->mnt_mp;
|
|
parent = mnt->mnt_parent;
|
|
while (parent->mnt.mnt_flags & MNT_UMOUNT) {
|
|
mp = parent->mnt_mp;
|
|
parent = parent->mnt_parent;
|
|
}
|
|
if (parent != mnt->mnt_parent) {
|
|
mnt_change_mountpoint(parent, mp, mnt);
|
|
mnt_notify_add(mnt);
|
|
}
|
|
}
|
|
}
|
|
|
|
static void cleanup_umount_visitations(struct list_head *visited)
|
|
{
|
|
while (!list_empty(visited)) {
|
|
struct mount *mnt =
|
|
list_first_entry(visited, struct mount, mnt_umounting);
|
|
list_del_init(&mnt->mnt_umounting);
|
|
}
|
|
}
|
|
|
|
/*
|
|
* collect all mounts that receive propagation from the mount in @list,
|
|
* and return these additional mounts in the same list.
|
|
* @list: the list of mounts to be unmounted.
|
|
*
|
|
* vfsmount lock must be held for write
|
|
*/
|
|
int propagate_umount(struct list_head *list)
|
|
{
|
|
struct mount *mnt;
|
|
LIST_HEAD(to_restore);
|
|
LIST_HEAD(to_umount);
|
|
LIST_HEAD(visited);
|
|
|
|
/* Find candidates for unmounting */
|
|
list_for_each_entry_reverse(mnt, list, mnt_list) {
|
|
struct mount *parent = mnt->mnt_parent;
|
|
struct mount *m;
|
|
|
|
/*
|
|
* If this mount has already been visited it is known that it's
|
|
* entire peer group and all of their slaves in the propagation
|
|
* tree for the mountpoint has already been visited and there is
|
|
* no need to visit them again.
|
|
*/
|
|
if (!list_empty(&mnt->mnt_umounting))
|
|
continue;
|
|
|
|
list_add_tail(&mnt->mnt_umounting, &visited);
|
|
for (m = propagation_next(parent, parent); m;
|
|
m = propagation_next(m, parent)) {
|
|
struct mount *child = __lookup_mnt(&m->mnt,
|
|
mnt->mnt_mountpoint);
|
|
if (!child)
|
|
continue;
|
|
|
|
if (!list_empty(&child->mnt_umounting)) {
|
|
/*
|
|
* If the child has already been visited it is
|
|
* know that it's entire peer group and all of
|
|
* their slaves in the propgation tree for the
|
|
* mountpoint has already been visited and there
|
|
* is no need to visit this subtree again.
|
|
*/
|
|
m = skip_propagation_subtree(m, parent);
|
|
continue;
|
|
} else if (child->mnt.mnt_flags & MNT_UMOUNT) {
|
|
/*
|
|
* We have come across a partially unmounted
|
|
* mount in a list that has not been visited
|
|
* yet. Remember it has been visited and
|
|
* continue about our merry way.
|
|
*/
|
|
list_add_tail(&child->mnt_umounting, &visited);
|
|
continue;
|
|
}
|
|
|
|
/* Check the child and parents while progress is made */
|
|
while (__propagate_umount(child,
|
|
&to_umount, &to_restore)) {
|
|
/* Is the parent a umount candidate? */
|
|
child = child->mnt_parent;
|
|
if (list_empty(&child->mnt_umounting))
|
|
break;
|
|
}
|
|
}
|
|
}
|
|
|
|
umount_list(&to_umount, &to_restore);
|
|
restore_mounts(&to_restore);
|
|
cleanup_umount_visitations(&visited);
|
|
list_splice_tail(&to_umount, list);
|
|
|
|
return 0;
|
|
}
|