Allocation of radix_tree_node objects can be easily triggered from
userspace, so we should account them to memory cgroup. Besides, we need
them accounted for making shadow node shrinker per memcg (see
mm/workingset.c).
A tricky thing about accounting radix_tree_node objects is that they are
mostly allocated through radix_tree_preload(), so we can't just set
SLAB_ACCOUNT for radix_tree_node_cachep - that would likely result in a
lot of unrelated cgroups using objects from each other's caches.
One way to overcome this would be making radix tree preloads per memcg,
but that would probably look cumbersome and overcomplicated.
Instead, we make radix_tree_node_alloc() first try to allocate from the
cache with __GFP_ACCOUNT, no matter if the caller has preloaded or not,
and only if it fails fall back on using per cpu preloads. This should
make most allocations accounted.
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
if (!gfpflags_allow_blocking(gfp_mask) && !in_interrupt()) {
struct radix_tree_preload *rtp;
+ /*
+ * Even if the caller has preloaded, try to allocate from the
+ * cache first for the new node to get accounted.
+ */
+ ret = kmem_cache_alloc(radix_tree_node_cachep,
+ gfp_mask | __GFP_ACCOUNT | __GFP_NOWARN);
+ if (ret)
+ goto out;
+
/*
* Provided the caller has preloaded here, we will always
* succeed in getting a node here (and never reach
* for debugging.
*/
kmemleak_update_trace(ret);
+ goto out;
}
- if (ret == NULL)
- ret = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask);
-
+ ret = kmem_cache_alloc(radix_tree_node_cachep,
+ gfp_mask | __GFP_ACCOUNT);
+out:
BUG_ON(radix_tree_is_indirect_ptr(ret));
return ret;
}