mirror of
https://github.com/torvalds/linux.git
synced 2025-04-06 09:13:43 +00:00
Scheduler updates for v6.15:
[ Merge note, these two commits are identical: - f3fa0e40df17 ("sched/clock: Don't define sched_clock_irqtime as static key") - b9f2b29b9494 ("sched: Don't define sched_clock_irqtime as static key") The first one is a cherry-picked version of the second, and the first one is already upstream. ] Core & fair scheduler changes: - Cancel the slice protection of the idle entity (Zihan Zhou) - Reduce the default slice to avoid tasks getting an extra tick (Zihan Zhou) - Force propagating min_slice of cfs_rq when {en,de}queue tasks (Tianchen Ding) - Refactor can_migrate_task() to elimate looping (I Hsin Cheng) - Add unlikey branch hints to several system calls (Colin Ian King) - Optimize current_clr_polling() on certain architectures (Yujun Dong) Deadline scheduler: (Juri Lelli) - Remove redundant dl_clear_root_domain call - Move dl_rebuild_rd_accounting to cpuset.h Uclamp: - Use the uclamp_is_used() helper instead of open-coding it (Xuewen Yan) - Optimize sched_uclamp_used static key enabling (Xuewen Yan) Scheduler topology support: (Juri Lelli) - Ignore special tasks when rebuilding domains - Add wrappers for sched_domains_mutex - Generalize unique visiting of root domains - Rebuild root domain accounting after every update - Remove partition_and_rebuild_sched_domains - Stop exposing partition_sched_domains_locked RSEQ: (Michael Jeanson) - Update kernel fields in lockstep with CONFIG_DEBUG_RSEQ=y - Fix segfault on registration when rseq_cs is non-zero - selftests: Add rseq syscall errors test - selftests: Ensure the rseq ABI TLS is actually 1024 bytes Membarriers: - Fix redundant load of membarrier_state (Nysal Jan K.A.) Scheduler debugging: - Introduce and use preempt_model_str() (Sebastian Andrzej Siewior) - Make CONFIG_SCHED_DEBUG unconditional (Ingo Molnar) Fixes and cleanups: - Always save/restore x86 TSC sched_clock() on suspend/resume (Guilherme G. Piccoli) - Misc fixes and cleanups (Thorsten Blum, Juri Lelli, Sebastian Andrzej Siewior) Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfejsoRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1ivkhAAwBF2tYRBS1oIHcC/OKK3JJoHVDp2LFbU 9sm5S3ZlGD/Ns2fbpY+9A8UFgUFfjYiTSV7hvf2B9Vge0XSxTmMNFu/MdxLBbo9r w6GSeNcNDQKpjEGLkrmPFsa2fiYI4dmH0IzDbS9V2cNPk470QBKjAKXNPaSER691 n2wLnQq+m5o4gXnPjnSz6RrrzisRnm2GOWnDV/iqR47pZFNlX2wWlo3s5r7//Hw0 a+QfEfpgKehhy/VSDXmSAgpqnNffjc78yBV6LNoVUddwahnOWiQMS3XViOqgy5VO jUGBrzW+sKkdRMBppxwJ/0XWgHGC27amIgnU0ZE5u+eiUEu8H9qWl1cRCFxyeB0O 8+WNfwmkH+FPWUdsn84kdePhSsZy6HfM6h44Xe0hx1V7tQXEXfbPzK3TnQg8Ktt1 Ky6ctbZt4cGpqGQuIqvba21A/racrD/DgvB7mHeZksnqZoKTDwxhT/nlQGpuwPoy SJYd1ynFVJvfC69SwMdwnaglimvEZx1GfT0o5XtCMslY5NkWCou5u+e65WX7ccU5 94wBCwI1/+KiFMJZp6TlPw07Q/Hsj9dDryxLc3OunU3zMVt3++1GZnBS/eb5FN1A 5TlAEpxgH9c5Q4/XJFCvx21DrTVHSuIrR6naK91bgqHo0cEfaHGtO/ejPtJGSfxe YIFnnu1dhRw= =SuQE -----END PGP SIGNATURE----- Merge tag 'sched-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Core & fair scheduler changes: - Cancel the slice protection of the idle entity (Zihan Zhou) - Reduce the default slice to avoid tasks getting an extra tick (Zihan Zhou) - Force propagating min_slice of cfs_rq when {en,de}queue tasks (Tianchen Ding) - Refactor can_migrate_task() to elimate looping (I Hsin Cheng) - Add unlikey branch hints to several system calls (Colin Ian King) - Optimize current_clr_polling() on certain architectures (Yujun Dong) Deadline scheduler: (Juri Lelli) - Remove redundant dl_clear_root_domain call - Move dl_rebuild_rd_accounting to cpuset.h Uclamp: - Use the uclamp_is_used() helper instead of open-coding it (Xuewen Yan) - Optimize sched_uclamp_used static key enabling (Xuewen Yan) Scheduler topology support: (Juri Lelli) - Ignore special tasks when rebuilding domains - Add wrappers for sched_domains_mutex - Generalize unique visiting of root domains - Rebuild root domain accounting after every update - Remove partition_and_rebuild_sched_domains - Stop exposing partition_sched_domains_locked RSEQ: (Michael Jeanson) - Update kernel fields in lockstep with CONFIG_DEBUG_RSEQ=y - Fix segfault on registration when rseq_cs is non-zero - selftests: Add rseq syscall errors test - selftests: Ensure the rseq ABI TLS is actually 1024 bytes Membarriers: - Fix redundant load of membarrier_state (Nysal Jan K.A.) Scheduler debugging: - Introduce and use preempt_model_str() (Sebastian Andrzej Siewior) - Make CONFIG_SCHED_DEBUG unconditional (Ingo Molnar) Fixes and cleanups: - Always save/restore x86 TSC sched_clock() on suspend/resume (Guilherme G. Piccoli) - Misc fixes and cleanups (Thorsten Blum, Juri Lelli, Sebastian Andrzej Siewior)" * tag 'sched-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) cpuidle, sched: Use smp_mb__after_atomic() in current_clr_polling() sched/debug: Remove CONFIG_SCHED_DEBUG sched/debug: Remove CONFIG_SCHED_DEBUG from self-test config files sched/debug, Documentation: Remove (most) CONFIG_SCHED_DEBUG references from documentation sched/debug: Make CONFIG_SCHED_DEBUG functionality unconditional sched/debug: Make 'const_debug' tunables unconditional __read_mostly sched/debug: Change SCHED_WARN_ON() to WARN_ON_ONCE() rseq/selftests: Fix namespace collision with rseq UAPI header include/{topology,cpuset}: Move dl_rebuild_rd_accounting to cpuset.h sched/topology: Stop exposing partition_sched_domains_locked cgroup/cpuset: Remove partition_and_rebuild_sched_domains sched/topology: Remove redundant dl_clear_root_domain call sched/deadline: Rebuild root domain accounting after every update sched/deadline: Generalize unique visiting of root domains sched/topology: Wrappers for sched_domains_mutex sched/deadline: Ignore special tasks when rebuilding domains tracing: Use preempt_model_str() xtensa: Rely on generic printing of preemption model x86: Rely on generic printing of preemption model s390: Rely on generic printing of preemption model ...
This commit is contained in:
commit
32b22538be
@ -2,7 +2,7 @@
|
||||
Scheduler debugfs
|
||||
=================
|
||||
|
||||
Booting a kernel with CONFIG_SCHED_DEBUG=y will give access to
|
||||
Booting a kernel with debugfs enabled will give access to
|
||||
scheduler specific debug files under /sys/kernel/debug/sched. Some of
|
||||
those files are described below.
|
||||
|
||||
|
@ -96,7 +96,7 @@ picked and the current task is preempted.
|
||||
CFS uses nanosecond granularity accounting and does not rely on any jiffies or
|
||||
other HZ detail. Thus the CFS scheduler has no notion of "timeslices" in the
|
||||
way the previous scheduler had, and has no heuristics whatsoever. There is
|
||||
only one central tunable (you have to switch on CONFIG_SCHED_DEBUG):
|
||||
only one central tunable:
|
||||
|
||||
/sys/kernel/debug/sched/base_slice_ns
|
||||
|
||||
|
@ -73,9 +73,8 @@ Architectures may override the generic domain builder and the default SD flags
|
||||
for a given topology level by creating a sched_domain_topology_level array and
|
||||
calling set_sched_topology() with this array as the parameter.
|
||||
|
||||
The sched-domains debugging infrastructure can be enabled by enabling
|
||||
CONFIG_SCHED_DEBUG and adding 'sched_verbose' to your cmdline. If you
|
||||
forgot to tweak your cmdline, you can also flip the
|
||||
The sched-domains debugging infrastructure can be enabled by 'sched_verbose'
|
||||
to your cmdline. If you forgot to tweak your cmdline, you can also flip the
|
||||
/sys/kernel/debug/sched/verbose knob. This enables an error checking parse of
|
||||
the sched domains which should catch most possible errors (described above). It
|
||||
also prints out the domain structure in a visual format.
|
||||
|
@ -107,8 +107,7 @@ detailed information:
|
||||
nr_rejected : 0
|
||||
enable_seq : 1
|
||||
|
||||
If ``CONFIG_SCHED_DEBUG`` is set, whether a given task is on sched_ext can
|
||||
be determined as follows:
|
||||
Whether a given task is on sched_ext can be determined as follows:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
|
@ -88,7 +88,7 @@ One of these is produced per domain for each cpu described. (Note that if
|
||||
CONFIG_SMP is not defined, *no* domains are utilized and these lines
|
||||
will not appear in the output. <name> is an extension to the domain field
|
||||
that prints the name of the corresponding sched domain. It can appear in
|
||||
schedstat version 17 and above, and requires CONFIG_SCHED_DEBUG.)
|
||||
schedstat version 17 and above.
|
||||
|
||||
domain<N> <name> <cpumask> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
|
||||
|
||||
|
@ -112,7 +112,7 @@ CFS usa una granularidad de nanosegundos y no depende de ningún
|
||||
jiffy o detalles como HZ. De este modo, el gestor de tareas CFS no tiene
|
||||
noción de "ventanas de tiempo" de la forma en que tenía el gestor de
|
||||
tareas previo, y tampoco tiene heurísticos. Únicamente hay un parámetro
|
||||
central ajustable (se ha de cambiar en CONFIG_SCHED_DEBUG):
|
||||
central ajustable:
|
||||
|
||||
/sys/kernel/debug/sched/base_slice_ns
|
||||
|
||||
|
@ -258,13 +258,6 @@ void show_stack(struct task_struct *tsk, unsigned long *sp, const char *loglvl)
|
||||
barrier();
|
||||
}
|
||||
|
||||
#ifdef CONFIG_PREEMPT
|
||||
#define S_PREEMPT " PREEMPT"
|
||||
#elif defined(CONFIG_PREEMPT_RT)
|
||||
#define S_PREEMPT " PREEMPT_RT"
|
||||
#else
|
||||
#define S_PREEMPT ""
|
||||
#endif
|
||||
#ifdef CONFIG_SMP
|
||||
#define S_SMP " SMP"
|
||||
#else
|
||||
@ -282,8 +275,8 @@ static int __die(const char *str, int err, struct pt_regs *regs)
|
||||
static int die_counter;
|
||||
int ret;
|
||||
|
||||
pr_emerg("Internal error: %s: %x [#%d]" S_PREEMPT S_SMP S_ISA "\n",
|
||||
str, err, ++die_counter);
|
||||
pr_emerg("Internal error: %s: %x [#%d]" S_SMP S_ISA "\n",
|
||||
str, err, ++die_counter);
|
||||
|
||||
/* trap and error numbers are mostly meaningless on ARM */
|
||||
ret = notify_die(DIE_OOPS, str, regs, err, tsk->thread.trap_no, SIGSEGV);
|
||||
|
@ -172,14 +172,6 @@ static void dump_kernel_instr(const char *lvl, struct pt_regs *regs)
|
||||
printk("%sCode: %s\n", lvl, str);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_PREEMPT
|
||||
#define S_PREEMPT " PREEMPT"
|
||||
#elif defined(CONFIG_PREEMPT_RT)
|
||||
#define S_PREEMPT " PREEMPT_RT"
|
||||
#else
|
||||
#define S_PREEMPT ""
|
||||
#endif
|
||||
|
||||
#define S_SMP " SMP"
|
||||
|
||||
static int __die(const char *str, long err, struct pt_regs *regs)
|
||||
@ -187,7 +179,7 @@ static int __die(const char *str, long err, struct pt_regs *regs)
|
||||
static int die_counter;
|
||||
int ret;
|
||||
|
||||
pr_emerg("Internal error: %s: %016lx [#%d]" S_PREEMPT S_SMP "\n",
|
||||
pr_emerg("Internal error: %s: %016lx [#%d] " S_SMP "\n",
|
||||
str, err, ++die_counter);
|
||||
|
||||
/* trap and error numbers are mostly meaningless on ARM */
|
||||
|
@ -263,10 +263,9 @@ static int __die(const char *str, struct pt_regs *regs, long err)
|
||||
{
|
||||
printk("Oops: %s, sig: %ld [#%d]\n", str, err, ++die_counter);
|
||||
|
||||
printk("%s PAGE_SIZE=%luK%s%s%s%s%s%s %s\n",
|
||||
printk("%s PAGE_SIZE=%luK%s %s%s%s%s %s\n",
|
||||
IS_ENABLED(CONFIG_CPU_LITTLE_ENDIAN) ? "LE" : "BE",
|
||||
PAGE_SIZE / 1024, get_mmu_str(),
|
||||
IS_ENABLED(CONFIG_PREEMPT) ? " PREEMPT" : "",
|
||||
IS_ENABLED(CONFIG_SMP) ? " SMP" : "",
|
||||
IS_ENABLED(CONFIG_SMP) ? (" NR_CPUS=" __stringify(NR_CPUS)) : "",
|
||||
debug_pagealloc_enabled() ? " DEBUG_PAGEALLOC" : "",
|
||||
|
@ -198,13 +198,8 @@ void __noreturn die(struct pt_regs *regs, const char *str)
|
||||
console_verbose();
|
||||
spin_lock_irq(&die_lock);
|
||||
bust_spinlocks(1);
|
||||
printk("%s: %04x ilc:%d [#%d] ", str, regs->int_code & 0xffff,
|
||||
printk("%s: %04x ilc:%d [#%d]", str, regs->int_code & 0xffff,
|
||||
regs->int_code >> 17, ++die_counter);
|
||||
#ifdef CONFIG_PREEMPT
|
||||
pr_cont("PREEMPT ");
|
||||
#elif defined(CONFIG_PREEMPT_RT)
|
||||
pr_cont("PREEMPT_RT ");
|
||||
#endif
|
||||
pr_cont("SMP ");
|
||||
if (debug_pagealloc_enabled())
|
||||
pr_cont("DEBUG_PAGEALLOC");
|
||||
|
@ -395,18 +395,13 @@ NOKPROBE_SYMBOL(oops_end);
|
||||
|
||||
static void __die_header(const char *str, struct pt_regs *regs, long err)
|
||||
{
|
||||
const char *pr = "";
|
||||
|
||||
/* Save the regs of the first oops for the executive summary later. */
|
||||
if (!die_counter)
|
||||
exec_summary_regs = *regs;
|
||||
|
||||
if (IS_ENABLED(CONFIG_PREEMPTION))
|
||||
pr = IS_ENABLED(CONFIG_PREEMPT_RT) ? " PREEMPT_RT" : " PREEMPT";
|
||||
|
||||
printk(KERN_DEFAULT
|
||||
"Oops: %s: %04lx [#%d]%s%s%s%s%s\n", str, err & 0xffff,
|
||||
++die_counter, pr,
|
||||
"Oops: %s: %04lx [#%d]%s%s%s%s\n", str, err & 0xffff,
|
||||
++die_counter,
|
||||
IS_ENABLED(CONFIG_SMP) ? " SMP" : "",
|
||||
debug_pagealloc_enabled() ? " DEBUG_PAGEALLOC" : "",
|
||||
IS_ENABLED(CONFIG_KASAN) ? " KASAN" : "",
|
||||
|
@ -959,7 +959,7 @@ static unsigned long long cyc2ns_suspend;
|
||||
|
||||
void tsc_save_sched_clock_state(void)
|
||||
{
|
||||
if (!sched_clock_stable())
|
||||
if (!static_branch_likely(&__use_tsc) && !sched_clock_stable())
|
||||
return;
|
||||
|
||||
cyc2ns_suspend = sched_clock();
|
||||
@ -979,7 +979,7 @@ void tsc_restore_sched_clock_state(void)
|
||||
unsigned long flags;
|
||||
int cpu;
|
||||
|
||||
if (!sched_clock_stable())
|
||||
if (!static_branch_likely(&__use_tsc) && !sched_clock_stable())
|
||||
return;
|
||||
|
||||
local_irq_save(flags);
|
||||
|
@ -629,15 +629,11 @@ DEFINE_SPINLOCK(die_lock);
|
||||
void __noreturn die(const char * str, struct pt_regs * regs, long err)
|
||||
{
|
||||
static int die_counter;
|
||||
const char *pr = "";
|
||||
|
||||
if (IS_ENABLED(CONFIG_PREEMPTION))
|
||||
pr = IS_ENABLED(CONFIG_PREEMPT_RT) ? " PREEMPT_RT" : " PREEMPT";
|
||||
|
||||
console_verbose();
|
||||
spin_lock_irq(&die_lock);
|
||||
|
||||
pr_info("%s: sig: %ld [#%d]%s\n", str, err, ++die_counter, pr);
|
||||
pr_info("%s: sig: %ld [#%d]\n", str, err, ++die_counter);
|
||||
show_regs(regs);
|
||||
if (!user_mode(regs))
|
||||
show_stack(NULL, (unsigned long *)regs->areg[1], KERN_INFO);
|
||||
|
@ -1489,7 +1489,6 @@ static const struct file_operations proc_fail_nth_operations = {
|
||||
#endif
|
||||
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
/*
|
||||
* Print out various scheduling related per-task fields:
|
||||
*/
|
||||
@ -1539,8 +1538,6 @@ static const struct file_operations proc_pid_sched_operations = {
|
||||
.release = single_release,
|
||||
};
|
||||
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_SCHED_AUTOGROUP
|
||||
/*
|
||||
* Print out autogroup related information:
|
||||
@ -3331,9 +3328,7 @@ static const struct pid_entry tgid_base_stuff[] = {
|
||||
ONE("status", S_IRUGO, proc_pid_status),
|
||||
ONE("personality", S_IRUSR, proc_pid_personality),
|
||||
ONE("limits", S_IRUGO, proc_pid_limits),
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
|
||||
#endif
|
||||
#ifdef CONFIG_SCHED_AUTOGROUP
|
||||
REG("autogroup", S_IRUGO|S_IWUSR, proc_pid_sched_autogroup_operations),
|
||||
#endif
|
||||
@ -3682,9 +3677,7 @@ static const struct pid_entry tid_base_stuff[] = {
|
||||
ONE("status", S_IRUGO, proc_pid_status),
|
||||
ONE("personality", S_IRUSR, proc_pid_personality),
|
||||
ONE("limits", S_IRUGO, proc_pid_limits),
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
|
||||
#endif
|
||||
NOD("comm", S_IFREG|S_IRUGO|S_IWUSR,
|
||||
&proc_tid_comm_inode_operations,
|
||||
&proc_pid_set_comm_operations, {}),
|
||||
|
@ -125,9 +125,11 @@ static inline int cpuset_do_page_mem_spread(void)
|
||||
|
||||
extern bool current_cpuset_is_being_rebound(void);
|
||||
|
||||
extern void dl_rebuild_rd_accounting(void);
|
||||
extern void rebuild_sched_domains(void);
|
||||
|
||||
extern void cpuset_print_current_mems_allowed(void);
|
||||
extern void cpuset_reset_sched_domains(void);
|
||||
|
||||
/*
|
||||
* read_mems_allowed_begin is required when making decisions involving
|
||||
@ -259,11 +261,20 @@ static inline bool current_cpuset_is_being_rebound(void)
|
||||
return false;
|
||||
}
|
||||
|
||||
static inline void dl_rebuild_rd_accounting(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void rebuild_sched_domains(void)
|
||||
{
|
||||
partition_sched_domains(1, NULL, NULL);
|
||||
}
|
||||
|
||||
static inline void cpuset_reset_sched_domains(void)
|
||||
{
|
||||
partition_sched_domains(1, NULL, NULL);
|
||||
}
|
||||
|
||||
static inline void cpuset_print_current_mems_allowed(void)
|
||||
{
|
||||
}
|
||||
|
@ -240,9 +240,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
|
||||
struct em_perf_state *ps;
|
||||
int i;
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
WARN_ONCE(!rcu_read_lock_held(), "EM: rcu read lock needed\n");
|
||||
#endif
|
||||
|
||||
if (!sum_util)
|
||||
return 0;
|
||||
|
@ -515,6 +515,8 @@ static inline bool preempt_model_rt(void)
|
||||
return IS_ENABLED(CONFIG_PREEMPT_RT);
|
||||
}
|
||||
|
||||
extern const char *preempt_model_str(void);
|
||||
|
||||
/*
|
||||
* Does the preemption model allow non-cooperative preemption?
|
||||
*
|
||||
|
@ -382,6 +382,11 @@ enum uclamp_id {
|
||||
#ifdef CONFIG_SMP
|
||||
extern struct root_domain def_root_domain;
|
||||
extern struct mutex sched_domains_mutex;
|
||||
extern void sched_domains_mutex_lock(void);
|
||||
extern void sched_domains_mutex_unlock(void);
|
||||
#else
|
||||
static inline void sched_domains_mutex_lock(void) { }
|
||||
static inline void sched_domains_mutex_unlock(void) { }
|
||||
#endif
|
||||
|
||||
struct sched_param {
|
||||
|
@ -34,7 +34,11 @@ static inline bool dl_time_before(u64 a, u64 b)
|
||||
struct root_domain;
|
||||
extern void dl_add_task_root_domain(struct task_struct *p);
|
||||
extern void dl_clear_root_domain(struct root_domain *rd);
|
||||
extern void dl_clear_root_domain_cpu(int cpu);
|
||||
|
||||
#endif /* CONFIG_SMP */
|
||||
|
||||
extern u64 dl_cookie;
|
||||
extern bool dl_bw_visited(int cpu, u64 cookie);
|
||||
|
||||
#endif /* _LINUX_SCHED_DEADLINE_H */
|
||||
|
@ -35,12 +35,10 @@ extern void show_stack(struct task_struct *task, unsigned long *sp,
|
||||
|
||||
extern void sched_show_task(struct task_struct *p);
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
struct seq_file;
|
||||
extern void proc_sched_show_task(struct task_struct *p,
|
||||
struct pid_namespace *ns, struct seq_file *m);
|
||||
extern void proc_sched_set_task(struct task_struct *p);
|
||||
#endif
|
||||
|
||||
/* Attach to any functions which should be ignored in wchan output. */
|
||||
#define __sched __section(".sched.text")
|
||||
|
@ -79,6 +79,21 @@ static __always_inline bool __must_check current_clr_polling_and_test(void)
|
||||
return unlikely(tif_need_resched());
|
||||
}
|
||||
|
||||
static __always_inline void current_clr_polling(void)
|
||||
{
|
||||
__current_clr_polling();
|
||||
|
||||
/*
|
||||
* Ensure we check TIF_NEED_RESCHED after we clear the polling bit.
|
||||
* Once the bit is cleared, we'll get IPIs with every new
|
||||
* TIF_NEED_RESCHED and the IPI handler, scheduler_ipi(), will also
|
||||
* fold.
|
||||
*/
|
||||
smp_mb__after_atomic(); /* paired with resched_curr() */
|
||||
|
||||
preempt_fold_need_resched();
|
||||
}
|
||||
|
||||
#else
|
||||
static inline void __current_set_polling(void) { }
|
||||
static inline void __current_clr_polling(void) { }
|
||||
@ -91,21 +106,15 @@ static inline bool __must_check current_clr_polling_and_test(void)
|
||||
{
|
||||
return unlikely(tif_need_resched());
|
||||
}
|
||||
#endif
|
||||
|
||||
static __always_inline void current_clr_polling(void)
|
||||
{
|
||||
__current_clr_polling();
|
||||
|
||||
/*
|
||||
* Ensure we check TIF_NEED_RESCHED after we clear the polling bit.
|
||||
* Once the bit is cleared, we'll get IPIs with every new
|
||||
* TIF_NEED_RESCHED and the IPI handler, scheduler_ipi(), will also
|
||||
* fold.
|
||||
*/
|
||||
smp_mb(); /* paired with resched_curr() */
|
||||
|
||||
preempt_fold_need_resched();
|
||||
}
|
||||
#endif
|
||||
|
||||
#endif /* _LINUX_SCHED_IDLE_H */
|
||||
|
@ -531,6 +531,13 @@ enum {
|
||||
|
||||
static inline void membarrier_mm_sync_core_before_usermode(struct mm_struct *mm)
|
||||
{
|
||||
/*
|
||||
* The atomic_read() below prevents CSE. The following should
|
||||
* help the compiler generate more efficient code on architectures
|
||||
* where sync_core_before_usermode() is a no-op.
|
||||
*/
|
||||
if (!IS_ENABLED(CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE))
|
||||
return;
|
||||
if (current->mm != mm)
|
||||
return;
|
||||
if (likely(!(atomic_read(&mm->membarrier_state) &
|
||||
|
@ -25,16 +25,12 @@ enum {
|
||||
};
|
||||
#undef SD_FLAG
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
|
||||
struct sd_flag_debug {
|
||||
unsigned int meta_flags;
|
||||
char *name;
|
||||
};
|
||||
extern const struct sd_flag_debug sd_flag_debug[];
|
||||
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_SCHED_SMT
|
||||
static inline int cpu_smt_flags(void)
|
||||
{
|
||||
@ -166,10 +162,6 @@ static inline struct cpumask *sched_domain_span(struct sched_domain *sd)
|
||||
return to_cpumask(sd->span);
|
||||
}
|
||||
|
||||
extern void partition_sched_domains_locked(int ndoms_new,
|
||||
cpumask_var_t doms_new[],
|
||||
struct sched_domain_attr *dattr_new);
|
||||
|
||||
extern void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
|
||||
struct sched_domain_attr *dattr_new);
|
||||
|
||||
@ -210,12 +202,6 @@ extern void __init set_sched_topology(struct sched_domain_topology_level *tl);
|
||||
|
||||
struct sched_domain_attr;
|
||||
|
||||
static inline void
|
||||
partition_sched_domains_locked(int ndoms_new, cpumask_var_t doms_new[],
|
||||
struct sched_domain_attr *dattr_new)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void
|
||||
partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
|
||||
struct sched_domain_attr *dattr_new)
|
||||
|
@ -193,9 +193,7 @@ static inline long __trace_sched_switch_state(bool preempt,
|
||||
{
|
||||
unsigned int state;
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
BUG_ON(p != current);
|
||||
#endif /* CONFIG_SCHED_DEBUG */
|
||||
|
||||
/*
|
||||
* Preemption ignores task state, therefore preempted tasks are always
|
||||
|
@ -953,10 +953,12 @@ static void dl_update_tasks_root_domain(struct cpuset *cs)
|
||||
css_task_iter_end(&it);
|
||||
}
|
||||
|
||||
static void dl_rebuild_rd_accounting(void)
|
||||
void dl_rebuild_rd_accounting(void)
|
||||
{
|
||||
struct cpuset *cs = NULL;
|
||||
struct cgroup_subsys_state *pos_css;
|
||||
int cpu;
|
||||
u64 cookie = ++dl_cookie;
|
||||
|
||||
lockdep_assert_held(&cpuset_mutex);
|
||||
lockdep_assert_cpus_held();
|
||||
@ -964,11 +966,12 @@ static void dl_rebuild_rd_accounting(void)
|
||||
|
||||
rcu_read_lock();
|
||||
|
||||
/*
|
||||
* Clear default root domain DL accounting, it will be computed again
|
||||
* if a task belongs to it.
|
||||
*/
|
||||
dl_clear_root_domain(&def_root_domain);
|
||||
for_each_possible_cpu(cpu) {
|
||||
if (dl_bw_visited(cpu, cookie))
|
||||
continue;
|
||||
|
||||
dl_clear_root_domain_cpu(cpu);
|
||||
}
|
||||
|
||||
cpuset_for_each_descendant_pre(cs, pos_css, &top_cpuset) {
|
||||
|
||||
@ -989,16 +992,6 @@ static void dl_rebuild_rd_accounting(void)
|
||||
rcu_read_unlock();
|
||||
}
|
||||
|
||||
static void
|
||||
partition_and_rebuild_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
|
||||
struct sched_domain_attr *dattr_new)
|
||||
{
|
||||
mutex_lock(&sched_domains_mutex);
|
||||
partition_sched_domains_locked(ndoms_new, doms_new, dattr_new);
|
||||
dl_rebuild_rd_accounting();
|
||||
mutex_unlock(&sched_domains_mutex);
|
||||
}
|
||||
|
||||
/*
|
||||
* Rebuild scheduler domains.
|
||||
*
|
||||
@ -1060,7 +1053,7 @@ void rebuild_sched_domains_locked(void)
|
||||
ndoms = generate_sched_domains(&doms, &attr);
|
||||
|
||||
/* Have scheduler rebuild the domains */
|
||||
partition_and_rebuild_sched_domains(ndoms, doms, attr);
|
||||
partition_sched_domains(ndoms, doms, attr);
|
||||
}
|
||||
#else /* !CONFIG_SMP */
|
||||
void rebuild_sched_domains_locked(void)
|
||||
@ -1082,6 +1075,13 @@ void rebuild_sched_domains(void)
|
||||
cpus_read_unlock();
|
||||
}
|
||||
|
||||
void cpuset_reset_sched_domains(void)
|
||||
{
|
||||
mutex_lock(&cpuset_mutex);
|
||||
partition_sched_domains(1, NULL, NULL);
|
||||
mutex_unlock(&cpuset_mutex);
|
||||
}
|
||||
|
||||
/**
|
||||
* cpuset_update_tasks_cpumask - Update the cpumasks of tasks in the cpuset.
|
||||
* @cs: the cpuset in which each task's cpus_allowed mask needs to be changed
|
||||
|
142
kernel/rseq.c
142
kernel/rseq.c
@ -78,24 +78,24 @@ efault:
|
||||
return -EFAULT;
|
||||
}
|
||||
|
||||
static void rseq_set_ro_fields(struct task_struct *t, u32 cpu_id_start, u32 cpu_id,
|
||||
u32 node_id, u32 mm_cid)
|
||||
{
|
||||
rseq_kernel_fields(t)->cpu_id_start = cpu_id;
|
||||
rseq_kernel_fields(t)->cpu_id = cpu_id;
|
||||
rseq_kernel_fields(t)->node_id = node_id;
|
||||
rseq_kernel_fields(t)->mm_cid = mm_cid;
|
||||
}
|
||||
/*
|
||||
* Update an rseq field and its in-kernel copy in lock-step to keep a coherent
|
||||
* state.
|
||||
*/
|
||||
#define rseq_unsafe_put_user(t, value, field, error_label) \
|
||||
do { \
|
||||
unsafe_put_user(value, &t->rseq->field, error_label); \
|
||||
rseq_kernel_fields(t)->field = value; \
|
||||
} while (0)
|
||||
|
||||
#else
|
||||
static int rseq_validate_ro_fields(struct task_struct *t)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void rseq_set_ro_fields(struct task_struct *t, u32 cpu_id_start, u32 cpu_id,
|
||||
u32 node_id, u32 mm_cid)
|
||||
{
|
||||
}
|
||||
#define rseq_unsafe_put_user(t, value, field, error_label) \
|
||||
unsafe_put_user(value, &t->rseq->field, error_label)
|
||||
#endif
|
||||
|
||||
/*
|
||||
@ -173,17 +173,18 @@ static int rseq_update_cpu_node_id(struct task_struct *t)
|
||||
WARN_ON_ONCE((int) mm_cid < 0);
|
||||
if (!user_write_access_begin(rseq, t->rseq_len))
|
||||
goto efault;
|
||||
unsafe_put_user(cpu_id, &rseq->cpu_id_start, efault_end);
|
||||
unsafe_put_user(cpu_id, &rseq->cpu_id, efault_end);
|
||||
unsafe_put_user(node_id, &rseq->node_id, efault_end);
|
||||
unsafe_put_user(mm_cid, &rseq->mm_cid, efault_end);
|
||||
|
||||
rseq_unsafe_put_user(t, cpu_id, cpu_id_start, efault_end);
|
||||
rseq_unsafe_put_user(t, cpu_id, cpu_id, efault_end);
|
||||
rseq_unsafe_put_user(t, node_id, node_id, efault_end);
|
||||
rseq_unsafe_put_user(t, mm_cid, mm_cid, efault_end);
|
||||
|
||||
/*
|
||||
* Additional feature fields added after ORIG_RSEQ_SIZE
|
||||
* need to be conditionally updated only if
|
||||
* t->rseq_len != ORIG_RSEQ_SIZE.
|
||||
*/
|
||||
user_write_access_end();
|
||||
rseq_set_ro_fields(t, cpu_id, cpu_id, node_id, mm_cid);
|
||||
trace_rseq_update(t);
|
||||
return 0;
|
||||
|
||||
@ -195,6 +196,7 @@ efault:
|
||||
|
||||
static int rseq_reset_rseq_cpu_node_id(struct task_struct *t)
|
||||
{
|
||||
struct rseq __user *rseq = t->rseq;
|
||||
u32 cpu_id_start = 0, cpu_id = RSEQ_CPU_ID_UNINITIALIZED, node_id = 0,
|
||||
mm_cid = 0;
|
||||
|
||||
@ -202,40 +204,61 @@ static int rseq_reset_rseq_cpu_node_id(struct task_struct *t)
|
||||
* Validate read-only rseq fields.
|
||||
*/
|
||||
if (rseq_validate_ro_fields(t))
|
||||
return -EFAULT;
|
||||
/*
|
||||
* Reset cpu_id_start to its initial state (0).
|
||||
*/
|
||||
if (put_user(cpu_id_start, &t->rseq->cpu_id_start))
|
||||
return -EFAULT;
|
||||
/*
|
||||
* Reset cpu_id to RSEQ_CPU_ID_UNINITIALIZED, so any user coming
|
||||
* in after unregistration can figure out that rseq needs to be
|
||||
* registered again.
|
||||
*/
|
||||
if (put_user(cpu_id, &t->rseq->cpu_id))
|
||||
return -EFAULT;
|
||||
/*
|
||||
* Reset node_id to its initial state (0).
|
||||
*/
|
||||
if (put_user(node_id, &t->rseq->node_id))
|
||||
return -EFAULT;
|
||||
/*
|
||||
* Reset mm_cid to its initial state (0).
|
||||
*/
|
||||
if (put_user(mm_cid, &t->rseq->mm_cid))
|
||||
return -EFAULT;
|
||||
goto efault;
|
||||
|
||||
rseq_set_ro_fields(t, cpu_id_start, cpu_id, node_id, mm_cid);
|
||||
if (!user_write_access_begin(rseq, t->rseq_len))
|
||||
goto efault;
|
||||
|
||||
/*
|
||||
* Reset all fields to their initial state.
|
||||
*
|
||||
* All fields have an initial state of 0 except cpu_id which is set to
|
||||
* RSEQ_CPU_ID_UNINITIALIZED, so that any user coming in after
|
||||
* unregistration can figure out that rseq needs to be registered
|
||||
* again.
|
||||
*/
|
||||
rseq_unsafe_put_user(t, cpu_id_start, cpu_id_start, efault_end);
|
||||
rseq_unsafe_put_user(t, cpu_id, cpu_id, efault_end);
|
||||
rseq_unsafe_put_user(t, node_id, node_id, efault_end);
|
||||
rseq_unsafe_put_user(t, mm_cid, mm_cid, efault_end);
|
||||
|
||||
/*
|
||||
* Additional feature fields added after ORIG_RSEQ_SIZE
|
||||
* need to be conditionally reset only if
|
||||
* t->rseq_len != ORIG_RSEQ_SIZE.
|
||||
*/
|
||||
user_write_access_end();
|
||||
return 0;
|
||||
|
||||
efault_end:
|
||||
user_write_access_end();
|
||||
efault:
|
||||
return -EFAULT;
|
||||
}
|
||||
|
||||
/*
|
||||
* Get the user-space pointer value stored in the 'rseq_cs' field.
|
||||
*/
|
||||
static int rseq_get_rseq_cs_ptr_val(struct rseq __user *rseq, u64 *rseq_cs)
|
||||
{
|
||||
if (!rseq_cs)
|
||||
return -EFAULT;
|
||||
|
||||
#ifdef CONFIG_64BIT
|
||||
if (get_user(*rseq_cs, &rseq->rseq_cs))
|
||||
return -EFAULT;
|
||||
#else
|
||||
if (copy_from_user(rseq_cs, &rseq->rseq_cs, sizeof(*rseq_cs)))
|
||||
return -EFAULT;
|
||||
#endif
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* If the rseq_cs field of 'struct rseq' contains a valid pointer to
|
||||
* user-space, copy 'struct rseq_cs' from user-space and validate its fields.
|
||||
*/
|
||||
static int rseq_get_rseq_cs(struct task_struct *t, struct rseq_cs *rseq_cs)
|
||||
{
|
||||
struct rseq_cs __user *urseq_cs;
|
||||
@ -244,17 +267,16 @@ static int rseq_get_rseq_cs(struct task_struct *t, struct rseq_cs *rseq_cs)
|
||||
u32 sig;
|
||||
int ret;
|
||||
|
||||
#ifdef CONFIG_64BIT
|
||||
if (get_user(ptr, &t->rseq->rseq_cs))
|
||||
return -EFAULT;
|
||||
#else
|
||||
if (copy_from_user(&ptr, &t->rseq->rseq_cs, sizeof(ptr)))
|
||||
return -EFAULT;
|
||||
#endif
|
||||
ret = rseq_get_rseq_cs_ptr_val(t->rseq, &ptr);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
/* If the rseq_cs pointer is NULL, return a cleared struct rseq_cs. */
|
||||
if (!ptr) {
|
||||
memset(rseq_cs, 0, sizeof(*rseq_cs));
|
||||
return 0;
|
||||
}
|
||||
/* Check that the pointer value fits in the user-space process space. */
|
||||
if (ptr >= TASK_SIZE)
|
||||
return -EINVAL;
|
||||
urseq_cs = (struct rseq_cs __user *)(unsigned long)ptr;
|
||||
@ -330,7 +352,7 @@ static int rseq_need_restart(struct task_struct *t, u32 cs_flags)
|
||||
return !!event_mask;
|
||||
}
|
||||
|
||||
static int clear_rseq_cs(struct task_struct *t)
|
||||
static int clear_rseq_cs(struct rseq __user *rseq)
|
||||
{
|
||||
/*
|
||||
* The rseq_cs field is set to NULL on preemption or signal
|
||||
@ -341,9 +363,9 @@ static int clear_rseq_cs(struct task_struct *t)
|
||||
* Set rseq_cs to NULL.
|
||||
*/
|
||||
#ifdef CONFIG_64BIT
|
||||
return put_user(0UL, &t->rseq->rseq_cs);
|
||||
return put_user(0UL, &rseq->rseq_cs);
|
||||
#else
|
||||
if (clear_user(&t->rseq->rseq_cs, sizeof(t->rseq->rseq_cs)))
|
||||
if (clear_user(&rseq->rseq_cs, sizeof(rseq->rseq_cs)))
|
||||
return -EFAULT;
|
||||
return 0;
|
||||
#endif
|
||||
@ -375,11 +397,11 @@ static int rseq_ip_fixup(struct pt_regs *regs)
|
||||
* Clear the rseq_cs pointer and return.
|
||||
*/
|
||||
if (!in_rseq_cs(ip, &rseq_cs))
|
||||
return clear_rseq_cs(t);
|
||||
return clear_rseq_cs(t->rseq);
|
||||
ret = rseq_need_restart(t, rseq_cs.flags);
|
||||
if (ret <= 0)
|
||||
return ret;
|
||||
ret = clear_rseq_cs(t);
|
||||
ret = clear_rseq_cs(t->rseq);
|
||||
if (ret)
|
||||
return ret;
|
||||
trace_rseq_ip_fixup(ip, rseq_cs.start_ip, rseq_cs.post_commit_offset,
|
||||
@ -453,6 +475,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
|
||||
int, flags, u32, sig)
|
||||
{
|
||||
int ret;
|
||||
u64 rseq_cs;
|
||||
|
||||
if (flags & RSEQ_FLAG_UNREGISTER) {
|
||||
if (flags & ~RSEQ_FLAG_UNREGISTER)
|
||||
@ -507,6 +530,19 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
|
||||
return -EINVAL;
|
||||
if (!access_ok(rseq, rseq_len))
|
||||
return -EFAULT;
|
||||
|
||||
/*
|
||||
* If the rseq_cs pointer is non-NULL on registration, clear it to
|
||||
* avoid a potential segfault on return to user-space. The proper thing
|
||||
* to do would have been to fail the registration but this would break
|
||||
* older libcs that reuse the rseq area for new threads without
|
||||
* clearing the fields.
|
||||
*/
|
||||
if (rseq_get_rseq_cs_ptr_val(rseq, &rseq_cs))
|
||||
return -EFAULT;
|
||||
if (rseq_cs && clear_rseq_cs(rseq))
|
||||
return -EFAULT;
|
||||
|
||||
#ifdef CONFIG_DEBUG_RSEQ
|
||||
/*
|
||||
* Initialize the in-kernel rseq fields copy for validation of
|
||||
|
@ -68,9 +68,7 @@
|
||||
# include "cpufreq_schedutil.c"
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
# include "debug.c"
|
||||
#endif
|
||||
#include "debug.c"
|
||||
|
||||
#ifdef CONFIG_SCHEDSTATS
|
||||
# include "stats.c"
|
||||
|
@ -91,7 +91,6 @@
|
||||
#include "autogroup.h"
|
||||
#include "pelt.h"
|
||||
#include "smp.h"
|
||||
#include "stats.h"
|
||||
|
||||
#include "../workqueue_internal.h"
|
||||
#include "../../io_uring/io-wq.h"
|
||||
@ -119,7 +118,6 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
|
||||
|
||||
DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
/*
|
||||
* Debugging: various feature bits
|
||||
*
|
||||
@ -129,7 +127,7 @@ DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
|
||||
*/
|
||||
#define SCHED_FEAT(name, enabled) \
|
||||
(1UL << __SCHED_FEAT_##name) * enabled |
|
||||
const_debug unsigned int sysctl_sched_features =
|
||||
__read_mostly unsigned int sysctl_sched_features =
|
||||
#include "features.h"
|
||||
0;
|
||||
#undef SCHED_FEAT
|
||||
@ -143,13 +141,12 @@ const_debug unsigned int sysctl_sched_features =
|
||||
*/
|
||||
__read_mostly int sysctl_resched_latency_warn_ms = 100;
|
||||
__read_mostly int sysctl_resched_latency_warn_once = 1;
|
||||
#endif /* CONFIG_SCHED_DEBUG */
|
||||
|
||||
/*
|
||||
* Number of tasks to iterate in a single balance run.
|
||||
* Limited because this is done with IRQs disabled.
|
||||
*/
|
||||
const_debug unsigned int sysctl_sched_nr_migrate = SCHED_NR_MIGRATE_BREAK;
|
||||
__read_mostly unsigned int sysctl_sched_nr_migrate = SCHED_NR_MIGRATE_BREAK;
|
||||
|
||||
__read_mostly int scheduler_running;
|
||||
|
||||
@ -800,11 +797,10 @@ void update_rq_clock(struct rq *rq)
|
||||
if (rq->clock_update_flags & RQCF_ACT_SKIP)
|
||||
return;
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
if (sched_feat(WARN_DOUBLE_CLOCK))
|
||||
SCHED_WARN_ON(rq->clock_update_flags & RQCF_UPDATED);
|
||||
WARN_ON_ONCE(rq->clock_update_flags & RQCF_UPDATED);
|
||||
rq->clock_update_flags |= RQCF_UPDATED;
|
||||
#endif
|
||||
|
||||
clock = sched_clock_cpu(cpu_of(rq));
|
||||
scx_rq_clock_update(rq, clock);
|
||||
|
||||
@ -1720,7 +1716,7 @@ static inline void uclamp_rq_dec_id(struct rq *rq, struct task_struct *p,
|
||||
|
||||
bucket = &uc_rq->bucket[uc_se->bucket_id];
|
||||
|
||||
SCHED_WARN_ON(!bucket->tasks);
|
||||
WARN_ON_ONCE(!bucket->tasks);
|
||||
if (likely(bucket->tasks))
|
||||
bucket->tasks--;
|
||||
|
||||
@ -1740,7 +1736,7 @@ static inline void uclamp_rq_dec_id(struct rq *rq, struct task_struct *p,
|
||||
* Defensive programming: this should never happen. If it happens,
|
||||
* e.g. due to future modification, warn and fix up the expected value.
|
||||
*/
|
||||
SCHED_WARN_ON(bucket->value > rq_clamp);
|
||||
WARN_ON_ONCE(bucket->value > rq_clamp);
|
||||
if (bucket->value >= rq_clamp) {
|
||||
bkt_clamp = uclamp_rq_max_value(rq, clamp_id, uc_se->value);
|
||||
uclamp_rq_set(rq, clamp_id, bkt_clamp);
|
||||
@ -1757,7 +1753,7 @@ static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p)
|
||||
* The condition is constructed such that a NOP is generated when
|
||||
* sched_uclamp_used is disabled.
|
||||
*/
|
||||
if (!static_branch_unlikely(&sched_uclamp_used))
|
||||
if (!uclamp_is_used())
|
||||
return;
|
||||
|
||||
if (unlikely(!p->sched_class->uclamp_enabled))
|
||||
@ -1784,7 +1780,7 @@ static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p)
|
||||
* The condition is constructed such that a NOP is generated when
|
||||
* sched_uclamp_used is disabled.
|
||||
*/
|
||||
if (!static_branch_unlikely(&sched_uclamp_used))
|
||||
if (!uclamp_is_used())
|
||||
return;
|
||||
|
||||
if (unlikely(!p->sched_class->uclamp_enabled))
|
||||
@ -1942,12 +1938,12 @@ static int sysctl_sched_uclamp_handler(const struct ctl_table *table, int write,
|
||||
}
|
||||
|
||||
if (update_root_tg) {
|
||||
static_branch_enable(&sched_uclamp_used);
|
||||
sched_uclamp_enable();
|
||||
uclamp_update_root_tg();
|
||||
}
|
||||
|
||||
if (old_min_rt != sysctl_sched_uclamp_util_min_rt_default) {
|
||||
static_branch_enable(&sched_uclamp_used);
|
||||
sched_uclamp_enable();
|
||||
uclamp_sync_util_min_rt_default();
|
||||
}
|
||||
|
||||
@ -2122,7 +2118,7 @@ void activate_task(struct rq *rq, struct task_struct *p, int flags)
|
||||
|
||||
void deactivate_task(struct rq *rq, struct task_struct *p, int flags)
|
||||
{
|
||||
SCHED_WARN_ON(flags & DEQUEUE_SLEEP);
|
||||
WARN_ON_ONCE(flags & DEQUEUE_SLEEP);
|
||||
|
||||
WRITE_ONCE(p->on_rq, TASK_ON_RQ_MIGRATING);
|
||||
ASSERT_EXCLUSIVE_WRITER(p->on_rq);
|
||||
@ -2727,7 +2723,7 @@ __do_set_cpus_allowed(struct task_struct *p, struct affinity_context *ctx)
|
||||
* XXX do further audits, this smells like something putrid.
|
||||
*/
|
||||
if (ctx->flags & SCA_MIGRATE_DISABLE)
|
||||
SCHED_WARN_ON(!p->on_cpu);
|
||||
WARN_ON_ONCE(!p->on_cpu);
|
||||
else
|
||||
lockdep_assert_held(&p->pi_lock);
|
||||
|
||||
@ -3292,7 +3288,6 @@ void relax_compatible_cpus_allowed_ptr(struct task_struct *p)
|
||||
|
||||
void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
|
||||
{
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
unsigned int state = READ_ONCE(p->__state);
|
||||
|
||||
/*
|
||||
@ -3330,7 +3325,6 @@ void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
|
||||
WARN_ON_ONCE(!cpu_online(new_cpu));
|
||||
|
||||
WARN_ON_ONCE(is_migration_disabled(p));
|
||||
#endif
|
||||
|
||||
trace_sched_migrate_task(p, new_cpu);
|
||||
|
||||
@ -4191,7 +4185,7 @@ int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
|
||||
* - we're serialized against set_special_state() by virtue of
|
||||
* it disabling IRQs (this allows not taking ->pi_lock).
|
||||
*/
|
||||
SCHED_WARN_ON(p->se.sched_delayed);
|
||||
WARN_ON_ONCE(p->se.sched_delayed);
|
||||
if (!ttwu_state_match(p, state, &success))
|
||||
goto out;
|
||||
|
||||
@ -4485,7 +4479,7 @@ static void __sched_fork(unsigned long clone_flags, struct task_struct *p)
|
||||
INIT_LIST_HEAD(&p->se.group_node);
|
||||
|
||||
/* A delayed task cannot be in clone(). */
|
||||
SCHED_WARN_ON(p->se.sched_delayed);
|
||||
WARN_ON_ONCE(p->se.sched_delayed);
|
||||
|
||||
#ifdef CONFIG_FAIR_GROUP_SCHED
|
||||
p->se.cfs_rq = NULL;
|
||||
@ -5573,7 +5567,6 @@ unsigned long long task_sched_runtime(struct task_struct *p)
|
||||
return ns;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
static u64 cpu_resched_latency(struct rq *rq)
|
||||
{
|
||||
int latency_warn_ms = READ_ONCE(sysctl_resched_latency_warn_ms);
|
||||
@ -5618,9 +5611,6 @@ static int __init setup_resched_latency_warn_ms(char *str)
|
||||
return 1;
|
||||
}
|
||||
__setup("resched_latency_warn_ms=", setup_resched_latency_warn_ms);
|
||||
#else
|
||||
static inline u64 cpu_resched_latency(struct rq *rq) { return 0; }
|
||||
#endif /* CONFIG_SCHED_DEBUG */
|
||||
|
||||
/*
|
||||
* This function gets called by the timer code, with HZ frequency.
|
||||
@ -5741,7 +5731,7 @@ static void sched_tick_remote(struct work_struct *work)
|
||||
* we are always sure that there is no proxy (only a
|
||||
* single task is running).
|
||||
*/
|
||||
SCHED_WARN_ON(rq->curr != rq->donor);
|
||||
WARN_ON_ONCE(rq->curr != rq->donor);
|
||||
update_rq_clock(rq);
|
||||
|
||||
if (!is_idle_task(curr)) {
|
||||
@ -5961,7 +5951,7 @@ static inline void schedule_debug(struct task_struct *prev, bool preempt)
|
||||
preempt_count_set(PREEMPT_DISABLED);
|
||||
}
|
||||
rcu_sleep_check();
|
||||
SCHED_WARN_ON(ct_state() == CT_STATE_USER);
|
||||
WARN_ON_ONCE(ct_state() == CT_STATE_USER);
|
||||
|
||||
profile_hit(SCHED_PROFILING, __builtin_return_address(0));
|
||||
|
||||
@ -6714,9 +6704,7 @@ static void __sched notrace __schedule(int sched_mode)
|
||||
picked:
|
||||
clear_tsk_need_resched(prev);
|
||||
clear_preempt_need_resched();
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
rq->last_seen_need_resched_ns = 0;
|
||||
#endif
|
||||
|
||||
if (likely(prev != next)) {
|
||||
rq->nr_switches++;
|
||||
@ -6807,7 +6795,7 @@ static inline void sched_submit_work(struct task_struct *tsk)
|
||||
* deadlock if the callback attempts to acquire a lock which is
|
||||
* already acquired.
|
||||
*/
|
||||
SCHED_WARN_ON(current->__state & TASK_RTLOCK_WAIT);
|
||||
WARN_ON_ONCE(current->__state & TASK_RTLOCK_WAIT);
|
||||
|
||||
/*
|
||||
* If we are going to sleep and we have plugged IO queued,
|
||||
@ -7090,7 +7078,7 @@ asmlinkage __visible void __sched preempt_schedule_irq(void)
|
||||
int default_wake_function(wait_queue_entry_t *curr, unsigned mode, int wake_flags,
|
||||
void *key)
|
||||
{
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_SCHED_DEBUG) && wake_flags & ~(WF_SYNC|WF_CURRENT_CPU));
|
||||
WARN_ON_ONCE(wake_flags & ~(WF_SYNC|WF_CURRENT_CPU));
|
||||
return try_to_wake_up(curr->private, mode, wake_flags);
|
||||
}
|
||||
EXPORT_SYMBOL(default_wake_function);
|
||||
@ -7644,10 +7632,57 @@ PREEMPT_MODEL_ACCESSOR(lazy);
|
||||
|
||||
#else /* !CONFIG_PREEMPT_DYNAMIC: */
|
||||
|
||||
#define preempt_dynamic_mode -1
|
||||
|
||||
static inline void preempt_dynamic_init(void) { }
|
||||
|
||||
#endif /* CONFIG_PREEMPT_DYNAMIC */
|
||||
|
||||
const char *preempt_modes[] = {
|
||||
"none", "voluntary", "full", "lazy", NULL,
|
||||
};
|
||||
|
||||
const char *preempt_model_str(void)
|
||||
{
|
||||
bool brace = IS_ENABLED(CONFIG_PREEMPT_RT) &&
|
||||
(IS_ENABLED(CONFIG_PREEMPT_DYNAMIC) ||
|
||||
IS_ENABLED(CONFIG_PREEMPT_LAZY));
|
||||
static char buf[128];
|
||||
|
||||
if (IS_ENABLED(CONFIG_PREEMPT_BUILD)) {
|
||||
struct seq_buf s;
|
||||
|
||||
seq_buf_init(&s, buf, sizeof(buf));
|
||||
seq_buf_puts(&s, "PREEMPT");
|
||||
|
||||
if (IS_ENABLED(CONFIG_PREEMPT_RT))
|
||||
seq_buf_printf(&s, "%sRT%s",
|
||||
brace ? "_{" : "_",
|
||||
brace ? "," : "");
|
||||
|
||||
if (IS_ENABLED(CONFIG_PREEMPT_DYNAMIC)) {
|
||||
seq_buf_printf(&s, "(%s)%s",
|
||||
preempt_dynamic_mode > 0 ?
|
||||
preempt_modes[preempt_dynamic_mode] : "undef",
|
||||
brace ? "}" : "");
|
||||
return seq_buf_str(&s);
|
||||
}
|
||||
|
||||
if (IS_ENABLED(CONFIG_PREEMPT_LAZY)) {
|
||||
seq_buf_printf(&s, "LAZY%s",
|
||||
brace ? "}" : "");
|
||||
return seq_buf_str(&s);
|
||||
}
|
||||
|
||||
return seq_buf_str(&s);
|
||||
}
|
||||
|
||||
if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY_BUILD))
|
||||
return "VOLUNTARY";
|
||||
|
||||
return "NONE";
|
||||
}
|
||||
|
||||
int io_schedule_prepare(void)
|
||||
{
|
||||
int old_iowait = current->in_iowait;
|
||||
@ -7762,10 +7797,9 @@ void show_state_filter(unsigned int state_filter)
|
||||
sched_show_task(p);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
if (!state_filter)
|
||||
sysrq_sched_debug_show();
|
||||
#endif
|
||||
|
||||
rcu_read_unlock();
|
||||
/*
|
||||
* Only show locks if all tasks are dumped:
|
||||
@ -8180,7 +8214,7 @@ static void cpuset_cpu_active(void)
|
||||
* operation in the resume sequence, just build a single sched
|
||||
* domain, ignoring cpusets.
|
||||
*/
|
||||
partition_sched_domains(1, NULL, NULL);
|
||||
cpuset_reset_sched_domains();
|
||||
if (--num_cpus_frozen)
|
||||
return;
|
||||
/*
|
||||
@ -8199,7 +8233,7 @@ static void cpuset_cpu_inactive(unsigned int cpu)
|
||||
cpuset_update_active_cpus();
|
||||
} else {
|
||||
num_cpus_frozen++;
|
||||
partition_sched_domains(1, NULL, NULL);
|
||||
cpuset_reset_sched_domains();
|
||||
}
|
||||
}
|
||||
|
||||
@ -8421,9 +8455,9 @@ void __init sched_init_smp(void)
|
||||
* CPU masks are stable and all blatant races in the below code cannot
|
||||
* happen.
|
||||
*/
|
||||
mutex_lock(&sched_domains_mutex);
|
||||
sched_domains_mutex_lock();
|
||||
sched_init_domains(cpu_active_mask);
|
||||
mutex_unlock(&sched_domains_mutex);
|
||||
sched_domains_mutex_unlock();
|
||||
|
||||
/* Move init over to a non-isolated CPU */
|
||||
if (set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_DOMAIN)) < 0)
|
||||
@ -9185,7 +9219,7 @@ static void cpu_util_update_eff(struct cgroup_subsys_state *css)
|
||||
unsigned int clamps;
|
||||
|
||||
lockdep_assert_held(&uclamp_mutex);
|
||||
SCHED_WARN_ON(!rcu_read_lock_held());
|
||||
WARN_ON_ONCE(!rcu_read_lock_held());
|
||||
|
||||
css_for_each_descendant_pre(css, top_css) {
|
||||
uc_parent = css_tg(css)->parent
|
||||
@ -9277,7 +9311,7 @@ static ssize_t cpu_uclamp_write(struct kernfs_open_file *of, char *buf,
|
||||
if (req.ret)
|
||||
return req.ret;
|
||||
|
||||
static_branch_enable(&sched_uclamp_used);
|
||||
sched_uclamp_enable();
|
||||
|
||||
guard(mutex)(&uclamp_mutex);
|
||||
guard(rcu)();
|
||||
@ -10520,7 +10554,7 @@ static void task_mm_cid_work(struct callback_head *work)
|
||||
struct mm_struct *mm;
|
||||
int weight, cpu;
|
||||
|
||||
SCHED_WARN_ON(t != container_of(work, struct task_struct, cid_work));
|
||||
WARN_ON_ONCE(t != container_of(work, struct task_struct, cid_work));
|
||||
|
||||
work->next = work; /* Prevent double-add */
|
||||
if (t->flags & PF_EXITING)
|
||||
|
@ -65,7 +65,7 @@ static unsigned long sched_core_update_cookie(struct task_struct *p,
|
||||
* a cookie until after we've removed it, we must have core scheduling
|
||||
* enabled here.
|
||||
*/
|
||||
SCHED_WARN_ON((p->core_cookie || cookie) && !sched_core_enabled(rq));
|
||||
WARN_ON_ONCE((p->core_cookie || cookie) && !sched_core_enabled(rq));
|
||||
|
||||
if (sched_core_enqueued(p))
|
||||
sched_core_dequeue(rq, p, DEQUEUE_SAVE);
|
||||
|
@ -166,14 +166,14 @@ static inline unsigned long dl_bw_capacity(int i)
|
||||
}
|
||||
}
|
||||
|
||||
static inline bool dl_bw_visited(int cpu, u64 gen)
|
||||
bool dl_bw_visited(int cpu, u64 cookie)
|
||||
{
|
||||
struct root_domain *rd = cpu_rq(cpu)->rd;
|
||||
|
||||
if (rd->visit_gen == gen)
|
||||
if (rd->visit_cookie == cookie)
|
||||
return true;
|
||||
|
||||
rd->visit_gen = gen;
|
||||
rd->visit_cookie = cookie;
|
||||
return false;
|
||||
}
|
||||
|
||||
@ -207,7 +207,7 @@ static inline unsigned long dl_bw_capacity(int i)
|
||||
return SCHED_CAPACITY_SCALE;
|
||||
}
|
||||
|
||||
static inline bool dl_bw_visited(int cpu, u64 gen)
|
||||
bool dl_bw_visited(int cpu, u64 cookie)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
@ -249,8 +249,8 @@ void __add_running_bw(u64 dl_bw, struct dl_rq *dl_rq)
|
||||
|
||||
lockdep_assert_rq_held(rq_of_dl_rq(dl_rq));
|
||||
dl_rq->running_bw += dl_bw;
|
||||
SCHED_WARN_ON(dl_rq->running_bw < old); /* overflow */
|
||||
SCHED_WARN_ON(dl_rq->running_bw > dl_rq->this_bw);
|
||||
WARN_ON_ONCE(dl_rq->running_bw < old); /* overflow */
|
||||
WARN_ON_ONCE(dl_rq->running_bw > dl_rq->this_bw);
|
||||
/* kick cpufreq (see the comment in kernel/sched/sched.h). */
|
||||
cpufreq_update_util(rq_of_dl_rq(dl_rq), 0);
|
||||
}
|
||||
@ -262,7 +262,7 @@ void __sub_running_bw(u64 dl_bw, struct dl_rq *dl_rq)
|
||||
|
||||
lockdep_assert_rq_held(rq_of_dl_rq(dl_rq));
|
||||
dl_rq->running_bw -= dl_bw;
|
||||
SCHED_WARN_ON(dl_rq->running_bw > old); /* underflow */
|
||||
WARN_ON_ONCE(dl_rq->running_bw > old); /* underflow */
|
||||
if (dl_rq->running_bw > old)
|
||||
dl_rq->running_bw = 0;
|
||||
/* kick cpufreq (see the comment in kernel/sched/sched.h). */
|
||||
@ -276,7 +276,7 @@ void __add_rq_bw(u64 dl_bw, struct dl_rq *dl_rq)
|
||||
|
||||
lockdep_assert_rq_held(rq_of_dl_rq(dl_rq));
|
||||
dl_rq->this_bw += dl_bw;
|
||||
SCHED_WARN_ON(dl_rq->this_bw < old); /* overflow */
|
||||
WARN_ON_ONCE(dl_rq->this_bw < old); /* overflow */
|
||||
}
|
||||
|
||||
static inline
|
||||
@ -286,10 +286,10 @@ void __sub_rq_bw(u64 dl_bw, struct dl_rq *dl_rq)
|
||||
|
||||
lockdep_assert_rq_held(rq_of_dl_rq(dl_rq));
|
||||
dl_rq->this_bw -= dl_bw;
|
||||
SCHED_WARN_ON(dl_rq->this_bw > old); /* underflow */
|
||||
WARN_ON_ONCE(dl_rq->this_bw > old); /* underflow */
|
||||
if (dl_rq->this_bw > old)
|
||||
dl_rq->this_bw = 0;
|
||||
SCHED_WARN_ON(dl_rq->running_bw > dl_rq->this_bw);
|
||||
WARN_ON_ONCE(dl_rq->running_bw > dl_rq->this_bw);
|
||||
}
|
||||
|
||||
static inline
|
||||
@ -2956,7 +2956,7 @@ void dl_add_task_root_domain(struct task_struct *p)
|
||||
struct dl_bw *dl_b;
|
||||
|
||||
raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
|
||||
if (!dl_task(p)) {
|
||||
if (!dl_task(p) || dl_entity_is_special(&p->dl)) {
|
||||
raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags);
|
||||
return;
|
||||
}
|
||||
@ -2981,18 +2981,22 @@ void dl_clear_root_domain(struct root_domain *rd)
|
||||
rd->dl_bw.total_bw = 0;
|
||||
|
||||
/*
|
||||
* dl_server bandwidth is only restored when CPUs are attached to root
|
||||
* domains (after domains are created or CPUs moved back to the
|
||||
* default root doamin).
|
||||
* dl_servers are not tasks. Since dl_add_task_root_domain ignores
|
||||
* them, we need to account for them here explicitly.
|
||||
*/
|
||||
for_each_cpu(i, rd->span) {
|
||||
struct sched_dl_entity *dl_se = &cpu_rq(i)->fair_server;
|
||||
|
||||
if (dl_server(dl_se) && cpu_active(i))
|
||||
rd->dl_bw.total_bw += dl_se->dl_bw;
|
||||
__dl_add(&rd->dl_bw, dl_se->dl_bw, dl_bw_cpus(i));
|
||||
}
|
||||
}
|
||||
|
||||
void dl_clear_root_domain_cpu(int cpu)
|
||||
{
|
||||
dl_clear_root_domain(cpu_rq(cpu)->rd);
|
||||
}
|
||||
|
||||
#endif /* CONFIG_SMP */
|
||||
|
||||
static void switched_from_dl(struct rq *rq, struct task_struct *p)
|
||||
@ -3171,15 +3175,18 @@ DEFINE_SCHED_CLASS(dl) = {
|
||||
#endif
|
||||
};
|
||||
|
||||
/* Used for dl_bw check and update, used under sched_rt_handler()::mutex */
|
||||
static u64 dl_generation;
|
||||
/*
|
||||
* Used for dl_bw check and update, used under sched_rt_handler()::mutex and
|
||||
* sched_domains_mutex.
|
||||
*/
|
||||
u64 dl_cookie;
|
||||
|
||||
int sched_dl_global_validate(void)
|
||||
{
|
||||
u64 runtime = global_rt_runtime();
|
||||
u64 period = global_rt_period();
|
||||
u64 new_bw = to_ratio(period, runtime);
|
||||
u64 gen = ++dl_generation;
|
||||
u64 cookie = ++dl_cookie;
|
||||
struct dl_bw *dl_b;
|
||||
int cpu, cpus, ret = 0;
|
||||
unsigned long flags;
|
||||
@ -3192,7 +3199,7 @@ int sched_dl_global_validate(void)
|
||||
for_each_online_cpu(cpu) {
|
||||
rcu_read_lock_sched();
|
||||
|
||||
if (dl_bw_visited(cpu, gen))
|
||||
if (dl_bw_visited(cpu, cookie))
|
||||
goto next;
|
||||
|
||||
dl_b = dl_bw_of(cpu);
|
||||
@ -3229,7 +3236,7 @@ static void init_dl_rq_bw_ratio(struct dl_rq *dl_rq)
|
||||
void sched_dl_do_global(void)
|
||||
{
|
||||
u64 new_bw = -1;
|
||||
u64 gen = ++dl_generation;
|
||||
u64 cookie = ++dl_cookie;
|
||||
struct dl_bw *dl_b;
|
||||
int cpu;
|
||||
unsigned long flags;
|
||||
@ -3240,7 +3247,7 @@ void sched_dl_do_global(void)
|
||||
for_each_possible_cpu(cpu) {
|
||||
rcu_read_lock_sched();
|
||||
|
||||
if (dl_bw_visited(cpu, gen)) {
|
||||
if (dl_bw_visited(cpu, cookie)) {
|
||||
rcu_read_unlock_sched();
|
||||
continue;
|
||||
}
|
||||
@ -3567,9 +3574,7 @@ void dl_bw_free(int cpu, u64 dl_bw)
|
||||
}
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
void print_dl_stats(struct seq_file *m, int cpu)
|
||||
{
|
||||
print_dl_rq(m, cpu, &cpu_rq(cpu)->dl);
|
||||
}
|
||||
#endif /* CONFIG_SCHED_DEBUG */
|
||||
|
@ -244,11 +244,13 @@ static ssize_t sched_dynamic_write(struct file *filp, const char __user *ubuf,
|
||||
|
||||
static int sched_dynamic_show(struct seq_file *m, void *v)
|
||||
{
|
||||
static const char * preempt_modes[] = {
|
||||
"none", "voluntary", "full", "lazy",
|
||||
};
|
||||
int j = ARRAY_SIZE(preempt_modes) - !IS_ENABLED(CONFIG_ARCH_HAS_PREEMPT_LAZY);
|
||||
int i = IS_ENABLED(CONFIG_PREEMPT_RT) * 2;
|
||||
int j;
|
||||
|
||||
/* Count entries in NULL terminated preempt_modes */
|
||||
for (j = 0; preempt_modes[j]; j++)
|
||||
;
|
||||
j -= !IS_ENABLED(CONFIG_ARCH_HAS_PREEMPT_LAZY);
|
||||
|
||||
for (; i < j; i++) {
|
||||
if (preempt_dynamic_mode == i)
|
||||
@ -292,7 +294,7 @@ static ssize_t sched_verbose_write(struct file *filp, const char __user *ubuf,
|
||||
bool orig;
|
||||
|
||||
cpus_read_lock();
|
||||
mutex_lock(&sched_domains_mutex);
|
||||
sched_domains_mutex_lock();
|
||||
|
||||
orig = sched_debug_verbose;
|
||||
result = debugfs_write_file_bool(filp, ubuf, cnt, ppos);
|
||||
@ -304,7 +306,7 @@ static ssize_t sched_verbose_write(struct file *filp, const char __user *ubuf,
|
||||
sd_dentry = NULL;
|
||||
}
|
||||
|
||||
mutex_unlock(&sched_domains_mutex);
|
||||
sched_domains_mutex_unlock();
|
||||
cpus_read_unlock();
|
||||
|
||||
return result;
|
||||
@ -515,9 +517,9 @@ static __init int sched_init_debug(void)
|
||||
debugfs_create_u32("migration_cost_ns", 0644, debugfs_sched, &sysctl_sched_migration_cost);
|
||||
debugfs_create_u32("nr_migrate", 0644, debugfs_sched, &sysctl_sched_nr_migrate);
|
||||
|
||||
mutex_lock(&sched_domains_mutex);
|
||||
sched_domains_mutex_lock();
|
||||
update_sched_domain_debugfs();
|
||||
mutex_unlock(&sched_domains_mutex);
|
||||
sched_domains_mutex_unlock();
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_NUMA_BALANCING
|
||||
|
@ -2472,7 +2472,7 @@ static bool task_can_run_on_remote_rq(struct task_struct *p, struct rq *rq,
|
||||
{
|
||||
int cpu = cpu_of(rq);
|
||||
|
||||
SCHED_WARN_ON(task_cpu(p) == cpu);
|
||||
WARN_ON_ONCE(task_cpu(p) == cpu);
|
||||
|
||||
/*
|
||||
* If @p has migration disabled, @p->cpus_ptr is updated to contain only
|
||||
|
@ -74,12 +74,12 @@ unsigned int sysctl_sched_tunable_scaling = SCHED_TUNABLESCALING_LOG;
|
||||
/*
|
||||
* Minimal preemption granularity for CPU-bound tasks:
|
||||
*
|
||||
* (default: 0.75 msec * (1 + ilog(ncpus)), units: nanoseconds)
|
||||
* (default: 0.70 msec * (1 + ilog(ncpus)), units: nanoseconds)
|
||||
*/
|
||||
unsigned int sysctl_sched_base_slice = 750000ULL;
|
||||
static unsigned int normalized_sysctl_sched_base_slice = 750000ULL;
|
||||
unsigned int sysctl_sched_base_slice = 700000ULL;
|
||||
static unsigned int normalized_sysctl_sched_base_slice = 700000ULL;
|
||||
|
||||
const_debug unsigned int sysctl_sched_migration_cost = 500000UL;
|
||||
__read_mostly unsigned int sysctl_sched_migration_cost = 500000UL;
|
||||
|
||||
static int __init setup_sched_thermal_decay_shift(char *str)
|
||||
{
|
||||
@ -399,7 +399,7 @@ static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
|
||||
|
||||
static inline void assert_list_leaf_cfs_rq(struct rq *rq)
|
||||
{
|
||||
SCHED_WARN_ON(rq->tmp_alone_branch != &rq->leaf_cfs_rq_list);
|
||||
WARN_ON_ONCE(rq->tmp_alone_branch != &rq->leaf_cfs_rq_list);
|
||||
}
|
||||
|
||||
/* Iterate through all leaf cfs_rq's on a runqueue */
|
||||
@ -696,7 +696,7 @@ static void update_entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *se)
|
||||
{
|
||||
s64 vlag, limit;
|
||||
|
||||
SCHED_WARN_ON(!se->on_rq);
|
||||
WARN_ON_ONCE(!se->on_rq);
|
||||
|
||||
vlag = avg_vruntime(cfs_rq) - se->vruntime;
|
||||
limit = calc_delta_fair(max_t(u64, 2*se->slice, TICK_NSEC), se);
|
||||
@ -883,6 +883,26 @@ struct sched_entity *__pick_first_entity(struct cfs_rq *cfs_rq)
|
||||
return __node_2_se(left);
|
||||
}
|
||||
|
||||
/*
|
||||
* HACK, stash a copy of deadline at the point of pick in vlag,
|
||||
* which isn't used until dequeue.
|
||||
*/
|
||||
static inline void set_protect_slice(struct sched_entity *se)
|
||||
{
|
||||
se->vlag = se->deadline;
|
||||
}
|
||||
|
||||
static inline bool protect_slice(struct sched_entity *se)
|
||||
{
|
||||
return se->vlag == se->deadline;
|
||||
}
|
||||
|
||||
static inline void cancel_protect_slice(struct sched_entity *se)
|
||||
{
|
||||
if (protect_slice(se))
|
||||
se->vlag = se->deadline + 1;
|
||||
}
|
||||
|
||||
/*
|
||||
* Earliest Eligible Virtual Deadline First
|
||||
*
|
||||
@ -919,11 +939,7 @@ static struct sched_entity *pick_eevdf(struct cfs_rq *cfs_rq)
|
||||
if (curr && (!curr->on_rq || !entity_eligible(cfs_rq, curr)))
|
||||
curr = NULL;
|
||||
|
||||
/*
|
||||
* Once selected, run a task until it either becomes non-eligible or
|
||||
* until it gets a new slice. See the HACK in set_next_entity().
|
||||
*/
|
||||
if (sched_feat(RUN_TO_PARITY) && curr && curr->vlag == curr->deadline)
|
||||
if (sched_feat(RUN_TO_PARITY) && curr && protect_slice(curr))
|
||||
return curr;
|
||||
|
||||
/* Pick the leftmost entity if it's eligible */
|
||||
@ -967,7 +983,6 @@ found:
|
||||
return best;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
struct sched_entity *__pick_last_entity(struct cfs_rq *cfs_rq)
|
||||
{
|
||||
struct rb_node *last = rb_last(&cfs_rq->tasks_timeline.rb_root);
|
||||
@ -994,7 +1009,6 @@ int sched_update_scaling(void)
|
||||
return 0;
|
||||
}
|
||||
#endif
|
||||
#endif
|
||||
|
||||
static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se);
|
||||
|
||||
@ -3301,7 +3315,7 @@ static void task_numa_work(struct callback_head *work)
|
||||
bool vma_pids_skipped;
|
||||
bool vma_pids_forced = false;
|
||||
|
||||
SCHED_WARN_ON(p != container_of(work, struct task_struct, numa_work));
|
||||
WARN_ON_ONCE(p != container_of(work, struct task_struct, numa_work));
|
||||
|
||||
work->next = work;
|
||||
/*
|
||||
@ -4020,7 +4034,7 @@ static inline bool load_avg_is_decayed(struct sched_avg *sa)
|
||||
* Make sure that rounding and/or propagation of PELT values never
|
||||
* break this.
|
||||
*/
|
||||
SCHED_WARN_ON(sa->load_avg ||
|
||||
WARN_ON_ONCE(sa->load_avg ||
|
||||
sa->util_avg ||
|
||||
sa->runnable_avg);
|
||||
|
||||
@ -5444,7 +5458,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
|
||||
clear_buddies(cfs_rq, se);
|
||||
|
||||
if (flags & DEQUEUE_DELAYED) {
|
||||
SCHED_WARN_ON(!se->sched_delayed);
|
||||
WARN_ON_ONCE(!se->sched_delayed);
|
||||
} else {
|
||||
bool delay = sleep;
|
||||
/*
|
||||
@ -5454,7 +5468,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
|
||||
if (flags & DEQUEUE_SPECIAL)
|
||||
delay = false;
|
||||
|
||||
SCHED_WARN_ON(delay && se->sched_delayed);
|
||||
WARN_ON_ONCE(delay && se->sched_delayed);
|
||||
|
||||
if (sched_feat(DELAY_DEQUEUE) && delay &&
|
||||
!entity_eligible(cfs_rq, se)) {
|
||||
@ -5530,15 +5544,12 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
|
||||
update_stats_wait_end_fair(cfs_rq, se);
|
||||
__dequeue_entity(cfs_rq, se);
|
||||
update_load_avg(cfs_rq, se, UPDATE_TG);
|
||||
/*
|
||||
* HACK, stash a copy of deadline at the point of pick in vlag,
|
||||
* which isn't used until dequeue.
|
||||
*/
|
||||
se->vlag = se->deadline;
|
||||
|
||||
set_protect_slice(se);
|
||||
}
|
||||
|
||||
update_stats_curr_start(cfs_rq, se);
|
||||
SCHED_WARN_ON(cfs_rq->curr);
|
||||
WARN_ON_ONCE(cfs_rq->curr);
|
||||
cfs_rq->curr = se;
|
||||
|
||||
/*
|
||||
@ -5579,7 +5590,7 @@ pick_next_entity(struct rq *rq, struct cfs_rq *cfs_rq)
|
||||
if (sched_feat(PICK_BUDDY) &&
|
||||
cfs_rq->next && entity_eligible(cfs_rq, cfs_rq->next)) {
|
||||
/* ->next will never be delayed */
|
||||
SCHED_WARN_ON(cfs_rq->next->sched_delayed);
|
||||
WARN_ON_ONCE(cfs_rq->next->sched_delayed);
|
||||
return cfs_rq->next;
|
||||
}
|
||||
|
||||
@ -5615,7 +5626,7 @@ static void put_prev_entity(struct cfs_rq *cfs_rq, struct sched_entity *prev)
|
||||
/* in !on_rq case, update occurred at dequeue */
|
||||
update_load_avg(cfs_rq, prev, 0);
|
||||
}
|
||||
SCHED_WARN_ON(cfs_rq->curr != prev);
|
||||
WARN_ON_ONCE(cfs_rq->curr != prev);
|
||||
cfs_rq->curr = NULL;
|
||||
}
|
||||
|
||||
@ -5838,7 +5849,7 @@ static int tg_unthrottle_up(struct task_group *tg, void *data)
|
||||
|
||||
cfs_rq->throttled_clock_self = 0;
|
||||
|
||||
if (SCHED_WARN_ON((s64)delta < 0))
|
||||
if (WARN_ON_ONCE((s64)delta < 0))
|
||||
delta = 0;
|
||||
|
||||
cfs_rq->throttled_clock_self_time += delta;
|
||||
@ -5858,7 +5869,7 @@ static int tg_throttle_down(struct task_group *tg, void *data)
|
||||
cfs_rq->throttled_clock_pelt = rq_clock_pelt(rq);
|
||||
list_del_leaf_cfs_rq(cfs_rq);
|
||||
|
||||
SCHED_WARN_ON(cfs_rq->throttled_clock_self);
|
||||
WARN_ON_ONCE(cfs_rq->throttled_clock_self);
|
||||
if (cfs_rq->nr_queued)
|
||||
cfs_rq->throttled_clock_self = rq_clock(rq);
|
||||
}
|
||||
@ -5967,7 +5978,7 @@ done:
|
||||
* throttled-list. rq->lock protects completion.
|
||||
*/
|
||||
cfs_rq->throttled = 1;
|
||||
SCHED_WARN_ON(cfs_rq->throttled_clock);
|
||||
WARN_ON_ONCE(cfs_rq->throttled_clock);
|
||||
if (cfs_rq->nr_queued)
|
||||
cfs_rq->throttled_clock = rq_clock(rq);
|
||||
return true;
|
||||
@ -6123,7 +6134,7 @@ static inline void __unthrottle_cfs_rq_async(struct cfs_rq *cfs_rq)
|
||||
}
|
||||
|
||||
/* Already enqueued */
|
||||
if (SCHED_WARN_ON(!list_empty(&cfs_rq->throttled_csd_list)))
|
||||
if (WARN_ON_ONCE(!list_empty(&cfs_rq->throttled_csd_list)))
|
||||
return;
|
||||
|
||||
first = list_empty(&rq->cfsb_csd_list);
|
||||
@ -6142,7 +6153,7 @@ static void unthrottle_cfs_rq_async(struct cfs_rq *cfs_rq)
|
||||
{
|
||||
lockdep_assert_rq_held(rq_of(cfs_rq));
|
||||
|
||||
if (SCHED_WARN_ON(!cfs_rq_throttled(cfs_rq) ||
|
||||
if (WARN_ON_ONCE(!cfs_rq_throttled(cfs_rq) ||
|
||||
cfs_rq->runtime_remaining <= 0))
|
||||
return;
|
||||
|
||||
@ -6178,7 +6189,7 @@ static bool distribute_cfs_runtime(struct cfs_bandwidth *cfs_b)
|
||||
goto next;
|
||||
|
||||
/* By the above checks, this should never be true */
|
||||
SCHED_WARN_ON(cfs_rq->runtime_remaining > 0);
|
||||
WARN_ON_ONCE(cfs_rq->runtime_remaining > 0);
|
||||
|
||||
raw_spin_lock(&cfs_b->lock);
|
||||
runtime = -cfs_rq->runtime_remaining + 1;
|
||||
@ -6199,7 +6210,7 @@ static bool distribute_cfs_runtime(struct cfs_bandwidth *cfs_b)
|
||||
* We currently only expect to be unthrottling
|
||||
* a single cfs_rq locally.
|
||||
*/
|
||||
SCHED_WARN_ON(!list_empty(&local_unthrottle));
|
||||
WARN_ON_ONCE(!list_empty(&local_unthrottle));
|
||||
list_add_tail(&cfs_rq->throttled_csd_list,
|
||||
&local_unthrottle);
|
||||
}
|
||||
@ -6224,7 +6235,7 @@ next:
|
||||
|
||||
rq_unlock_irqrestore(rq, &rf);
|
||||
}
|
||||
SCHED_WARN_ON(!list_empty(&local_unthrottle));
|
||||
WARN_ON_ONCE(!list_empty(&local_unthrottle));
|
||||
|
||||
rcu_read_unlock();
|
||||
|
||||
@ -6776,7 +6787,7 @@ static void hrtick_start_fair(struct rq *rq, struct task_struct *p)
|
||||
{
|
||||
struct sched_entity *se = &p->se;
|
||||
|
||||
SCHED_WARN_ON(task_rq(p) != rq);
|
||||
WARN_ON_ONCE(task_rq(p) != rq);
|
||||
|
||||
if (rq->cfs.h_nr_queued > 1) {
|
||||
u64 ran = se->sum_exec_runtime - se->prev_sum_exec_runtime;
|
||||
@ -6887,8 +6898,8 @@ requeue_delayed_entity(struct sched_entity *se)
|
||||
* Because a delayed entity is one that is still on
|
||||
* the runqueue competing until elegibility.
|
||||
*/
|
||||
SCHED_WARN_ON(!se->sched_delayed);
|
||||
SCHED_WARN_ON(!se->on_rq);
|
||||
WARN_ON_ONCE(!se->sched_delayed);
|
||||
WARN_ON_ONCE(!se->on_rq);
|
||||
|
||||
if (sched_feat(DELAY_ZERO)) {
|
||||
update_entity_lag(cfs_rq, se);
|
||||
@ -6991,6 +7002,8 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
|
||||
update_cfs_group(se);
|
||||
|
||||
se->slice = slice;
|
||||
if (se != cfs_rq->curr)
|
||||
min_vruntime_cb_propagate(&se->run_node, NULL);
|
||||
slice = cfs_rq_min_slice(cfs_rq);
|
||||
|
||||
cfs_rq->h_nr_runnable += h_nr_runnable;
|
||||
@ -7120,6 +7133,8 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags)
|
||||
update_cfs_group(se);
|
||||
|
||||
se->slice = slice;
|
||||
if (se != cfs_rq->curr)
|
||||
min_vruntime_cb_propagate(&se->run_node, NULL);
|
||||
slice = cfs_rq_min_slice(cfs_rq);
|
||||
|
||||
cfs_rq->h_nr_runnable -= h_nr_runnable;
|
||||
@ -7144,8 +7159,8 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags)
|
||||
rq->next_balance = jiffies;
|
||||
|
||||
if (p && task_delayed) {
|
||||
SCHED_WARN_ON(!task_sleep);
|
||||
SCHED_WARN_ON(p->on_rq != 1);
|
||||
WARN_ON_ONCE(!task_sleep);
|
||||
WARN_ON_ONCE(p->on_rq != 1);
|
||||
|
||||
/* Fix-up what dequeue_task_fair() skipped */
|
||||
hrtick_update(rq);
|
||||
@ -8723,7 +8738,7 @@ static inline void set_task_max_allowed_capacity(struct task_struct *p) {}
|
||||
static void set_next_buddy(struct sched_entity *se)
|
||||
{
|
||||
for_each_sched_entity(se) {
|
||||
if (SCHED_WARN_ON(!se->on_rq))
|
||||
if (WARN_ON_ONCE(!se->on_rq))
|
||||
return;
|
||||
if (se_is_idle(se))
|
||||
return;
|
||||
@ -8783,8 +8798,15 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
|
||||
* Preempt an idle entity in favor of a non-idle entity (and don't preempt
|
||||
* in the inverse case).
|
||||
*/
|
||||
if (cse_is_idle && !pse_is_idle)
|
||||
if (cse_is_idle && !pse_is_idle) {
|
||||
/*
|
||||
* When non-idle entity preempt an idle entity,
|
||||
* don't give idle entity slice protection.
|
||||
*/
|
||||
cancel_protect_slice(se);
|
||||
goto preempt;
|
||||
}
|
||||
|
||||
if (cse_is_idle != pse_is_idle)
|
||||
return;
|
||||
|
||||
@ -8803,8 +8825,8 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
|
||||
* Note that even if @p does not turn out to be the most eligible
|
||||
* task at this moment, current's slice protection will be lost.
|
||||
*/
|
||||
if (do_preempt_short(cfs_rq, pse, se) && se->vlag == se->deadline)
|
||||
se->vlag = se->deadline + 1;
|
||||
if (do_preempt_short(cfs_rq, pse, se))
|
||||
cancel_protect_slice(se);
|
||||
|
||||
/*
|
||||
* If @p has become the most eligible task, force preemption.
|
||||
@ -9417,12 +9439,11 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
|
||||
return 0;
|
||||
|
||||
/* Prevent to re-select dst_cpu via env's CPUs: */
|
||||
for_each_cpu_and(cpu, env->dst_grpmask, env->cpus) {
|
||||
if (cpumask_test_cpu(cpu, p->cpus_ptr)) {
|
||||
env->flags |= LBF_DST_PINNED;
|
||||
env->new_dst_cpu = cpu;
|
||||
break;
|
||||
}
|
||||
cpu = cpumask_first_and_and(env->dst_grpmask, env->cpus, p->cpus_ptr);
|
||||
|
||||
if (cpu < nr_cpu_ids) {
|
||||
env->flags |= LBF_DST_PINNED;
|
||||
env->new_dst_cpu = cpu;
|
||||
}
|
||||
|
||||
return 0;
|
||||
@ -12461,7 +12482,7 @@ unlock:
|
||||
|
||||
void nohz_balance_exit_idle(struct rq *rq)
|
||||
{
|
||||
SCHED_WARN_ON(rq != this_rq());
|
||||
WARN_ON_ONCE(rq != this_rq());
|
||||
|
||||
if (likely(!rq->nohz_tick_stopped))
|
||||
return;
|
||||
@ -12497,7 +12518,7 @@ void nohz_balance_enter_idle(int cpu)
|
||||
{
|
||||
struct rq *rq = cpu_rq(cpu);
|
||||
|
||||
SCHED_WARN_ON(cpu != smp_processor_id());
|
||||
WARN_ON_ONCE(cpu != smp_processor_id());
|
||||
|
||||
/* If this CPU is going down, then nothing needs to be done: */
|
||||
if (!cpu_active(cpu))
|
||||
@ -12580,7 +12601,7 @@ static void _nohz_idle_balance(struct rq *this_rq, unsigned int flags)
|
||||
int balance_cpu;
|
||||
struct rq *rq;
|
||||
|
||||
SCHED_WARN_ON((flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK);
|
||||
WARN_ON_ONCE((flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK);
|
||||
|
||||
/*
|
||||
* We assume there will be no idle load after this update and clear
|
||||
@ -13020,7 +13041,7 @@ bool cfs_prio_less(const struct task_struct *a, const struct task_struct *b,
|
||||
struct cfs_rq *cfs_rqb;
|
||||
s64 delta;
|
||||
|
||||
SCHED_WARN_ON(task_rq(b)->core != rq->core);
|
||||
WARN_ON_ONCE(task_rq(b)->core != rq->core);
|
||||
|
||||
#ifdef CONFIG_FAIR_GROUP_SCHED
|
||||
/*
|
||||
@ -13223,7 +13244,7 @@ static void switched_from_fair(struct rq *rq, struct task_struct *p)
|
||||
|
||||
static void switched_to_fair(struct rq *rq, struct task_struct *p)
|
||||
{
|
||||
SCHED_WARN_ON(p->se.sched_delayed);
|
||||
WARN_ON_ONCE(p->se.sched_delayed);
|
||||
|
||||
attach_task_cfs_rq(p);
|
||||
|
||||
@ -13258,7 +13279,7 @@ static void __set_next_task_fair(struct rq *rq, struct task_struct *p, bool firs
|
||||
if (!first)
|
||||
return;
|
||||
|
||||
SCHED_WARN_ON(se->sched_delayed);
|
||||
WARN_ON_ONCE(se->sched_delayed);
|
||||
|
||||
if (hrtick_enabled_fair(rq))
|
||||
hrtick_start_fair(rq, p);
|
||||
@ -13645,7 +13666,6 @@ DEFINE_SCHED_CLASS(fair) = {
|
||||
#endif
|
||||
};
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
void print_cfs_stats(struct seq_file *m, int cpu)
|
||||
{
|
||||
struct cfs_rq *cfs_rq, *pos;
|
||||
@ -13679,7 +13699,6 @@ void show_numa_stats(struct task_struct *p, struct seq_file *m)
|
||||
rcu_read_unlock();
|
||||
}
|
||||
#endif /* CONFIG_NUMA_BALANCING */
|
||||
#endif /* CONFIG_SCHED_DEBUG */
|
||||
|
||||
__init void init_sched_fair_class(void)
|
||||
{
|
||||
|
@ -169,9 +169,8 @@ static void destroy_rt_bandwidth(struct rt_bandwidth *rt_b)
|
||||
|
||||
static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se)
|
||||
{
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
WARN_ON_ONCE(!rt_entity_is_task(rt_se));
|
||||
#endif
|
||||
|
||||
return container_of(rt_se, struct task_struct, rt);
|
||||
}
|
||||
|
||||
@ -1713,7 +1712,7 @@ static struct sched_rt_entity *pick_next_rt_entity(struct rt_rq *rt_rq)
|
||||
BUG_ON(idx >= MAX_RT_PRIO);
|
||||
|
||||
queue = array->queue + idx;
|
||||
if (SCHED_WARN_ON(list_empty(queue)))
|
||||
if (WARN_ON_ONCE(list_empty(queue)))
|
||||
return NULL;
|
||||
next = list_entry(queue->next, struct sched_rt_entity, run_list);
|
||||
|
||||
@ -2910,6 +2909,7 @@ static int sched_rt_handler(const struct ctl_table *table, int write, void *buff
|
||||
int ret;
|
||||
|
||||
mutex_lock(&mutex);
|
||||
sched_domains_mutex_lock();
|
||||
old_period = sysctl_sched_rt_period;
|
||||
old_runtime = sysctl_sched_rt_runtime;
|
||||
|
||||
@ -2936,6 +2936,7 @@ undo:
|
||||
sysctl_sched_rt_period = old_period;
|
||||
sysctl_sched_rt_runtime = old_runtime;
|
||||
}
|
||||
sched_domains_mutex_unlock();
|
||||
mutex_unlock(&mutex);
|
||||
|
||||
return ret;
|
||||
@ -2967,7 +2968,6 @@ static int sched_rr_handler(const struct ctl_table *table, int write, void *buff
|
||||
}
|
||||
#endif /* CONFIG_SYSCTL */
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
void print_rt_stats(struct seq_file *m, int cpu)
|
||||
{
|
||||
rt_rq_iter_t iter;
|
||||
@ -2978,4 +2978,3 @@ void print_rt_stats(struct seq_file *m, int cpu)
|
||||
print_rt_rq(m, cpu, rt_rq);
|
||||
rcu_read_unlock();
|
||||
}
|
||||
#endif /* CONFIG_SCHED_DEBUG */
|
||||
|
@ -91,12 +91,6 @@ struct cpuidle_state;
|
||||
#include "cpupri.h"
|
||||
#include "cpudeadline.h"
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
# define SCHED_WARN_ON(x) WARN_ONCE(x, #x)
|
||||
#else
|
||||
# define SCHED_WARN_ON(x) ({ (void)(x), 0; })
|
||||
#endif
|
||||
|
||||
/* task_struct::on_rq states: */
|
||||
#define TASK_ON_RQ_QUEUED 1
|
||||
#define TASK_ON_RQ_MIGRATING 2
|
||||
@ -998,7 +992,7 @@ struct root_domain {
|
||||
* Also, some corner cases, like 'wrap around' is dangerous, but given
|
||||
* that u64 is 'big enough'. So that shouldn't be a concern.
|
||||
*/
|
||||
u64 visit_gen;
|
||||
u64 visit_cookie;
|
||||
|
||||
#ifdef HAVE_RT_PUSH_IPI
|
||||
/*
|
||||
@ -1180,10 +1174,8 @@ struct rq {
|
||||
|
||||
atomic_t nr_iowait;
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
u64 last_seen_need_resched_ns;
|
||||
int ticks_without_resched;
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_MEMBARRIER
|
||||
int membarrier_state;
|
||||
@ -1571,7 +1563,7 @@ static inline void update_idle_core(struct rq *rq) { }
|
||||
|
||||
static inline struct task_struct *task_of(struct sched_entity *se)
|
||||
{
|
||||
SCHED_WARN_ON(!entity_is_task(se));
|
||||
WARN_ON_ONCE(!entity_is_task(se));
|
||||
return container_of(se, struct task_struct, se);
|
||||
}
|
||||
|
||||
@ -1652,7 +1644,7 @@ static inline void assert_clock_updated(struct rq *rq)
|
||||
* The only reason for not seeing a clock update since the
|
||||
* last rq_pin_lock() is if we're currently skipping updates.
|
||||
*/
|
||||
SCHED_WARN_ON(rq->clock_update_flags < RQCF_ACT_SKIP);
|
||||
WARN_ON_ONCE(rq->clock_update_flags < RQCF_ACT_SKIP);
|
||||
}
|
||||
|
||||
static inline u64 rq_clock(struct rq *rq)
|
||||
@ -1699,7 +1691,7 @@ static inline void rq_clock_cancel_skipupdate(struct rq *rq)
|
||||
static inline void rq_clock_start_loop_update(struct rq *rq)
|
||||
{
|
||||
lockdep_assert_rq_held(rq);
|
||||
SCHED_WARN_ON(rq->clock_update_flags & RQCF_ACT_SKIP);
|
||||
WARN_ON_ONCE(rq->clock_update_flags & RQCF_ACT_SKIP);
|
||||
rq->clock_update_flags |= RQCF_ACT_SKIP;
|
||||
}
|
||||
|
||||
@ -1712,14 +1704,12 @@ static inline void rq_clock_stop_loop_update(struct rq *rq)
|
||||
struct rq_flags {
|
||||
unsigned long flags;
|
||||
struct pin_cookie cookie;
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
/*
|
||||
* A copy of (rq::clock_update_flags & RQCF_UPDATED) for the
|
||||
* current pin context is stashed here in case it needs to be
|
||||
* restored in rq_repin_lock().
|
||||
*/
|
||||
unsigned int clock_update_flags;
|
||||
#endif
|
||||
};
|
||||
|
||||
extern struct balance_callback balance_push_callback;
|
||||
@ -1770,21 +1760,18 @@ static inline void rq_pin_lock(struct rq *rq, struct rq_flags *rf)
|
||||
{
|
||||
rf->cookie = lockdep_pin_lock(__rq_lockp(rq));
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
rq->clock_update_flags &= (RQCF_REQ_SKIP|RQCF_ACT_SKIP);
|
||||
rf->clock_update_flags = 0;
|
||||
# ifdef CONFIG_SMP
|
||||
SCHED_WARN_ON(rq->balance_callback && rq->balance_callback != &balance_push_callback);
|
||||
# endif
|
||||
#ifdef CONFIG_SMP
|
||||
WARN_ON_ONCE(rq->balance_callback && rq->balance_callback != &balance_push_callback);
|
||||
#endif
|
||||
}
|
||||
|
||||
static inline void rq_unpin_lock(struct rq *rq, struct rq_flags *rf)
|
||||
{
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
if (rq->clock_update_flags > RQCF_ACT_SKIP)
|
||||
rf->clock_update_flags = RQCF_UPDATED;
|
||||
#endif
|
||||
|
||||
scx_rq_clock_invalidate(rq);
|
||||
lockdep_unpin_lock(__rq_lockp(rq), rf->cookie);
|
||||
}
|
||||
@ -1793,12 +1780,10 @@ static inline void rq_repin_lock(struct rq *rq, struct rq_flags *rf)
|
||||
{
|
||||
lockdep_repin_lock(__rq_lockp(rq), rf->cookie);
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
/*
|
||||
* Restore the value we stashed in @rf for this pin context.
|
||||
*/
|
||||
rq->clock_update_flags |= rf->clock_update_flags;
|
||||
#endif
|
||||
}
|
||||
|
||||
extern
|
||||
@ -2072,9 +2057,7 @@ struct sched_group_capacity {
|
||||
unsigned long next_update;
|
||||
int imbalance; /* XXX unrelated to capacity but shared group state */
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
int id;
|
||||
#endif
|
||||
|
||||
unsigned long cpumask[]; /* Balance mask */
|
||||
};
|
||||
@ -2114,13 +2097,8 @@ static inline struct cpumask *group_balance_mask(struct sched_group *sg)
|
||||
|
||||
extern int group_balance_cpu(struct sched_group *sg);
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
extern void update_sched_domain_debugfs(void);
|
||||
extern void dirty_sched_domain_sysctl(int cpu);
|
||||
#else
|
||||
static inline void update_sched_domain_debugfs(void) { }
|
||||
static inline void dirty_sched_domain_sysctl(int cpu) { }
|
||||
#endif
|
||||
|
||||
extern int sched_update_scaling(void);
|
||||
|
||||
@ -2200,13 +2178,8 @@ static inline void __set_task_cpu(struct task_struct *p, unsigned int cpu)
|
||||
}
|
||||
|
||||
/*
|
||||
* Tunables that become constants when CONFIG_SCHED_DEBUG is off:
|
||||
* Tunables:
|
||||
*/
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
# define const_debug __read_mostly
|
||||
#else
|
||||
# define const_debug const
|
||||
#endif
|
||||
|
||||
#define SCHED_FEAT(name, enabled) \
|
||||
__SCHED_FEAT_##name ,
|
||||
@ -2218,13 +2191,11 @@ enum {
|
||||
|
||||
#undef SCHED_FEAT
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
|
||||
/*
|
||||
* To support run-time toggling of sched features, all the translation units
|
||||
* (but core.c) reference the sysctl_sched_features defined in core.c.
|
||||
*/
|
||||
extern const_debug unsigned int sysctl_sched_features;
|
||||
extern __read_mostly unsigned int sysctl_sched_features;
|
||||
|
||||
#ifdef CONFIG_JUMP_LABEL
|
||||
|
||||
@ -2246,24 +2217,6 @@ extern struct static_key sched_feat_keys[__SCHED_FEAT_NR];
|
||||
|
||||
#endif /* !CONFIG_JUMP_LABEL */
|
||||
|
||||
#else /* !SCHED_DEBUG: */
|
||||
|
||||
/*
|
||||
* Each translation unit has its own copy of sysctl_sched_features to allow
|
||||
* constants propagation at compile time and compiler optimization based on
|
||||
* features default.
|
||||
*/
|
||||
#define SCHED_FEAT(name, enabled) \
|
||||
(1UL << __SCHED_FEAT_##name) * enabled |
|
||||
static const_debug __maybe_unused unsigned int sysctl_sched_features =
|
||||
#include "features.h"
|
||||
0;
|
||||
#undef SCHED_FEAT
|
||||
|
||||
#define sched_feat(x) !!(sysctl_sched_features & (1UL << __SCHED_FEAT_##x))
|
||||
|
||||
#endif /* !SCHED_DEBUG */
|
||||
|
||||
extern struct static_key_false sched_numa_balancing;
|
||||
extern struct static_key_false sched_schedstats;
|
||||
|
||||
@ -2685,7 +2638,7 @@ static inline void idle_set_state(struct rq *rq,
|
||||
|
||||
static inline struct cpuidle_state *idle_get_state(struct rq *rq)
|
||||
{
|
||||
SCHED_WARN_ON(!rcu_read_lock_held());
|
||||
WARN_ON_ONCE(!rcu_read_lock_held());
|
||||
|
||||
return rq->idle_state;
|
||||
}
|
||||
@ -2843,12 +2796,11 @@ extern void wakeup_preempt(struct rq *rq, struct task_struct *p, int flags);
|
||||
# define SCHED_NR_MIGRATE_BREAK 32
|
||||
#endif
|
||||
|
||||
extern const_debug unsigned int sysctl_sched_nr_migrate;
|
||||
extern const_debug unsigned int sysctl_sched_migration_cost;
|
||||
extern __read_mostly unsigned int sysctl_sched_nr_migrate;
|
||||
extern __read_mostly unsigned int sysctl_sched_migration_cost;
|
||||
|
||||
extern unsigned int sysctl_sched_base_slice;
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
extern int sysctl_resched_latency_warn_ms;
|
||||
extern int sysctl_resched_latency_warn_once;
|
||||
|
||||
@ -2859,7 +2811,6 @@ extern unsigned int sysctl_numa_balancing_scan_period_min;
|
||||
extern unsigned int sysctl_numa_balancing_scan_period_max;
|
||||
extern unsigned int sysctl_numa_balancing_scan_size;
|
||||
extern unsigned int sysctl_numa_balancing_hot_threshold;
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_SCHED_HRTICK
|
||||
|
||||
@ -2932,7 +2883,6 @@ unsigned long arch_scale_freq_capacity(int cpu)
|
||||
}
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
/*
|
||||
* In double_lock_balance()/double_rq_lock(), we use raw_spin_rq_lock() to
|
||||
* acquire rq lock instead of rq_lock(). So at the end of these two functions
|
||||
@ -2947,9 +2897,6 @@ static inline void double_rq_clock_clear_update(struct rq *rq1, struct rq *rq2)
|
||||
rq2->clock_update_flags &= (RQCF_REQ_SKIP|RQCF_ACT_SKIP);
|
||||
#endif
|
||||
}
|
||||
#else
|
||||
static inline void double_rq_clock_clear_update(struct rq *rq1, struct rq *rq2) { }
|
||||
#endif
|
||||
|
||||
#define DEFINE_LOCK_GUARD_2(name, type, _lock, _unlock, ...) \
|
||||
__DEFINE_UNLOCK_GUARD(name, type, _unlock, type *lock2; __VA_ARGS__) \
|
||||
@ -3162,7 +3109,6 @@ extern struct sched_entity *__pick_root_entity(struct cfs_rq *cfs_rq);
|
||||
extern struct sched_entity *__pick_first_entity(struct cfs_rq *cfs_rq);
|
||||
extern struct sched_entity *__pick_last_entity(struct cfs_rq *cfs_rq);
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
extern bool sched_debug_verbose;
|
||||
|
||||
extern void print_cfs_stats(struct seq_file *m, int cpu);
|
||||
@ -3173,15 +3119,12 @@ extern void print_rt_rq(struct seq_file *m, int cpu, struct rt_rq *rt_rq);
|
||||
extern void print_dl_rq(struct seq_file *m, int cpu, struct dl_rq *dl_rq);
|
||||
|
||||
extern void resched_latency_warn(int cpu, u64 latency);
|
||||
# ifdef CONFIG_NUMA_BALANCING
|
||||
#ifdef CONFIG_NUMA_BALANCING
|
||||
extern void show_numa_stats(struct task_struct *p, struct seq_file *m);
|
||||
extern void
|
||||
print_numa_stats(struct seq_file *m, int node, unsigned long tsf,
|
||||
unsigned long tpf, unsigned long gsf, unsigned long gpf);
|
||||
# endif /* CONFIG_NUMA_BALANCING */
|
||||
#else /* !CONFIG_SCHED_DEBUG: */
|
||||
static inline void resched_latency_warn(int cpu, u64 latency) { }
|
||||
#endif /* !CONFIG_SCHED_DEBUG */
|
||||
#endif /* CONFIG_NUMA_BALANCING */
|
||||
|
||||
extern void init_cfs_rq(struct cfs_rq *cfs_rq);
|
||||
extern void init_rt_rq(struct rt_rq *rt_rq);
|
||||
@ -3394,6 +3337,31 @@ static inline bool update_other_load_avgs(struct rq *rq) { return false; }
|
||||
|
||||
unsigned long uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id);
|
||||
|
||||
/*
|
||||
* When uclamp is compiled in, the aggregation at rq level is 'turned off'
|
||||
* by default in the fast path and only gets turned on once userspace performs
|
||||
* an operation that requires it.
|
||||
*
|
||||
* Returns true if userspace opted-in to use uclamp and aggregation at rq level
|
||||
* hence is active.
|
||||
*/
|
||||
static inline bool uclamp_is_used(void)
|
||||
{
|
||||
return static_branch_likely(&sched_uclamp_used);
|
||||
}
|
||||
|
||||
/*
|
||||
* Enabling static branches would get the cpus_read_lock(),
|
||||
* check whether uclamp_is_used before enable it to avoid always
|
||||
* calling cpus_read_lock(). Because we never disable this
|
||||
* static key once enable it.
|
||||
*/
|
||||
static inline void sched_uclamp_enable(void)
|
||||
{
|
||||
if (!uclamp_is_used())
|
||||
static_branch_enable(&sched_uclamp_used);
|
||||
}
|
||||
|
||||
static inline unsigned long uclamp_rq_get(struct rq *rq,
|
||||
enum uclamp_id clamp_id)
|
||||
{
|
||||
@ -3417,7 +3385,7 @@ static inline bool uclamp_rq_is_capped(struct rq *rq)
|
||||
unsigned long rq_util;
|
||||
unsigned long max_util;
|
||||
|
||||
if (!static_branch_likely(&sched_uclamp_used))
|
||||
if (!uclamp_is_used())
|
||||
return false;
|
||||
|
||||
rq_util = cpu_util_cfs(cpu_of(rq)) + cpu_util_rt(rq);
|
||||
@ -3426,19 +3394,6 @@ static inline bool uclamp_rq_is_capped(struct rq *rq)
|
||||
return max_util != SCHED_CAPACITY_SCALE && rq_util >= max_util;
|
||||
}
|
||||
|
||||
/*
|
||||
* When uclamp is compiled in, the aggregation at rq level is 'turned off'
|
||||
* by default in the fast path and only gets turned on once userspace performs
|
||||
* an operation that requires it.
|
||||
*
|
||||
* Returns true if userspace opted-in to use uclamp and aggregation at rq level
|
||||
* hence is active.
|
||||
*/
|
||||
static inline bool uclamp_is_used(void)
|
||||
{
|
||||
return static_branch_likely(&sched_uclamp_used);
|
||||
}
|
||||
|
||||
#define for_each_clamp_id(clamp_id) \
|
||||
for ((clamp_id) = 0; (clamp_id) < UCLAMP_CNT; (clamp_id)++)
|
||||
|
||||
@ -3486,6 +3441,8 @@ static inline bool uclamp_is_used(void)
|
||||
return false;
|
||||
}
|
||||
|
||||
static inline void sched_uclamp_enable(void) {}
|
||||
|
||||
static inline unsigned long
|
||||
uclamp_rq_get(struct rq *rq, enum uclamp_id clamp_id)
|
||||
{
|
||||
@ -3619,6 +3576,7 @@ extern int preempt_dynamic_mode;
|
||||
extern int sched_dynamic_mode(const char *str);
|
||||
extern void sched_dynamic_update(int mode);
|
||||
#endif
|
||||
extern const char *preempt_modes[];
|
||||
|
||||
#ifdef CONFIG_SCHED_MM_CID
|
||||
|
||||
|
@ -144,7 +144,7 @@ static inline void psi_enqueue(struct task_struct *p, int flags)
|
||||
|
||||
if (p->se.sched_delayed) {
|
||||
/* CPU migration of "sleeping" task */
|
||||
SCHED_WARN_ON(!(flags & ENQUEUE_MIGRATED));
|
||||
WARN_ON_ONCE(!(flags & ENQUEUE_MIGRATED));
|
||||
if (p->in_memstall)
|
||||
set |= TSK_MEMSTALL;
|
||||
if (p->in_iowait)
|
||||
|
@ -368,7 +368,7 @@ static int uclamp_validate(struct task_struct *p,
|
||||
* blocking operation which obviously cannot be done while holding
|
||||
* scheduler locks.
|
||||
*/
|
||||
static_branch_enable(&sched_uclamp_used);
|
||||
sched_uclamp_enable();
|
||||
|
||||
return 0;
|
||||
}
|
||||
@ -875,7 +875,7 @@ do_sched_setscheduler(pid_t pid, int policy, struct sched_param __user *param)
|
||||
{
|
||||
struct sched_param lparam;
|
||||
|
||||
if (!param || pid < 0)
|
||||
if (unlikely(!param || pid < 0))
|
||||
return -EINVAL;
|
||||
if (copy_from_user(&lparam, param, sizeof(struct sched_param)))
|
||||
return -EFAULT;
|
||||
@ -984,7 +984,7 @@ SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
|
||||
struct sched_attr attr;
|
||||
int retval;
|
||||
|
||||
if (!uattr || pid < 0 || flags)
|
||||
if (unlikely(!uattr || pid < 0 || flags))
|
||||
return -EINVAL;
|
||||
|
||||
retval = sched_copy_attr(uattr, &attr);
|
||||
@ -1049,7 +1049,7 @@ SYSCALL_DEFINE2(sched_getparam, pid_t, pid, struct sched_param __user *, param)
|
||||
struct task_struct *p;
|
||||
int retval;
|
||||
|
||||
if (!param || pid < 0)
|
||||
if (unlikely(!param || pid < 0))
|
||||
return -EINVAL;
|
||||
|
||||
scoped_guard (rcu) {
|
||||
@ -1085,8 +1085,8 @@ SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
|
||||
struct task_struct *p;
|
||||
int retval;
|
||||
|
||||
if (!uattr || pid < 0 || usize > PAGE_SIZE ||
|
||||
usize < SCHED_ATTR_SIZE_VER0 || flags)
|
||||
if (unlikely(!uattr || pid < 0 || usize > PAGE_SIZE ||
|
||||
usize < SCHED_ATTR_SIZE_VER0 || flags))
|
||||
return -EINVAL;
|
||||
|
||||
scoped_guard (rcu) {
|
||||
|
@ -6,13 +6,19 @@
|
||||
#include <linux/bsearch.h>
|
||||
|
||||
DEFINE_MUTEX(sched_domains_mutex);
|
||||
void sched_domains_mutex_lock(void)
|
||||
{
|
||||
mutex_lock(&sched_domains_mutex);
|
||||
}
|
||||
void sched_domains_mutex_unlock(void)
|
||||
{
|
||||
mutex_unlock(&sched_domains_mutex);
|
||||
}
|
||||
|
||||
/* Protected by sched_domains_mutex: */
|
||||
static cpumask_var_t sched_domains_tmpmask;
|
||||
static cpumask_var_t sched_domains_tmpmask2;
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
|
||||
static int __init sched_debug_setup(char *str)
|
||||
{
|
||||
sched_debug_verbose = true;
|
||||
@ -151,15 +157,6 @@ static void sched_domain_debug(struct sched_domain *sd, int cpu)
|
||||
break;
|
||||
}
|
||||
}
|
||||
#else /* !CONFIG_SCHED_DEBUG */
|
||||
|
||||
# define sched_debug_verbose 0
|
||||
# define sched_domain_debug(sd, cpu) do { } while (0)
|
||||
static inline bool sched_debug(void)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
#endif /* CONFIG_SCHED_DEBUG */
|
||||
|
||||
/* Generate a mask of SD flags with the SDF_NEEDS_GROUPS metaflag */
|
||||
#define SD_FLAG(name, mflags) (name * !!((mflags) & SDF_NEEDS_GROUPS)) |
|
||||
@ -560,7 +557,7 @@ static int init_rootdomain(struct root_domain *rd)
|
||||
rd->rto_push_work = IRQ_WORK_INIT_HARD(rto_push_irq_work_func);
|
||||
#endif
|
||||
|
||||
rd->visit_gen = 0;
|
||||
rd->visit_cookie = 0;
|
||||
init_dl_bw(&rd->dl_bw);
|
||||
if (cpudl_init(&rd->cpudl) != 0)
|
||||
goto free_rto_mask;
|
||||
@ -2275,9 +2272,7 @@ static int __sdt_alloc(const struct cpumask *cpu_map)
|
||||
if (!sgc)
|
||||
return -ENOMEM;
|
||||
|
||||
#ifdef CONFIG_SCHED_DEBUG
|
||||
sgc->id = j;
|
||||
#endif
|
||||
|
||||
*per_cpu_ptr(sdd->sgc, j) = sgc;
|
||||
}
|
||||
@ -2680,7 +2675,7 @@ static int dattrs_equal(struct sched_domain_attr *cur, int idx_cur,
|
||||
*
|
||||
* Call with hotplug lock and sched_domains_mutex held
|
||||
*/
|
||||
void partition_sched_domains_locked(int ndoms_new, cpumask_var_t doms_new[],
|
||||
static void partition_sched_domains_locked(int ndoms_new, cpumask_var_t doms_new[],
|
||||
struct sched_domain_attr *dattr_new)
|
||||
{
|
||||
bool __maybe_unused has_eas = false;
|
||||
@ -2712,21 +2707,8 @@ void partition_sched_domains_locked(int ndoms_new, cpumask_var_t doms_new[],
|
||||
for (i = 0; i < ndoms_cur; i++) {
|
||||
for (j = 0; j < n && !new_topology; j++) {
|
||||
if (cpumask_equal(doms_cur[i], doms_new[j]) &&
|
||||
dattrs_equal(dattr_cur, i, dattr_new, j)) {
|
||||
struct root_domain *rd;
|
||||
|
||||
/*
|
||||
* This domain won't be destroyed and as such
|
||||
* its dl_bw->total_bw needs to be cleared.
|
||||
* Tasks contribution will be then recomputed
|
||||
* in function dl_update_tasks_root_domain(),
|
||||
* dl_servers contribution in function
|
||||
* dl_restore_server_root_domain().
|
||||
*/
|
||||
rd = cpu_rq(cpumask_any(doms_cur[i]))->rd;
|
||||
dl_clear_root_domain(rd);
|
||||
dattrs_equal(dattr_cur, i, dattr_new, j))
|
||||
goto match1;
|
||||
}
|
||||
}
|
||||
/* No match - a current sched domain not in new doms_new[] */
|
||||
detach_destroy_domains(doms_cur[i]);
|
||||
@ -2783,6 +2765,7 @@ match3:
|
||||
ndoms_cur = ndoms_new;
|
||||
|
||||
update_sched_domain_debugfs();
|
||||
dl_rebuild_rd_accounting();
|
||||
}
|
||||
|
||||
/*
|
||||
@ -2791,7 +2774,7 @@ match3:
|
||||
void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
|
||||
struct sched_domain_attr *dattr_new)
|
||||
{
|
||||
mutex_lock(&sched_domains_mutex);
|
||||
sched_domains_mutex_lock();
|
||||
partition_sched_domains_locked(ndoms_new, doms_new, dattr_new);
|
||||
mutex_unlock(&sched_domains_mutex);
|
||||
sched_domains_mutex_unlock();
|
||||
}
|
||||
|
@ -4100,12 +4100,7 @@ print_trace_header(struct seq_file *m, struct trace_iterator *iter)
|
||||
entries,
|
||||
total,
|
||||
buf->cpu,
|
||||
preempt_model_none() ? "server" :
|
||||
preempt_model_voluntary() ? "desktop" :
|
||||
preempt_model_full() ? "preempt" :
|
||||
preempt_model_lazy() ? "lazy" :
|
||||
preempt_model_rt() ? "preempt_rt" :
|
||||
"unknown",
|
||||
preempt_model_str(),
|
||||
/* These are reserved for later use */
|
||||
0, 0, 0, 0);
|
||||
#ifdef CONFIG_SMP
|
||||
|
@ -1321,15 +1321,6 @@ endmenu # "Debug lockups and hangs"
|
||||
|
||||
menu "Scheduler Debugging"
|
||||
|
||||
config SCHED_DEBUG
|
||||
bool "Collect scheduler debugging info"
|
||||
depends on DEBUG_KERNEL && DEBUG_FS
|
||||
default y
|
||||
help
|
||||
If you say Y here, the /sys/kernel/debug/sched file will be provided
|
||||
that can help debug the scheduler. The runtime overhead of this
|
||||
option is minimal.
|
||||
|
||||
config SCHED_INFO
|
||||
bool
|
||||
default n
|
||||
|
@ -54,7 +54,7 @@ void __init dump_stack_set_arch_desc(const char *fmt, ...)
|
||||
*/
|
||||
void dump_stack_print_info(const char *log_lvl)
|
||||
{
|
||||
printk("%sCPU: %d UID: %u PID: %d Comm: %.20s %s%s %s %.*s" BUILD_ID_FMT "\n",
|
||||
printk("%sCPU: %d UID: %u PID: %d Comm: %.20s %s%s %s %.*s %s " BUILD_ID_FMT "\n",
|
||||
log_lvl, raw_smp_processor_id(),
|
||||
__kuid_val(current_real_cred()->euid),
|
||||
current->pid, current->comm,
|
||||
@ -62,7 +62,7 @@ void dump_stack_print_info(const char *log_lvl)
|
||||
print_tainted(),
|
||||
init_utsname()->release,
|
||||
(int)strcspn(init_utsname()->version, " "),
|
||||
init_utsname()->version, BUILD_ID_VAL);
|
||||
init_utsname()->version, preempt_model_str(), BUILD_ID_VAL);
|
||||
|
||||
if (get_taint())
|
||||
printk("%s%s\n", log_lvl, print_tainted_verbose());
|
||||
|
1
tools/testing/selftests/rseq/.gitignore
vendored
1
tools/testing/selftests/rseq/.gitignore
vendored
@ -9,3 +9,4 @@ param_test_compare_twice
|
||||
param_test_mm_cid
|
||||
param_test_mm_cid_benchmark
|
||||
param_test_mm_cid_compare_twice
|
||||
syscall_errors_test
|
||||
|
@ -16,11 +16,12 @@ OVERRIDE_TARGETS = 1
|
||||
|
||||
TEST_GEN_PROGS = basic_test basic_percpu_ops_test basic_percpu_ops_mm_cid_test param_test \
|
||||
param_test_benchmark param_test_compare_twice param_test_mm_cid \
|
||||
param_test_mm_cid_benchmark param_test_mm_cid_compare_twice
|
||||
param_test_mm_cid_benchmark param_test_mm_cid_compare_twice \
|
||||
syscall_errors_test
|
||||
|
||||
TEST_GEN_PROGS_EXTENDED = librseq.so
|
||||
|
||||
TEST_PROGS = run_param_test.sh
|
||||
TEST_PROGS = run_param_test.sh run_syscall_errors_test.sh
|
||||
|
||||
TEST_FILES := settings
|
||||
|
||||
@ -54,3 +55,7 @@ $(OUTPUT)/param_test_mm_cid_benchmark: param_test.c $(TEST_GEN_PROGS_EXTENDED) \
|
||||
$(OUTPUT)/param_test_mm_cid_compare_twice: param_test.c $(TEST_GEN_PROGS_EXTENDED) \
|
||||
rseq.h rseq-*.h
|
||||
$(CC) $(CFLAGS) -DBUILDOPT_RSEQ_PERCPU_MM_CID -DRSEQ_COMPARE_TWICE $< $(LDLIBS) -lrseq -o $@
|
||||
|
||||
$(OUTPUT)/syscall_errors_test: syscall_errors_test.c $(TEST_GEN_PROGS_EXTENDED) \
|
||||
rseq.h rseq-*.h
|
||||
$(CC) $(CFLAGS) $< $(LDLIBS) -lrseq -o $@
|
||||
|
@ -71,9 +71,20 @@ static int rseq_ownership;
|
||||
/* Original struct rseq allocation size is 32 bytes. */
|
||||
#define ORIG_RSEQ_ALLOC_SIZE 32
|
||||
|
||||
/*
|
||||
* Use a union to ensure we allocate a TLS area of 1024 bytes to accomodate an
|
||||
* rseq registration that is larger than the current rseq ABI.
|
||||
*/
|
||||
union rseq_tls {
|
||||
struct rseq_abi abi;
|
||||
char dummy[RSEQ_THREAD_AREA_ALLOC_SIZE];
|
||||
};
|
||||
|
||||
static
|
||||
__thread struct rseq_abi __rseq_abi __attribute__((tls_model("initial-exec"), aligned(RSEQ_THREAD_AREA_ALLOC_SIZE))) = {
|
||||
.cpu_id = RSEQ_ABI_CPU_ID_UNINITIALIZED,
|
||||
__thread union rseq_tls __rseq __attribute__((tls_model("initial-exec"))) = {
|
||||
.abi = {
|
||||
.cpu_id = RSEQ_ABI_CPU_ID_UNINITIALIZED,
|
||||
},
|
||||
};
|
||||
|
||||
static int sys_rseq(struct rseq_abi *rseq_abi, uint32_t rseq_len,
|
||||
@ -87,7 +98,7 @@ static int sys_getcpu(unsigned *cpu, unsigned *node)
|
||||
return syscall(__NR_getcpu, cpu, node, NULL);
|
||||
}
|
||||
|
||||
int rseq_available(void)
|
||||
bool rseq_available(void)
|
||||
{
|
||||
int rc;
|
||||
|
||||
@ -96,9 +107,9 @@ int rseq_available(void)
|
||||
abort();
|
||||
switch (errno) {
|
||||
case ENOSYS:
|
||||
return 0;
|
||||
return false;
|
||||
case EINVAL:
|
||||
return 1;
|
||||
return true;
|
||||
default:
|
||||
abort();
|
||||
}
|
||||
@ -149,7 +160,7 @@ int rseq_register_current_thread(void)
|
||||
/* Treat libc's ownership as a successful registration. */
|
||||
return 0;
|
||||
}
|
||||
rc = sys_rseq(&__rseq_abi, get_rseq_min_alloc_size(), 0, RSEQ_SIG);
|
||||
rc = sys_rseq(&__rseq.abi, get_rseq_min_alloc_size(), 0, RSEQ_SIG);
|
||||
if (rc) {
|
||||
/*
|
||||
* After at least one thread has registered successfully
|
||||
@ -183,7 +194,7 @@ int rseq_unregister_current_thread(void)
|
||||
/* Treat libc's ownership as a successful unregistration. */
|
||||
return 0;
|
||||
}
|
||||
rc = sys_rseq(&__rseq_abi, get_rseq_min_alloc_size(), RSEQ_ABI_FLAG_UNREGISTER, RSEQ_SIG);
|
||||
rc = sys_rseq(&__rseq.abi, get_rseq_min_alloc_size(), RSEQ_ABI_FLAG_UNREGISTER, RSEQ_SIG);
|
||||
if (rc)
|
||||
return -1;
|
||||
return 0;
|
||||
@ -249,7 +260,7 @@ void rseq_init(void)
|
||||
rseq_ownership = 1;
|
||||
|
||||
/* Calculate the offset of the rseq area from the thread pointer. */
|
||||
rseq_offset = (void *)&__rseq_abi - rseq_thread_pointer();
|
||||
rseq_offset = (void *)&__rseq.abi - rseq_thread_pointer();
|
||||
|
||||
/* rseq flags are deprecated, always set to 0. */
|
||||
rseq_flags = 0;
|
||||
|
@ -159,6 +159,11 @@ int32_t rseq_fallback_current_cpu(void);
|
||||
*/
|
||||
int32_t rseq_fallback_current_node(void);
|
||||
|
||||
/*
|
||||
* Returns true if rseq is supported.
|
||||
*/
|
||||
bool rseq_available(void);
|
||||
|
||||
/*
|
||||
* Values returned can be either the current CPU number, -1 (rseq is
|
||||
* uninitialized), or -2 (rseq initialization has failed).
|
||||
|
5
tools/testing/selftests/rseq/run_syscall_errors_test.sh
Executable file
5
tools/testing/selftests/rseq/run_syscall_errors_test.sh
Executable file
@ -0,0 +1,5 @@
|
||||
#!/bin/bash
|
||||
# SPDX-License-Identifier: MIT
|
||||
# SPDX-FileCopyrightText: 2024 Michael Jeanson <mjeanson@efficios.com>
|
||||
|
||||
GLIBC_TUNABLES="${GLIBC_TUNABLES:-}:glibc.pthread.rseq=0" ./syscall_errors_test
|
124
tools/testing/selftests/rseq/syscall_errors_test.c
Normal file
124
tools/testing/selftests/rseq/syscall_errors_test.c
Normal file
@ -0,0 +1,124 @@
|
||||
// SPDX-License-Identifier: MIT
|
||||
// SPDX-FileCopyrightText: 2024 Michael Jeanson <mjeanson@efficios.com>
|
||||
|
||||
#ifndef _GNU_SOURCE
|
||||
#define _GNU_SOURCE
|
||||
#endif
|
||||
|
||||
#include <assert.h>
|
||||
#include <stdint.h>
|
||||
#include <syscall.h>
|
||||
#include <string.h>
|
||||
#include <unistd.h>
|
||||
|
||||
#include "rseq.h"
|
||||
|
||||
static int sys_rseq(void *rseq_abi, uint32_t rseq_len,
|
||||
int flags, uint32_t sig)
|
||||
{
|
||||
return syscall(__NR_rseq, rseq_abi, rseq_len, flags, sig);
|
||||
}
|
||||
|
||||
/*
|
||||
* Check the value of errno on some expected failures of the rseq syscall.
|
||||
*/
|
||||
|
||||
int main(void)
|
||||
{
|
||||
struct rseq_abi *global_rseq = rseq_get_abi();
|
||||
int ret;
|
||||
int errno_copy;
|
||||
|
||||
if (!rseq_available()) {
|
||||
fprintf(stderr, "rseq syscall unavailable");
|
||||
goto error;
|
||||
}
|
||||
|
||||
/* The current thread is NOT registered. */
|
||||
|
||||
/* EINVAL */
|
||||
errno = 0;
|
||||
ret = sys_rseq(global_rseq, 32, -1, RSEQ_SIG);
|
||||
errno_copy = errno;
|
||||
fprintf(stderr, "Registration with invalid flag fails with errno set to EINVAL (ret = %d, errno = %s)\n", ret, strerrorname_np(errno_copy));
|
||||
if (ret == 0 || errno_copy != EINVAL)
|
||||
goto error;
|
||||
|
||||
errno = 0;
|
||||
ret = sys_rseq((char *) global_rseq + 1, 32, 0, RSEQ_SIG);
|
||||
errno_copy = errno;
|
||||
fprintf(stderr, "Registration with unaligned rseq_abi fails with errno set to EINVAL (ret = %d, errno = %s)\n", ret, strerrorname_np(errno_copy));
|
||||
if (ret == 0 || errno_copy != EINVAL)
|
||||
goto error;
|
||||
|
||||
errno = 0;
|
||||
ret = sys_rseq(global_rseq, 31, 0, RSEQ_SIG);
|
||||
errno_copy = errno;
|
||||
fprintf(stderr, "Registration with invalid size fails with errno set to EINVAL (ret = %d, errno = %s)\n", ret, strerrorname_np(errno_copy));
|
||||
if (ret == 0 || errno_copy != EINVAL)
|
||||
goto error;
|
||||
|
||||
|
||||
#if defined(__LP64__) && (!defined(__s390__) && !defined(__s390x__))
|
||||
/*
|
||||
* We haven't found a reliable way to find an invalid address when
|
||||
* running a 32bit userspace on a 64bit kernel, so only run this test
|
||||
* on 64bit builds for the moment.
|
||||
*
|
||||
* Also exclude architectures that select
|
||||
* CONFIG_ALTERNATE_USER_ADDRESS_SPACE where the kernel and userspace
|
||||
* have their own address space and this failure can't happen.
|
||||
*/
|
||||
|
||||
/* EFAULT */
|
||||
errno = 0;
|
||||
ret = sys_rseq((void *) -4096UL, 32, 0, RSEQ_SIG);
|
||||
errno_copy = errno;
|
||||
fprintf(stderr, "Registration with invalid address fails with errno set to EFAULT (ret = %d, errno = %s)\n", ret, strerrorname_np(errno_copy));
|
||||
if (ret == 0 || errno_copy != EFAULT)
|
||||
goto error;
|
||||
#endif
|
||||
|
||||
errno = 0;
|
||||
ret = sys_rseq(global_rseq, 32, 0, RSEQ_SIG);
|
||||
errno_copy = errno;
|
||||
fprintf(stderr, "Registration succeeds for the current thread (ret = %d, errno = %s)\n", ret, strerrorname_np(errno_copy));
|
||||
if (ret != 0 && errno != 0)
|
||||
goto error;
|
||||
|
||||
/* The current thread is registered. */
|
||||
|
||||
/* EBUSY */
|
||||
errno = 0;
|
||||
ret = sys_rseq(global_rseq, 32, 0, RSEQ_SIG);
|
||||
errno_copy = errno;
|
||||
fprintf(stderr, "Double registration fails with errno set to EBUSY (ret = %d, errno = %s)\n", ret, strerrorname_np(errno_copy));
|
||||
if (ret == 0 || errno_copy != EBUSY)
|
||||
goto error;
|
||||
|
||||
/* EPERM */
|
||||
errno = 0;
|
||||
ret = sys_rseq(global_rseq, 32, RSEQ_ABI_FLAG_UNREGISTER, RSEQ_SIG + 1);
|
||||
errno_copy = errno;
|
||||
fprintf(stderr, "Unregistration with wrong RSEQ_SIG fails with errno to EPERM (ret = %d, errno = %s)\n", ret, strerrorname_np(errno_copy));
|
||||
if (ret == 0 || errno_copy != EPERM)
|
||||
goto error;
|
||||
|
||||
errno = 0;
|
||||
ret = sys_rseq(global_rseq, 32, RSEQ_ABI_FLAG_UNREGISTER, RSEQ_SIG);
|
||||
errno_copy = errno;
|
||||
fprintf(stderr, "Unregistration succeeds for the current thread (ret = %d, errno = %s)\n", ret, strerrorname_np(errno_copy));
|
||||
if (ret != 0)
|
||||
goto error;
|
||||
|
||||
errno = 0;
|
||||
ret = sys_rseq(global_rseq, 32, RSEQ_ABI_FLAG_UNREGISTER, RSEQ_SIG);
|
||||
errno_copy = errno;
|
||||
fprintf(stderr, "Double unregistration fails with errno set to EINVAL (ret = %d, errno = %s)\n", ret, strerrorname_np(errno_copy));
|
||||
if (ret == 0 || errno_copy != EINVAL)
|
||||
goto error;
|
||||
|
||||
return 0;
|
||||
error:
|
||||
return -1;
|
||||
}
|
@ -1 +1 @@
|
||||
CONFIG_SCHED_DEBUG=y
|
||||
# empty
|
||||
|
@ -1,4 +1,3 @@
|
||||
CONFIG_SCHED_DEBUG=y
|
||||
CONFIG_SCHED_CLASS_EXT=y
|
||||
CONFIG_CGROUPS=y
|
||||
CONFIG_CGROUP_SCHED=y
|
||||
|
@ -27,7 +27,6 @@ CONFIG_DEBUG_KMEMLEAK=y
|
||||
CONFIG_DEBUG_STACK_USAGE=y
|
||||
CONFIG_DEBUG_SHIRQ=y
|
||||
CONFIG_WQ_WATCHDOG=y
|
||||
CONFIG_SCHED_DEBUG=y
|
||||
CONFIG_SCHED_INFO=y
|
||||
CONFIG_SCHEDSTATS=y
|
||||
CONFIG_SCHED_STACK_END_CHECK=y
|
||||
|
Loading…
x
Reference in New Issue
Block a user