- 15 Jun, 2021 22 commits
-
-
Philippe Gerum authored
When lockdep is enabled, per_cpu(hardirqs_enabled) may not be in sync with the in-band stall bit on kernel entry upon preemption by an IRQ. This may happen since the relevant local_irq_* calls do not manipulate the stall bit and the lockdep irq state atomically (for a good reason). In addition, the raw_local_irq API may be used directly, without lockdep tracking whatsoever (e.g. when manipulating raw spinlocks). As a result, the kernel may observe a stalled in-band stage, with per_cpu(hardirqs_enabled) not mirroring the interrupt state, e.g.: /* in-band, irqs_disabled=0, percpu(hardirqs_enabled)=1 */ raw_local_irq_irqsave /* e.g. raw_spin_lock */ /* irqs_disabled=1, percpu(hardirqs_enabled)=1 */ <IRQ> trace_hardirqs_off_pipelined /* on entry */ trace_hardirqs_off /* irqs_disabled=1, percpu(hardirqs_enabled)=0 */ .... trace_hardirqs_on_pipelined /* irqs_disabled on exit -> skips trace_hardirqs_on */ </IRQ> /* percpu(hardirqs_enabled) still 0 */ raw_local_irq_restore WARN_ON(lockdep_assert_irqs_enabled()); /* TRIGGERS! */ kentry_enter_pipelined and kentry_exit_pipelined are introduced to preserve the full irq state for the in-band stage across a kernel entry (IRQ and fault), which is comprised of the stall bit and the lockdep irq state (per_cpu(hardirqs_enabled)) now tracked independently. These helpers are normally called from the kernel entry/exit code in the asm section by architectures which do not use the generic kernel entry code, in order to save the interrupt and lockdep states for the in-band stage on entry, restoring them when leaving the kernel. At this chance, the pipelined fault entry/exit routines are simplified by relying on these helpers for preserving the virtual interrupt state across the fault handling code. This fixes random kernel splats with CONFIG_PROVE_LOCKING enabled such as: [ 25.735750] WARNING: CPU: 0 PID: 65 at kernel/softirq.c:175 __local_bh_enable_ip+0x1e4/0x264 [ 25.747380] Modules linked in: [ 25.750529] CPU: 0 PID: 65 Comm: kworker/u3:1 Not tainted 5.10.42-00593-g5753d0a33341-dirty #5 [ 25.759307] Hardware name: Generic AM33XX (Flattened Device Tree) [ 25.765463] IRQ stage: Linux [ 25.768473] Workqueue: xprtiod xs_stream_data_receive_workfn [ 25.774237] [<c030fe14>] (unwind_backtrace) from [<c030c3f8>] (show_stack+0x10/0x14) [ 25.782129] [<c030c3f8>] (show_stack) from [<c033cf30>] (__warn+0x118/0x11c) [ 25.789317] [<c033cf30>] (__warn) from [<c033cfe4>] (warn_slowpath_fmt+0xb0/0xb8) [ 25.796944] [<c033cfe4>] (warn_slowpath_fmt) from [<c0342cc0>] (__local_bh_enable_ip+0x1e4/0x264) [ 25.805904] [<c0342cc0>] (__local_bh_enable_ip) from [<c0f778b0>] (tcp_recvmsg+0x31c/0xa54) [ 25.814402] [<c0f778b0>] (tcp_recvmsg) from [<c0fb17a8>] (inet_recvmsg+0x48/0x70) [ 25.822024] [<c0fb17a8>] (inet_recvmsg) from [<c1072b90>] (xs_sock_recvmsg.constprop.9+0x24/0x40) [ 25.831042] [<c1072b90>] (xs_sock_recvmsg.constprop.9) from [<c1073e34>] (xs_stream_data_receive_workfn+0xe0/0x630) [ 25.841652] [<c1073e34>] (xs_stream_data_receive_workfn) from [<c035b008>] (process_one_work+0x2f8/0x7b4) [ 25.851367] [<c035b008>] (process_one_work) from [<c035b508>] (worker_thread+0x44/0x594) [ 25.859605] [<c035b508>] (worker_thread) from [<c0361c6c>] (kthread+0x16c/0x184) [ 25.867142] [<c0361c6c>] (kthread) from [<c0300184>] (ret_from_fork+0x14/0x30) [ 25.874431] Exception stack(0xc406ffb0 to 0xc406fff8) [ 25.879610] ffa0: 00000000 00000000 00000000 00000000 [ 25.887946] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 25.896197] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 25.902936] irq event stamp: 81142 [ 25.906460] hardirqs last enabled at (81152): [<c0394420>] console_unlock+0x374/0x5cc [ 25.914451] hardirqs last disabled at (81159): [<c03943f0>] console_unlock+0x344/0x5cc [ 25.922516] softirqs last enabled at (80912): [<c0ec18a4>] lock_sock_nested+0x30/0x84 [ 25.930572] softirqs last disabled at (80915): [<c0ec494c>] release_sock+0x18/0x98 Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
lockdep_save/restore_irqs_state might convey the wrong information: this is not about saving+disabling then conditionally re-enabling the tracked state, but merely to read/write such state unconditionally. Let's change this to non-equivocal names.
-
Philippe Gerum authored
We must make sure to play any IRQ which might be pending in the in-band log before leaving an interrupt frame for a preempted kernel context. This completes "irq_pipeline: Account for stage migration across faults", so that we synchronize the log once the in-band stage is unstalled. In addition, we also care to do this before preempt_schedule_irq() runs, so that we won't miss any rescheduling request which might have been triggered by some IRQ we just played. Signed-off-by:
Philippe Gerum <rpm@xenomai.org> Suggested-by:
Jan Kiszka <jan.kiszka@siemens.com>
-
Philippe Gerum authored
Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
Make sure that inband is marked stalled when entering from user mode, taking an exception. This affects x86 which is currently the only arch using generic irqentry_enter_from_user_mode on exceptions. It fixes this lockdep warning: DEBUG_LOCKS_WARN_ON(!lockdep_stage_disabled()) WARNING: CPU: 2 PID: 1477 at ../kernel/locking/lockdep.c:4129 lockdep_hardirqs_on_prepare+0x160/0x1a0 Signed-off-by:
Philippe Gerum <rpm@xenomai.org> Tested-by:
Jan Kiszka <jan.kiszka@siemens.com>
-
Philippe Gerum authored
The way local_irq_disable_full() works may cause interrupt events to lag in the interrupt log inadvertently if the code path does not synchronize such log afterwards. As a result, some interrupts may not get played when they should, causing breakage. Since calling inband_irq_disable() with hard irqs off is deemed ok (unlike with inband_irq_enable()), invert the two operations so that hard irqs are disabled before the in-band stage is stalled, preventing any interrupt to be logged in between. See https://xenomai.org/pipermail/xenomai/2021-June/045476.html. This fixes this issue: https://xenomai.org/pipermail/xenomai/2021-May/045071.html Signed-off-by:
Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by:
Philippe Gerum <rpm@xenomai.org> Reported-by:
Florian Bezdeka <florian.bezdeka@siemens.com>
-
Jan Kiszka authored
We had an overlap with compat flags so that, e.g., TS_COMPAT_RESTART made a 32-bit standard task also a dovetail one. Signed-off-by:
Jan Kiszka <jan.kiszka@siemens.com>
-
Jan Kiszka authored
Something must have went wrong if entering for an IRQ or an exception over oob and with this stage stalled. Warn when debugging Suggest-by:
Philippe Gerum <rpm@xenomai.org> Signed-off-by:
Jan Kiszka <jan.kiszka@siemens.com>
-
Jan Kiszka authored
We need to unstall the inband stage when we entered for a fault over OOB and then migrated to inband. So far we kept the inband stage stalled, causing a state corruption this way. Signed-off-by:
Jan Kiszka <jan.kiszka@siemens.com>
-
Jan Kiszka authored
This field represents mutually exclusive states, namely - IRQENTRY_INBAND_UNSTALLED - IRQENTRY_INBAND_STALLED - IRQENTRY_OOB Encodes them as enum and test against them, rather than against state bits that suggest they could be combined. Also flip the inverted naming of INBAND_STALLED vs. INBAND_UNSTALLED: Only when we entered under INBAND_UNSTALLED, certain actions need to be taken on exit. Finally, document the stage_info field of irqentry_state. Signed-off-by:
Jan Kiszka <jan.kiszka@siemens.com>
-
Philippe Gerum authored
Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
We may be running a SMP kernel on a uniprocessor machine whose interrupt controller supports no IPI. We should attempt to hook IPIs only if the hardware can support multiple CPUs, otherwise it is unneeded and poised to fail. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
We have only very few syscalls, prefer a plain switch to a pointer indirection which ends up being fairly costly due to exploit mitigations. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
EVL_HIGH_PERCPU_CONCURRENCY optimizes the implementation for applications with many real-time threads running concurrently on any given CPU core (typically when eight or more threads may be sharing a single CPU core). This is a combination of the scalable scheduler and rb-tree timer indexing as a single configuration switch, since both aspects are normally coupled. If the application system runs only a few EVL threads per CPU core, then this option should be turned off, in order to minimize the cache footprint of the queuing operations performed by the scheduler and timer subsystems. Otherwise, it should be turned on in order to have constant-time queuing operations for a large number of runnable threads and outstanding timers. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
For applications with only few runnable tasks at any point in time, a linear queue ordering the latter for scheduling delivers better performance on low-end systems due to smaller CPU cache footprints, compared to the multi-level queue used by the scalable scheduler. Allow users to select between lightning-fast and scalable scheduler implementation depending on the runtime profile of the application. Signed-off-by:
Philippe Gerum <rpm@xenomai.org> # Please enter the commit message for your changes. Lines starting # with '#' will be ignored, and an empty message aborts the commit. # # On branch evl/master # Your branch is ahead of 'origin/evl/master' by 2 commits. # (use "git push" to publish your local commits) # # Changes to be committed: # modified: include/evl/sched.h # modified: include/evl/sched/queue.h # modified: include/evl/sched/tp.h # modified: include/evl/sched/weak.h # modified: kernel/evl/Kconfig # modified: kernel/evl/sched/core.c # # Untracked files: # include/trace/events/mm.h #
-
Philippe Gerum authored
Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
Add (back) the ability to index timers either in a rb-tree or linked to a basic linked list. The latter delivers lower latency to applications systems with very few active timers at any point in time (typically less than 10 active timers, e.g. not more than a couple of timed loops, very few timed syscalls). Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
- 15 May, 2021 2 commits
-
-
Philippe Gerum authored
Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
The pipelined interrupt entry code must always run the common work loop before returning to user mode on the in-band stage, including after the preempted task was demoted from oob to in-band context as a result of handling the incoming IRQ. Failing to do so may cause in-band work to be left pending in this particular case, like _TIF_RETUSER and other _TIF_WORK conditions. This bug caused the smokey 'gdb' test to fail on x86: https://xenomai.org/pipermail/xenomai/2021-March/044522.html Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
- 03 May, 2021 16 commits
-
-
Philippe Gerum authored
Since #ae18ad28 , MAX_RT_PRIO should be used instead. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
A process is now marked for COW-breaking on fork() upon the first call to dovetail_init_altsched(), and must ensure its memory is locked via a call to mlockall(MCL_CURRENT|MCL_FUTURE) as usual. As a result, force_commit_memory() became pointless and was removed from the Dovetail interface. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Zhang Kun authored
evl/factory.h is included more than once, remove the one that isn't necessary. Signed-off-by:
Zhang Kun <zhangkun@cdjrlc.com>
-
Philippe Gerum authored
Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
An EVL lock is now distinct from a hard lock in that it tracks and disables preemption in the core when held. Such spinlock may be useful when only EVL threads running out-of-band can contend for the lock, to the exclusion of out-of-band IRQ handlers. In this case, disabling preemption before attempting to grab the lock may be substituted to disabling hard irqs. There are gotchas when using such type of lock from the in-band context, see comments in evl/lock.h. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
Very short sections of code outside of any hot path are protected by such lock. Therefore we would not generally benefit from the preemption disabling feature we are going to add to the EVL-specific spinlock. Make it a hard lock to clarify the intent. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
For the most part, the gate lock is nested with a wait queue hard lock - which requires hard irqs to be off - to access the protected sections. Therefore we would not benefit in the common case from the preemption disabling feature we are going to add to the EVL-specific spinlock. Make it a hard lock to clarify the intent. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
For the most part, a thread hard lock - which requires hard irqs to be off - is nested with the mutex lock to access the protected sections. Therefore we would not benefit in the common case from the preemption disabling feature we are going to add to the EVL-specific spinlock. Make it a hard lock to clarify the intent. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
Sleeping voluntarily with EVL preemption disabled is a bug. Add the proper assertion to detect this. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
Given the semantics of an evl_flag, disabling preemption manually around the evl_raise_flag(to_flag) -> evl_wait_flag(from_flag) sequence does not make sense. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
The subscriber lock is shared between both execution stages, but accessed from the in-band stage for the most part, which implies disabling hard irqs while holding it. Meanwhile, out-of-band IRQs and EVL threads may compete for the observable lock, which would require hard irqs to be disabled while holding it. Therefore we would not generally benefit from the preemption disabling feature we are going to add to the EVL-specific spinlock in any case. Make these hard locks to clarify the intent. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
The data protected by the inbound (oob -> in-band traffic) buffer lock is frequently accessed from the in-band stage by design, where hard irqs should be disabled. Conversely, the out-of-band sections are short enough to bear with interrupt-free execution. Therefore we would not generally benefit from the preemption disabling feature we are going to add to the EVL-specific spinlock. Make it a hard lock to clarify the intent. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
Out-of-band IRQs and EVL thread contexts would usually compete for such lock, which would require hard irqs to be disabled while holding it. Therefore we would not generally benefit from the preemption disabling feature we are going to add to the EVL-specific spinlock. Make it a hard lock to clarify the intent. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-
Philippe Gerum authored
The data protected by the file table lock is frequently accessed from the in-band stage where holding it with hard irqs off is required. Therefore we would not benefit in the common case from the preemption disabling feature we are going to add to the EVL-specific spinlock. Make it a hard lock to clarify the intent. Signed-off-by:
Philippe Gerum <rpm@xenomai.org>
-