Merging mlinuxguy contributed realtime patches

This commit is contained in:
Mauro Ribeiro 2013-04-14 22:21:21 -03:00
parent 8b8d0c948a
commit 6741d7a6a2
354 changed files with 13830 additions and 1693 deletions

View file

@ -0,0 +1,64 @@
Introduction:
-------------
The module hwlat_detector is a special purpose kernel module that is used to
detect large system latencies induced by the behavior of certain underlying
hardware or firmware, independent of Linux itself. The code was developed
originally to detect SMIs (System Management Interrupts) on x86 systems,
however there is nothing x86 specific about this patchset. It was
originally written for use by the "RT" patch since the Real Time
kernel is highly latency sensitive.
SMIs are usually not serviced by the Linux kernel, which typically does not
even know that they are occuring. SMIs are instead are set up by BIOS code
and are serviced by BIOS code, usually for "critical" events such as
management of thermal sensors and fans. Sometimes though, SMIs are used for
other tasks and those tasks can spend an inordinate amount of time in the
handler (sometimes measured in milliseconds). Obviously this is a problem if
you are trying to keep event service latencies down in the microsecond range.
The hardware latency detector works by hogging all of the cpus for configurable
amounts of time (by calling stop_machine()), polling the CPU Time Stamp Counter
for some period, then looking for gaps in the TSC data. Any gap indicates a
time when the polling was interrupted and since the machine is stopped and
interrupts turned off the only thing that could do that would be an SMI.
Note that the SMI detector should *NEVER* be used in a production environment.
It is intended to be run manually to determine if the hardware platform has a
problem with long system firmware service routines.
Usage:
------
Loading the module hwlat_detector passing the parameter "enabled=1" (or by
setting the "enable" entry in "hwlat_detector" debugfs toggled on) is the only
step required to start the hwlat_detector. It is possible to redefine the
threshold in microseconds (us) above which latency spikes will be taken
into account (parameter "threshold=").
Example:
# modprobe hwlat_detector enabled=1 threshold=100
After the module is loaded, it creates a directory named "hwlat_detector" under
the debugfs mountpoint, "/debug/hwlat_detector" for this text. It is necessary
to have debugfs mounted, which might be on /sys/debug on your system.
The /debug/hwlat_detector interface contains the following files:
count - number of latency spikes observed since last reset
enable - a global enable/disable toggle (0/1), resets count
max - maximum hardware latency actually observed (usecs)
sample - a pipe from which to read current raw sample data
in the format <timestamp> <latency observed usecs>
(can be opened O_NONBLOCK for a single sample)
threshold - minimum latency value to be considered (usecs)
width - time period to sample with CPUs held (usecs)
must be less than the total window size (enforced)
window - total period of sampling, width being inside (usecs)
By default we will set width to 500,000 and window to 1,000,000, meaning that
we will sample every 1,000,000 usecs (1s) for 500,000 usecs (0.5s). If we
observe any latencies that exceed the threshold (initially 100 usecs),
then we write to a global sample ring buffer of 8K samples, which is
consumed by reading from the "sample" (pipe) debugfs file interface.

View file

@ -1182,6 +1182,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
See comment before ip2_setup() in
drivers/char/ip2/ip2base.c.
irqaffinity= [SMP] Set the default irq affinity mask
Format:
<cpu number>,...,<cpu number>
or
<cpu number>-<cpu number>
(must be a positive range in ascending order)
or a mixture
<cpu number>,...,<cpu number>-<cpu number>
irqfixup [HW]
When an interrupt is not handled search all handlers
for it. Intended to get systems with badly broken

View file

@ -57,10 +57,17 @@ On PowerPC - Press 'ALT - Print Screen (or F13) - <command key>,
On other - If you know of the key combos for other architectures, please
let me know so I can add them to this section.
On all - write a character to /proc/sysrq-trigger. e.g.:
On all - write a character to /proc/sysrq-trigger, e.g.:
echo t > /proc/sysrq-trigger
On all - Enable network SysRq by writing a cookie to icmp_echo_sysrq, e.g.
echo 0x01020304 >/proc/sys/net/ipv4/icmp_echo_sysrq
Send an ICMP echo request with this pattern plus the particular
SysRq command key. Example:
# ping -c1 -s57 -p0102030468
will trigger the SysRq-H (help) command.
* What are the 'command' keys?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
'b' - Will immediately reboot the system without syncing or unmounting

View file

@ -0,0 +1,186 @@
Using the Linux Kernel Latency Histograms
This document gives a short explanation how to enable, configure and use
latency histograms. Latency histograms are primarily relevant in the
context of real-time enabled kernels (CONFIG_PREEMPT/CONFIG_PREEMPT_RT)
and are used in the quality management of the Linux real-time
capabilities.
* Purpose of latency histograms
A latency histogram continuously accumulates the frequencies of latency
data. There are two types of histograms
- potential sources of latencies
- effective latencies
* Potential sources of latencies
Potential sources of latencies are code segments where interrupts,
preemption or both are disabled (aka critical sections). To create
histograms of potential sources of latency, the kernel stores the time
stamp at the start of a critical section, determines the time elapsed
when the end of the section is reached, and increments the frequency
counter of that latency value - irrespective of whether any concurrently
running process is affected by latency or not.
- Configuration items (in the Kernel hacking/Tracers submenu)
CONFIG_INTERRUPT_OFF_LATENCY
CONFIG_PREEMPT_OFF_LATENCY
* Effective latencies
Effective latencies are actually occuring during wakeup of a process. To
determine effective latencies, the kernel stores the time stamp when a
process is scheduled to be woken up, and determines the duration of the
wakeup time shortly before control is passed over to this process. Note
that the apparent latency in user space may be somewhat longer, since the
process may be interrupted after control is passed over to it but before
the execution in user space takes place. Simply measuring the interval
between enqueuing and wakeup may also not appropriate in cases when a
process is scheduled as a result of a timer expiration. The timer may have
missed its deadline, e.g. due to disabled interrupts, but this latency
would not be registered. Therefore, the offsets of missed timers are
recorded in a separate histogram. If both wakeup latency and missed timer
offsets are configured and enabled, a third histogram may be enabled that
records the overall latency as a sum of the timer latency, if any, and the
wakeup latency. This histogram is called "timerandwakeup".
- Configuration items (in the Kernel hacking/Tracers submenu)
CONFIG_WAKEUP_LATENCY
CONFIG_MISSED_TIMER_OFSETS
* Usage
The interface to the administration of the latency histograms is located
in the debugfs file system. To mount it, either enter
mount -t sysfs nodev /sys
mount -t debugfs nodev /sys/kernel/debug
from shell command line level, or add
nodev /sys sysfs defaults 0 0
nodev /sys/kernel/debug debugfs defaults 0 0
to the file /etc/fstab. All latency histogram related files are then
available in the directory /sys/kernel/debug/tracing/latency_hist. A
particular histogram type is enabled by writing non-zero to the related
variable in the /sys/kernel/debug/tracing/latency_hist/enable directory.
Select "preemptirqsoff" for the histograms of potential sources of
latencies and "wakeup" for histograms of effective latencies etc. The
histogram data - one per CPU - are available in the files
/sys/kernel/debug/tracing/latency_hist/preemptoff/CPUx
/sys/kernel/debug/tracing/latency_hist/irqsoff/CPUx
/sys/kernel/debug/tracing/latency_hist/preemptirqsoff/CPUx
/sys/kernel/debug/tracing/latency_hist/wakeup/CPUx
/sys/kernel/debug/tracing/latency_hist/wakeup/sharedprio/CPUx
/sys/kernel/debug/tracing/latency_hist/missed_timer_offsets/CPUx
/sys/kernel/debug/tracing/latency_hist/timerandwakeup/CPUx
The histograms are reset by writing non-zero to the file "reset" in a
particular latency directory. To reset all latency data, use
#!/bin/sh
TRACINGDIR=/sys/kernel/debug/tracing
HISTDIR=$TRACINGDIR/latency_hist
if test -d $HISTDIR
then
cd $HISTDIR
for i in `find . | grep /reset$`
do
echo 1 >$i
done
fi
* Data format
Latency data are stored with a resolution of one microsecond. The
maximum latency is 10,240 microseconds. The data are only valid, if the
overflow register is empty. Every output line contains the latency in
microseconds in the first row and the number of samples in the second
row. To display only lines with a positive latency count, use, for
example,
grep -v " 0$" /sys/kernel/debug/tracing/latency_hist/preemptoff/CPU0
#Minimum latency: 0 microseconds.
#Average latency: 0 microseconds.
#Maximum latency: 25 microseconds.
#Total samples: 3104770694
#There are 0 samples greater or equal than 10240 microseconds
#usecs samples
0 2984486876
1 49843506
2 58219047
3 5348126
4 2187960
5 3388262
6 959289
7 208294
8 40420
9 4485
10 14918
11 18340
12 25052
13 19455
14 5602
15 969
16 47
17 18
18 14
19 1
20 3
21 2
22 5
23 2
25 1
* Wakeup latency of a selected process
To only collect wakeup latency data of a particular process, write the
PID of the requested process to
/sys/kernel/debug/tracing/latency_hist/wakeup/pid
PIDs are not considered, if this variable is set to 0.
* Details of the process with the highest wakeup latency so far
Selected data of the process that suffered from the highest wakeup
latency that occurred in a particular CPU are available in the file
/sys/kernel/debug/tracing/latency_hist/wakeup/max_latency-CPUx.
In addition, other relevant system data at the time when the
latency occurred are given.
The format of the data is (all in one line):
<PID> <Priority> <Latency> (<Timeroffset>) <Command> \
<- <PID> <Priority> <Command> <Timestamp>
The value of <Timeroffset> is only relevant in the combined timer
and wakeup latency recording. In the wakeup recording, it is
always 0, in the missed_timer_offsets recording, it is the same
as <Latency>.
When retrospectively searching for the origin of a latency and
tracing was not enabled, it may be helpful to know the name and
some basic data of the task that (finally) was switching to the
late real-tlme task. In addition to the victim's data, also the
data of the possible culprit are therefore displayed after the
"<-" symbol.
Finally, the timestamp of the time when the latency occurred
in <seconds>.<microseconds> after the most recent system boot
is provided.
These data are also reset when the wakeup histogram is reset.

View file

@ -6,6 +6,7 @@ config OPROFILE
tristate "OProfile system profiling"
depends on PROFILING
depends on HAVE_OPROFILE
depends on !PREEMPT_RT_FULL
select RING_BUFFER
select RING_BUFFER_ALLOW_SWAP
help

View file

@ -108,7 +108,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr,
/* If we're in an interrupt context, or have no user context,
we must not take the fault. */
if (!mm || in_atomic())
if (!mm || pagefault_disabled())
goto no_context;
#ifdef CONFIG_ALPHA_LARGE_VMALLOC

View file

@ -17,6 +17,7 @@ config ARM
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
select HARDIRQS_SW_RESEND
select IRQ_FORCED_THREADING
select HAVE_AOUT
select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL
select HAVE_ARCH_KGDB
@ -46,6 +47,7 @@ config ARM
select HAVE_MEMBLOCK
select HAVE_OPROFILE if (HAVE_PERF_EVENTS)
select HAVE_PERF_EVENTS
select HAVE_PREEMPT_LAZY
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_UID16

File diff suppressed because it is too large Load diff

View file

@ -3,6 +3,14 @@
#include <linux/thread_info.h>
#if defined CONFIG_PREEMPT_RT_FULL && defined CONFIG_HIGHMEM
void switch_kmaps(struct task_struct *prev_p, struct task_struct *next_p);
#else
static inline void
switch_kmaps(struct task_struct *prev_p, struct task_struct *next_p) { }
#endif
/*
* switch_to(prev, next) should switch from task `prev' to `next'
* `prev' will never be the same as `next'. schedule() itself
@ -12,6 +20,7 @@ extern struct task_struct *__switch_to(struct task_struct *, struct thread_info
#define switch_to(prev,next,last) \
do { \
switch_kmaps(prev, next); \
last = __switch_to(prev,task_thread_info(prev), task_thread_info(next)); \
} while (0)

View file

@ -50,6 +50,7 @@ struct cpu_context_save {
struct thread_info {
unsigned long flags; /* low level flags */
int preempt_count; /* 0 => preemptable, <0 => bug */
int preempt_lazy_count; /* 0 => preemptable, <0 => bug */
mm_segment_t addr_limit; /* address limit */
struct task_struct *task; /* main task structure */
struct exec_domain *exec_domain; /* execution domain */
@ -148,6 +149,7 @@ extern int vfp_restore_user_hwstate(struct user_vfp __user *,
#define TIF_SIGPENDING 0
#define TIF_NEED_RESCHED 1
#define TIF_NOTIFY_RESUME 2 /* callback before returning to user */
#define TIF_NEED_RESCHED_LAZY 3
#define TIF_SYSCALL_TRACE 8
#define TIF_SYSCALL_AUDIT 9
#define TIF_SYSCALL_TRACEPOINT 10
@ -160,6 +162,7 @@ extern int vfp_restore_user_hwstate(struct user_vfp __user *,
#define _TIF_SIGPENDING (1 << TIF_SIGPENDING)
#define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
#define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
#define _TIF_NEED_RESCHED_LAZY (1 << TIF_NEED_RESCHED_LAZY)
#define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
#define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
#define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT)

View file

@ -50,6 +50,7 @@ int main(void)
BLANK();
DEFINE(TI_FLAGS, offsetof(struct thread_info, flags));
DEFINE(TI_PREEMPT, offsetof(struct thread_info, preempt_count));
DEFINE(TI_PREEMPT_LAZY, offsetof(struct thread_info, preempt_lazy_count));
DEFINE(TI_ADDR_LIMIT, offsetof(struct thread_info, addr_limit));
DEFINE(TI_TASK, offsetof(struct thread_info, task));
DEFINE(TI_EXEC_DOMAIN, offsetof(struct thread_info, exec_domain));

View file

@ -29,28 +29,17 @@ static void early_console_write(struct console *con, const char *s, unsigned n)
early_write(s, n);
}
static struct console early_console = {
static struct console early_console_dev = {
.name = "earlycon",
.write = early_console_write,
.flags = CON_PRINTBUFFER | CON_BOOT,
.index = -1,
};
asmlinkage void early_printk(const char *fmt, ...)
{
char buf[512];
int n;
va_list ap;
va_start(ap, fmt);
n = vscnprintf(buf, sizeof(buf), fmt, ap);
early_write(buf, n);
va_end(ap);
}
static int __init setup_early_printk(char *buf)
{
register_console(&early_console);
early_console = &early_console_dev;
register_console(&early_console_dev);
return 0;
}

View file

@ -216,11 +216,18 @@ __irq_svc:
#ifdef CONFIG_PREEMPT
get_thread_info tsk
ldr r8, [tsk, #TI_PREEMPT] @ get preempt count
ldr r0, [tsk, #TI_FLAGS] @ get flags
teq r8, #0 @ if preempt count != 0
bne 1f @ return from exeption
ldr r0, [tsk, #TI_FLAGS] @ get flags
tst r0, #_TIF_NEED_RESCHED @ if NEED_RESCHED is set
blne svc_preempt @ preempt!
ldr r8, [tsk, #TI_PREEMPT_LAZY] @ get preempt lazy count
teq r8, #0 @ if preempt lazy count != 0
movne r0, #0 @ force flags to 0
tst r0, #_TIF_NEED_RESCHED
tst r0, #_TIF_NEED_RESCHED_LAZY
blne svc_preempt
1:
#endif
#ifdef CONFIG_TRACE_IRQFLAGS
@ -240,6 +247,8 @@ svc_preempt:
1: bl preempt_schedule_irq @ irq en/disable is done inside
ldr r0, [tsk, #TI_FLAGS] @ get new tasks TI_FLAGS
tst r0, #_TIF_NEED_RESCHED
bne 1b
tst r0, #_TIF_NEED_RESCHED_LAZY
moveq pc, r8 @ go again
b 1b
#endif

View file

@ -118,7 +118,8 @@ static int cpu_pmu_request_irq(struct arm_pmu *cpu_pmu, irq_handler_t handler)
continue;
}
err = request_irq(irq, handler, IRQF_NOBALANCING, "arm-pmu",
err = request_irq(irq, handler,
IRQF_NOBALANCING | IRQF_NO_THREAD, "arm-pmu",
cpu_pmu);
if (err) {
pr_err("unable to request IRQ%d for ARM PMU counters\n",

View file

@ -459,6 +459,31 @@ unsigned long arch_randomize_brk(struct mm_struct *mm)
}
#ifdef CONFIG_MMU
/*
* CONFIG_SPLIT_PTLOCK_CPUS results in a page->ptl lock. If the lock is not
* initialized by pgtable_page_ctor() then a coredump of the vector page will
* fail.
*/
static int __init vectors_user_mapping_init_page(void)
{
struct page *page;
unsigned long addr = 0xffff0000;
pgd_t *pgd;
pud_t *pud;
pmd_t *pmd;
pgd = pgd_offset_k(addr);
pud = pud_offset(pgd, addr);
pmd = pmd_offset(pud, addr);
page = pmd_page(*(pmd));
pgtable_page_ctor(page);
return 0;
}
late_initcall(vectors_user_mapping_init_page);
/*
* The vectors page is always readable from user space for the
* atomic helpers and the signal restart code. Insert it into the

View file

@ -638,7 +638,8 @@ asmlinkage int
do_work_pending(struct pt_regs *regs, unsigned int thread_flags, int syscall)
{
do {
if (likely(thread_flags & _TIF_NEED_RESCHED)) {
if (likely(thread_flags & (_TIF_NEED_RESCHED |
_TIF_NEED_RESCHED_LAZY))) {
schedule();
} else {
if (unlikely(!user_mode(regs)))

View file

@ -134,6 +134,7 @@ clkevt32k_mode(enum clock_event_mode mode, struct clock_event_device *dev)
break;
case CLOCK_EVT_MODE_SHUTDOWN:
case CLOCK_EVT_MODE_UNUSED:
remove_irq(AT91_ID_SYS, &at91rm9200_timer_irq);
case CLOCK_EVT_MODE_RESUME:
irqmask = 0;
break;

View file

@ -77,7 +77,7 @@ static struct clocksource pit_clk = {
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
};
static struct irqaction at91sam926x_pit_irq;
/*
* Clockevent device: interrupts every 1/HZ (== pit_cycles * MCK/16)
*/
@ -86,6 +86,8 @@ pit_clkevt_mode(enum clock_event_mode mode, struct clock_event_device *dev)
{
switch (mode) {
case CLOCK_EVT_MODE_PERIODIC:
/* Set up irq handler */
setup_irq(AT91_ID_SYS, &at91sam926x_pit_irq);
/* update clocksource counter */
pit_cnt += pit_cycle * PIT_PICNT(pit_read(AT91_PIT_PIVR));
pit_write(AT91_PIT_MR, (pit_cycle - 1) | AT91_PIT_PITEN
@ -98,6 +100,7 @@ pit_clkevt_mode(enum clock_event_mode mode, struct clock_event_device *dev)
case CLOCK_EVT_MODE_UNUSED:
/* disable irq, leaving the clocksource active */
pit_write(AT91_PIT_MR, (pit_cycle - 1) | AT91_PIT_PITEN);
remove_irq(AT91_ID_SYS, &at91sam926x_pit_irq);
break;
case CLOCK_EVT_MODE_RESUME:
break;

View file

@ -442,7 +442,19 @@ config MACH_HKDK4412
select EXYNOS4_SETUP_DWMCI
select SAMSUNG_DEV_ADC
help
Machine support for Odroid-X based on Samsung EXYNOS4412
Machine support for ODROIDs based on Samsung EXYNOS4412 (X/X2/U2)
config ODROID_X
bool "Enable Support for ODROID-X"
depends on MACH_HKDK4412
config ODROID_X2
bool "Enable support for ODROID-X2"
depends on MACH_HKDK4412
config ODROID_U2
bool "Enable support for ODROID-U2"
depends on MACH_HKDK4412
endif

View file

@ -177,11 +177,15 @@ static struct regulator_init_data __initdata max77686_buck1_data = {
.constraints = {
.name = "VDD_MIF_1.0V",
.min_uV = 800000,
.max_uV = 1050000,
.max_uV = 1100000,
.always_on = 1,
.boot_on = 1,
.valid_ops_mask = REGULATOR_CHANGE_VOLTAGE |
REGULATOR_CHANGE_STATUS,
.valid_ops_mask = REGULATOR_CHANGE_VOLTAGE | REGULATOR_CHANGE_STATUS,
.state_mem = {
.uV = 1100000,
.mode = REGULATOR_MODE_NORMAL,
.enabled = 1,
},
},
.num_consumer_supplies = ARRAY_SIZE(max77686_buck1_consumer),
.consumer_supplies = max77686_buck1_consumer,
@ -191,7 +195,7 @@ static struct regulator_init_data __initdata max77686_buck2_data = {
.constraints = {
.name = "VDD_ARM_1.3V",
.min_uV = 800000,
.max_uV = 1350000,
.max_uV = 1500000,
.always_on = 1,
.boot_on = 1,
.valid_ops_mask = REGULATOR_CHANGE_VOLTAGE,
@ -1226,7 +1230,13 @@ static void __init hkdk4412_machine_init(void)
platform_add_devices(hkdk4412_devices, ARRAY_SIZE(hkdk4412_devices));
}
MACHINE_START(ODROIDX, "ODROID-X")
#if defined(CONFIG_ODROID_X)
MACHINE_START(ODROIDX, "ODROIDX")
#elif defined(CONFIG_ODROID_X2)
MACHINE_START(ODROIDX, "ODROIDX2")
#elif defined(CONFIG_ODROID_U2)
MACHINE_START(ODROIDX, "ODROIDU2"
#endif
/* Maintainer: Dongjin Kim <dongjin.kim@agreeyamobiity.net> */
.atag_offset = 0x100,
.smp = smp_ops(exynos_smp_ops),

View file

@ -72,7 +72,7 @@ static void __iomem *scu_base_addr(void)
return (void __iomem *)(S5P_VA_SCU);
}
static DEFINE_SPINLOCK(boot_lock);
static DEFINE_RAW_SPINLOCK(boot_lock);
static void __cpuinit exynos_secondary_init(unsigned int cpu)
{
@ -92,8 +92,8 @@ static void __cpuinit exynos_secondary_init(unsigned int cpu)
/*
* Synchronise with the boot thread.
*/
spin_lock(&boot_lock);
spin_unlock(&boot_lock);
raw_spin_lock(&boot_lock);
raw_spin_unlock(&boot_lock);
}
static int __cpuinit exynos_boot_secondary(unsigned int cpu, struct task_struct *idle)
@ -105,7 +105,7 @@ static int __cpuinit exynos_boot_secondary(unsigned int cpu, struct task_struct
* Set synchronisation state between this boot processor
* and the secondary one
*/
spin_lock(&boot_lock);
raw_spin_lock(&boot_lock);
/*
* The secondary processor is waiting to be released from
@ -134,7 +134,7 @@ static int __cpuinit exynos_boot_secondary(unsigned int cpu, struct task_struct
if (timeout == 0) {
printk(KERN_ERR "cpu1 power enable failed");
spin_unlock(&boot_lock);
raw_spin_unlock(&boot_lock);
return -ETIMEDOUT;
}
}
@ -173,7 +173,7 @@ static int __cpuinit exynos_boot_secondary(unsigned int cpu, struct task_struct
* now the secondary core is starting up let it run its
* calibrations, then wait for it to finish
*/
spin_unlock(&boot_lock);
raw_spin_unlock(&boot_lock);
return pen_release != -1 ? -ENOSYS : 0;
}

View file

@ -31,7 +31,7 @@
extern void msm_secondary_startup(void);
static DEFINE_SPINLOCK(boot_lock);
static DEFINE_RAW_SPINLOCK(boot_lock);
static inline int get_core_count(void)
{
@ -58,8 +58,8 @@ static void __cpuinit msm_secondary_init(unsigned int cpu)
/*
* Synchronise with the boot thread.
*/
spin_lock(&boot_lock);
spin_unlock(&boot_lock);
raw_spin_lock(&boot_lock);
raw_spin_unlock(&boot_lock);
}
static __cpuinit void prepare_cold_cpu(unsigned int cpu)
@ -96,7 +96,7 @@ static int __cpuinit msm_boot_secondary(unsigned int cpu, struct task_struct *id
* set synchronisation state between this boot processor
* and the secondary one
*/
spin_lock(&boot_lock);
raw_spin_lock(&boot_lock);
/*
* The secondary processor is waiting to be released from
@ -130,7 +130,7 @@ static int __cpuinit msm_boot_secondary(unsigned int cpu, struct task_struct *id
* now the secondary core is starting up let it run its
* calibrations, then wait for it to finish
*/
spin_unlock(&boot_lock);
raw_spin_unlock(&boot_lock);
return pen_release != -1 ? -ENOSYS : 0;
}

View file

@ -45,7 +45,7 @@ u16 pm44xx_errata;
/* SCU base address */
static void __iomem *scu_base;
static DEFINE_SPINLOCK(boot_lock);
static DEFINE_RAW_SPINLOCK(boot_lock);
void __iomem *omap4_get_scu_base(void)
{
@ -76,8 +76,8 @@ static void __cpuinit omap4_secondary_init(unsigned int cpu)
/*
* Synchronise with the boot thread.
*/
spin_lock(&boot_lock);
spin_unlock(&boot_lock);
raw_spin_lock(&boot_lock);
raw_spin_unlock(&boot_lock);
}
static int __cpuinit omap4_boot_secondary(unsigned int cpu, struct task_struct *idle)
@ -90,7 +90,7 @@ static int __cpuinit omap4_boot_secondary(unsigned int cpu, struct task_struct *
* Set synchronisation state between this boot processor
* and the secondary one
*/
spin_lock(&boot_lock);
raw_spin_lock(&boot_lock);
/*
* Update the AuxCoreBoot0 with boot state for secondary core.
@ -163,7 +163,7 @@ static int __cpuinit omap4_boot_secondary(unsigned int cpu, struct task_struct *
* Now the secondary core is starting up let it run its
* calibrations, then wait for it to finish
*/
spin_unlock(&boot_lock);
raw_spin_unlock(&boot_lock);
return 0;
}

View file

@ -46,7 +46,7 @@
static void __iomem *wakeupgen_base;
static void __iomem *sar_base;
static DEFINE_SPINLOCK(wakeupgen_lock);
static DEFINE_RAW_SPINLOCK(wakeupgen_lock);
static unsigned int irq_target_cpu[MAX_IRQS];
static unsigned int irq_banks = MAX_NR_REG_BANKS;
static unsigned int max_irqs = MAX_IRQS;
@ -134,9 +134,9 @@ static void wakeupgen_mask(struct irq_data *d)
{
unsigned long flags;
spin_lock_irqsave(&wakeupgen_lock, flags);
raw_spin_lock_irqsave(&wakeupgen_lock, flags);
_wakeupgen_clear(d->irq, irq_target_cpu[d->irq]);
spin_unlock_irqrestore(&wakeupgen_lock, flags);
raw_spin_unlock_irqrestore(&wakeupgen_lock, flags);
}
/*
@ -146,9 +146,9 @@ static void wakeupgen_unmask(struct irq_data *d)
{
unsigned long flags;
spin_lock_irqsave(&wakeupgen_lock, flags);
raw_spin_lock_irqsave(&wakeupgen_lock, flags);
_wakeupgen_set(d->irq, irq_target_cpu[d->irq]);
spin_unlock_irqrestore(&wakeupgen_lock, flags);
raw_spin_unlock_irqrestore(&wakeupgen_lock, flags);
}
#ifdef CONFIG_HOTPLUG_CPU
@ -189,7 +189,7 @@ static void wakeupgen_irqmask_all(unsigned int cpu, unsigned int set)
{
unsigned long flags;
spin_lock_irqsave(&wakeupgen_lock, flags);
raw_spin_lock_irqsave(&wakeupgen_lock, flags);
if (set) {
_wakeupgen_save_masks(cpu);
_wakeupgen_set_all(cpu, WKG_MASK_ALL);
@ -197,7 +197,7 @@ static void wakeupgen_irqmask_all(unsigned int cpu, unsigned int set)
_wakeupgen_set_all(cpu, WKG_UNMASK_ALL);
_wakeupgen_restore_masks(cpu);
}
spin_unlock_irqrestore(&wakeupgen_lock, flags);
raw_spin_unlock_irqrestore(&wakeupgen_lock, flags);
}
#endif

View file

@ -21,7 +21,7 @@
#include <mach/spear.h>
#include <mach/generic.h>
static DEFINE_SPINLOCK(boot_lock);
static DEFINE_RAW_SPINLOCK(boot_lock);
static void __iomem *scu_base = IOMEM(VA_SCU_BASE);
@ -44,8 +44,8 @@ static void __cpuinit spear13xx_secondary_init(unsigned int cpu)
/*
* Synchronise with the boot thread.
*/
spin_lock(&boot_lock);
spin_unlock(&boot_lock);
raw_spin_lock(&boot_lock);
raw_spin_unlock(&boot_lock);
}
static int __cpuinit spear13xx_boot_secondary(unsigned int cpu, struct task_struct *idle)
@ -56,7 +56,7 @@ static int __cpuinit spear13xx_boot_secondary(unsigned int cpu, struct task_stru
* set synchronisation state between this boot processor
* and the secondary one
*/
spin_lock(&boot_lock);
raw_spin_lock(&boot_lock);
/*
* The secondary processor is waiting to be released from
@ -83,7 +83,7 @@ static int __cpuinit spear13xx_boot_secondary(unsigned int cpu, struct task_stru
* now the secondary core is starting up let it run its
* calibrations, then wait for it to finish
*/
spin_unlock(&boot_lock);
raw_spin_unlock(&boot_lock);
return pen_release != -1 ? -ENOSYS : 0;
}

View file

@ -50,7 +50,7 @@ static void __iomem *scu_base_addr(void)
return NULL;
}
static DEFINE_SPINLOCK(boot_lock);
static DEFINE_RAW_SPINLOCK(boot_lock);
static void __cpuinit ux500_secondary_init(unsigned int cpu)
{
@ -70,8 +70,8 @@ static void __cpuinit ux500_secondary_init(unsigned int cpu)
/*
* Synchronise with the boot thread.
*/
spin_lock(&boot_lock);
spin_unlock(&boot_lock);
raw_spin_lock(&boot_lock);
raw_spin_unlock(&boot_lock);
}
static int __cpuinit ux500_boot_secondary(unsigned int cpu, struct task_struct *idle)
@ -82,7 +82,7 @@ static int __cpuinit ux500_boot_secondary(unsigned int cpu, struct task_struct *
* set synchronisation state between this boot processor
* and the secondary one
*/
spin_lock(&boot_lock);
raw_spin_lock(&boot_lock);
/*
* The secondary processor is waiting to be released from
@ -103,7 +103,7 @@ static int __cpuinit ux500_boot_secondary(unsigned int cpu, struct task_struct *
* now the secondary core is starting up let it run its
* calibrations, then wait for it to finish
*/
spin_unlock(&boot_lock);
raw_spin_unlock(&boot_lock);
return pen_release != -1 ? -ENOSYS : 0;
}

View file

@ -279,7 +279,7 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
* If we're in an interrupt or have no user
* context, we must not take the fault..
*/
if (in_atomic() || !mm)
if (!mm || pagefault_disabled())
goto no_context;
/*

View file

@ -38,6 +38,7 @@ EXPORT_SYMBOL(kunmap);
void *kmap_atomic(struct page *page)
{
pte_t pte = mk_pte(page, kmap_prot);
unsigned int idx;
unsigned long vaddr;
void *kmap;
@ -76,7 +77,10 @@ void *kmap_atomic(struct page *page)
* in place, so the contained TLB flush ensures the TLB is updated
* with the new mapping.
*/
set_top_pte(vaddr, mk_pte(page, kmap_prot));
#ifdef CONFIG_PREEMPT_RT_FULL
current->kmap_pte[type] = pte;
#endif
set_top_pte(vaddr, pte);
return (void *)vaddr;
}
@ -93,12 +97,15 @@ void __kunmap_atomic(void *kvaddr)
if (cache_is_vivt())
__cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE);
#ifdef CONFIG_PREEMPT_RT_FULL
current->kmap_pte[type] = __pte(0);
#endif
#ifdef CONFIG_DEBUG_HIGHMEM
BUG_ON(vaddr != __fix_to_virt(FIX_KMAP_BEGIN + idx));
set_top_pte(vaddr, __pte(0));
#else
(void) idx; /* to kill a warning */
#endif
set_top_pte(vaddr, __pte(0));
kmap_atomic_idx_pop();
} else if (vaddr >= PKMAP_ADDR(0) && vaddr < PKMAP_ADDR(LAST_PKMAP)) {
/* this address was obtained through kmap_high_get() */
@ -110,6 +117,7 @@ EXPORT_SYMBOL(__kunmap_atomic);
void *kmap_atomic_pfn(unsigned long pfn)
{
pte_t pte = pfn_pte(pfn, kmap_prot);
unsigned long vaddr;
int idx, type;
@ -121,7 +129,10 @@ void *kmap_atomic_pfn(unsigned long pfn)
#ifdef CONFIG_DEBUG_HIGHMEM
BUG_ON(!pte_none(get_top_pte(vaddr)));
#endif
set_top_pte(vaddr, pfn_pte(pfn, kmap_prot));
#ifdef CONFIG_PREEMPT_RT_FULL
current->kmap_pte[type] = pte;
#endif
set_top_pte(vaddr, pte);
return (void *)vaddr;
}
@ -135,3 +146,29 @@ struct page *kmap_atomic_to_page(const void *ptr)
return pte_page(get_top_pte(vaddr));
}
#if defined CONFIG_PREEMPT_RT_FULL
void switch_kmaps(struct task_struct *prev_p, struct task_struct *next_p)
{
int i;
/*
* Clear @prev's kmap_atomic mappings
*/
for (i = 0; i < prev_p->kmap_idx; i++) {
int idx = i + KM_TYPE_NR * smp_processor_id();
set_top_pte(__fix_to_virt(FIX_KMAP_BEGIN + idx), __pte(0));
}
/*
* Restore @next_p's kmap_atomic mappings
*/
for (i = 0; i < next_p->kmap_idx; i++) {
int idx = i + KM_TYPE_NR * smp_processor_id();
if (!pte_none(next_p->kmap_pte[i]))
set_top_pte(__fix_to_virt(FIX_KMAP_BEGIN + idx),
next_p->kmap_pte[i]);
}
}
#endif

View file

@ -32,7 +32,7 @@ static void __cpuinit write_pen_release(int val)
outer_clean_range(__pa(&pen_release), __pa(&pen_release + 1));
}
static DEFINE_SPINLOCK(boot_lock);
static DEFINE_RAW_SPINLOCK(boot_lock);
void __cpuinit versatile_secondary_init(unsigned int cpu)
{
@ -52,8 +52,8 @@ void __cpuinit versatile_secondary_init(unsigned int cpu)
/*
* Synchronise with the boot thread.
*/
spin_lock(&boot_lock);
spin_unlock(&boot_lock);
raw_spin_lock(&boot_lock);
raw_spin_unlock(&boot_lock);
}
int __cpuinit versatile_boot_secondary(unsigned int cpu, struct task_struct *idle)
@ -64,7 +64,7 @@ int __cpuinit versatile_boot_secondary(unsigned int cpu, struct task_struct *idl
* Set synchronisation state between this boot processor
* and the secondary one
*/
spin_lock(&boot_lock);
raw_spin_lock(&boot_lock);
/*
* This is really belt and braces; we hold unintended secondary
@ -94,7 +94,7 @@ int __cpuinit versatile_boot_secondary(unsigned int cpu, struct task_struct *idl
* now the secondary core is starting up let it run its
* calibrations, then wait for it to finish
*/
spin_unlock(&boot_lock);
raw_spin_unlock(&boot_lock);
return pen_release != -1 ? -ENOSYS : 0;
}

View file

@ -81,7 +81,7 @@ asmlinkage void do_page_fault(unsigned long ecr, struct pt_regs *regs)
* If we're in an interrupt or have no user context, we must
* not take the fault...
*/
if (in_atomic() || !mm || regs->sr & SYSREG_BIT(GM))
if (!mm || regs->sr & SYSREG_BIT(GM) || pagefault_disabled())
goto no_context;
local_irq_enable();

View file

@ -25,8 +25,6 @@ extern struct console *bfin_earlyserial_init(unsigned int port,
extern struct console *bfin_jc_early_init(void);
#endif
static struct console *early_console;
/* Default console */
#define DEFAULT_PORT 0
#define DEFAULT_CFLAG CS8|B57600

View file

@ -114,7 +114,7 @@ do_page_fault(unsigned long address, struct pt_regs *regs,
* user context, we must not take the fault.
*/
if (in_atomic() || !mm)
if (!mm || pagefault_disabled())
goto no_context;
retry:

View file

@ -78,7 +78,7 @@ asmlinkage void do_page_fault(int datammu, unsigned long esr0, unsigned long ear
* If we're in an interrupt or have no user
* context, we must not take the fault..
*/
if (in_atomic() || !mm)
if (!mm || pagefault_disabled())
goto no_context;
down_read(&mm->mmap_sem);

View file

@ -98,7 +98,7 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
/*
* If we're in an interrupt or have no user context, we must not take the fault..
*/
if (in_atomic() || !mm)
if (!mm || pagefault_disabled())
goto no_context;
#ifdef CONFIG_VIRTUAL_MEM_MAP

View file

@ -114,7 +114,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code,
* If we're in an interrupt or have no user context or are running in an
* atomic region then we must not take the fault..
*/
if (in_atomic() || !mm)
if (!mm || pagefault_disabled())
goto bad_area_nosemaphore;
/* When running in the kernel we expect faults to occur only to

View file

@ -85,7 +85,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
* If we're in an interrupt or have no user
* context, we must not take the fault..
*/
if (in_atomic() || !mm)
if (!mm || pagefault_disabled())
goto no_context;
retry:

View file

@ -21,7 +21,6 @@
#include <asm/setup.h>
#include <asm/prom.h>
static u32 early_console_initialized;
static u32 base_addr;
#ifdef CONFIG_SERIAL_UARTLITE_CONSOLE
@ -109,27 +108,11 @@ static struct console early_serial_uart16550_console = {
};
#endif /* CONFIG_SERIAL_8250_CONSOLE */
static struct console *early_console;
void early_printk(const char *fmt, ...)
{
char buf[512];
int n;
va_list ap;
if (early_console_initialized) {
va_start(ap, fmt);
n = vscnprintf(buf, 512, fmt, ap);
early_console->write(early_console, buf, n);
va_end(ap);
}
}
int __init setup_early_printk(char *opt)
{
int version = 0;
if (early_console_initialized)
if (early_console)
return 1;
base_addr = of_early_console(&version);
@ -159,7 +142,6 @@ int __init setup_early_printk(char *opt)
}
register_console(early_console);
early_console_initialized = 1;
return 0;
}
return 1;
@ -169,7 +151,7 @@ int __init setup_early_printk(char *opt)
* only for early console because of performance degression */
void __init remap_early_printk(void)
{
if (!early_console_initialized || !early_console)
if (!early_console)
return;
printk(KERN_INFO "early_printk_console remapping from 0x%x to ",
base_addr);
@ -195,9 +177,9 @@ void __init remap_early_printk(void)
void __init disable_early_printk(void)
{
if (!early_console_initialized || !early_console)
if (!early_console)
return;
printk(KERN_WARNING "disabling early console\n");
unregister_console(early_console);
early_console_initialized = 0;
early_console = NULL;
}

View file

@ -108,7 +108,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long address,
if ((error_code & 0x13) == 0x13 || (error_code & 0x11) == 0x11)
is_write = 0;
if (unlikely(in_atomic() || !mm)) {
if (unlikely(!mm || pagefault_disabled())) {
if (kernel_mode(regs))
goto bad_area_nosemaphore;

View file

@ -2102,7 +2102,7 @@ config CPU_R4400_WORKAROUNDS
#
config HIGHMEM
bool "High Memory Support"
depends on 32BIT && CPU_SUPPORTS_HIGHMEM && SYS_SUPPORTS_HIGHMEM
depends on 32BIT && CPU_SUPPORTS_HIGHMEM && SYS_SUPPORTS_HIGHMEM && !PREEMPT_RT_FULL
config CPU_SUPPORTS_HIGHMEM
bool

View file

@ -8,6 +8,7 @@
* written by Ralf Baechle (ralf@linux-mips.org)
*/
#include <linux/console.h>
#include <linux/printk.h>
#include <linux/init.h>
#include <asm/setup.h>
@ -25,20 +26,18 @@ early_console_write(struct console *con, const char *s, unsigned n)
}
}
static struct console early_console __initdata = {
static struct console early_console_prom = {
.name = "early",
.write = early_console_write,
.flags = CON_PRINTBUFFER | CON_BOOT,
.index = -1
};
static int early_console_initialized __initdata;
void __init setup_early_printk(void)
{
if (early_console_initialized)
if (early_console)
return;
early_console_initialized = 1;
early_console = &early_console_prom;
register_console(&early_console);
register_console(&early_console_prom);
}

View file

@ -601,6 +601,7 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, void *unused,
__u32 thread_info_flags)
{
local_irq_enable();
preempt_check_resched();
/* deal with pending signal delivery */
if (thread_info_flags & _TIF_SIGPENDING)

View file

@ -89,7 +89,7 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs, unsigned long writ
* If we're in an interrupt or have no user
* context, we must not take the fault..
*/
if (in_atomic() || !mm)
if (!mm || pagefault_disabled())
goto bad_area_nosemaphore;
retry:

View file

@ -168,7 +168,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long fault_code,
* If we're in an interrupt or have no user
* context, we must not take the fault..
*/
if (in_atomic() || !mm)
if (!mm || pagefault_disabled())
goto no_context;
retry:

View file

@ -176,7 +176,7 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
unsigned long acc_type;
int fault;
if (in_atomic() || !mm)
if (!mm || pagefault_disabled())
goto no_context;
down_read(&mm->mmap_sem);

View file

@ -60,10 +60,11 @@ config LOCKDEP_SUPPORT
config RWSEM_GENERIC_SPINLOCK
bool
default y if PREEMPT_RT_FULL
config RWSEM_XCHGADD_ALGORITHM
bool
default y
default y if !PREEMPT_RT_FULL
config GENERIC_LOCKBREAK
bool
@ -141,6 +142,7 @@ config PPC
select GENERIC_CLOCKEVENTS
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
select HAVE_PREEMPT_LAZY
select HAVE_MOD_ARCH_SPECIFIC
select MODULES_USE_ELF_RELA
select CLONE_BACKWARDS
@ -290,7 +292,7 @@ menu "Kernel options"
config HIGHMEM
bool "High memory support"
depends on PPC32
depends on PPC32 && !PREEMPT_RT_FULL
source kernel/Kconfig.hz
source kernel/Kconfig.preempt

View file

@ -43,6 +43,8 @@ struct thread_info {
int cpu; /* cpu we're on */
int preempt_count; /* 0 => preemptable,
<0 => BUG */
int preempt_lazy_count; /* 0 => preemptable,
<0 => BUG */
struct restart_block restart_block;
unsigned long local_flags; /* private flags for thread */
@ -97,7 +99,7 @@ static inline struct thread_info *current_thread_info(void)
#define TIF_PERFMON_CTXSW 6 /* perfmon needs ctxsw calls */
#define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */
#define TIF_SINGLESTEP 8 /* singlestepping active */
#define TIF_MEMDIE 9 /* is terminating due to OOM killer */
#define TIF_NEED_RESCHED_LAZY 9 /* lazy rescheduling necessary */
#define TIF_SECCOMP 10 /* secure computing */
#define TIF_RESTOREALL 11 /* Restore all regs (implies NOERROR) */
#define TIF_NOERROR 12 /* Force successful syscall return */
@ -106,6 +108,7 @@ static inline struct thread_info *current_thread_info(void)
#define TIF_SYSCALL_TRACEPOINT 15 /* syscall tracepoint instrumentation */
#define TIF_EMULATE_STACK_STORE 16 /* Is an instruction emulation
for stack store? */
#define TIF_MEMDIE 17 /* is terminating due to OOM killer */
/* as above, but as bit values */
#define _TIF_SYSCALL_TRACE (1<<TIF_SYSCALL_TRACE)
@ -124,12 +127,15 @@ static inline struct thread_info *current_thread_info(void)
#define _TIF_UPROBE (1<<TIF_UPROBE)
#define _TIF_SYSCALL_TRACEPOINT (1<<TIF_SYSCALL_TRACEPOINT)
#define _TIF_EMULATE_STACK_STORE (1<<TIF_EMULATE_STACK_STORE)
#define _TIF_NEED_RESCHED_LAZY (1<<TIF_NEED_RESCHED_LAZY)
#define _TIF_SYSCALL_T_OR_A (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
_TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT)
#define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
_TIF_NOTIFY_RESUME | _TIF_UPROBE)
_TIF_NOTIFY_RESUME | _TIF_UPROBE | \
_TIF_NEED_RESCHED_LAZY)
#define _TIF_PERSYSCALL_MASK (_TIF_RESTOREALL|_TIF_NOERROR)
#define _TIF_NEED_RESCHED_MASK (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)
/* Bits in local_flags */
/* Don't move TLF_NAPPING without adjusting the code in entry_32.S */

View file

@ -124,6 +124,7 @@ int main(void)
DEFINE(TI_FLAGS, offsetof(struct thread_info, flags));
DEFINE(TI_LOCAL_FLAGS, offsetof(struct thread_info, local_flags));
DEFINE(TI_PREEMPT, offsetof(struct thread_info, preempt_count));
DEFINE(TI_PREEMPT_LAZY, offsetof(struct thread_info, preempt_lazy_count));
DEFINE(TI_TASK, offsetof(struct thread_info, task));
DEFINE(TI_CPU, offsetof(struct thread_info, cpu));

View file

@ -892,7 +892,14 @@ resume_kernel:
cmpwi 0,r0,0 /* if non-zero, just restore regs and return */
bne restore
andi. r8,r8,_TIF_NEED_RESCHED
bne+ 1f
lwz r0,TI_PREEMPT_LAZY(r9)
cmpwi 0,r0,0 /* if non-zero, just restore regs and return */
bne restore
lwz r0,TI_FLAGS(r9)
andi. r0,r0,_TIF_NEED_RESCHED_LAZY
beq+ restore
1:
lwz r3,_MSR(r1)
andi. r0,r3,MSR_EE /* interrupts off? */
beq restore /* don't schedule if so */
@ -903,11 +910,11 @@ resume_kernel:
*/
bl trace_hardirqs_off
#endif
1: bl preempt_schedule_irq
2: bl preempt_schedule_irq
CURRENT_THREAD_INFO(r9, r1)
lwz r3,TI_FLAGS(r9)
andi. r0,r3,_TIF_NEED_RESCHED
bne- 1b
andi. r0,r3,_TIF_NEED_RESCHED_MASK
bne- 2b
#ifdef CONFIG_TRACE_IRQFLAGS
/* And now, to properly rebalance the above, we tell lockdep they
* are being turned back on, which will happen when we return
@ -1228,7 +1235,7 @@ global_dbcr0:
#endif /* !(CONFIG_4xx || CONFIG_BOOKE) */
do_work: /* r10 contains MSR_KERNEL here */
andi. r0,r9,_TIF_NEED_RESCHED
andi. r0,r9,_TIF_NEED_RESCHED_MASK
beq do_user_signal
do_resched: /* r10 contains MSR_KERNEL here */
@ -1249,7 +1256,7 @@ recheck:
MTMSRD(r10) /* disable interrupts */
CURRENT_THREAD_INFO(r9, r1)
lwz r9,TI_FLAGS(r9)
andi. r0,r9,_TIF_NEED_RESCHED
andi. r0,r9,_TIF_NEED_RESCHED_MASK
bne- do_resched
andi. r0,r9,_TIF_USER_WORK_MASK
beq restore_user

View file

@ -592,7 +592,7 @@ _GLOBAL(ret_from_except_lite)
andi. r0,r4,_TIF_USER_WORK_MASK
beq restore
andi. r0,r4,_TIF_NEED_RESCHED
andi. r0,r4,_TIF_NEED_RESCHED_MASK
beq 1f
bl .restore_interrupts
bl .schedule
@ -642,10 +642,16 @@ resume_kernel:
#ifdef CONFIG_PREEMPT
/* Check if we need to preempt */
andi. r0,r4,_TIF_NEED_RESCHED
beq+ restore
/* Check that preempt_count() == 0 and interrupts are enabled */
lwz r8,TI_PREEMPT(r9)
andi. r0,r4,_TIF_NEED_RESCHED
bne+ check_count
andi. r0,r4,_TIF_NEED_RESCHED_LAZY
beq+ restore
lwz r8,TI_PREEMPT_LAZY(r9)
/* Check that preempt_count() == 0 and interrupts are enabled */
check_count:
cmpwi cr1,r8,0
ld r0,SOFTE(r1)
cmpdi r0,0
@ -662,7 +668,7 @@ resume_kernel:
/* Re-test flags and eventually loop */
CURRENT_THREAD_INFO(r9, r1)
ld r4,TI_FLAGS(r9)
andi. r0,r4,_TIF_NEED_RESCHED
andi. r0,r4,_TIF_NEED_RESCHED_MASK
bne 1b
/*

View file

@ -584,6 +584,7 @@ void irq_ctx_init(void)
}
}
#ifndef CONFIG_PREEMPT_RT_FULL
static inline void do_softirq_onstack(void)
{
struct thread_info *curtp, *irqtp;
@ -620,6 +621,7 @@ void do_softirq(void)
local_irq_restore(flags);
}
#endif
irq_hw_number_t virq_to_hw(unsigned int virq)
{

View file

@ -36,6 +36,7 @@
.text
#ifndef CONFIG_PREEMPT_RT_FULL
_GLOBAL(call_do_softirq)
mflr r0
stw r0,4(r1)
@ -46,6 +47,7 @@ _GLOBAL(call_do_softirq)
lwz r0,4(r1)
mtlr r0
blr
#endif
_GLOBAL(call_handle_irq)
mflr r0

View file

@ -29,6 +29,7 @@
.text
#ifndef CONFIG_PREEMPT_RT_FULL
_GLOBAL(call_do_softirq)
mflr r0
std r0,16(r1)
@ -39,6 +40,7 @@ _GLOBAL(call_do_softirq)
ld r0,16(r1)
mtlr r0
blr
#endif
_GLOBAL(call_handle_irq)
ld r8,0(r6)

View file

@ -156,15 +156,13 @@ static struct console udbg_console = {
.index = 0,
};
static int early_console_initialized;
/*
* Called by setup_system after ppc_md->probe and ppc_md->early_init.
* Call it again after setting udbg_putc in ppc_md->setup_arch.
*/
void __init register_early_udbg_console(void)
{
if (early_console_initialized)
if (early_console)
return;
if (!udbg_putc)
@ -174,7 +172,7 @@ void __init register_early_udbg_console(void)
printk(KERN_INFO "early console immortal !\n");
udbg_console.flags &= ~CON_BOOT;
}
early_console_initialized = 1;
early_console = &udbg_console;
register_console(&udbg_console);
}

View file

@ -259,7 +259,7 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
if (!arch_irq_disabled_regs(regs))
local_irq_enable();
if (in_atomic() || mm == NULL) {
if (!mm || pagefault_disabled()) {
if (!user_mode(regs))
return SIGSEGV;
/* in_atomic() in user mode is really bad,

View file

@ -43,6 +43,7 @@ static irqreturn_t timebase_interrupt(int irq, void *dev)
static struct irqaction tbint_irqaction = {
.handler = timebase_interrupt,
.flags = IRQF_NO_THREAD,
.name = "tbint",
};

View file

@ -120,6 +120,7 @@ static irqreturn_t cpm_error_interrupt(int irq, void *dev)
static struct irqaction cpm_error_irqaction = {
.handler = cpm_error_interrupt,
.flags = IRQF_NO_THREAD,
.name = "error",
};

View file

@ -333,6 +333,8 @@ static int fsl_of_msi_remove(struct platform_device *ofdev)
return 0;
}
static struct lock_class_key fsl_msi_irq_class;
static int fsl_msi_setup_hwirq(struct fsl_msi *msi, struct platform_device *dev,
int offset, int irq_index)
{
@ -351,7 +353,7 @@ static int fsl_msi_setup_hwirq(struct fsl_msi *msi, struct platform_device *dev,
dev_err(&dev->dev, "No memory for MSI cascade data\n");
return -ENOMEM;
}
irq_set_lockdep_class(virt_msir, &fsl_msi_irq_class);
msi->msi_virqs[irq_index] = virt_msir;
cascade_data->index = offset;
cascade_data->msi_data = msi;

View file

@ -296,7 +296,8 @@ static inline int do_exception(struct pt_regs *regs, int access)
* user context.
*/
fault = VM_FAULT_BADCONTEXT;
if (unlikely(!user_space_fault(trans_exc_code) || in_atomic() || !mm))
if (unlikely(!user_space_fault(trans_exc_code) ||
!mm || pagefault_disabled()))
goto out;
address = trans_exc_code & __FAIL_ADDR_MASK;
@ -435,7 +436,8 @@ void __kprobes do_asce_exception(struct pt_regs *regs)
clear_tsk_thread_flag(current, TIF_PER_TRAP);
trans_exc_code = regs->int_parm_long;
if (unlikely(!user_space_fault(trans_exc_code) || in_atomic() || !mm))
if (unlikely(!user_space_fault(trans_exc_code) || !mm ||
pagefault_disabled()))
goto no_context;
down_read(&mm->mmap_sem);

View file

@ -72,7 +72,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long write,
* If we're in an interrupt or have no user
* context, we must not take the fault..
*/
if (in_atomic() || !mm)
if (!mm || pagefault_disabled())
goto bad_area_nosemaphore;
down_read(&mm->mmap_sem);

View file

@ -149,6 +149,7 @@ void irq_ctx_exit(int cpu)
hardirq_ctx[cpu] = NULL;
}
#ifndef CONFIG_PREEMPT_RT_FULL
asmlinkage void do_softirq(void)
{
unsigned long flags;
@ -191,6 +192,7 @@ asmlinkage void do_softirq(void)
local_irq_restore(flags);
}
#endif
#else
static inline void handle_one_irq(unsigned int irq)
{

View file

@ -144,8 +144,6 @@ static struct console bios_console = {
.index = -1,
};
static struct console *early_console;
static int __init setup_early_printk(char *buf)
{
int keep_early = 0;

View file

@ -440,7 +440,7 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs,
* If we're in an interrupt, have no user context or are running
* in an atomic region then we must not take the fault:
*/
if (unlikely(in_atomic() || !mm)) {
if (unlikely(!mm || pagefault_disabled())) {
bad_area_nosemaphore(regs, error_code, address);
return;
}

View file

@ -698,6 +698,7 @@ void __irq_entry handler_irq(int pil, struct pt_regs *regs)
set_irq_regs(old_regs);
}
#ifndef CONFIG_PREEMPT_RT_FULL
void do_softirq(void)
{
unsigned long flags;
@ -723,6 +724,7 @@ void do_softirq(void)
local_irq_restore(flags);
}
#endif
#ifdef CONFIG_HOTPLUG_CPU
void fixup_irqs(void)

View file

@ -64,7 +64,7 @@ int of_set_property(struct device_node *dp, const char *name, void *val, int len
err = -ENODEV;
mutex_lock(&of_set_property_mutex);
write_lock(&devtree_lock);
raw_spin_lock(&devtree_lock);
prevp = &dp->properties;
while (*prevp) {
struct property *prop = *prevp;
@ -91,7 +91,7 @@ int of_set_property(struct device_node *dp, const char *name, void *val, int len
}
prevp = &(*prevp)->next;
}
write_unlock(&devtree_lock);
raw_spin_unlock(&devtree_lock);
mutex_unlock(&of_set_property_mutex);
/* XXX Upate procfs if necessary... */

View file

@ -309,6 +309,7 @@ void __init setup_arch(char **cmdline_p)
boot_flags_init(*cmdline_p);
early_console = &prom_early_console;
register_console(&prom_early_console);
printk("ARCH: ");

View file

@ -551,6 +551,12 @@ static void __init init_sparc64_elf_hwcap(void)
pause_patch();
}
static inline void register_prom_console(void)
{
early_console = &prom_early_console;
register_console(&prom_early_console);
}
void __init setup_arch(char **cmdline_p)
{
/* Initialize PROM console and command line. */
@ -562,7 +568,7 @@ void __init setup_arch(char **cmdline_p)
#ifdef CONFIG_EARLYFB
if (btext_find_display())
#endif
register_console(&prom_early_console);
register_prom_console();
if (tlb_type == hypervisor)
printk("ARCH: SUN4V\n");

View file

@ -200,7 +200,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
* If we're in an interrupt or have no user
* context, we must not take the fault..
*/
if (in_atomic() || !mm)
if (!mm || pagefault_disabled())
goto no_context;
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);

View file

@ -321,7 +321,7 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs)
* If we're in an interrupt or have no user
* context, we must not take the fault..
*/
if (in_atomic() || !mm)
if (!mm || pagefault_disabled())
goto intr_or_no_mm;
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);

View file

@ -17,6 +17,7 @@
#include <linux/init.h>
#include <linux/string.h>
#include <linux/irqflags.h>
#include <linux/printk.h>
#include <asm/setup.h>
#include <hv/hypervisor.h>
@ -33,25 +34,8 @@ static struct console early_hv_console = {
};
/* Direct interface for emergencies */
static struct console *early_console = &early_hv_console;
static int early_console_initialized;
static int early_console_complete;
static void early_vprintk(const char *fmt, va_list ap)
{
char buf[512];
int n = vscnprintf(buf, sizeof(buf), fmt, ap);
early_console->write(early_console, buf, n);
}
void early_printk(const char *fmt, ...)
{
va_list ap;
va_start(ap, fmt);
early_vprintk(fmt, ap);
va_end(ap);
}
void early_panic(const char *fmt, ...)
{
va_list ap;
@ -69,14 +53,13 @@ static int __initdata keep_early;
static int __init setup_early_printk(char *str)
{
if (early_console_initialized)
if (early_console)
return 1;
if (str != NULL && strncmp(str, "keep", 4) == 0)
keep_early = 1;
early_console = &early_hv_console;
early_console_initialized = 1;
register_console(early_console);
return 0;
@ -85,12 +68,12 @@ static int __init setup_early_printk(char *str)
void __init disable_early_printk(void)
{
early_console_complete = 1;
if (!early_console_initialized || !early_console)
if (!early_console)
return;
if (!keep_early) {
early_printk("disabling early console\n");
unregister_console(early_console);
early_console_initialized = 0;
early_console = NULL;
} else {
early_printk("keeping early console\n");
}
@ -98,7 +81,7 @@ void __init disable_early_printk(void)
void warn_early_printk(void)
{
if (early_console_complete || early_console_initialized)
if (early_console_complete || early_console)
return;
early_printk("\
Machine shutting down before console output is fully initialized.\n\

View file

@ -360,7 +360,7 @@ static int handle_page_fault(struct pt_regs *regs,
* If we're in an interrupt, have no user context or are running in an
* atomic region then we must not take the fault.
*/
if (in_atomic() || !mm) {
if (!mm || pagefault_disabled()) {
vma = NULL; /* happy compiler */
goto bad_area_nosemaphore;
}

View file

@ -16,7 +16,7 @@ static void early_console_write(struct console *con, const char *s, unsigned int
um_early_printk(s, n);
}
static struct console early_console = {
static struct console early_console_dev = {
.name = "earlycon",
.write = early_console_write,
.flags = CON_BOOT,
@ -25,8 +25,10 @@ static struct console early_console = {
static int __init setup_early_printk(char *buf)
{
register_console(&early_console);
if (!early_console) {
early_console = &early_console_dev;
register_console(&early_console_dev);
}
return 0;
}

View file

@ -39,7 +39,7 @@ int handle_page_fault(unsigned long address, unsigned long ip,
* If the fault was during atomic operation, don't take the fault, just
* fail.
*/
if (in_atomic())
if (pagefault_disabled())
goto out_nosemaphore;
retry:

View file

@ -33,21 +33,17 @@ static struct console early_ocd_console = {
.index = -1,
};
/* Direct interface for emergencies */
static struct console *early_console = &early_ocd_console;
static int __initdata keep_early;
static int __init setup_early_printk(char *buf)
{
if (!buf)
int keep_early;
if (!buf || early_console)
return 0;
if (strstr(buf, "keep"))
keep_early = 1;
if (!strncmp(buf, "ocd", 3))
early_console = &early_ocd_console;
early_console = &early_ocd_console;
if (keep_early)
early_console->flags &= ~CON_BOOT;

View file

@ -108,6 +108,7 @@ config X86
select KTIME_SCALAR if X86_32
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
select HAVE_PREEMPT_LAZY
select HAVE_CONTEXT_TRACKING if X86_64
select HAVE_IRQ_TIME_ACCOUNTING
select MODULES_USE_ELF_REL if X86_32
@ -173,8 +174,11 @@ config ARCH_MAY_HAVE_PC_FDC
def_bool y
depends on ISA_DMA_API
config RWSEM_GENERIC_SPINLOCK
def_bool PREEMPT_RT_FULL
config RWSEM_XCHGADD_ALGORITHM
def_bool y
def_bool !RWSEM_GENERIC_SPINLOCK && !PREEMPT_RT_FULL
config GENERIC_CALIBRATE_DELAY
def_bool y
@ -772,7 +776,7 @@ config IOMMU_HELPER
config MAXSMP
bool "Enable Maximum number of SMP Processors and NUMA Nodes"
depends on X86_64 && SMP && DEBUG_KERNEL && EXPERIMENTAL
select CPUMASK_OFFSTACK
select CPUMASK_OFFSTACK if !PREEMPT_RT_FULL
---help---
Enable maximum number of CPUS and NUMA Nodes for this architecture.
If unsure, say N.

View file

@ -250,14 +250,14 @@ static int ecb_encrypt(struct blkcipher_desc *desc,
err = blkcipher_walk_virt(desc, &walk);
desc->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
kernel_fpu_begin();
while ((nbytes = walk.nbytes)) {
kernel_fpu_begin();
aesni_ecb_enc(ctx, walk.dst.virt.addr, walk.src.virt.addr,
nbytes & AES_BLOCK_MASK);
nbytes & AES_BLOCK_MASK);
kernel_fpu_end();
nbytes &= AES_BLOCK_SIZE - 1;
err = blkcipher_walk_done(desc, &walk, nbytes);
}
kernel_fpu_end();
return err;
}
@ -274,14 +274,14 @@ static int ecb_decrypt(struct blkcipher_desc *desc,
err = blkcipher_walk_virt(desc, &walk);
desc->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
kernel_fpu_begin();
while ((nbytes = walk.nbytes)) {
kernel_fpu_begin();
aesni_ecb_dec(ctx, walk.dst.virt.addr, walk.src.virt.addr,
nbytes & AES_BLOCK_MASK);
kernel_fpu_end();
nbytes &= AES_BLOCK_SIZE - 1;
err = blkcipher_walk_done(desc, &walk, nbytes);
}
kernel_fpu_end();
return err;
}
@ -298,14 +298,14 @@ static int cbc_encrypt(struct blkcipher_desc *desc,
err = blkcipher_walk_virt(desc, &walk);
desc->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
kernel_fpu_begin();
while ((nbytes = walk.nbytes)) {
kernel_fpu_begin();
aesni_cbc_enc(ctx, walk.dst.virt.addr, walk.src.virt.addr,
nbytes & AES_BLOCK_MASK, walk.iv);
kernel_fpu_end();
nbytes &= AES_BLOCK_SIZE - 1;
err = blkcipher_walk_done(desc, &walk, nbytes);
}
kernel_fpu_end();
return err;
}
@ -322,14 +322,14 @@ static int cbc_decrypt(struct blkcipher_desc *desc,
err = blkcipher_walk_virt(desc, &walk);
desc->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
kernel_fpu_begin();
while ((nbytes = walk.nbytes)) {
kernel_fpu_begin();
aesni_cbc_dec(ctx, walk.dst.virt.addr, walk.src.virt.addr,
nbytes & AES_BLOCK_MASK, walk.iv);
kernel_fpu_end();
nbytes &= AES_BLOCK_SIZE - 1;
err = blkcipher_walk_done(desc, &walk, nbytes);
}
kernel_fpu_end();
return err;
}
@ -362,18 +362,20 @@ static int ctr_crypt(struct blkcipher_desc *desc,
err = blkcipher_walk_virt_block(desc, &walk, AES_BLOCK_SIZE);
desc->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
kernel_fpu_begin();
while ((nbytes = walk.nbytes) >= AES_BLOCK_SIZE) {
kernel_fpu_begin();
aesni_ctr_enc(ctx, walk.dst.virt.addr, walk.src.virt.addr,
nbytes & AES_BLOCK_MASK, walk.iv);
kernel_fpu_end();
nbytes &= AES_BLOCK_SIZE - 1;
err = blkcipher_walk_done(desc, &walk, nbytes);
}
if (walk.nbytes) {
kernel_fpu_begin();
ctr_crypt_final(ctx, &walk);
kernel_fpu_end();
err = blkcipher_walk_done(desc, &walk, 0);
}
kernel_fpu_end();
return err;
}

View file

@ -51,8 +51,8 @@
#define ACPI_ASM_MACROS
#define BREAKPOINT3
#define ACPI_DISABLE_IRQS() local_irq_disable()
#define ACPI_ENABLE_IRQS() local_irq_enable()
#define ACPI_DISABLE_IRQS() local_irq_disable_nort()
#define ACPI_ENABLE_IRQS() local_irq_enable_nort()
#define ACPI_FLUSH_CPU_CACHE() wbinvd()
int __acpi_acquire_global_lock(unsigned int *lock);

View file

@ -14,12 +14,21 @@
#define IRQ_STACK_ORDER 2
#define IRQ_STACK_SIZE (PAGE_SIZE << IRQ_STACK_ORDER)
#define STACKFAULT_STACK 1
#define DOUBLEFAULT_STACK 2
#define NMI_STACK 3
#define DEBUG_STACK 4
#define MCE_STACK 5
#define N_EXCEPTION_STACKS 5 /* hw limit: 7 */
#ifdef CONFIG_PREEMPT_RT_FULL
# define STACKFAULT_STACK 0
# define DOUBLEFAULT_STACK 1
# define NMI_STACK 2
# define DEBUG_STACK 0
# define MCE_STACK 3
# define N_EXCEPTION_STACKS 3 /* hw limit: 7 */
#else
# define STACKFAULT_STACK 1
# define DOUBLEFAULT_STACK 2
# define NMI_STACK 3
# define DEBUG_STACK 4
# define MCE_STACK 5
# define N_EXCEPTION_STACKS 5 /* hw limit: 7 */
#endif
#define PUD_PAGE_SIZE (_AC(1, UL) << PUD_SHIFT)
#define PUD_PAGE_MASK (~(PUD_PAGE_SIZE-1))

View file

@ -23,6 +23,19 @@ typedef struct {
unsigned long sig[_NSIG_WORDS];
} sigset_t;
/*
* Because some traps use the IST stack, we must keep preemption
* disabled while calling do_trap(), but do_trap() may call
* force_sig_info() which will grab the signal spin_locks for the
* task, which in PREEMPT_RT_FULL are mutexes. By defining
* ARCH_RT_DELAYS_SIGNAL_SEND the force_sig_info() will set
* TIF_NOTIFY_RESUME and set up the signal to be sent on exit of the
* trap.
*/
#if defined(CONFIG_PREEMPT_RT_FULL) && defined(CONFIG_X86_64)
#define ARCH_RT_DELAYS_SIGNAL_SEND
#endif
#ifndef CONFIG_COMPAT
typedef sigset_t compat_sigset_t;
#endif

View file

@ -57,7 +57,7 @@
*/
static __always_inline void boot_init_stack_canary(void)
{
u64 canary;
u64 uninitialized_var(canary);
u64 tsc;
#ifdef CONFIG_X86_64
@ -68,8 +68,16 @@ static __always_inline void boot_init_stack_canary(void)
* of randomness. The TSC only matters for very early init,
* there it already has some randomness on most systems. Later
* on during the bootup the random pool has true entropy too.
*
* For preempt-rt we need to weaken the randomness a bit, as
* we can't call into the random generator from atomic context
* due to locking constraints. We just leave canary
* uninitialized and use the TSC based randomness on top of
* it.
*/
#ifndef CONFIG_PREEMPT_RT_FULL
get_random_bytes(&canary, sizeof(canary));
#endif
tsc = __native_read_tsc();
canary += tsc + (tsc << 32UL);

View file

@ -31,6 +31,8 @@ struct thread_info {
__u32 cpu; /* current CPU */
int preempt_count; /* 0 => preemptable,
<0 => BUG */
int preempt_lazy_count; /* 0 => lazy preemptable,
<0 => BUG */
mm_segment_t addr_limit;
struct restart_block restart_block;
void __user *sysenter_return;
@ -82,6 +84,7 @@ struct thread_info {
#define TIF_SYSCALL_EMU 6 /* syscall emulation active */
#define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */
#define TIF_SECCOMP 8 /* secure computing */
#define TIF_NEED_RESCHED_LAZY 9 /* lazy rescheduling necessary */
#define TIF_MCE_NOTIFY 10 /* notify userspace of an MCE */
#define TIF_USER_RETURN_NOTIFY 11 /* notify kernel of userspace return */
#define TIF_UPROBE 12 /* breakpointed or singlestepping */
@ -107,6 +110,7 @@ struct thread_info {
#define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
#define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
#define _TIF_SECCOMP (1 << TIF_SECCOMP)
#define _TIF_NEED_RESCHED_LAZY (1 << TIF_NEED_RESCHED_LAZY)
#define _TIF_MCE_NOTIFY (1 << TIF_MCE_NOTIFY)
#define _TIF_USER_RETURN_NOTIFY (1 << TIF_USER_RETURN_NOTIFY)
#define _TIF_UPROBE (1 << TIF_UPROBE)
@ -157,6 +161,8 @@ struct thread_info {
#define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY)
#define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW|_TIF_DEBUG)
#define _TIF_NEED_RESCHED_MASK (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)
#define PREEMPT_ACTIVE 0x10000000
#ifdef CONFIG_X86_32

View file

@ -2428,7 +2428,8 @@ static bool io_apic_level_ack_pending(struct irq_cfg *cfg)
static inline bool ioapic_irqd_mask(struct irq_data *data, struct irq_cfg *cfg)
{
/* If we are moving the irq we need to mask it */
if (unlikely(irqd_is_setaffinity_pending(data))) {
if (unlikely(irqd_is_setaffinity_pending(data) &&
!irqd_irq_inprogress(data))) {
mask_ioapic(cfg);
return true;
}

View file

@ -33,6 +33,7 @@ void common(void) {
OFFSET(TI_status, thread_info, status);
OFFSET(TI_addr_limit, thread_info, addr_limit);
OFFSET(TI_preempt_count, thread_info, preempt_count);
OFFSET(TI_preempt_lazy_count, thread_info, preempt_lazy_count);
BLANK();
OFFSET(crypto_tfm_ctx_offset, crypto_tfm, __crt_ctx);

View file

@ -1103,7 +1103,9 @@ DEFINE_PER_CPU(struct task_struct *, fpu_owner_task);
*/
static const unsigned int exception_stack_sizes[N_EXCEPTION_STACKS] = {
[0 ... N_EXCEPTION_STACKS - 1] = EXCEPTION_STKSZ,
#if DEBUG_STACK > 0
[DEBUG_STACK - 1] = DEBUG_STKSZ
#endif
};
static DEFINE_PER_CPU_PAGE_ALIGNED(char, exception_stacks

View file

@ -41,6 +41,7 @@
#include <linux/debugfs.h>
#include <linux/irq_work.h>
#include <linux/export.h>
#include <linux/jiffies.h>
#include <asm/processor.h>
#include <asm/mce.h>
@ -1259,7 +1260,7 @@ void mce_log_therm_throt_event(__u64 status)
static unsigned long check_interval = 5 * 60; /* 5 minutes */
static DEFINE_PER_CPU(unsigned long, mce_next_interval); /* in jiffies */
static DEFINE_PER_CPU(struct timer_list, mce_timer);
static DEFINE_PER_CPU(struct hrtimer, mce_timer);
static unsigned long mce_adjust_timer_default(unsigned long interval)
{
@ -1269,13 +1270,10 @@ static unsigned long mce_adjust_timer_default(unsigned long interval)
static unsigned long (*mce_adjust_timer)(unsigned long interval) =
mce_adjust_timer_default;
static void mce_timer_fn(unsigned long data)
static enum hrtimer_restart mce_timer_fn(struct hrtimer *timer)
{
struct timer_list *t = &__get_cpu_var(mce_timer);
unsigned long iv;
WARN_ON(smp_processor_id() != data);
if (mce_available(__this_cpu_ptr(&cpu_info))) {
machine_check_poll(MCP_TIMESTAMP,
&__get_cpu_var(mce_poll_banks));
@ -1296,9 +1294,10 @@ static void mce_timer_fn(unsigned long data)
__this_cpu_write(mce_next_interval, iv);
/* Might have become 0 after CMCI storm subsided */
if (iv) {
t->expires = jiffies + iv;
add_timer_on(t, smp_processor_id());
hrtimer_forward_now(timer, ns_to_ktime(jiffies_to_usecs(iv)));
return HRTIMER_RESTART;
}
return HRTIMER_NORESTART;
}
/*
@ -1306,28 +1305,37 @@ static void mce_timer_fn(unsigned long data)
*/
void mce_timer_kick(unsigned long interval)
{
struct timer_list *t = &__get_cpu_var(mce_timer);
unsigned long when = jiffies + interval;
struct hrtimer *t = &__get_cpu_var(mce_timer);
unsigned long iv = __this_cpu_read(mce_next_interval);
if (timer_pending(t)) {
if (time_before(when, t->expires))
mod_timer_pinned(t, when);
if (hrtimer_active(t)) {
s64 exp;
s64 intv_us;
intv_us = jiffies_to_usecs(interval);
exp = ktime_to_us(hrtimer_expires_remaining(t));
if (intv_us < exp) {
hrtimer_cancel(t);
hrtimer_start_range_ns(t,
ns_to_ktime(intv_us * 1000),
0, HRTIMER_MODE_REL_PINNED);
}
} else {
t->expires = round_jiffies(when);
add_timer_on(t, smp_processor_id());
hrtimer_start_range_ns(t,
ns_to_ktime(jiffies_to_usecs(interval) * 1000),
0, HRTIMER_MODE_REL_PINNED);
}
if (interval < iv)
__this_cpu_write(mce_next_interval, interval);
}
/* Must not be called in IRQ context where del_timer_sync() can deadlock */
/* Must not be called in IRQ context where hrtimer_cancel() can deadlock */
static void mce_timer_delete_all(void)
{
int cpu;
for_each_online_cpu(cpu)
del_timer_sync(&per_cpu(mce_timer, cpu));
hrtimer_cancel(&per_cpu(mce_timer, cpu));
}
static void mce_do_trigger(struct work_struct *work)
@ -1632,7 +1640,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
}
}
static void mce_start_timer(unsigned int cpu, struct timer_list *t)
static void mce_start_timer(unsigned int cpu, struct hrtimer *t)
{
unsigned long iv = mce_adjust_timer(check_interval * HZ);
@ -1641,16 +1649,17 @@ static void mce_start_timer(unsigned int cpu, struct timer_list *t)
if (mca_cfg.ignore_ce || !iv)
return;
t->expires = round_jiffies(jiffies + iv);
add_timer_on(t, smp_processor_id());
hrtimer_start_range_ns(t, ns_to_ktime(jiffies_to_usecs(iv) * 1000),
0, HRTIMER_MODE_REL_PINNED);
}
static void __mcheck_cpu_init_timer(void)
{
struct timer_list *t = &__get_cpu_var(mce_timer);
struct hrtimer *t = &__get_cpu_var(mce_timer);
unsigned int cpu = smp_processor_id();
setup_timer(t, mce_timer_fn, cpu);
hrtimer_init(t, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
t->function = mce_timer_fn;
mce_start_timer(cpu, t);
}
@ -2307,6 +2316,8 @@ static void __cpuinit mce_disable_cpu(void *h)
if (!mce_available(__this_cpu_ptr(&cpu_info)))
return;
hrtimer_cancel(&__get_cpu_var(mce_timer));
if (!(action & CPU_TASKS_FROZEN))
cmci_clear();
for (i = 0; i < mca_cfg.banks; i++) {
@ -2333,6 +2344,7 @@ static void __cpuinit mce_reenable_cpu(void *h)
if (b->init)
wrmsrl(MSR_IA32_MCx_CTL(i), b->ctl);
}
__mcheck_cpu_init_timer();
}
/* Get notified when a cpu comes on/off. Be hotplug friendly. */
@ -2340,7 +2352,6 @@ static int __cpuinit
mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
{
unsigned int cpu = (unsigned long)hcpu;
struct timer_list *t = &per_cpu(mce_timer, cpu);
switch (action & ~CPU_TASKS_FROZEN) {
case CPU_ONLINE:
@ -2356,11 +2367,9 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
break;
case CPU_DOWN_PREPARE:
smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
del_timer_sync(t);
break;
case CPU_DOWN_FAILED:
smp_call_function_single(cpu, mce_reenable_cpu, &action, 1);
mce_start_timer(cpu, t);
break;
}

View file

@ -108,6 +108,7 @@ struct intel_shared_regs {
struct er_account regs[EXTRA_REG_MAX];
int refcnt; /* per-core: #HT threads */
unsigned core_id; /* per-core: core id */
struct rcu_head rcu;
};
#define MAX_LBR_ENTRIES 16

View file

@ -1715,7 +1715,7 @@ static void intel_pmu_cpu_dying(int cpu)
pc = cpuc->shared_regs;
if (pc) {
if (pc->core_id == -1 || --pc->refcnt == 0)
kfree(pc);
kfree_rcu(pc, rcu);
cpuc->shared_regs = NULL;
}

View file

@ -2636,7 +2636,7 @@ static void __cpuinit uncore_cpu_dying(int cpu)
box = *per_cpu_ptr(pmu->box, cpu);
*per_cpu_ptr(pmu->box, cpu) = NULL;
if (box && atomic_dec_and_test(&box->refcnt))
kfree(box);
kfree_rcu(box, rcu);
}
}
}
@ -2666,7 +2666,8 @@ static int __cpuinit uncore_cpu_starting(int cpu)
if (exist && exist->phys_id == phys_id) {
atomic_inc(&exist->refcnt);
*per_cpu_ptr(pmu->box, cpu) = exist;
kfree(box);
if (box)
kfree_rcu(box, rcu);
box = NULL;
break;
}

View file

@ -421,6 +421,7 @@ struct intel_uncore_box {
struct hrtimer hrtimer;
struct list_head list;
struct intel_uncore_extra_reg shared_regs[0];
struct rcu_head rcu;
};
#define UNCORE_BOX_FLAG_INITIATED 0

View file

@ -21,10 +21,14 @@
(N_EXCEPTION_STACKS + DEBUG_STKSZ/EXCEPTION_STKSZ - 2)
static char x86_stack_ids[][8] = {
#if DEBUG_STACK > 0
[ DEBUG_STACK-1 ] = "#DB",
#endif
[ NMI_STACK-1 ] = "NMI",
[ DOUBLEFAULT_STACK-1 ] = "#DF",
#if STACKFAULT_STACK > 0
[ STACKFAULT_STACK-1 ] = "#SS",
#endif
[ MCE_STACK-1 ] = "#MC",
#if DEBUG_STKSZ > EXCEPTION_STKSZ
[ N_EXCEPTION_STACKS ...

View file

@ -169,25 +169,9 @@ static struct console early_serial_console = {
.index = -1,
};
/* Direct interface for emergencies */
static struct console *early_console = &early_vga_console;
static int __initdata early_console_initialized;
asmlinkage void early_printk(const char *fmt, ...)
{
char buf[512];
int n;
va_list ap;
va_start(ap, fmt);
n = vscnprintf(buf, sizeof(buf), fmt, ap);
early_console->write(early_console, buf, n);
va_end(ap);
}
static inline void early_console_register(struct console *con, int keep_early)
{
if (early_console->index != -1) {
if (con->index != -1) {
printk(KERN_CRIT "ERROR: earlyprintk= %s already used\n",
con->name);
return;
@ -207,9 +191,8 @@ static int __init setup_early_printk(char *buf)
if (!buf)
return 0;
if (early_console_initialized)
if (early_console)
return 0;
early_console_initialized = 1;
keep = (strstr(buf, "keep") != NULL);

View file

@ -364,14 +364,22 @@ ENTRY(resume_kernel)
DISABLE_INTERRUPTS(CLBR_ANY)
cmpl $0,TI_preempt_count(%ebp) # non-zero preempt_count ?
jnz restore_all
need_resched:
movl TI_flags(%ebp), %ecx # need_resched set ?
testb $_TIF_NEED_RESCHED, %cl
jnz 1f
cmpl $0,TI_preempt_lazy_count(%ebp) # non-zero preempt_lazy_count ?
jnz restore_all
testl $_TIF_NEED_RESCHED_LAZY, %ecx
jz restore_all
testl $X86_EFLAGS_IF,PT_EFLAGS(%esp) # interrupts off (exception path) ?
1: testl $X86_EFLAGS_IF,PT_EFLAGS(%esp) # interrupts off (exception path) ?
jz restore_all
call preempt_schedule_irq
jmp need_resched
movl TI_flags(%ebp), %ecx # need_resched set ?
testl $_TIF_NEED_RESCHED_MASK, %ecx
jnz 1b
jmp restore_all
END(resume_kernel)
#endif
CFI_ENDPROC
@ -607,7 +615,7 @@ ENDPROC(system_call)
ALIGN
RING0_PTREGS_FRAME # can't unwind into user space anyway
work_pending:
testb $_TIF_NEED_RESCHED, %cl
testl $_TIF_NEED_RESCHED_MASK, %ecx
jz work_notifysig
work_resched:
call schedule
@ -620,7 +628,7 @@ work_resched:
andl $_TIF_WORK_MASK, %ecx # is there any work to be done other
# than syscall tracing?
jz restore_all
testb $_TIF_NEED_RESCHED, %cl
testl $_TIF_NEED_RESCHED_MASK, %ecx
jnz work_resched
work_notifysig: # deal with pending signals and

View file

@ -673,8 +673,8 @@ sysret_check:
/* Handle reschedules */
/* edx: work, edi: workmask */
sysret_careful:
bt $TIF_NEED_RESCHED,%edx
jnc sysret_signal
testl $_TIF_NEED_RESCHED_MASK,%edx
jz sysret_signal
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_NONE)
pushq_cfi %rdi
@ -786,8 +786,8 @@ GLOBAL(int_with_check)
/* First do a reschedule test. */
/* edx: work, edi: workmask */
int_careful:
bt $TIF_NEED_RESCHED,%edx
jnc int_very_careful
testl $_TIF_NEED_RESCHED_MASK,%edx
jz int_very_careful
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_NONE)
pushq_cfi %rdi
@ -1094,8 +1094,8 @@ bad_iret:
/* edi: workmask, edx: work */
retint_careful:
CFI_RESTORE_STATE
bt $TIF_NEED_RESCHED,%edx
jnc retint_signal
testl $_TIF_NEED_RESCHED_MASK,%edx
jz retint_signal
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_NONE)
pushq_cfi %rdi
@ -1128,9 +1128,15 @@ retint_signal:
ENTRY(retint_kernel)
cmpl $0,TI_preempt_count(%rcx)
jnz retint_restore_args
bt $TIF_NEED_RESCHED,TI_flags(%rcx)
bt $TIF_NEED_RESCHED,TI_flags(%rcx)
jc 1f
cmpl $0,TI_preempt_lazy_count(%rcx)
jnz retint_restore_args
bt $TIF_NEED_RESCHED_LAZY,TI_flags(%rcx)
jnc retint_restore_args
bt $9,EFLAGS-ARGOFFSET(%rsp) /* interrupts off? */
1: bt $9,EFLAGS-ARGOFFSET(%rsp) /* interrupts off? */
jnc retint_restore_args
call preempt_schedule_irq
jmp exit_intr
@ -1337,6 +1343,7 @@ bad_gs:
jmp 2b
.previous
#ifndef CONFIG_PREEMPT_RT_FULL
/* Call softirq on interrupt stack. Interrupts are off. */
ENTRY(call_softirq)
CFI_STARTPROC
@ -1356,6 +1363,7 @@ ENTRY(call_softirq)
ret
CFI_ENDPROC
END(call_softirq)
#endif
#ifdef CONFIG_XEN
zeroentry xen_hypervisor_callback xen_do_hypervisor_callback
@ -1520,7 +1528,7 @@ paranoid_userspace:
movq %rsp,%rdi /* &pt_regs */
call sync_regs
movq %rax,%rsp /* switch stack for scheduling */
testl $_TIF_NEED_RESCHED,%ebx
testl $_TIF_NEED_RESCHED_MASK,%ebx
jnz paranoid_schedule
movl %ebx,%edx /* arg3: thread flags */
TRACE_IRQS_ON

View file

@ -8,6 +8,7 @@
#include <linux/slab.h>
#include <linux/hpet.h>
#include <linux/init.h>
#include <linux/dmi.h>
#include <linux/cpu.h>
#include <linux/pm.h>
#include <linux/io.h>
@ -573,6 +574,30 @@ static void init_one_hpet_msi_clockevent(struct hpet_dev *hdev, int cpu)
#define RESERVE_TIMERS 0
#endif
static int __init dmi_disable_hpet_msi(const struct dmi_system_id *d)
{
hpet_msi_disable = 1;
return 0;
}
static struct dmi_system_id __initdata dmi_hpet_table[] = {
/*
* MSI based per cpu timers lose interrupts when intel_idle()
* is enabled - independent of the c-state. With idle=poll the
* problem cannot be observed. We have no idea yet, whether
* this is a W510 specific issue or a general chipset oddity.
*/
{
.callback = dmi_disable_hpet_msi,
.ident = "Lenovo W510",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad W510"),
},
},
{}
};
static void hpet_msi_capability_lookup(unsigned int start_timer)
{
unsigned int id;
@ -580,6 +605,8 @@ static void hpet_msi_capability_lookup(unsigned int start_timer)
unsigned int num_timers_used = 0;
int i;
dmi_check_system(dmi_hpet_table);
if (hpet_msi_disable)
return;

View file

@ -149,6 +149,7 @@ void __cpuinit irq_ctx_init(int cpu)
cpu, per_cpu(hardirq_ctx, cpu), per_cpu(softirq_ctx, cpu));
}
#ifndef CONFIG_PREEMPT_RT_FULL
asmlinkage void do_softirq(void)
{
unsigned long flags;
@ -179,6 +180,7 @@ asmlinkage void do_softirq(void)
local_irq_restore(flags);
}
#endif
bool handle_irq(unsigned irq, struct pt_regs *regs)
{

View file

@ -88,7 +88,7 @@ bool handle_irq(unsigned irq, struct pt_regs *regs)
return true;
}
#ifndef CONFIG_PREEMPT_RT_FULL
extern void call_softirq(void);
asmlinkage void do_softirq(void)
@ -108,3 +108,4 @@ asmlinkage void do_softirq(void)
}
local_irq_restore(flags);
}
#endif

View file

@ -18,6 +18,7 @@ void smp_irq_work_interrupt(struct pt_regs *regs)
irq_exit();
}
#ifndef CONFIG_PREEMPT_RT_FULL
void arch_irq_work_raise(void)
{
#ifdef CONFIG_X86_LOCAL_APIC
@ -28,3 +29,4 @@ void arch_irq_work_raise(void)
apic_wait_icr_idle();
#endif
}
#endif

View file

@ -36,6 +36,7 @@
#include <linux/uaccess.h>
#include <linux/io.h>
#include <linux/kdebug.h>
#include <linux/highmem.h>
#include <asm/pgtable.h>
#include <asm/ldt.h>
@ -216,6 +217,35 @@ start_thread(struct pt_regs *regs, unsigned long new_ip, unsigned long new_sp)
}
EXPORT_SYMBOL_GPL(start_thread);
#ifdef CONFIG_PREEMPT_RT_FULL
static void switch_kmaps(struct task_struct *prev_p, struct task_struct *next_p)
{
int i;
/*
* Clear @prev's kmap_atomic mappings
*/
for (i = 0; i < prev_p->kmap_idx; i++) {
int idx = i + KM_TYPE_NR * smp_processor_id();
pte_t *ptep = kmap_pte - idx;
kpte_clear_flush(ptep, __fix_to_virt(FIX_KMAP_BEGIN + idx));
}
/*
* Restore @next_p's kmap_atomic mappings
*/
for (i = 0; i < next_p->kmap_idx; i++) {
int idx = i + KM_TYPE_NR * smp_processor_id();
if (!pte_none(next_p->kmap_pte[i]))
set_pte(kmap_pte - idx, next_p->kmap_pte[i]);
}
}
#else
static inline void
switch_kmaps(struct task_struct *prev_p, struct task_struct *next_p) { }
#endif
/*
* switch_to(x,y) should switch tasks from x to y.
@ -295,6 +325,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT))
__switch_to_xtra(prev_p, next_p, tss);
switch_kmaps(prev_p, next_p);
/*
* Leave lazy mode, flushing any hypercalls made here.
* This must be done before restoring TLS segments so

View file

@ -808,6 +808,14 @@ do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
mce_notify_process();
#endif /* CONFIG_X86_64 && CONFIG_X86_MCE */
#ifdef ARCH_RT_DELAYS_SIGNAL_SEND
if (unlikely(current->forced_info.si_signo)) {
struct task_struct *t = current;
force_sig_info(t->forced_info.si_signo, &t->forced_info, t);
t->forced_info.si_signo = 0;
}
#endif
if (thread_info_flags & _TIF_UPROBE)
uprobe_notify_resume(regs);

View file

@ -85,9 +85,21 @@ static inline void conditional_sti(struct pt_regs *regs)
local_irq_enable();
}
static inline void preempt_conditional_sti(struct pt_regs *regs)
static inline void conditional_sti_ist(struct pt_regs *regs)
{
#ifdef CONFIG_X86_64
/*
* X86_64 uses a per CPU stack on the IST for certain traps
* like int3. The task can not be preempted when using one
* of these stacks, thus preemption must be disabled, otherwise
* the stack can be corrupted if the task is scheduled out,
* and another task comes in and uses this stack.
*
* On x86_32 the task keeps its own stack and it is OK if the
* task schedules out.
*/
inc_preempt_count();
#endif
if (regs->flags & X86_EFLAGS_IF)
local_irq_enable();
}
@ -98,11 +110,13 @@ static inline void conditional_cli(struct pt_regs *regs)
local_irq_disable();
}
static inline void preempt_conditional_cli(struct pt_regs *regs)
static inline void conditional_cli_ist(struct pt_regs *regs)
{
if (regs->flags & X86_EFLAGS_IF)
local_irq_disable();
#ifdef CONFIG_X86_64
dec_preempt_count();
#endif
}
static int __kprobes
@ -229,9 +243,9 @@ dotraplinkage void do_stack_segment(struct pt_regs *regs, long error_code)
exception_enter(regs);
if (notify_die(DIE_TRAP, "stack segment", regs, error_code,
X86_TRAP_SS, SIGBUS) != NOTIFY_STOP) {
preempt_conditional_sti(regs);
conditional_sti_ist(regs);
do_trap(X86_TRAP_SS, SIGBUS, "stack segment", regs, error_code, NULL);
preempt_conditional_cli(regs);
conditional_cli_ist(regs);
}
exception_exit(regs);
}
@ -331,9 +345,9 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
* as we may switch to the interrupt stack.
*/
debug_stack_usage_inc();
preempt_conditional_sti(regs);
conditional_sti_ist(regs);
do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, error_code, NULL);
preempt_conditional_cli(regs);
conditional_cli_ist(regs);
debug_stack_usage_dec();
exit:
exception_exit(regs);
@ -438,12 +452,12 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
debug_stack_usage_inc();
/* It's safe to allow irq's after DR6 has been saved */
preempt_conditional_sti(regs);
conditional_sti_ist(regs);
if (regs->flags & X86_VM_MASK) {
handle_vm86_trap((struct kernel_vm86_regs *) regs, error_code,
X86_TRAP_DB);
preempt_conditional_cli(regs);
conditional_cli_ist(regs);
debug_stack_usage_dec();
goto exit;
}
@ -463,7 +477,7 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code)
si_code = get_si_code(tsk->thread.debugreg6);
if (tsk->thread.debugreg6 & (DR_STEP | DR_TRAP_BITS) || user_icebp)
send_sigtrap(tsk, regs, error_code, si_code);
preempt_conditional_cli(regs);
conditional_cli_ist(regs);
debug_stack_usage_dec();
exit:

View file

@ -5242,6 +5242,13 @@ int kvm_arch_init(void *opaque)
goto out;
}
#ifdef CONFIG_PREEMPT_RT_FULL
if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) {
printk(KERN_ERR "RT requires X86_FEATURE_CONSTANT_TSC\n");
return -EOPNOTSUPP;
}
#endif
r = kvm_mmu_module_init();
if (r)
goto out_free_percpu;

Some files were not shown because too many files have changed in this diff Show more