aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2009-09-03perf_counter: Fix output-sharing error pathIngo Molnar
We forget to release the fd in the PERF_FLAG_FD_OUTPUT error path. Reorganize the error flow here to be a clean fall-through logic. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-03perf trace: Fix read_string()Ingo Molnar
We did not account for the enclosing \0. Depending on what malloc() gave us this resulted in corrupted version string printouts. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-03perf trace: Print out in nanosecondsIngo Molnar
Print out more accurate timestamps - usecs does not cut it anymore on fast enough boxes ;-) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-03perf tools: Seek to the end of the header areaIngo Molnar
Leave the input fd at the data area. It does not matter right now - but seeking at the end of it certainly did not make sense. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-03perf trace: Fix parsing of perf.dataIngo Molnar
We started parsing perf.data at head 0. This caused -D to segfault and it could possibly also case incorrect trace entries to be displayed. Parse it at data_offset instead. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-03perf trace: Sample timestamps as wellIngo Molnar
Before: perf-21082 [013] 0.000000: sched_wakeup_new: task perf:21083 [120] success=1 [015] perf-21082 [013] 0.000000: sched_migrate_task: task perf:21082 [120] from: 13 to: 15 perf-21082 [013] 0.000000: sched_process_fork: parent perf:21082 child perf:21083 true-21083 [015] 0.000000: sched_wakeup: task migration/15:33 [0] success=1 [015] perf-21082 [013] 0.000000: sched_switch: task perf:21082 [120] (S) ==> swapper:0 [140] true-21083 [015] 0.000000: sched_switch: task perf:21083 [120] (R) ==> migration/15:33 [0] true-21083 [011] 0.000000: sched_process_exit: task true:21083 [120] After: perf-21082 [013] 14674.797613: sched_wakeup_new: task perf:21083 [120] success=1 [015] perf-21082 [013] 14674.797506: sched_migrate_task: task perf:21082 [120] from: 13 to: 15 perf-21082 [013] 14674.797610: sched_process_fork: parent perf:21082 child perf:21083 true-21083 [015] 14674.797725: sched_wakeup: task migration/15:33 [0] success=1 [015] perf-21082 [013] 14674.797722: sched_switch: task perf:21082 [120] (S) ==> swapper:0 [140] true-21083 [015] 14674.797729: sched_switch: task perf:21083 [120] (R) ==> migration/15:33 [0] true-21083 [011] 14674.798159: sched_process_exit: task true:21083 [120] Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-02perf_counter: Introduce new (non-)paranoia level to allow raw tracepoint accessIngo Molnar
I want to sample inherited tracepoint workloads as a normal user and the CAP_SYS_ADMIN check prevents me from doing that right now. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-02Merge branch 'perfcounters/urgent' into perfcounters/coreIngo Molnar
Merge reason: We are going to modify a place modified by perfcounters/urgent. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-02perf trace: Sample the CPU tooIngo Molnar
Sample, record, parse and print the CPU field - it had all zeroes before. Before (watch the second column, the CPU values): perf-32685 [000] 0.000000: sched_wakeup_new: task perf:32686 [120] success=1 [011] perf-32685 [000] 0.000000: sched_migrate_task: task perf:32685 [120] from: 1 to: 11 perf-32685 [000] 0.000000: sched_process_fork: parent perf:32685 child perf:32686 true-32686 [000] 0.000000: sched_wakeup: task migration/11:25 [0] success=1 [011] true-32686 [000] 0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015] true-32686 [000] 0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015] perf-32685 [000] 0.000000: sched_switch: task perf:32685 [120] (S) ==> swapper:0 [140] true-32686 [000] 0.000000: sched_switch: task perf:32686 [120] (R) ==> migration/11:25 [0] true-32686 [000] 0.000000: sched_switch: task perf:32686 [120] (R) ==> distccd:12793 [125] true-32686 [000] 0.000000: sched_switch: task true:32686 [120] (R) ==> distccd:12793 [125] true-32686 [000] 0.000000: sched_process_exit: task true:32686 [120] true-32686 [000] 0.000000: sched_stat_wait: task: distccd:12793 wait: 6767985949080 [ns] true-32686 [000] 0.000000: sched_stat_wait: task: distccd:12793 wait: 6767986139446 [ns] true-32686 [000] 0.000000: sched_stat_sleep: task: distccd:12793 sleep: 132844 [ns] true-32686 [000] 0.000000: sched_stat_sleep: task: distccd:12793 sleep: 131724 [ns] After: perf-32685 [001] 0.000000: sched_wakeup_new: task perf:32686 [120] success=1 [011] perf-32685 [001] 0.000000: sched_migrate_task: task perf:32685 [120] from: 1 to: 11 perf-32685 [001] 0.000000: sched_process_fork: parent perf:32685 child perf:32686 true-32686 [011] 0.000000: sched_wakeup: task migration/11:25 [0] success=1 [011] true-32686 [015] 0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015] true-32686 [015] 0.000000: sched_wakeup: task distccd:12793 [125] success=1 [015] perf-32685 [001] 0.000000: sched_switch: task perf:32685 [120] (S) ==> swapper:0 [140] true-32686 [011] 0.000000: sched_switch: task perf:32686 [120] (R) ==> migration/11:25 [0] true-32686 [015] 0.000000: sched_switch: task perf:32686 [120] (R) ==> distccd:12793 [125] true-32686 [015] 0.000000: sched_switch: task true:32686 [120] (R) ==> distccd:12793 [125] true-32686 [015] 0.000000: sched_process_exit: task true:32686 [120] true-32686 [015] 0.000000: sched_stat_wait: task: distccd:12793 wait: 6767985949080 [ns] true-32686 [015] 0.000000: sched_stat_wait: task: distccd:12793 wait: 6767986139446 [ns] true-32686 [015] 0.000000: sched_stat_sleep: task: distccd:12793 sleep: 132844 [ns] true-32686 [015] 0.000000: sched_stat_sleep: task: distccd:12793 sleep: 131724 [ns] So we can now see how this workload migrated between CPUs. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-02perf tools: Work around strict aliasing related warningsIngo Molnar
Older versions of GCC are rather stupid about strict aliasing: util/trace-event-parse.c: In function 'parse_cmdlines': util/trace-event-parse.c:93: warning: dereferencing type-punned pointer will break strict-aliasing rules util/trace-event-parse.c: In function 'parse_proc_kallsyms': util/trace-event-parse.c:155: warning: dereferencing type-punned pointer will break strict-aliasing rules util/trace-event-parse.c:157: warning: dereferencing type-punned pointer will break strict-aliasing rules util/trace-event-parse.c:158: warning: dereferencing type-punned pointer will break strict-aliasing rules util/trace-event-parse.c: In function 'parse_ftrace_printk': util/trace-event-parse.c:294: warning: dereferencing type-punned pointer will break strict-aliasing rules util/trace-event-parse.c:295: warning: dereferencing type-punned pointer will break strict-aliasing rules make: *** [util/trace-event-parse.o] Error 1 Make it clear to GCC that we intend with those pointers, by passing them through via an explicit (void *) cast. We might want to add -fno-strict-aliasing as well, like the kernel itself does. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-02perf tools: Clean up warnings list in the MakefileIngo Molnar
Make it easier to turn warnings on/off by using a separate line for each warning added. Some of the warnings have too much of a nuisance factor and we might want to turn them off in the future. Cc: Arjan van de Ven <arjan@infradead.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-31perf tools: Complete support for dynamic stringsFrederic Weisbecker
Complete support for __str_loc type strings of ftrace events which have dynamic offsets values set for each of them inside their sammples. Before: geany-5759 [000] 0.000000: lock_release: name geany-5759 [000] 0.000000: lock_release: name geany-5759 [000] 0.000000: lock_release: name kondemand/0-362 [000] 0.000000: lock_release: name pdflush-421 [000] 0.000000: lock_release: name After: geany-5759 [000] 0.000000: lock_release: &u->lock geany-5759 [000] 0.000000: lock_release: key geany-5759 [000] 0.000000: lock_release: &group->notification_mutex kondemand/0-362 [000] 0.000000: lock_release: &rq->lock pdflush-421 [000] 0.000000: lock_release: &rq->lock Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1251693921-6579-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2009-08-31perf tools: Unify swapper tasks namingFrederic Weisbecker
In perf tools, we hardcode the pid 0 cmdline resolving to "idle" because the init task is not included in the COMM events. But the idle tasks secondary cpus are resolved into their "init" name through the COMM events. We have then such strange result in perf report (ditto with trace): 19.66% init [kernel] [k] acpi_idle_enter_c1 17.32% [idle] [kernel] [k] acpi_idle_enter_c1 It's then better to unify the swapper tasks into a single init name. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1251693921-6579-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
2009-08-31perf tools: Resolve idle thread cmdline for perf traceFrederic Weisbecker
The cmd-trace tool used the cmdline file and resolved the idle thread using a hardcoded check for the 0 task pid. Now we have a centralized way to do that from perf using register_idle_thread() API. Before: :0-0 [000] 0.000000: irq_handler_entry: irq=0 handler=name :0-0 [000] 0.000000: irq_handler_entry: irq=0 handler=name After: [idle]-0 [000] 0.000000: irq_handler_entry: irq=0 handler=name [idle]-0 [000] 0.000000: irq_handler_entry: irq=0 handler=name Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1251693921-6579-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-31perf tools: Librarize idle thread registrationFrederic Weisbecker
Librarize register_idle_thread() used by annotate and report. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1251693921-6579-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-31perf tools: Add missing parameters documentationFrederic Weisbecker
Add missing documentation for the following parameters: - perf record -R - perf report -g Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1251682323-10395-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-31Merge branch 'perfcounters/tracing' into perfcounters/coreIngo Molnar
Merge reason: this topic is ready now to merge into the main development branch for .32, with functional perf trace output. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-29perf_counter: Fix /0 bug in swcountersPeter Zijlstra
We have a race in the swcounter stuff where we can start counting a counter that has never been enabled, this leads to a /0 situation. The below avoids the /0 but doesn't close the race, this would need a new counter state. The race is due to perf_swcounter_is_counting() which cannot discern between disabled due to scheduled out, and disabled for any other reason. Such a crash has been seen by Ingo: [ 967.092372] divide error: 0000 [#1] SMP [ 967.096499] last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map [ 967.104846] CPU 5 [ 967.106965] Modules linked in: [ 967.110169] Pid: 3351, comm: hackbench Not tainted 2.6.31-rc8-tip-01158-gd940a54-dirty #1568 X8DTN [ 967.119456] RIP: 0010:[<ffffffff810c0aba>] [<ffffffff810c0aba>] perf_swcounter_ctx_event+0x127/0x1af [ 967.129137] RSP: 0018:ffff8801a95abd70 EFLAGS: 00010046 [ 967.134699] RAX: 0000000000000002 RBX: ffff8801bd645c00 RCX: 0000000000000002 [ 967.142162] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801bd645d40 [ 967.149584] RBP: ffff8801a95abdb0 R08: 0000000000000001 R09: ffff8801a95abe00 [ 967.157042] R10: 0000000000000037 R11: ffff8801aa1245f8 R12: ffff8801a95abe00 [ 967.164481] R13: ffff8801a95abe00 R14: ffff8801aa1c0e78 R15: 0000000000000001 [ 967.171953] FS: 0000000000000000(0000) GS:ffffc90000a00000(0063) knlGS:00000000f7f486c0 [ 967.180406] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b [ 967.186374] CR2: 000000004822c0ac CR3: 00000001b19a2000 CR4: 00000000000006e0 [ 967.193770] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 967.201224] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 967.208692] Process hackbench (pid: 3351, threadinfo ffff8801a95aa000, task ffff8801a96b0000) [ 967.217607] Stack: [ 967.219711] 0000000000000000 0000000000000037 0000000200000001 ffffc90000a1107c [ 967.227296] <0> ffff8801a95abe00 0000000000000001 0000000000000001 0000000000000037 [ 967.235333] <0> ffff8801a95abdf0 ffffffff810c0c20 0000000200a14f30 ffff8801a95abe40 [ 967.243532] Call Trace: [ 967.246103] [<ffffffff810c0c20>] do_perf_swcounter_event+0xde/0xec [ 967.252635] [<ffffffff810c0ca7>] perf_tpcounter_event+0x79/0x7b [ 967.258957] [<ffffffff81037f73>] ftrace_profile_sched_switch+0xc0/0xcb [ 967.265791] [<ffffffff8155f22d>] schedule+0x429/0x4c4 [ 967.271156] [<ffffffff8100c01e>] int_careful+0xd/0x14 Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1251472247.17617.74.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-28perf tools: do not complain if root is owning perf.dataPierre Habouzit
This improves patch fa6963b24 so that perf.data stuff that has been dumped as root can be read (annotate/report) by a user without the use of the --force. Rationale is that root has plenty of ways to screw us (usually) that do not require twisted schemes involving specially crafting a perf.data. Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> LKML-Reference: <20090827075902.GF19653@laphroaig.corp> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-28perf_counters: Increase paranoia levelIngo Molnar
Per-cpu counters are an ASLR information leak as they show the execution other tasks do. Increase the paranoia level to 1, which disallows per-cpu counters. (they still allow counting/profiling of own tasks - and admin can profile everything.) Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-28perf tools: Fix missing string field printing in perf traceFrederic Weisbecker
Some string fields are not printed because of a missing printf in the post-processing. Before: perf-10070 [000] 0.000000: sched_switch: task :10070 [120] (R) ==> :5720 [120] geany-5720 [000] 0.000000: sched_switch: task :5720 [120] (S) ==> :10070 [120] perf-10070 [000] 0.000000: sched_switch: task :10070 [120] (R) ==> :5720 [120] geany-5720 [000] 0.000000: sched_switch: task :5720 [120] (S) ==> :10070 [120] <idle>-0 [000] 0.000000: sched_switch: task :0 [140] (R) ==> :361 [115] After: perf-10070 [000] 0.000000: sched_switch: task perf:10070 [120] (R) ==> geany:5720 [120] geany-5720 [000] 0.000000: sched_switch: task geany:5720 [120] (S) ==> perf:10070 [120] perf-10070 [000] 0.000000: sched_switch: task perf:10070 [120] (R) ==> geany:5720 [120] geany-5720 [000] 0.000000: sched_switch: task geany:5720 [120] (S) ==> perf:10070 [120] <idle>-0 [000] 0.000000: sched_switch: task swapper:0 [140] (R) ==> kondemand/1:361 [115] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1251427567-10551-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-28perf tools: Only save the event formats we needFrederic Weisbecker
While opening a trace event counter, every events are saved in the trace.info file. But we only want to save the specifications of the events we are using. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1251421798-9101-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-27Linux 2.6.31-rc8Linus Torvalds
2009-08-27module: workaround duplicate section namesJames Bottomley
The root cause is a duplicate section name (.text); is this legal? [ Amerigo Wang: "AFAIK, yes." ] However, there's a problem with commit 6d76013381ed28979cd122eb4b249a88b5e384fa in that if you fail to allocate a mod->sect_attrs (in this case it's null because of the duplication), it still gets used without checking in add_notes_attrs() This should fix it [ This patch leaves other problems, particularly the sections directory, but recent parisc toolchains seem to produce these modules and this prevents a crash and is a minimal change -- RR ] Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Helge Deller <deller@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-27module: fix BUG_ON() for powerpc (and other function descriptor archs)Rusty Russell
The rarely-used symbol_put_addr() needs to use dereference_function_descriptor on powerpc. Reported-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-27xenfb: connect to backend before registering fbJeremy Fitzhardinge
As soon as the framebuffer is registered, our methods may be called by the kernel. This leads to a crash as xenfb_refresh() gets called before we have the irq. Connect to the backend before registering our framebuffer with the kernel. [ Fixes bug http://bugzilla.kernel.org/show_bug.cgi?id=14059 ] Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-27Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notifyLinus Torvalds
* 'for-linus' of git://git.infradead.org/users/eparis/notify: inotify: Ensure we alwasy write the terminating NULL. inotify: fix locking around inotify watching in the idr inotify: do not BUG on idr entries at inotify destruction inotify: seperate new watch creation updating existing watches
2009-08-27lmb: Remove __init from lmb_end_of_DRAM()Benjamin Herrenschmidt
We call lmb_end_of_DRAM() to test whether a DMA mask is ok on a machine without IOMMU, but this function is marked as __init. I don't think there's a clean way to get the top of RAM max_pfn doesn't appear to include highmem or I missed (or we have a bug :-) so for now, let's just avoid having a broken 2.6.31 by making this function non-__init and we can revisit later. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-27Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: update documentation pointers 9p: remove unnecessary v9fses->options which duplicates the mount string net/9p: insulate the client against an invalid error code sent by a 9p server 9p: Add missing cast for the error return value in v9fs_get_inode 9p: Remove redundant inode uid/gid assignment 9p: Fix possible regressions when ->get_sb fails. 9p: Fix v9fs show_options 9p: Fix possible memleak in v9fs_inode_from fid. 9p: minor comment fixes 9p: Fix possible inode leak in v9fs_get_inode. 9p: Check for error in return value of v9fs_fid_add
2009-08-27ipv4: make ip_append_data() handle NULL routing tableJulien TINNES
Add a check in ip_append_data() for NULL *rtp to prevent future bugs in callers from being exploitable. Signed-off-by: Julien Tinnes <julien@cr0.org> Signed-off-by: Tavis Ormandy <taviso@sdf.lonestar.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-27AFS: Stop readlink() on AFS crashing due to NULL 'file' ptrDavid Howells
kAFS crashes when asked to read a symbolic link because page_getlink() passes a NULL file pointer to read_mapping_page(), but afs_readpage() expects a file pointer from which to extract a key. Modify afs_readpage() to request the appropriate key from the calling process's keyrings if a file struct is not supplied with one attached. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-27inotify: Ensure we alwasy write the terminating NULL.Eric W. Biederman
Before the rewrite copy_event_to_user always wrote a terqminating '\0' byte to user space after the filename. Since the rewrite that terminating byte was skipped if your filename is exactly a multiple of event_size. Ouch! So add one byte to name_size before we round up and use clear_user to set userspace to zero like /dev/zero does instead of copying the strange nul_inotify_event. I can't quite convince myself len_to_zero will never exceed 16 and even if it doesn't clear_user should be more efficient and a more accurate reflection of what the code is trying to do. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2009-08-27inotify: fix locking around inotify watching in the idrEric Paris
The are races around the idr storage of inotify watches. It's possible that a watch could be found from sys_inotify_rm_watch() in the idr, but it could be removed from the idr before that code does it's removal. Move the locking and the refcnt'ing so that these have to happen atomically. Signed-off-by: Eric Paris <eparis@redhat.com>
2009-08-27inotify: do not BUG on idr entries at inotify destructionEric Paris
If an inotify watch is left in the idr when an fsnotify group is destroyed this will lead to a BUG. This is not a dangerous situation and really indicates a programming bug and leak of memory. This patch changes it to use a WARN and a printk rather than killing people's boxes. Signed-off-by: Eric Paris <eparis@redhat.com>
2009-08-27inotify: seperate new watch creation updating existing watchesEric Paris
There is nothing known wrong with the inotify watch addition/modification but this patch seperates the two code paths to make them each easy to verify as correct. Signed-off-by: Eric Paris <eparis@redhat.com>
2009-08-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: virtio: net refill on out-of-memory smc91x: fix compilation on SMP
2009-08-26Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/ps3: Update ps3_defconfig powerpc/ps3: Add missing check for PS3 to rtc-ps3 platform device registration
2009-08-27powerpc/ps3: Update ps3_defconfigGeoff Levand
Update ps3_defconfig. o Refresh for 2.6.31. o Remove MTD support. o Add more HID drivers. Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-27powerpc/ps3: Add missing check for PS3 to rtc-ps3 platform device registrationGeert Uytterhoeven
On non-PS3, we get: | kernel BUG at drivers/rtc/rtc-ps3.c:36! because the rtc-ps3 platform device is registered unconditionally in a kernel with builtin support for PS3. Reported-by: Sachin Sant <sachinp@in.ibm.com> Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Acked-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: IMA: iint put in ima_counts_get and put
2009-08-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k,m68knommu: Wire up rt_tgsigqueueinfo and perf_counter_open m68k: Fix redefinition of pgprot_noncached arch/m68k/include/asm/motorola_pgalloc.h: fix kunmap arg m68k: cnt reaches -1, not 0 m68k: count can reach 51, not 50
2009-08-26leds: after setting inverted attribute, we must update the LEDThadeu Lima de Souza Cascardo
If we change the inverted attribute to another value, the LED will not be inverted until we change the GPIO state. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Cc: Samuel R. C. Vale <srcvale@holoscopio.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-26leds: fix multiple requests and releases of IRQ for GPIO LED TriggerThadeu Lima de Souza Cascardo
When setting the same GPIO number, multiple IRQ shared requests will be done without freing the previous request. It will also try to free a failed request or an already freed IRQ if 0 was written to the gpio file. All these oops and leaks were fixed with the following solution: keep the previous allocated GPIO (if any) still allocated in case the new request fails. The alternative solution would desallocate the previous allocated GPIO and set gpio as 0. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Signed-off-by: Samuel R. C. Vale <srcvale@holoscopio.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-26acpi processor: remove superfluous warning messageFrans Pop
This failure is very common on many platforms. Handling it in the ACPI processor driver is enough, and we don't need a warning message unless CONFIG_ACPI_DEBUG is set. Based on a patch from Zhang Rui. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13389 Signed-off-by: Frans Pop <elendil@planet.nl> Acked-by: Zhang Rui <rui.zhang@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-26ACPI processor: force throttling state when BIOS returns incorrect valueFrans Pop
If the BIOS reports an invalid throttling state (which seems to be fairly common after system boot), a reset is done to state T0. Because of a check in acpi_processor_get_throttling_ptc(), the reset never actually gets executed, which results in the error reoccurring on every access of for example /proc/acpi/processor/CPU0/throttling. Add a 'force' option to acpi_processor_set_throttling() to ensure the reset really takes effect. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13389 This patch, together with the next one, fixes a regression introduced in 2.6.30, listed on the regression list. They have been available for 2.5 months now in bugzilla, but have not been picked up, despite various reminders and without any reason given. Google shows that numerous people are hitting this issue. The issue is in itself relatively minor, but the bug in the code is clear. The patches have been in all my kernels and today testing has shown that throttling works correctly with the patches applied when the system overheats (http://bugzilla.kernel.org/show_bug.cgi?id=13918#c14). Signed-off-by: Frans Pop <elendil@planet.nl> Acked-by: Zhang Rui <rui.zhang@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-26wmi: fix kernel panic when stack protection enabled.Costantino Leandro
Summary: Kernel panic arise when stack protection is enabled, since strncat will add a null terminating byte '\0'; So in functions like this one (wmi_query_block): char wc[4]="WC"; .... strncat(method, block->object_id, 2); ... the length of wc should be n+1 (wc[5]) or stack protection fault will arise. This is not noticeable when stack protection is disabled,but , isn't good either. Config used: [CONFIG_CC_STACKPROTECTOR_ALL=y, CONFIG_CC_STACKPROTECTOR=y] Panic Trace ------------ .... stack-protector: kernel stack corrupted in : fa7b182c 2.6.30-rc8-obelisco-generic call_trace: [<c04a6c40>] ? panic+0x45/0xd9 [<c012925d>] ? __stack_chk_fail+0x1c/0x40 [<fa7b182c>] ? wmi_query_block+0x15a/0x162 [wmi] [<fa7b182c>] ? wmi_query_block+0x15a/0x162 [wmi] [<fa7e7000>] ? acer_wmi_init+0x00/0x61a [acer_wmi] [<fa7e7135>] ? acer_wmi_init+0x135/0x61a [acer_wmi] [<c0101159>] ? do_one_initcall+0x50+0x126 Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13514 Signed-off-by: Costantino Leandro <lcostantino@gmail.com> Signed-off-by: Carlos Corbacho <carlos@strangeworlds.co.uk> Cc: Len Brown <len.brown@intel.com> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-26acpi: don't call acpi_processor_init if acpi is disabledYinghai Lu
Jens reported early_ioremap messages with old ASUS board... > [ 1.507461] pci 0000:00:09.0: Firmware left e100 interrupts enabled; disabling > [ 1.532778] early_ioremap(3fffd080, 0000005c) [0] => Pid: 1, comm: swapper Not tainted 2.6.31-rc4 #36 > [ 1.561007] Call Trace: > [ 1.568638] [<c136e48b>] ? printk+0x18/0x1d > [ 1.581734] [<c15513ff>] __early_ioremap+0x74/0x1e9 > [ 1.596898] [<c15515aa>] early_ioremap+0x1a/0x1c > [ 1.611270] [<c154a187>] __acpi_map_table+0x18/0x1a > [ 1.626451] [<c135a7f8>] acpi_os_map_memory+0x1d/0x25 > [ 1.642129] [<c119459c>] acpi_tb_verify_table+0x20/0x49 > [ 1.658321] [<c1193e50>] acpi_get_table_with_size+0x53/0xa1 > [ 1.675553] [<c1193eae>] acpi_get_table+0x10/0x15 > [ 1.690192] [<c155cc19>] acpi_processor_init+0x23/0xab > [ 1.706126] [<c1001043>] do_one_initcall+0x33/0x180 > [ 1.721279] [<c155cbf6>] ? acpi_processor_init+0x0/0xab > [ 1.737479] [<c106893a>] ? register_irq_proc+0xaa/0xc0 > [ 1.753411] [<c10689b7>] ? init_irq_proc+0x67/0x80 > [ 1.768316] [<c15405e7>] kernel_init+0x120/0x176 > [ 1.782678] [<c15404c7>] ? kernel_init+0x0/0x176 > [ 1.797062] [<c10038b7>] kernel_thread_helper+0x7/0x10 > [ 1.812984] 00000080 + ffe00000 that is rather later. acpi_gbl_permanent_mmap should be set in acpi_early_init() if acpi is not disabled and we have > [ 0.000000] ASUS P2B-DS detected: force use of acpi=ht just don't load acpi_processor_init... Reported-and-tested-by: Jens Rosenboom <jens@leia.mcbone.net> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-26thermal_sys: check get_temp return valueMichael Brunner
The return value of the get_temp function is not checked when doing a thermal zone update. This may lead to a critical shutdown if get_temp fails and the content of the temp variable is incorrectly set higher than the critical trip point. This has been observed on a system with incorrect ACPI implementation where the corresponding methods were not serialized and therefore sometimes triggered ACPI errors (AE_ALREADY_EXISTS). The following critical shutdowns indicated a temperature of 2097 C, which was obviously wrong. The patch adds a return value check that jumps over all trip point evaluations printing a warning if get_temp fails. The trip points are evaluated again on the next polling interval with successful get_temp execution. Signed-off-by: Michael Brunner <mibru@gmx.de> Acked-by: Zhang Rui <rui.zhang@intel.com> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-26clone(): fix race between copy_process() and de_thread()Oleg Nesterov
Spotted by Hiroshi Shimamoto who also provided the test-case below. copy_process() uses signal->count as a reference counter, but it is not. This test case #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> #include <stdio.h> #include <errno.h> #include <pthread.h> void *null_thread(void *p) { for (;;) sleep(1); return NULL; } void *exec_thread(void *p) { execl("/bin/true", "/bin/true", NULL); return null_thread(p); } int main(int argc, char **argv) { for (;;) { pid_t pid; int ret, status; pid = fork(); if (pid < 0) break; if (!pid) { pthread_t tid; pthread_create(&tid, NULL, exec_thread, NULL); for (;;) pthread_create(&tid, NULL, null_thread, NULL); } do { ret = waitpid(pid, &status, 0); } while (ret == -1 && errno == EINTR); } return 0; } quickly creates an unkillable task. If copy_process(CLONE_THREAD) races with de_thread() copy_signal()->atomic(signal->count) breaks the signal->notify_count logic, and the execing thread can hang forever in kernel space. Change copy_process() to increment count/live only when we know for sure we can't fail. In this case the forked thread will take care of its reference to signal correctly. If copy_process() fails, check CLONE_THREAD flag. If it it set - do nothing, the counters were not changed and current belongs to the same thread group. If it is not set, ->signal must be released in any case (and ->count must be == 1), the forked child is the only thread in the thread group. We need more cleanups here, in particular signal->count should not be used by de_thread/__exit_signal at all. This patch only fixes the bug. Reported-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Tested-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-26mm: fix for infinite churning of mlocked pagesMinchan Kim
An mlocked page might lose the isolatation race. This causes the page to clear PG_mlocked while it remains in a VM_LOCKED vma. This means it can be put onto the [in]active list. We can rescue it by using try_to_unmap() in shrink_page_list(). But now, As Wu Fengguang pointed out, vmscan has a bug. If the page has PG_referenced, it can't reach try_to_unmap() in shrink_page_list() but is put into the active list. If the page is referenced repeatedly, it can remain on the [in]active list without being moving to the unevictable list. This patch fixes it. Reported-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: KOSAKI Motohiro <<kosaki.motohiro@jp.fujitsu.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>