aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2008-10-14tracing/ftrace: give an entry on the config for boot tracerFrédéric Weisbecker
Bring the entry to choose the boot tracer on the kernel config. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14tracing/ftrace: make tracing suitable to run the boot tracerFrédéric Weisbecker
The tracing engine have now to be init in early_initcall to set the boot tracer. Only the debugfs settings will be initialized at fs_initcall time. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14tracing/ftrace: add the boot tracerFrédéric Weisbecker
Add the boot/initcall tracer. It's primary purpose is to be able to trace the initcalls. It is intended to be used with scripts/bootgraph.pl after some small improvements. Note that it is not active after its init. To avoid tracing (and so crashing) before the whole tracing engine init, you have to explicitly call start_boot_trace() after do_pre_smp_initcalls() to enable it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14markers: bit-field is not thread-safe nor smp-safeLai Jiangshan
bit-field is not thread-safe nor smp-safe. struct marker_entry.rcu_pending is not protected by any lock in rcu-callback free_old_closure(). so we must turn it into a safe type. detail: I suppose rcu_pending and ptype are store in struct marker_entry.tmp1 free_old_closure() side: change ptype side: | load struct marker_entry.tmp1 --------------------------------|-------------------------------- | change ptype bit in tmp1 load struct marker_entry.tmp1 | change rcu_pending bit in tmp1 | store tmp1 | --------------------------------|-------------------------------- | store tmp1 now this result equals that free_old_closure() do not change rcu_pending bit, bug! This bug will cause redundant rcu_barrier_sched() called. not too harmful. ----- corresponding: free_old_closure() side: change ptype side: load struct marker_entry.tmp1 | --------------------------------|-------------------------------- | load struct marker_entry.tmp1 change rcu_pending bit in tmp1 | | change ptype bit in tmp1 | store tmp1 --------------------------------|-------------------------------- store tmp1 | now this result equals that change ptype side do not change ptype bit, bug! this bug cause marker_probe_cb() access to invalid memory. oops! see also: http://en.wikipedia.org/wiki/Bit_field Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14markers: fix unchecked formatLai Jiangshan
when the second, third... probe is registered, its format is not checked, this patch fixes it. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14markers: re-enable fast batch registrationMathieu Desnoyers
Lai Jiangshan discovered a reentrancy issue with markers and fixed it by adding synchronize_sched() calls at each registration/unregistraiton. It works, but it removes the ability to do batch registration/unregistration and can cause registration of ~100 markers to take about 30 seconds on a loaded machine (synchronize_sched() is much slower on such workloads). This patch implements a version of the fix which won't slow down marker batch registration/unregistration. It also go back to the original non-synchronized reg/unreg. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14markers: fix unregister bug and reenter bug, cleanupMathieu Desnoyers
Use the new rcu_read_lock_sched/unlock_sched() in marker code around the call site instead of preempt_disable/enable(). It helps reviewing the code more easily. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14tracepoints: fix reentrancyMathieu Desnoyers
The tracepoints had the same problem markers did have wrt reentrancy. Apply a similar fix using a rcu_barrier after each tracepoint mutex lock. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14tracepoints: use rcu schedMathieu Desnoyers
Make tracepoints use rcu sched. (cleanup) Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14markers: fix unregister bug and reenter bugLai Jiangshan
unregister bug: codes using makers are typically calling marker_probe_unregister() and then destroying the data that marker_probe_func needs(or unloading this module). This is bug when the corresponding marker_probe_func is still running(on other cpus), it is using the destroying/ed data. we should call synchronize_sched() after marker_update_probes(). reenter bug: marker_probe_register(), marker_probe_unregister() and marker_probe_unregister_private_data() are not reentrant safe functions. these 3 functions release markers_mutex and then require it again and do "entry->oldptr = old; ...", but entry->oldptr maybe is using now for these 3 functions may reenter when markers_mutex is released. we use synchronize_sched() instead of call_rcu_sched() to fix this bug. actually we can do: " if (entry->rcu_pending) rcu_barrier_sched(); " after require markers_mutex again. but synchronize_sched() is better and simpler. For these 3 functions are not critical path. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: warn on failure to disable mcount callersSteven Rostedt
With the recent updates to ftrace, there should not be any failures when modifying the code. If there is, then we need to warn about it. This patch has a cleaned up version of the code that I used to discover that the weak symbols were causing failures. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14tracing/ftrace: replace none tracer by nop tracerFrédéric Weisbecker
Replace "none" tracer by the recently created "nop" tracer. Both are pretty similar except that nop accepts TRACE_PRINT or TRACE_SPECIAL entries. And as a consequence, changing the size of the ring buffer now requires that tracing has already been disabled. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14tracing/ftrace: tracing engine depends on Nop TracerFrédéric Weisbecker
Now that the nop tracer is used as the default tracer by replacing the "none" tracer, tracing engine depends on it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14tracing/ftrace: make nop tracer reset previous entriesFrédéric Weisbecker
If nop tracer is selected, some old entries from the previous tracer could still be enqueued. Tracing have to be reset. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14trace: remove pointless ifdefsSteven Noonan
The functions are already 'extern' anyway, so there's no problem with linkage. Removing these ifdefs also helps find any potential compiler errors. Suggested by Andrew Morton. Signed-off-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: mcount_addr defined but not usedSteven Noonan
When CONFIG_DYNAMIC_FTRACE isn't used, neither is mcount_addr. This patch eliminates that warning. Signed-off-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: add nop tracerSteven Noonan
A no-op tracer which can serve two purposes: 1. A template for development of a new tracer. 2. A convenient way to see ftrace_printk() calls without an irrelevant trace making the output messy. [ mingo@elte.hu: resolved conflicts ] Signed-off-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: inject markers via trace_marker filePekka Paalanen
Allow a user to inject a marker (TRACE_PRINT entry) into the trace ring buffer. The related file operations are derived from code by Frédéric Weisbecker <fweisbec@gmail.com>. Signed-off-by: Pekka Paalanen <pq@iki.fi> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14mmiotrace: handle TRACE_PRINT entriesPekka Paalanen
Also make trace_seq_print_cont() non-static, and add a newline if the seq buffer can't hold all data. Signed-off-by: Pekka Paalanen <pq@iki.fi> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14x86 mmiotrace: implement mmiotrace_printk()Pekka Paalanen
Offer mmiotrace users a function to inject markers from inside the kernel. This depends on the trace_vprintk() patch. Signed-off-by: Pekka Paalanen <pq@iki.fi> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: add trace_vprintk()Pekka Paalanen
trace_vprintk() for easier implementation of tracer specific *_printk functions. Add check check for no_tracer, and implement __ftrace_printk() as a wrapper. Signed-off-by: Pekka Paalanen <pq@iki.fi> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: move mmiotrace functions out of trace.cPekka Paalanen
Moves the mmiotrace specific functions from trace.c to trace_mmiotrace.c. Functions trace_wake_up(), tracing_get_trace_entry(), and tracing_generic_entry_update() are therefore made available outside trace.c. Signed-off-by: Pekka Paalanen <pq@iki.fi> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: fix unlocking of hashSteven Rostedt
This must be brown paper bag week for Steven Rostedt! While working on ftrace for PPC, I discovered that the hash locking done when CONFIG_FTRACE_MCOUNT_RECORD is not set, is totally incorrect. With a cut and paste error, I had the hash lock macro to lock for both hash_lock _and_ hash_unlock! This bug did not affect x86 since this bug was introduced when CONFIG_FTRACE_MCOUNT_RECORD was added to x86. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: make it depend on DEBUG_KERNELIngo Molnar
make most of the tracers depend on DEBUG_KERNEL - that's their intended purpose. (most distributions have DEBUG_KERNEL enabled anyway so this is not a practical limitation - but it simplifies the tracing menu in the normal case) Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: sched_switch: show the wakee's cpuPeter Zijlstra
While profiling the smp behaviour of the scheduler it was needed to know to which cpu a task got woken. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: make ftrace_printk usable with the other tracersPeter Zijlstra
Currently ftrace_printk only works with the ftrace tracer, switch it to an iter_ctrl setting so we can make us of them with other tracers too. [rostedt@redhat.com: tweak to the disable condition] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: print continue index fixSteven Rostedt
An item in the trace buffer that is bigger than one entry may be split up using the TRACE_CONT entry. This makes it a virtual single entry. The current code increments the iterator index even while traversing TRACE_CONT entries, making it look like the iterator is further than it actually is. This patch adds code to not increment the iterator index while skipping over TRACE_CONT entries. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: binary and not logical for continue testSteven Rostedt
Peter Zijlstra provided me with a nice brown paper bag while letting me know that I was doing a logical AND and not a binary one, making a condition true more often than it should be. Luckily, a false true is handled by the calling function and no harm is done. But this needs to be fixed regardless. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: make output nicely spaced for up to 999 cpusMichael Ellerman
Currently some of the ftrace output goes skewiff if you have more than 9 cpus, and some if you have more than 99. Twiddle with the headers and format strings to make up to 999 cpus display without causing spacing problems. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14stack tracer: depends on DEBUG_KERNELIngo Molnar
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: stack trace add indexesSteven Rostedt
This patch adds indexes into the stack that the functions in the stack dump were found at. As an added bonus, I also added a diff to show which function is the most notorious consumer of the stack. The output now looks like this: # cat /debug/tracing/stack_trace Depth Size Location (48 entries) ----- ---- -------- 0) 2476 212 blk_recount_segments+0x39/0x59 1) 2264 12 bio_phys_segments+0x16/0x1d 2) 2252 20 blk_rq_bio_prep+0x23/0xaf 3) 2232 12 init_request_from_bio+0x74/0x77 4) 2220 56 __make_request+0x294/0x331 5) 2164 136 generic_make_request+0x34f/0x37d 6) 2028 56 submit_bio+0xe7/0xef 7) 1972 28 submit_bh+0xd1/0xf0 8) 1944 112 block_read_full_page+0x299/0x2a9 9) 1832 8 blkdev_readpage+0x14/0x16 10) 1824 28 read_cache_page_async+0x7e/0x109 11) 1796 16 read_cache_page+0x11/0x49 12) 1780 32 read_dev_sector+0x3c/0x72 13) 1748 48 read_lba+0x4d/0xaa 14) 1700 168 efi_partition+0x85/0x61b 15) 1532 72 rescan_partitions+0x10e/0x266 16) 1460 40 do_open+0x1c7/0x24e 17) 1420 292 __blkdev_get+0x79/0x84 18) 1128 12 blkdev_get+0x12/0x14 19) 1116 20 register_disk+0xd1/0x11e 20) 1096 28 add_disk+0x34/0x90 21) 1068 52 sd_probe+0x2b1/0x366 22) 1016 20 driver_probe_device+0xa5/0x120 23) 996 8 __device_attach+0xd/0xf 24) 988 32 bus_for_each_drv+0x3e/0x68 25) 956 24 device_attach+0x56/0x6c 26) 932 16 bus_attach_device+0x26/0x4d 27) 916 64 device_add+0x380/0x4b4 28) 852 28 scsi_sysfs_add_sdev+0xa1/0x1c9 29) 824 160 scsi_probe_and_add_lun+0x919/0xa2a 30) 664 36 __scsi_add_device+0x88/0xae 31) 628 44 ata_scsi_scan_host+0x9e/0x21c 32) 584 28 ata_host_register+0x1cb/0x1db 33) 556 24 ata_host_activate+0x98/0xb5 34) 532 192 ahci_init_one+0x9bd/0x9e9 35) 340 20 pci_device_probe+0x3e/0x5e 36) 320 20 driver_probe_device+0xa5/0x120 37) 300 20 __driver_attach+0x3f/0x5e 38) 280 36 bus_for_each_dev+0x40/0x62 39) 244 12 driver_attach+0x19/0x1b 40) 232 28 bus_add_driver+0x9c/0x1af 41) 204 28 driver_register+0x76/0xd2 42) 176 20 __pci_register_driver+0x44/0x71 43) 156 8 ahci_init+0x14/0x16 44) 148 100 _stext+0x42/0x122 45) 48 20 kernel_init+0x175/0x1dc 46) 28 28 kernel_thread_helper+0x7/0x10 The first column is simply an index starting from the inner most function and counting down to the outer most. The next column is the location that the function was found on the stack. The next column is the size of the stack for that function. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: remove direct reference to mcount in trace codeSteven Rostedt
The mcount record method of ftrace scans objdump for references to mcount. Using mcount as the reference to test if the calls to mcount being replaced are indeed calls to mcount, this use of mcount was also caught as a location to change. Using a variable that points to the mcount address moves this reference into the data section that is not scanned, and we do not use a false location to try and modify. The warn on code was what was used to detect this bug. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: add stack tracerSteven Rostedt
This is another tracer using the ftrace infrastructure, that examines at each function call the size of the stack. If the stack use is greater than the previous max it is recorded. You can always see (and set) the max stack size seen. By setting it to zero will start the recording again. The backtrace is also available. For example: # cat /debug/tracing/stack_max_size 1856 # cat /debug/tracing/stack_trace [<c027764d>] stack_trace_call+0x8f/0x101 [<c021b966>] ftrace_call+0x5/0x8 [<c02553cc>] clocksource_get_next+0x12/0x48 [<c02542a5>] update_wall_time+0x538/0x6d1 [<c0245913>] do_timer+0x23/0xb0 [<c0257657>] tick_do_update_jiffies64+0xd9/0xf1 [<c02576b9>] tick_sched_timer+0x4a/0xad [<c0250fe6>] __run_hrtimer+0x3e/0x75 [<c02518ed>] hrtimer_interrupt+0xf1/0x154 [<c022c870>] smp_apic_timer_interrupt+0x71/0x84 [<c021b7e9>] apic_timer_interrupt+0x2d/0x34 [<c0238597>] finish_task_switch+0x29/0xa0 [<c05abd13>] schedule+0x765/0x7be [<c05abfca>] schedule_timeout+0x1b/0x90 [<c05ab4d4>] wait_for_common+0xab/0x101 [<c05ab5ac>] wait_for_completion+0x12/0x14 [<c033cfc3>] blk_execute_rq+0x84/0x99 [<c0402470>] scsi_execute+0xc2/0x105 [<c040250a>] scsi_execute_req+0x57/0x7f [<c043afe0>] sr_test_unit_ready+0x3e/0x97 [<c043bbd6>] sr_media_change+0x43/0x205 [<c046b59f>] media_changed+0x48/0x77 [<c046b5ff>] cdrom_media_changed+0x31/0x37 [<c043b091>] sr_block_media_changed+0x16/0x18 [<c02b9e69>] check_disk_change+0x1b/0x63 [<c046f4c3>] cdrom_open+0x7a1/0x806 [<c043b148>] sr_block_open+0x78/0x8d [<c02ba4c0>] do_open+0x90/0x257 [<c02ba869>] blkdev_open+0x2d/0x56 [<c0296a1f>] __dentry_open+0x14d/0x23c [<c0296b32>] nameidata_to_filp+0x24/0x38 [<c02a1c68>] do_filp_open+0x347/0x626 [<c02967ef>] do_sys_open+0x47/0xbc [<c02968b0>] sys_open+0x23/0x2b [<c021aadd>] sysenter_do_call+0x12/0x26 I've tested this on both x86_64 and i386. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: clean up macro usageIngo Molnar
enclose the argument in parenthesis. (especially since we cast it, which is a high prio operation) Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: fix build failureStephen Rothwell
After disabling FTRACE_MCOUNT_RECORD via a patch, a dormant build failure surfaced: kernel/trace/ftrace.c: In function 'ftrace_record_ip': kernel/trace/ftrace.c:416: error: incompatible type for argument 1 of '_spin_lock_irqsave' kernel/trace/ftrace.c:433: error: incompatible type for argument 1 of '_spin_lock_irqsave' Introduced by commit 6dad8e07f4c10b17b038e84d29f3ca41c2e55cd0 ("ftrace: add necessary locking for ftrace records"). Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: add necessary locking for ftrace recordsSteven Rostedt
The new design of pre-recorded mcounts and updating the code outside of kstop_machine has changed the way the records themselves are protected. This patch uses the ftrace_lock to protect the records. Note, the lock still does not need to be taken within calls that are only called via kstop_machine, since the that code can not run while the spin lock is held. Also removed the hash_lock needed for the daemon when MCOUNT_RECORD is configured. Also did a slight cleanup of an unused variable. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: do not init module on ftrace disabledSteven Rostedt
If one of the self tests of ftrace has disabled the function tracer, do not run the code to convert the mcount calls in modules. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: fix some mistakes in error messagesFrédéric Weisbecker
This patch fixes some mistakes on the tracer in warning messages when debugfs fails to create tracing files. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: srostedt@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: dump out ftrace buffers to console on panicSteven Rostedt
At OLS I had a lot of interest to be able to have the ftrace buffers dumped on panic. Usually one would expect to uses kexec and examine the buffers after a new kernel is loaded. But sometimes the resources do not permit kdump and kexec, so having an option to still see the sequence of events up to the crash is very advantageous. This patch adds the option to have the ftrace buffers dumped to the console in the latency_trace format on a panic. When the option is set, the default entries per CPU buffer are lowered to 16384, since the writing to the serial (if that is the console) may take an awful long time otherwise. [ Changes since -v1: Got alpine to send correctly (as well as spell check working). Removed config option. Moved the static variables into ftrace_dump itself. Gave printk a log level. ] Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: ftrace_printk doc movedSteven Rostedt
Based on Randy Dunlap's suggestion, the ftrace_printk kernel-doc belongs with the ftrace_printk macro that should be used. Not with the __ftrace_printk internal function. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: printk formatting infrastructureSteven Rostedt
This patch adds a feature that can help kernel developers debug their code using ftrace. int ftrace_printk(const char *fmt, ...); This records into the ftrace buffer using printf formatting. The entry size in the buffers are still a fixed length. A new type has been added that allows for more entries to be used for a single recording. The start of the print is still the same as the other entries. It returns the number of characters written to the ftrace buffer. For example: Having a module with the following code: static int __init ftrace_print_test(void) { ftrace_printk("jiffies are %ld\n", jiffies); return 0; } Gives me: insmod-5441 3...1 7569us : ftrace_print_test: jiffies are 4296626666 for the latency_trace file and: insmod-5441 [03] 1959.370498: ftrace_print_test jiffies are 4296626666 for the trace file. Note: Only the infrastructure should go into the kernel. It is to help facilitate debugging for other kernel developers. Calls to ftrace_printk is not intended to be left in the kernel, and should be frowned upon just like scattering printks around in the code. But having this easily at your fingertips helps the debugging go faster and bugs be solved quicker. Maybe later on, we can hook this with markers and have their printf format be sucked into ftrace output. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: new continue entry - separate out from trace_entrySteven Rostedt
Some tracers will need to work with more than one entry. In order to do this the trace_entry structure was split into two fields. One for the start of all entries, and one to continue an existing entry. The trace_entry structure now has a "field" entry that consists of the previous content of the trace_entry, and a "cont" entry that is just a string buffer the size of the "field" entry. Thanks to Andrew Morton for suggesting this idea. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: remove old pointers to mcountSteven Rostedt
When a mcount pointer is recorded into a table, it is used to add or remove calls to mcount (replacing them with nops). If the code is removed via removing a module, the pointers still exist. At modifying the code a check is always made to make sure the code being replaced is the code expected. In-other-words, the code being replaced is compared to what it is expected to be before being replaced. There is a very small chance that the code being replaced just happens to look like code that calls mcount (very small since the call to mcount is relative). To remove this chance, this patch adds ftrace_release to allow module unloading to remove the pointers to mcount within the module. Another change for init calls is made to not trace calls marked with __init. The tracing can not be started until after init is done anyway. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: do not show freed records in available_filter_functionsSteven Rostedt
Seems that freed records can appear in the available_filter_functions list. This patch fixes that. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: enable mcount recording for modulesSteven Rostedt
This patch enables the loading of the __mcount_section of modules and changing all the callers of mcount into nops. The modification is done before the init_module function is called, so again, we do not need to use kstop_machine to make these changes. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: mcount call site on boot nops coreSteven Rostedt
This is the infrastructure to the converting the mcount call sites recorded by the __mcount_loc section into nops on boot. It also allows for using these sites to enable tracing as normal. When the __mcount_loc section is used, the "ftraced" kernel thread is disabled. This uses the current infrastructure to record the mcount call sites as well as convert them to nops. The mcount function is kept as a stub on boot up and not converted to the ftrace_record_ip function. We use the ftrace_record_ip to only record from the table. This patch does not handle modules. That comes with a later patch. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: create __mcount_loc sectionSteven Rostedt
This patch creates a section in the kernel called "__mcount_loc". This will hold a list of pointers to the mcount relocation for each call site of mcount. For example: objdump -dr init/main.o [...] Disassembly of section .text: 0000000000000000 <do_one_initcall>: 0: 55 push %rbp [...] 000000000000017b <init_post>: 17b: 55 push %rbp 17c: 48 89 e5 mov %rsp,%rbp 17f: 53 push %rbx 180: 48 83 ec 08 sub $0x8,%rsp 184: e8 00 00 00 00 callq 189 <init_post+0xe> 185: R_X86_64_PC32 mcount+0xfffffffffffffffc [...] We will add a section to point to each function call. .section __mcount_loc,"a",@progbits [...] .quad .text + 0x185 [...] The offset to of the mcount call site in init_post is an offset from the start of the section, and not the start of the function init_post. The mcount relocation is at the call site 0x185 from the start of the .text section. .text + 0x185 == init_post + 0xa We need a way to add this __mcount_loc section in a way that we do not lose the relocations after final link. The .text section here will be attached to all other .text sections after final link and the offsets will be meaningless. We need to keep track of where these .text sections are. To do this, we use the start of the first function in the section. do_one_initcall. We can make a tmp.s file with this function as a reference to the start of the .text section. .section __mcount_loc,"a",@progbits [...] .quad do_one_initcall + 0x185 [...] Then we can compile the tmp.s into a tmp.o gcc -c tmp.s -o tmp.o And link it into back into main.o. ld -r main.o tmp.o -o tmp_main.o mv tmp_main.o main.o But we have a problem. What happens if the first function in a section is not exported, and is a static function. The linker will not let the tmp.o use it. This case exists in main.o as well. Disassembly of section .init.text: 0000000000000000 <set_reset_devices>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: e8 00 00 00 00 callq 9 <set_reset_devices+0x9> 5: R_X86_64_PC32 mcount+0xfffffffffffffffc The first function in .init.text is a static function. 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 t set_reset_devices The lowercase 't' means that set_reset_devices is local and is not exported. If we simply try to link the tmp.o with the set_reset_devices we end up with two symbols: one local and one global. .section __mcount_loc,"a",@progbits .quad set_reset_devices + 0x10 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 t set_reset_devices U set_reset_devices We still have an undefined reference to set_reset_devices, and if we try to compile the kernel, we will end up with an undefined reference to set_reset_devices, or even worst, it could be exported someplace else, and then we will have a reference to the wrong location. To handle this case, we make an intermediate step using objcopy. We convert set_reset_devices into a global exported symbol before linking it with tmp.o and set it back afterwards. 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 T set_reset_devices 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 T set_reset_devices 00000000000000a8 t __setup_set_reset_devices 000000000000105f t __setup_str_set_reset_devices 0000000000000000 t set_reset_devices Now we have a section in main.o called __mcount_loc that we can place somewhere in the kernel using vmlinux.ld.S and access it to convert all these locations that call mcount into nops before starting SMP and thus, eliminating the need to do this with kstop_machine. Note, A well documented perl script (scripts/recordmcount.pl) is used to do all this in one location. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14ftrace: ignore functions that cannot be kprobe-edIngo Molnar
kprobes already has an extensive list of annotations for functions that should not be instrumented. Add notrace annotations to these functions as well. This is particularly useful for functions called by the NMI path. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14tracepoints: use TABLE_SIZE macroMathieu Desnoyers
Steven Rostedt suggested: | Wouldn't it look nicer to have: (TRACEPOINT_TABLE_SIZE - 1) ? Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-14tracing: clean up tracepoints kconfig structureIngo Molnar
do not expose users to CONFIG_TRACEPOINTS - tracers can select it just fine. update ftrace to select CONFIG_TRACEPOINTS. Signed-off-by: Ingo Molnar <mingo@elte.hu>