aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/builtin-report.c
AgeCommit message (Collapse)Author
2009-10-17perf tools: Move dereference after NULL testJulia Lawall
In each case, if the NULL test on thread is needed, then the dereference should be after the NULL test. A simplified version of the semantic match that detects this problem is as follows (http://coccinelle.lip6.fr/): // <smpl> @match exists@ expression x, E; identifier fld; @@ * x->fld ... when != \(x = E\|&x\) * x == NULL // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> LKML-Reference: <Pine.LNX.4.64.0910170842500.9213@ask.diku.dk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13perf tools: Move threads & last_match to threads.cArnaldo Carvalho de Melo
This was just being copy'n'pasted all over. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091013141629.GD21809@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08perf tools: Up the verbose level for some really verbose stuffArnaldo Carvalho de Melo
Like printing every symbol created. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1254923340-4870-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08perf tools: Unify perf.data mapping and events handlingFrederic Weisbecker
This librarizes the perf.data file mapping and handling in various perf tools, roughly reducing the amount of code and fixing the places that mmap from beginning of the file whereas we want to mmap from the beginning of the data, leading to page fault because the mmap window is too small since the trace info are written in the file too. TODO: - convert perf timechart too Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arjan van de Ven <arjan@infradead.org> LKML-Reference: <20091007104729.GD5043@nowhere> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-05perf report: Use kernel_maps__find_symbol as fallback to find vdsos, etcArnaldo Carvalho de Melo
In resolve_symbol, as we're moving to breaking the kernel symbols list per address ranges, i.e. kernel linking sections, so that we don't have a big kernel_map that in its range covers what is in the modules. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-04perf tools: Remove show_mask bitmaskArnaldo Carvalho de Melo
As it was not being exposed via any command line and with --dsos/--comms we can do this and even more, like asking for just kernel + some module: [root@doppio linux-2.6-tip]# perf report --dsos \[kernel\],\[drm\] --vmlinux /home/acme/git/build/tip-recvmmsg/vmlinux --modules | head -15 # Samples: 619669 # # Overhead Command Shared Object Symbol # ........ ............... ............. ...... # 7.12% swapper [kernel] [k] read_hpet 6.86% init [kernel] [k] read_hpet 6.22% init [kernel] [k] mwait_idle_with_hints 5.34% swapper [kernel] [k] mwait_idle_with_hints 3.01% firefox [kernel] [.] vread_hpet 2.14% Xorg [drm] [k] drm_clflush_pages 2.09% pidgin [kernel] [.] vread_hpet 1.58% npviewer.bin [kernel] [.] vread_hpet 1.37% swapper [kernel] [k] hpet_next_event 1.23% Xorg [kernel] [k] read_hpet [root@doppio linux-2.6-tip]# Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091003233048.GA30535@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-03perf tools: Move hist_entry__add common code to hist.cArnaldo Carvalho de Melo
Now perf report and annotate do the callgraph/hit processing in their specialized hist_entry__add functions. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-02perf tools: Rewrite and improve support for kernel modulesArnaldo Carvalho de Melo
Representing modules as struct map entries, backed by a DSO, etc, using /proc/modules to find where the module is loaded. DSOs now can have a short and long name, so that in verbose mode we can show exactly which .ko or vmlinux image was used. As kernel modules now are a DSO separate from the kernel, we can ask for just the hits for a particular set of kernel modules, just like we can do with shared libraries: [root@doppio linux-2.6-tip]# perf report -n --vmlinux /home/acme/git/build/tip-recvmmsg/vmlinux --modules --dsos \[drm\] | head -15 84.58% 13266 Xorg [k] drm_clflush_pages 4.02% 630 Xorg [k] trace_kmalloc.clone.0 3.95% 619 Xorg [k] drm_ioctl 2.07% 324 Xorg [k] drm_addbufs 1.68% 263 Xorg [k] drm_gem_close_ioctl 0.77% 120 Xorg [k] drm_setmaster_ioctl 0.70% 110 Xorg [k] drm_lastclose 0.68% 106 Xorg [k] drm_open 0.54% 85 Xorg [k] drm_mm_search_free [root@doppio linux-2.6-tip]# Specifying --dsos /lib/modules/2.6.31-tip/kernel/drivers/gpu/drm/drm.ko would have the same effect. Allowing specifying just 'drm.ko' is left for another patch. Processing kallsyms so that per kernel module struct map are instantiated was also left for another patch. That will allow removing the module name from each of its symbols. struct symbol was reduced by removing the ->module backpointer and moving it (well now the map) to struct symbol_entry in perf top, that is its only user right now. The total linecount went down by ~500 lines. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30perf tools: Put common histogram functions in their own fileJohn Kacur
Move histogram related functions into their own files (hist.c and hist.h) and make use of them in builtin-annotate.c and builtin-report.c. Signed-off-by: John Kacur <jkacur@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <alpine.LFD.2.00.0909281531180.8316@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24perf tools: Create util/sort.and use itJohn Kacur
Create util/sort.[ch] and move common functionality for builtin-report.c and builtin-annotate.c there, and make use of it. Signed-off-by: John Kacur <jkacur@redhat.com> LKML-Reference: <alpine.LFD.2.00.0909241758390.11383@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21perf: Do the big rename: Performance Counters -> Performance EventsIngo Molnar
Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N | sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-31perf tools: Librarize idle thread registrationFrederic Weisbecker
Librarize register_idle_thread() used by annotate and report. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1251693921-6579-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-31Merge branch 'perfcounters/tracing' into perfcounters/coreIngo Molnar
Merge reason: this topic is ready now to merge into the main development branch for .32, with functional perf trace output. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-28perf tools: do not complain if root is owning perf.dataPierre Habouzit
This improves patch fa6963b24 so that perf.data stuff that has been dumped as root can be read (annotate/report) by a user without the use of the --force. Rationale is that root has plenty of ways to screw us (usually) that do not require twisted schemes involving specially crafting a perf.data. Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> LKML-Reference: <20090827075902.GF19653@laphroaig.corp> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-24Merge branch 'perfcounters/urgent' into perfcounters/coreIngo Molnar
Conflicts: tools/perf/builtin-annotate.c tools/perf/builtin-report.c Merge reason: resolve these conflicts. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-19perf tools: Check perf.data ownerPeter Zijlstra
Add an owner check to opening perf.data files and a switch to silence it. Because perf-report/perf-annotate are binary parsers reading another users' perf.data file could be a security risk if the file were explicitly engineered to trigger bugs in the parser (we hope of course there are non such bugs, but you never know). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090819092023.896648538@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-18perf tools: Fix comm column adjustingFrederic Weisbecker
The librarization of the thread helpers between annotate and report lost some perf report specifics. This patch fixes the thread comm column adjusting that has been omitted during this export. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1250604226-6852-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-18perf tools: Fix spelling mistake in callchain errorFrederic Weisbecker
While running perf report -g in a perf.data file that hasn't been recorded in callchain mode, the error reported has a spelling issue: ./perf report -g selected -c but no callchain data. Did you call perf record without -g? Fix it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1250543271-8383-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16perf tools: Librarize trace_event() helperFrederic Weisbecker
Librarize trace_event() helper so that perf trace can use it too. Also clean up the debug.h includes a bit. It's not good to have it included in perf.h because it doesn't make it flexible against other headers it may need (headers that can also depend on perf.h and then create a recursive header dependency). Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1250453149-664-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16perf tools: Librarize sample type and attr finding from headersFrederic Weisbecker
Librarize the sample type and attr fetching from perf data file headers so that we can also use it from perf trace. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1250448997-30715-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16perf tools: Put the show mode into the event headers filesFrederic Weisbecker
Annotate and report share the same flags to filter events considering their context (kernel, user, hypervisor). Both tools have their own definitions of these flags. Factorize them out into the event headers file. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1250445414-29237-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16perf tools: Factorize the dprintf definitionFrederic Weisbecker
We have two users of dprintf: report and annotate. Another one is coming with perf trace. Then factorize it into the debug file. While at it, rename dprintf() to dump_printf() so that it doesn't conflicts with its libc homograph. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1250443461-28130-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16perf: Enable more compiler warningsIngo Molnar
Related to a shadowed variable bug fix Valdis Kletnieks noticed that perf does not get built with -Wshadow, which could have helped us avoid the bug. So enable -Wshadow and also enable the following warnings on perf builds, in addition to the already enabled -Wall -Wextra -std=gnu99 warnings: -Wcast-align -Wformat=2 -Wshadow -Winit-self -Wpacked -Wredundant-decls -Wstack-protector -Wstrict-aliasing=3 -Wswitch-default -Wswitch-enum -Wno-system-headers -Wundef -Wvolatile-register-var -Wwrite-strings -Wbad-function-cast -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wstrict-prototypes -Wdeclaration-after-statement And change/fix the perf code to build cleanly under GCC 4.3.2. The list of warnings enablement is rather arbitrary: it's based on my (quick) reading of the GCC manpages and trying them on perf. I categorized the warnings based on individually enabling them and looking whether they trigger something in the perf build. If i liked those warnings (i.e. if they trigger for something that arguably could be improved) i enabled the warning. If the warnings seemed to come from language laywers spamming the build with tons of nuisance warnings i generally kept them off. Most of the sign conversion related warnings were in this category. (A second patch enabling some of the sign warnings might be welcome - sign bugs can be nasty.) I also kept warnings that seem to make sense from their manpage description and which produced no actual warnings on our code base. These warnings might still be turned off if they end up being a nuisance. I also left out a few warnings that are not supported in older compilers. [ Note that these changes might break the build on older compilers i did not test, or on non-x86 architectures that produce different warnings, so more testing would be welcome. ] Reported-by: Valdis.Kletnieks@vt.edu Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-15perf tools: Factorize the thread code in a dedicated fileFrederic Weisbecker
Factorize the thread management code used by perf-annotate and perf-report in dedicated source and header files. v2: pass last_match by address so that it can actually be modified. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1250245313-6995-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-15Merge branch 'perfcounters/urgent' into perfcounters/coreIngo Molnar
Conflicts: kernel/perf_counter.c Merge reason: update to latest upstream (-rc6) and resolve the conflict with urgent fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-13perf report: Don't show unresolved DSOs and symbols when -S/-d is usedArnaldo Carvalho de Melo
We're interested in just those symbols/DSOs, so filter out the unresolved ones. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090812211957.GE3495@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-12perf report: Show the tid too in -DArnaldo Carvalho de Melo
This made it easier to find the firefox threading related bug. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090811192138.GE18061@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-12perf tools: Factorize the map helpersFrederic Weisbecker
Factorize the dso mapping helpers into a single purpose common file "util/map.c" Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Brice Goglin <Brice.Goglin@inria.fr>
2009-08-12perf tools: Factorize the event structure definitions in a single fileFrederic Weisbecker
Factorize the multiple definition of the events structures into a single util/event.h file. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Brice Goglin <Brice.Goglin@inria.fr>
2009-08-12perf tools: Factorize high level dso helpersFrederic Weisbecker
Factorize multiple definitions of high level dso helpers into the symbol source file. The side effect is a general export of the verbose and eprintf debugging helpers into a new file dedicated to debugging purposes. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Brice Goglin <Brice.Goglin@inria.fr>
2009-08-10perf report: Add raw displaying of per-thread countersBrice Goglin
If --pretty=raw is given to perf report -T, it now displays one line per-thread per-counter with the raw event id added. We get: # PID TID Name Raw Count 18608 18609 cache-misses 28e 416744 18608 18609 cache-references 28f 6456792 18608 18608 cache-misses 28e 448219 18608 18608 cache-references 28f 7270244 instead of: # PID TID cache-misses cache-references 18608 18609 416744 6456792 18608 18608 448219 7270244 Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> Acked-by: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4A802008.5050409@inria.fr> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf report: Fix and improve the displaying of per-thread event countersBrice Goglin
Improve and fix the handling of per-thread counter stats recorded via perf record -s. Previously we only displayed it in debug printouts (-D) and even that output was hard to disambiguate. I moved everything to utils/values.[ch] so that we may reuse it in perf stat. We get something like this now: # PID TID cache-misses cache-references 4658 4659 495581 3238779 4658 4662 498246 3236823 4658 4663 499531 3243162 Then it'll be easy to add --pretty=raw to display a single line per thread/event. By the way, -S was also used for --symbol... So I used -T/--thread here. perf report: Add -T/--threads to display per-thread counter values We get something like this now: # PID TID cache-misses cache-references 4658 4659 495581 3238779 4658 4662 498246 3236823 4658 4663 499531 3243162 Per-thread arrays of counter values are managed in utils/values.[ch] Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf tools: callchain: Fix sum of percentages to be 100% by displaying ↵Frederic Weisbecker
amount of ignored chains in fractal mode When we filter the callchains below a given percentage, we ignore them and the end result only shows entries that have an upper percentage than the filter threshold. It seems to users then that we have an imbalance in the percentage, as if the sum inside a profiled branch doesn't reach 100%. Since in the past there have been real perf report bugs that showed the same sypmtom, it would be nice to assure the user that the data is perfect and trustable and it all sums up to 100.00%. So fix this by displaying the remaining hits that have been filtered but without more detail than their amount in each branches. Example while filtering below 50%: 7.73% [k] delay_tsc | |--98.22%-- __const_udelay | | | |--86.37%-- ath5k_hw_register_timeout | | ath5k_hw_noise_floor_calibration | | ath5k_hw_reset | | ath5k_reset | | ath5k_config | | ieee80211_hw_config | | | | | |--88.53%-- ieee80211_scan_work | | | worker_thread | | | kthread | | | child_rip | | --11.47%-- [...] | --13.63%-- [...] --1.78%-- [...] Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1249690585-9145-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf tools: callchain: Fix 'perf report' display to be callchain by defaultFrederic Weisbecker
If we recorded with -g option to record the callchain, right now we require a -g option to perf report as well - and people reported this as unnecessary complication: the user already specified -g once, no need to require it a second time. So if the recording includes call-chains, display the callchain by default from perf report. ( The user can override this default using "-g none" option from perf report. ) Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1249690585-9145-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf report: Add debug help for the finding of symbol bugs - show the symtab ↵Arnaldo Carvalho de Melo
origin (DSO, build-id, kernel, etc) Used with perf report --verbose: [acme@doppio linux-2.6-tip]$ perf report -v | head -16 5.17% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x00000000005d8eee f [.] imgContainer::DrawFrameTo(gfxIImageFrame*, gfxIImageFrame*, nsRect&) 2.56% firefox /lib64/libpthread-2.10.1.so 0x0000000000008e02 d [.] __pthread_mutex_lock_internal 1.94% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x0000000000d0af8f f [.] SearchTable 1.75% firefox [kernel] 0xffffffffff60013b k [.] vread_hpet 1.63% firefox /lib64/libpthread-2.10.1.so 0x000000000000a404 d [.] __pthread_mutex_unlock 1.47% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000482ea f [.] js_Interpret 1.42% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x000000000003eda3 f [.] JS_CallTracer 1.24% firefox [kernel] 0xffffffff8102ca4a k [k] read_hpet 1.16% firefox [kernel] 0xffffffff810f3dd4 k [k] fget_light 1.11% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000567ff f [.] js_TraceObject 0.98% firefox /usr/lib64/firefox-3.5.2/firefox 0x000000000000dd23 b [.] arena_ralloc [acme@doppio linux-2.6-tip]$ The new field is just after the symbol address. To help in figuring out symbol resolution bugs. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf report: Fix per task mult-counter stat reportingPeter Zijlstra
Brice Goglin reported: > I can easily sort them by thread id, but I don't know how to match > my 4 events with each group of 4 lines. Also report the counter id and the time running/enabled stats (in case the counter got time-shared). Reported-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09perf tools: Fix call-chain cumul hit based sub-total (fractal mode)Frederic Weisbecker
The callchain fractal mode builds each new total hits in a new branch of profiling by using the parent's hits of the current branch plus the hits of the children. This is wrong, the total hits of a branch should be made of the sum of every children hits, we must ignore the parent hits in this scope. This patch also fixes another mistake with the hit counting. Now the rates are correct. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-05perf report: Make --sort comm,dso,symbol the defaultPekka Enberg
If you're doing performance testing, you're interested in the symbols anyway so lets make "--sort comm,dso,symbol" the default sort option. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: acme@redhat.com LKML-Reference: <1249467921-10450-1-git-send-email-penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04perf: Fix read buffer overflowRoel Kluin
Check whether index is within bounds before testing the element. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Cc: a.p.zijlstra@chello.nl Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4A757BCF.40101@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-02perf report: Update for the new FORK/EXIT eventsPeter Zijlstra
Since FORK is now also issued for threads, detect those by comparing the parent and child PID. Teach it about EXIT events and ignore them. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-22perf_counter: PERF_SAMPLE_ID and inherited countersPeter Zijlstra
Anton noted that for inherited counters the counter-id as provided by PERF_SAMPLE_ID isn't mappable to the id found through PERF_RECORD_ID because each inherited counter gets its own id. His suggestion was to always return the parent counter id, since that is the primary counter id as exposed. However, these inherited counters have a unique identifier so that events like PERF_EVENT_PERIOD and PERF_EVENT_THROTTLE can be specific about which counter gets modified, which is important when trying to normalize the sample streams. This patch removes PERF_EVENT_PERIOD in favour of PERF_SAMPLE_PERIOD, which is more useful anyway, since changing periods became a lot more common than initially thought -- rendering PERF_EVENT_PERIOD the less useful solution (also, PERF_SAMPLE_PERIOD reports the more accurate value, since it reports the value used to trigger the overflow, whereas PERF_EVENT_PERIOD simply reports the requested period changed, which might only take effect on the next cycle). This still leaves us PERF_EVENT_THROTTLE to consider, but since that _should_ be a rare occurrence, and linking it to a primary id is the most useful bit to diagnose the problem, we introduce a PERF_SAMPLE_STREAM_ID, for those few cases where the full reconstruction is important. [Does change the ABI a little, but I see no other way out] Suggested-by: Anton Blanchard <anton@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1248095846.15751.8781.camel@twins>
2009-07-22Merge commit 'tip/perfcounters/core' into perf-counters-for-linusPeter Zijlstra
2009-07-18perf_counter: Make call graph option consistentAnton Blanchard
perf record uses -g for logging call graph data but perf report uses -c to print call graph data. Be consistent and use -g everywhere for call graph data. Also update the help text to reflect the current default - fractal,0.5 Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090716104817.803604373@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-11perf report: Introduce -n/--show-nr-samplesArnaldo Carvalho de Melo
[acme@doppio pahole]$ perf report -ns comm,dso,symbol -d /lib64/libc-2.10.1.so -C pahole | head -17 21.94% 32101 [.] _int_malloc 20.10% 29402 [.] __GI_strcmp 16.77% 24533 [.] __tsearch 12.61% 18450 [.] malloc_consolidate 6.42% 9394 [.] _int_free 6.28% 9191 [.] __tfind 4.56% 6678 [.] __GI___libc_free 4.46% 6520 [.] _IO_vfprintf_internal 2.59% 3786 [.] __malloc 1.17% 1716 [.] __GI_memcpy [acme@doppio pahole]$ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1247325517-12272-5-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-11perf report: Make the output more compactArnaldo Carvalho de Melo
When we filter by column content we may end up with a column that has the same value for all the lines. So remove that column and tell its unique value on the top, as a comment. Example: [acme@doppio pahole]$ perf report --sort comm,dso,symbol -d ./build/libdwarves.so.1.0.0 -C pahole | head -15 # dso: ./build/libdwarves.so.1.0.0 # comm: pahole # Samples: 58409 # # Overhead Symbol # ........ ...... # 20.93% [.] tag__recode_dwarf_type 14.94% [.] namespace__recode_dwarf_types 10.38% [.] cu__table_add_tag 6.69% [.] __die__process_tag 5.05% [.] die__process_function 4.70% [.] list__for_all_tags 3.68% [.] tag__init 3.48% [.] die__create_new_parameter [acme@doppio pahole]$ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1247325517-12272-3-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-11perf report: Tidy up reporting of symbols not foundArnaldo Carvalho de Melo
Always printing the level info about if it is in the kernel, hypervisor or userspace as that is in the hist_entry. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1247325517-12272-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-11perf report: Adjust column width to the values sampledArnaldo Carvalho de Melo
Auto-adjust column width of perf report output to the longest occuring string length. Example: [acme@doppio pahole]$ perf report --sort comm,dso,symbol | head -13 12.79% pahole /usr/lib64/libdw-0.141.so [.] __libdw_find_attr 8.90% pahole /lib64/libc-2.10.1.so [.] _int_malloc 8.68% pahole /usr/lib64/libdw-0.141.so [.] __libdw_form_val_len 8.15% pahole /lib64/libc-2.10.1.so [.] __GI_strcmp 6.80% pahole /lib64/libc-2.10.1.so [.] __tsearch 5.54% pahole ./build/libdwarves.so.1.0.0 [.] tag__recode_dwarf_type [acme@doppio pahole]$ [acme@doppio pahole]$ perf report --sort comm,dso,symbol -d /lib64/libc-2.10.1.so | head -10 21.92% pahole /lib64/libc-2.10.1.so [.] _int_malloc 20.08% pahole /lib64/libc-2.10.1.so [.] __GI_strcmp 16.75% pahole /lib64/libc-2.10.1.so [.] __tsearch [acme@doppio pahole]$ Also add these extra options to control the new behaviour: -w, --field-width Force each column width to the provided list, for large terminal readability. -t, --field-separator: Use a special separator character and don't pad with spaces, replacing all occurances of this separator in symbol names (and other output) with a '.' character, that thus it's the only non valid separator. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090711014728.GH3452@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-05perf report: Add "Fractal" mode output - support callchains with relative ↵Frederic Weisbecker
overhead rate The current callchain displays the overhead rates as absolute: relative to the total overhead. This patch provides relative overhead percentage, in which each branch of the callchain tree is a independant instrumentated object. This provides a 'fractal' view of the call-chain profile: each sub-graph looks like a profile in itself - relative to its parent. You can produce such output by using the "fractal" mode that you can abbreviate via f, fr, fra, frac, etc... ./perf report -s sym -c fractal Example: 8.46% [k] copy_user_generic_string | |--52.01%-- generic_file_aio_read | do_sync_read | vfs_read | | | |--97.20%-- sys_pread64 | | system_call_fastpath | | pread64 | | | --2.81%-- sys_read | system_call_fastpath | __read | |--39.85%-- generic_file_buffered_write | __generic_file_aio_write_nolock | generic_file_aio_write | do_sync_write | reiserfs_file_write | vfs_write | | | |--97.05%-- sys_pwrite64 | | system_call_fastpath | | __pwrite64 | | | --2.95%-- sys_write | system_call_fastpath | __write_nocancel [...] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246772361-9960-5-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-05perf report: Change default callchain parametersFrederic Weisbecker
The default callchain parameters are set to use the flat mode and never filter any overhead threshold of backtrace. But flat mode is boring compared to graph mode. Also the number of callchains may be very high if none is filtered. Let's change this to set the graph view and a minimum overhead of 0.5% as default parameters. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246772361-9960-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-05perf report: Use a modifiable string for default callchain optionsFrederic Weisbecker
If the user doesn't provide options to tune his callchain output (ie: if he uses -c without arguments) then the default value passed in the OPT_CALLBACK_DEFAULT() macro is used. But it's parsed later by strtok() which will replace comma separators to a zero. This may segfault as we are using a read-only string. Use a modifiable one instead, and also fix the "100%" default minimum threshold value by turning it into a 0 (output every callchains) as it was intended in the origin. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246772361-9960-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>