265 Commits

Author SHA1 Message Date
Linus Torvalds
802f0d58d5 perf tools changes for v6.15
perf record
 -----------
 * Introduce latency profiling using scheduler information.  The latency
   profiling is to show impacts on wall-time rather than cpu-time.  By
   tracking context switches, it can weight samples and find which part
   of the code contributed more to the execution latency.
 
   The value (period) of the sample is weighted by dividing it by the
   number of parallel execution at the moment.  The parallelism is
   tracked in perf report with sched-switch records.  This will reduce
   the portion that are run in parallel and in turn increase the portion
   of serial executions.
 
   For now, it's limited to profile processes, IOW system-wide profiling
   is not supported.  You can add --latency option to enable this.
 
     $ perf record --latency -- make -C tools/perf
 
   I've run the above command for perf build which adds -j option to
   make with the number of CPUs in the system internally.  Normally
   it'd show something like below:
 
     $ perf report -F overhead,comm
     ...
     #
     # Overhead  Command
     # ........  ...............
     #
         78.97%  cc1
          6.54%  python3
          4.21%  shellcheck
          3.28%  ld
          1.80%  as
          1.37%  cc1plus
          0.80%  sh
          0.62%  clang
          0.56%  gcc
          0.44%  perl
          0.39%  make
 	 ...
 
   The cc1 takes around 80% of the overhead as it's the actual compiler.
   However it runs in parallel so its contribution to latency may be less
   than that.  Now, perf report will show both overhead and latency (if
   --latency was given at record time) like below:
 
     $ perf report -s comm
     ...
     #
     # Overhead   Latency  Command
     # ........  ........  ...............
     #
         78.97%    48.66%  cc1
          6.54%    25.68%  python3
          4.21%     0.39%  shellcheck
          3.28%    13.70%  ld
          1.80%     2.56%  as
          1.37%     3.08%  cc1plus
          0.80%     0.98%  sh
          0.62%     0.61%  clang
          0.56%     0.33%  gcc
          0.44%     1.71%  perl
          0.39%     0.83%  make
 	 ...
 
   You can see latency of cc1 goes down to around 50% and python3 and ld
   contribute a lot more than their overhead.  You can use --latency
   option in perf report to get the same result but ordered by latency.
 
     $ perf report --latency -s comm
 
 perf report
 -----------
 * As a side effect of the latency profiling work, it adds a new output
   field 'latency' and a sort key 'parallelism'.  The below is a result
   from my system with 64 CPUs.  The build was well-parallelized but
   contained some serial portions.
 
     $ perf report -s parallelism
     ...
     #
     # Overhead   Latency  Parallelism
     # ........  ........  ...........
     #
         16.95%     1.54%           62
         13.38%     1.24%           61
         12.50%    70.47%            1
         11.81%     1.06%           63
          7.59%     0.71%           60
          4.33%    12.20%            2
          3.41%     0.33%           59
          2.05%     0.18%           64
          1.75%     1.09%            9
          1.64%     1.85%            5
          ...
 
 * Support Feodra mini-debuginfo which is a LZMA compressed symbol table
   inside ".gnu_debugdata" ELF section.
 
 perf annotate
 -------------
 * Add --code-with-type option to enable data-type profiling with the
   usual annotate output.  Instead of focusing on data structure, it
   shows code annotation together with data type it accesses in case the
   instruction refers to a memory location (and it was able to resolve
   the target data type).  Currently it only works with --stdio.
 
     $ perf annotate --stdio --code-with-type
     ...
      Percent |      Source code & Disassembly of vmlinux for cpu/mem-loads,ldlat=30/pp (18 samples, percent: local period)
     ----------------------------------------------------------------------------------------------------------------------
              : 0                0xffffffff81050610 <__fdget>:
         0.00 :   ffffffff81050610:        callq   0xffffffff81c01b80 <__fentry__>           # data-type: (stack operation)
         0.00 :   ffffffff81050615:        pushq   %rbp              # data-type: (stack operation)
         0.00 :   ffffffff81050616:        movq    %rsp, %rbp
         0.00 :   ffffffff81050619:        pushq   %r15              # data-type: (stack operation)
         0.00 :   ffffffff8105061b:        pushq   %r14              # data-type: (stack operation)
         0.00 :   ffffffff8105061d:        pushq   %rbx              # data-type: (stack operation)
         0.00 :   ffffffff8105061e:        subq    $0x10, %rsp
         0.00 :   ffffffff81050622:        movl    %edi, %ebx
         0.00 :   ffffffff81050624:        movq    %gs:0x7efc4814(%rip), %rax  # 0x14e40 <current_task>              # data-type: struct task_struct* +0
         0.00 :   ffffffff8105062c:        movq    0x8d0(%rax), %r14         # data-type: struct task_struct +0x8d0 (files)
         0.00 :   ffffffff81050633:        movl    (%r14), %eax              # data-type: struct files_struct +0 (count.counter)
         0.00 :   ffffffff81050636:        cmpl    $0x1, %eax
         0.00 :   ffffffff81050639:        je      0xffffffff810506a9 <__fdget+0x99>
         0.00 :   ffffffff8105063b:        movq    0x20(%r14), %rcx          # data-type: struct files_struct +0x20 (fdt)
         0.00 :   ffffffff8105063f:        movl    (%rcx), %eax              # data-type: struct fdtable +0 (max_fds)
         0.00 :   ffffffff81050641:        cmpl    %ebx, %eax
         0.00 :   ffffffff81050643:        jbe     0xffffffff810506ef <__fdget+0xdf>
         0.00 :   ffffffff81050649:        movl    %ebx, %r15d
         5.56 :   ffffffff8105064c:        movq    0x8(%rcx), %rdx           # data-type: struct fdtable +0x8 (fd)
 	...
 
   The "# data-type:" part was added with this change.  The first few
   entries are not very interesting.  But later you can it accesses
   a couple of fields in the task_struct, files_struct and fdtable.
 
 perf trace
 ----------
 * Support syscall tracing for different ABI.  For example it can trace
   system calls for 32-bit applications on 64-bit kernel transparently.
 
 * Add --summary-mode=total option to show global syscall summary.  The
   default is 'thread' to show per-thread syscall summary.
 
 Python support
 --------------
 * Add more interfaces to 'perf' module to parse events, and config,
   enable or disable the event list properly so that it can implement
   basic functionalities purely in Python.  There is an example code
   for these new interfaces in python/tracepoint.py.
 
 * Add mypy and pylint support to enable build time checking.  Fix
   some code based on the findings from these tools.
 
 Internals
 ---------
 * Introduce io_dir__readdir() API to make directory traveral (usually
   for proc or sysfs) efficient with less memory footprint.
 
 JSON vendor events
 ------------------
 * Add events and metrics for ARM Neoverse N3 and V3
 * Update events and metrics on various Intel CPUs
 * Add/update events for a number of SiFive processors
 
 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZ+Y/YQAKCRCMstVUGiXM
 g64CAP9sdUgKCe66JilZRAEy9d3Cp835v7KjHSlCXLXkRSoy6wD+KH96Y0OqVtGw
 GYON4s9ZoXAoH+J0lKIFSffVL9MhYgY=
 =7L7B
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.15-2025-03-27' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Namhyung Kim:
 "perf record:

   - Introduce latency profiling using scheduler information.

     The latency profiling is to show impacts on wall-time rather than
     cpu-time. By tracking context switches, it can weight samples and
     find which part of the code contributed more to the execution
     latency.

     The value (period) of the sample is weighted by dividing it by the
     number of parallel execution at the moment. The parallelism is
     tracked in perf report with sched-switch records. This will reduce
     the portion that are run in parallel and in turn increase the
     portion of serial executions.

     For now, it's limited to profile processes, IOW system-wide
     profiling is not supported. You can add --latency option to enable
     this.

       $ perf record --latency -- make -C tools/perf

     I've run the above command for perf build which adds -j option to
     make with the number of CPUs in the system internally. Normally
     it'd show something like below:

       $ perf report -F overhead,comm
       ...
       #
       # Overhead  Command
       # ........  ...............
       #
           78.97%  cc1
            6.54%  python3
            4.21%  shellcheck
            3.28%  ld
            1.80%  as
            1.37%  cc1plus
            0.80%  sh
            0.62%  clang
            0.56%  gcc
            0.44%  perl
            0.39%  make
  	 ...

     The cc1 takes around 80% of the overhead as it's the actual
     compiler. However it runs in parallel so its contribution to
     latency may be less than that. Now, perf report will show both
     overhead and latency (if --latency was given at record time) like
     below:

       $ perf report -s comm
       ...
       #
       # Overhead   Latency  Command
       # ........  ........  ...............
       #
           78.97%    48.66%  cc1
            6.54%    25.68%  python3
            4.21%     0.39%  shellcheck
            3.28%    13.70%  ld
            1.80%     2.56%  as
            1.37%     3.08%  cc1plus
            0.80%     0.98%  sh
            0.62%     0.61%  clang
            0.56%     0.33%  gcc
            0.44%     1.71%  perl
            0.39%     0.83%  make
  	 ...

     You can see latency of cc1 goes down to around 50% and python3 and
     ld contribute a lot more than their overhead. You can use --latency
     option in perf report to get the same result but ordered by
     latency.

       $ perf report --latency -s comm

  perf report:

   - As a side effect of the latency profiling work, it adds a new
     output field 'latency' and a sort key 'parallelism'. The below is a
     result from my system with 64 CPUs. The build was well-parallelized
     but contained some serial portions.

       $ perf report -s parallelism
       ...
       #
       # Overhead   Latency  Parallelism
       # ........  ........  ...........
       #
           16.95%     1.54%           62
           13.38%     1.24%           61
           12.50%    70.47%            1
           11.81%     1.06%           63
            7.59%     0.71%           60
            4.33%    12.20%            2
            3.41%     0.33%           59
            2.05%     0.18%           64
            1.75%     1.09%            9
            1.64%     1.85%            5
            ...

   - Support Feodra mini-debuginfo which is a LZMA compressed symbol
     table inside ".gnu_debugdata" ELF section.

  perf annotate:

   - Add --code-with-type option to enable data-type profiling with the
     usual annotate output.

     Instead of focusing on data structure, it shows code annotation
     together with data type it accesses in case the instruction refers
     to a memory location (and it was able to resolve the target data
     type). Currently it only works with --stdio.

       $ perf annotate --stdio --code-with-type
       ...
        Percent |      Source code & Disassembly of vmlinux for cpu/mem-loads,ldlat=30/pp (18 samples, percent: local period)
       ----------------------------------------------------------------------------------------------------------------------
                : 0                0xffffffff81050610 <__fdget>:
           0.00 :   ffffffff81050610:        callq   0xffffffff81c01b80 <__fentry__>           # data-type: (stack operation)
           0.00 :   ffffffff81050615:        pushq   %rbp              # data-type: (stack operation)
           0.00 :   ffffffff81050616:        movq    %rsp, %rbp
           0.00 :   ffffffff81050619:        pushq   %r15              # data-type: (stack operation)
           0.00 :   ffffffff8105061b:        pushq   %r14              # data-type: (stack operation)
           0.00 :   ffffffff8105061d:        pushq   %rbx              # data-type: (stack operation)
           0.00 :   ffffffff8105061e:        subq    $0x10, %rsp
           0.00 :   ffffffff81050622:        movl    %edi, %ebx
           0.00 :   ffffffff81050624:        movq    %gs:0x7efc4814(%rip), %rax  # 0x14e40 <current_task>              # data-type: struct task_struct* +0
           0.00 :   ffffffff8105062c:        movq    0x8d0(%rax), %r14         # data-type: struct task_struct +0x8d0 (files)
           0.00 :   ffffffff81050633:        movl    (%r14), %eax              # data-type: struct files_struct +0 (count.counter)
           0.00 :   ffffffff81050636:        cmpl    $0x1, %eax
           0.00 :   ffffffff81050639:        je      0xffffffff810506a9 <__fdget+0x99>
           0.00 :   ffffffff8105063b:        movq    0x20(%r14), %rcx          # data-type: struct files_struct +0x20 (fdt)
           0.00 :   ffffffff8105063f:        movl    (%rcx), %eax              # data-type: struct fdtable +0 (max_fds)
           0.00 :   ffffffff81050641:        cmpl    %ebx, %eax
           0.00 :   ffffffff81050643:        jbe     0xffffffff810506ef <__fdget+0xdf>
           0.00 :   ffffffff81050649:        movl    %ebx, %r15d
           5.56 :   ffffffff8105064c:        movq    0x8(%rcx), %rdx           # data-type: struct fdtable +0x8 (fd)
  	...

     The "# data-type:" part was added with this change. The first few
     entries are not very interesting. But later you can it accesses a
     couple of fields in the task_struct, files_struct and fdtable.

  perf trace:

   - Support syscall tracing for different ABI. For example it can trace
     system calls for 32-bit applications on 64-bit kernel
     transparently.

   - Add --summary-mode=total option to show global syscall summary. The
     default is 'thread' to show per-thread syscall summary.

  Python support:

   - Add more interfaces to 'perf' module to parse events, and config,
     enable or disable the event list properly so that it can implement
     basic functionalities purely in Python. There is an example code
     for these new interfaces in python/tracepoint.py.

   - Add mypy and pylint support to enable build time checking. Fix some
     code based on the findings from these tools.

  Internals:

   - Introduce io_dir__readdir() API to make directory traveral (usually
     for proc or sysfs) efficient with less memory footprint.

  JSON vendor events:

   - Add events and metrics for ARM Neoverse N3 and V3

   - Update events and metrics on various Intel CPUs

   - Add/update events for a number of SiFive processors"

* tag 'perf-tools-for-v6.15-2025-03-27' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (229 commits)
  perf bpf-filter: Fix a parsing error with comma
  perf report: Fix a memory leak for perf_env on AMD
  perf trace: Fix wrong size to bpf_map__update_elem call
  perf tools: annotate asm_pure_loop.S
  perf python: Fix setup.py mypy errors
  perf test: Address attr.py mypy error
  perf build: Add pylint build tests
  perf build: Add mypy build tests
  perf build: Rename TEST_LOGS to SHELL_TEST_LOGS
  tools/build: Don't pass test log files to linker
  perf bench sched pipe: fix enforced blocking reads in worker_thread
  perf tools: Fix is_compat_mode build break in ppc64
  perf build: filter all combinations of -flto for libperl
  perf vendor events arm64 AmpereOneX: Fix frontend_bound calculation
  perf vendor events arm64: AmpereOne/AmpereOneX: Mark LD_RETIRED impacted by errata
  perf trace: Fix evlist memory leak
  perf trace: Fix BTF memory leak
  perf trace: Make syscall table stable
  perf syscalltbl: Mask off ABI type for MIPS system calls
  perf build: Remove Makefile.syscalls
  ...
2025-03-31 08:52:33 -07:00
Linus Torvalds
4fa118e5b7 tracing tooling updates for 6.15:
- Allow RTLA to collect data via BPF
 
   The current implementation of rtla uses libtracefs and libtraceevent to
   pull sample events generated by the timerlat tracer from the trace
   buffer. rtla then processes the sample by updating the histogram and
   summary (current, maximum, minimum, and sum values) as well as checks
   if tracing has been stopped due to threshold overflow.
 
   In use cases where a large number of samples is being generated, that
   is, with measurements running on many CPUs and with a low interval,
   this sample processing design causes a significant CPU load on the rtla
   side. Furthermore, with >100 CPUs and 100us interval, rtla was reported
   as not being able to keep up with the samples and dropping most of them,
   leading to it being unusable.
 
   Change the way the timerlat trace processes samples by attaching
   a BPF program to the trace event using the BPF skeleton feature of bpftool.
   Unlike the current implementation, the BPF implementation does not check
   whether tracing is stopped (in BPF mode, tracing is always off to improve
   performance), but waits for a write to a BPF ringbuffer instead. This allows
   rtla to exit immediately when a threshold is violated, without waiting
   for the next iteration of the while loop.
 
   If the requirements for the BPF implementation are not met, either at
   build time or at run time, the current implementation is used as
   fallback. Which implementation is being used can be seen when running
   rtla timerlat with "-D" option. rtla can be forced to run in non-BPF
   mode by setting the RTLA_NO_BPF option to 1, for debugging purposes.
 
 - Fix LD_FLAGS from being dropped in build
 
 - Refactor code to remove duplication of save_trace_to_file
 
 - Always set options and do not rely on default settings
 
   Do not rely on the default kernel settings of the tracers when
   starting. They could have been changed by the user which gives
   inconsistent results. Always set the options that rtla expects.
 
 - Add creation of ctags and TAGS for traversing code
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ+WBgRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qg54AQDCOChaSSBiUkD0VoPKIeDMlPfvO5Qz
 Xvrst5gtopfKFgEA12/9Lll/sh1eoc4saeGBooNY48HBUMjmX3KNFB14PQg=
 =aHFu
 -----END PGP SIGNATURE-----

Merge tag 'trace-tools-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing tooling updates from Steven Rostedt:

 - Allow RTLA to collect data via BPF

   The current implementation of rtla uses libtracefs and libtraceevent
   to pull sample events generated by the timerlat tracer from the trace
   buffer. rtla then processes the sample by updating the histogram and
   summary (current, maximum, minimum, and sum values) as well as checks
   if tracing has been stopped due to threshold overflow.

   In use cases where a large number of samples is being generated, that
   is, with measurements running on many CPUs and with a low interval,
   this sample processing design causes a significant CPU load on the
   rtla side. Furthermore, with >100 CPUs and 100us interval, rtla was
   reported as not being able to keep up with the samples and dropping
   most of them, leading to it being unusable.

   Change the way the timerlat trace processes samples by attaching a
   BPF program to the trace event using the BPF skeleton feature of
   bpftool. Unlike the current implementation, the BPF implementation
   does not check whether tracing is stopped (in BPF mode, tracing is
   always off to improve performance), but waits for a write to a BPF
   ringbuffer instead. This allows rtla to exit immediately when a
   threshold is violated, without waiting for the next iteration of the
   while loop.

   If the requirements for the BPF implementation are not met, either at
   build time or at run time, the current implementation is used as
   fallback. Which implementation is being used can be seen when running
   rtla timerlat with "-D" option. rtla can be forced to run in non-BPF
   mode by setting the RTLA_NO_BPF option to 1, for debugging purposes.

 - Fix LD_FLAGS from being dropped in build

 - Refactor code to remove duplication of save_trace_to_file

 - Always set options and do not rely on default settings

   Do not rely on the default kernel settings of the tracers when
   starting. They could have been changed by the user which gives
   inconsistent results. Always set the options that rtla expects.

 - Add creation of ctags and TAGS for traversing code

* tag 'trace-tools-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  rtla: Add the ability to create ctags and etags
  rtla/tests: Test setting default options
  rtla/tests: Reset osnoise options before check
  rtla: Always set all tracer options
  rtla/osnoise: Set OSNOISE_WORKLOAD to true
  rtla: Unify apply_config between top and hist
  rtla/osnoise: Unify params struct
  rtla: Fix segfault in save_trace_to_file call
  tools/build: Use SYSTEM_BPFTOOL for system bpftool
  rtla: Refactor save_trace_to_file
  tools/rv: Keep user LDFLAGS in build
  rtla/timerlat: Test BPF mode
  rtla/timerlat_top: Use BPF to collect samples
  rtla/timerlat_top: Move divisor to update
  rtla/timerlat_hist: Use BPF to collect samples
  rtla/timerlat: Add BPF skeleton to collect samples
  rtla: Add optional dependency on BPF tooling
  tools/build: Add bpftool-skeletons feature test
  rtla/timerlat: Unify params struct
2025-03-27 17:03:01 -07:00
Tomas Glozar
814d051ebe tools/build: Use SYSTEM_BPFTOOL for system bpftool
The feature test for system bpftool uses BPFTOOL as the variable to set
its path, defaulting to just "bpftool" if not set by the user.

This conflicts with selftests and a few other utilities, which expect
BPFTOOL to be set to the in-tree bpftool path by default. For example,
bpftool selftests fail to build:

$ make -C tools/testing/selftests/bpf/
make: Entering directory '/home/tglozar/dev/linux/tools/testing/selftests/bpf'

make: *** No rule to make target 'bpftool', needed by '/home/tglozar/dev/linux/tools/testing/selftests/bpf/tools/include/vmlinux.h'.  Stop.
make: Leaving directory '/home/tglozar/dev/linux/tools/testing/selftests/bpf'

Fix the problem by renaming the variable used for system bpftool from
BPFTOOL to SYSTEM_BPFTOOL, so that the new usage does not conflict with
the existing one of BPFTOOL.

Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250326004018.248357-1-tglozar@redhat.com
Fixes: 8a635c3856dd ("tools/build: Add bpftool-skeletons feature test")
Closes: https://lore.kernel.org/linux-kernel/5df6968a-2e5f-468e-b457-fc201535dd4c@linux.ibm.com/
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Suggested-by: Quentin Monnet <qmo@kernel.org>
Acked-by: Quentin Monnet <qmo@kernel.org>
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-26 10:31:31 -04:00
Ian Rogers
935e7cb5bb tools/build: Don't pass test log files to linker
Separate test log files from object files. Depend on test log output
but don't pass to the linker.

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250311213628.569562-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-24 09:38:20 -07:00
Tomas Glozar
8a635c3856 tools/build: Add bpftool-skeletons feature test
Add bpftool-skeletons feature test, testing the presence of a bpftool
capable of generating skeletons.

This is to be used for tools that do not require building their own
bootstrap bpftool from the kernel source tree.

Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Link: https://lore.kernel.org/20250218145859.27762-3-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-04 12:35:17 -05:00
Charlie Jenkins
293f324ce9 tools: Unify top-level quiet infrastructure
Commit f2868b1a66d4f40f ("perf tools: Expose quiet/verbose variables in
Makefile.perf") moved the quiet infrastructure out of
tools/build/Makefile.build and into the top-level Makefile.perf file so
that the quiet infrastructure could be used throughout perf and not just
in Makefile.build.

Extract out the quiet infrastructure into Makefile.include so that it
can be leveraged outside of perf.

Fixes: f2868b1a66d4f40f ("perf tools: Expose quiet/verbose variables in Makefile.perf")
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Benjamin Tissoires <bentiss@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Lukasz Luba <lukasz.luba@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Mykola Lysenko <mykolal@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <qmo@kernel.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20250213-quiet_tools-v3-1-07de4482a581@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-02-18 15:31:45 -03:00
Ian Rogers
7c1e94f5dc tools build: Fix a number of Wconversion warnings
There's some expressed interest in having the compiler flag
-Wconversion detect at build time certain kinds of potential problems:
https://lore.kernel.org/lkml/20250103182532.GB781381@e132581.arm.com/

As feature detection passes -Wconversion from CFLAGS when set, the
feature detection compile tests need to not fail because of
-Wconversion as the failure will be interpretted as a missing
feature. Switch various types to avoid the -Wconversion issue, the
exact meaning of the code is unimportant as it is typically looking
for header file definitions.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250106215443.198633-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 20:23:05 -08:00
Charlie Jenkins
f2868b1a66 perf tools: Expose quiet/verbose variables in Makefile.perf
The variables to make builds silent/verbose live inside
tools/build/Makefile.build. Move those variables to the top-level
Makefile.perf to be generally available.

Committer testing:

See the SYSCALL lines, now they are consistent with the other
operations in other lines:
  SYSTBL  /tmp/build/perf-tools-next/arch/x86/include/generated/asm/syscalls_32.h
  SYSTBL  /tmp/build/perf-tools-next/arch/x86/include/generated/asm/syscalls_64.h
  GEN     /tmp/build/perf-tools-next/common-cmds.h
  GEN     /tmp/build/perf-tools-next/arch/arm64/include/generated/asm/sysreg-defs.h
  PERF_VERSION = 6.13.rc2.g3d94bb6ed1d0
  GEN     perf-archive
  MKDIR   /tmp/build/perf-tools-next/jvmti/
  MKDIR   /tmp/build/perf-tools-next/jvmti/
  MKDIR   /tmp/build/perf-tools-next/jvmti/
  MKDIR   /tmp/build/perf-tools-next/jvmti/
  GEN     perf-iostat
  CC      /tmp/build/perf-tools-next/jvmti/libjvmti.o

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20250114-perf_make_test-v1-1-decc1c517b11@rivosinc.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-16 10:59:20 -08:00
Charlie Jenkins
3cc550f5bb perf tools: Remove dependency on libaudit
All architectures now support HAVE_SYSCALL_TABLE_SUPPORT, so the flag is
no longer needed. With the removal of the flag, the related
GENERIC_SYSCALL_TABLE can also be removed.

libaudit was only used as a fallback for when HAVE_SYSCALL_TABLE_SUPPORT
was not defined, so libaudit is also no longer needed for any
architecture.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-16-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 10:59:42 -03:00
Charlie Jenkins
4a73aff8c5 perf tools: Create generic syscall table support
Currently each architecture in perf independently generates syscall
headers.

Adapt the work that has gone into unifying syscall header
implementations in the kernel to work with perf tools.

Introduce this framework with riscv at first. riscv previously relied on
libaudit, but with this change, perf tools for riscv no longer needs
this external dependency.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-1-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09 12:49:49 -03:00
Leo Yan
d557814cdf tools build: Add feature test for libelf with ZSTD
The macro ELFCOMPRESS_ZSTD defines the compress algorithm, which was
introduced in the commit ("libelf: Document and make ELFCOMPRESS_ZSTD
usable with old system elf.h") of the repository elfutils-0.188-67.
Therefore, libelf 0.189 and later versions require to link the libzstd
library.

Add a test for checking if libelf supports ZSTD algorithm.  Pass the
macro ELFCOMPRESS_ZSTD as an argument to the elf_compress() function.
If the build succeeds, it means the feature is supported.

Reviewed-by: Quentin Monnet <qmo@kernel.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Quentin Monnet <qmo@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20241215221223.293205-2-leo.yan@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18 16:24:32 -03:00
Arnaldo Carvalho de Melo
055f0ce7d8 tools build: Test for presence of libtraceevent and libtracefs in test-all.c
Since these are so far considered part of the basic set of libraries to
be present when building perf, have then in
tools/build/features/test-all.c.

They were already in the FEATURE_TESTS_BASIC variable of
tools/build/Makefile.feature, meaning if test-all.c builds, those
features would be set as present, but then we were calling "again"
(well, they were not in test-all.c, so were not really being tested) for
it to be detected, fix this all up by not calling feature_check for
those features but instead have them in test-all.c to be tested together
with the the set of basic expected libraries.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20241213195052.914914-3-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18 16:24:28 -03:00
Arnaldo Carvalho de Melo
701b27403c tools build feature: Don't set feature-libslang-include-subdir=1 if test-all.c builds
As it is not really included in tools/build/feature/test-all.c, so any
questioning about this feature should really try to build
tools/build/feature/test-libslang-include-subdir.c and not set it as
detected when test-all.c builds.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20241213195052.914914-2-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-13 18:30:57 -03:00
Arnaldo Carvalho de Melo
b1ef2559d5 tools build feature: Don't set feature-libcap=1 if libcap-devel isn't available
libcap isn't tested in the tools/build/feature/test-all.c fast path
feature detection process, so don't set it as available if test-all
manages to build.

There are other users of this feature detection mechanism, and they
explicitely ask for libcap to be tested, so are not affected by this
patch, for instance, with this patch in place:

  $ make -C tools/bpf/bpftool/ clean
  <SNIP>
  make: Leaving directory '/home/acme/git/perf-tools-next/tools/bpf/bpftool'
  ⬢ [acme@toolbox perf-tools-next]$ make -C tools/bpf/bpftool/
  make: Entering directory '/home/acme/git/perf-tools-next/tools/bpf/bpftool'

  Auto-detecting system features:
  ...                         clang-bpf-co-re: [ on  ]
  ...                                    llvm: [ on  ]
  ...                                  libcap: [ on  ]
  ...                                  libbfd: [ on  ]
  ...                             libelf-zstd: [ on  ]
  <SNIP>
    LINK    bpftool
  make: Leaving directory '/home/acme/git/perf-tools-next/tools/bpf/bpftool'
  $
  $ sudo rpm -e libcap-devel
  $ make -C tools/bpf/bpftool/
  <SNIP>
  make: Entering directory '/home/acme/git/perf-tools-next/tools/bpf/bpftool'

  Auto-detecting system features:
  ...                         clang-bpf-co-re: [ on  ]
  ...                                    llvm: [ on  ]
  ...                                  libcap: [ OFF ]
  ...                                  libbfd: [ on  ]
  ...                             libelf-zstd: [ on  ]

  $

Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Quentin Monnet <qmo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20241211224509.797827-3-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-12 15:34:14 -03:00
Arnaldo Carvalho de Melo
20ed532554 tools build feature: Add some comments to explain the FEATURE_TESTS logic
The tools/build/feature/test-all.c works in conjunction with the
tools/build/Makefile.feature FEATURE_TESTS_BASIC and FEATURE_TESTS_EXTRA
contents, so that if test-all.c manages to be built, we go on and
iterate all entries in FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA setting
them to 1.

To test this:

  $ rm -rf /tmp/b ; mkdir /tmp/b ; make -C tools/perf O=/tmp/b feature-dump
  $ cat /tmp/b/feature/test-all.make.output
  $ ldd /tmp/b/feature/test-all.bin
	linux-vdso.so.1 (0x00007f2a47a67000)
	libdw.so.1 => /lib64/libdw.so.1 (0x00007f2a477cf000)
	libpython3.12.so.1.0 => /lib64/libpython3.12.so.1.0 (0x00007f2a471fe000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f2a4711a000)
	libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f2a470f2000)
	libtracefs.so.1 => /lib64/libtracefs.so.1 (0x00007f2a470cb000)
	libcrypto.so.3 => /lib64/libcrypto.so.3 (0x00007f2a46c1b000)
	libz.so.1 => /lib64/libz.so.1 (0x00007f2a46bf8000)
	libbabeltrace-ctf.so.1 => /lib64/libbabeltrace-ctf.so.1 (0x00007f2a46bad000)
	libcapstone.so.5 => /lib64/libcapstone.so.5 (0x00007f2a464b8000)
	libopencsd_c_api.so.1 => /lib64/libopencsd_c_api.so.1 (0x00007f2a464a8000)
	libopencsd.so.1 => /lib64/libopencsd.so.1 (0x00007f2a46422000)
	libelf.so.1 => /lib64/libelf.so.1 (0x00007f2a46406000)
	libnuma.so.1 => /lib64/libnuma.so.1 (0x00007f2a463f6000)
	libslang.so.2 => /lib64/libslang.so.2 (0x00007f2a46113000)
	libperl.so.5.38 => /lib64/libperl.so.5.38 (0x00007f2a45d74000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f2a45b83000)
	liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f2a45b50000)
	libzstd.so.1 => /lib64/libzstd.so.1 (0x00007f2a45a91000)
	libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f2a45a7b000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f2a47a69000)
	libbabeltrace.so.1 => /lib64/libbabeltrace.so.1 (0x00007f2a45a6b000)
	libpopt.so.0 => /lib64/libpopt.so.0 (0x00007f2a45a5b000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f2a45a51000)
	libgmodule-2.0.so.0 => /lib64/libgmodule-2.0.so.0 (0x00007f2a45a4a000)
	libglib-2.0.so.0 => /lib64/libglib-2.0.so.0 (0x00007f2a458fa000)
	libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f2a45696000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f2a45668000)
	libcrypt.so.2 => /lib64/libcrypt.so.2 (0x00007f2a45630000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f2a45590000)
  $ head /tmp/b/FEATURE-DUMP
  feature-backtrace=1
  feature-libdw=1
  feature-eventfd=1
  feature-fortify-source=1
  feature-get_current_dir_name=1
  feature-gettid=1
  feature-glibc=1
  feature-libbfd=1
  feature-libbfd-buildid=1
  feature-libcap=1
  $

There are inconsistencies that are being audited, as can be seen above
with the libcap case, that is not linked with test-all.bin nor is
present in test-all.c, so shouldn't be set as present. Further patches
are going to address those inconsistencies, but lets document this a bit
more to reduce the chances of this happening again.

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20241211224509.797827-2-acme@kernel.org
[ Fixed typo pointed out by Ian Rogers ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-12 15:33:24 -03:00
Arnaldo Carvalho de Melo
b40fbeb0b1 tools build: Remove the libunwind feature tests from the ones detected when test-all.o builds
We have a tools/build/feature/test-all.c that has the most common set of
features that perf uses and are expected to have its development files
available when building perf.

When we made libwunwind opt-in we forgot to remove them from the list of
features that are assumed to be available when test-all.c builds, remove
them.

Before this patch:

  $ rm -rf /tmp/b ; mkdir /tmp/b ; make -C tools/perf O=/tmp/b feature-dump ; grep feature-libunwind-aarch64= /tmp/b/FEATURE-DUMP
  feature-libunwind-aarch64=1
  $

Even tho this not being test built and those header files being
available:

  $ head -5 tools/build/feature/test-libunwind-aarch64.c
  // SPDX-License-Identifier: GPL-2.0
  #include <libunwind-aarch64.h>
  #include <stdlib.h>

  extern int UNW_OBJ(dwarf_search_unwind_table) (unw_addr_space_t as,
  $

After this patch:

  $ grep feature-libunwind- /tmp/b/FEATURE-DUMP
  $

Now an audit on what is being enabled when test-all.c builds will be
performed.

Fixes: 176c9d1e6a06f2fa ("tools features: Don't check for libunwind devel files by default")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-11 19:42:05 -03:00
Arnaldo Carvalho de Melo
176c9d1e6a tools features: Don't check for libunwind devel files by default
Since 13e17c9ff49119aa ("perf build: Make libunwind opt-in rather than
opt-out"), so we shouldn't by default be testing for its availability at
build time in tools/build/features/test-all.c.

That test was designed to test the features we expect to be the most
common ones in most builds, so if we test build just that file, then we
assume the features there are present and will not test one by one.

Removing it from test-all.c gets rid of the first impediment for
test-all.c to build successfully:

  $ cat /tmp/build/perf-tools-next/feature/test-all.make.output
  In file included from test-all.c:62:
  test-libunwind.c:2:10: fatal error: libunwind.h: No such file or directory
      2 | #include <libunwind.h>
        |          ^~~~~~~~~~~~~
  compilation terminated.
  $

We then get to:

  $ cat /tmp/build/perf-tools-next/feature/test-all.make.output
  /usr/bin/ld: cannot find -lunwind-x86_64: No such file or directory
  /usr/bin/ld: cannot find -lunwind: No such file or directory
  collect2: error: ld returned 1 exit status
  $

So make all the logic related to setting CFLAGS, LDFLAGS, etc for
libunwind to be conditional on NO_LIBWUNWIND=1, which is now the
default, now we get a faster build:

  $ cat /tmp/build/perf-tools-next/feature/test-all.make.output
  $ ldd /tmp/build/perf-tools-next/feature/test-all.bin
  	linux-vdso.so.1 (0x00007fef04cde000)
  	libdw.so.1 => /lib64/libdw.so.1 (0x00007fef04a49000)
  	libpython3.12.so.1.0 => /lib64/libpython3.12.so.1.0 (0x00007fef04478000)
  	libm.so.6 => /lib64/libm.so.6 (0x00007fef04394000)
  	libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007fef0436c000)
  	libtracefs.so.1 => /lib64/libtracefs.so.1 (0x00007fef04345000)
  	libcrypto.so.3 => /lib64/libcrypto.so.3 (0x00007fef03e95000)
  	libz.so.1 => /lib64/libz.so.1 (0x00007fef03e72000)
  	libelf.so.1 => /lib64/libelf.so.1 (0x00007fef03e56000)
  	libnuma.so.1 => /lib64/libnuma.so.1 (0x00007fef03e48000)
  	libslang.so.2 => /lib64/libslang.so.2 (0x00007fef03b65000)
  	libperl.so.5.38 => /lib64/libperl.so.5.38 (0x00007fef037c6000)
  	libc.so.6 => /lib64/libc.so.6 (0x00007fef035d5000)
  	liblzma.so.5 => /lib64/liblzma.so.5 (0x00007fef035a0000)
  	libzstd.so.1 => /lib64/libzstd.so.1 (0x00007fef034e1000)
  	libbz2.so.1 => /lib64/libbz2.so.1 (0x00007fef034cd000)
  	/lib64/ld-linux-x86-64.so.2 (0x00007fef04ce0000)
  	libcrypt.so.2 => /lib64/libcrypt.so.2 (0x00007fef03495000)
  $

Fixes: 13e17c9ff49119aa ("perf build: Make libunwind opt-in rather than opt-out")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Z09zTztD8X8qIWCX@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09 17:51:53 -03:00
Linus Torvalds
b50ecc5aca perf tools changes for v6.13
perf record
 -----------
 * Enable leader sampling for inherited task events.  It was supported
   only for system-wide events but the kernel started to support such a
   setup since v6.12.
 
   This is to reduce the number of PMU interrupts.  The samples of the
   leader event will contain counts of other events and no samples will
   be generated for the other member events.
 
     $ perf record -e '{cycles,instructions}:S'  ${MYPROG}
 
 perf report
 -----------
 * Fix --branch-history option to display more branch-related information
   like prediction, abort and cycles which is available on Intel machines.
 
     $ perf record -bg -- perf test -w brstack
 
     $ perf report --branch-history
     ...
     #
     # Overhead  Source:Line               Symbol          Shared Object         Predicted  Abort  Cycles  IPC   [IPC Coverage]
     # ........  ........................  ..............  ....................  .........  .....  ......  ....................
     #
          8.17%  copy_page_64.S:19         [k] copy_page   [kernel.kallsyms]     50.0%      0      5       -      -
                 |
                 ---xas_load xarray.h:171
                    |
                    |--5.68%--xas_load xarray.c:245 (cycles:1)
                    |          xas_load xarray.c:242
                    |          xas_load xarray.h:1260 (cycles:1)
                    |          xas_descend xarray.c:146
                    |          xas_load xarray.c:244 (cycles:2)
                    |          xas_load xarray.c:245
                    |          xas_descend xarray.c:218 (cycles:10)
     ...
 
 perf stat
 ---------
 * Add HWMON PMU support.  The HWMON provides various system information
   like CPU/GPU temperature, fan speed and so on.  Expose them as PMU
   events so that users can see the values using perf stat commands.
 
     $ perf stat -e temp_cpu,fan1 true
 
      Performance counter stats for 'true':
 
                  60.00 'C   temp_cpu
                      0 rpm  fan1
 
            0.000745382 seconds time elapsed
 
            0.000883000 seconds user
            0.000000000 seconds sys
 
 * Display metric threshold in JSON output.  Some metrics define
   thresholds to classify value ranges.  It used to be in a different
   color but it won't work for JSON.  Add "metric-threshold" field to
   the JSON that can be one of "good", "less good", "nearly bad" and
   "bad".
 
     # perf stat -a -M TopdownL1 -j true
     {"counter-value" : "18693525.000000", "unit" : "", "event" : "TOPDOWN.SLOTS", "event-runtime" : 5552708, "pcnt-running" : 100.00, "metric-value" : "43.226002", "metric-unit" : "%  tma_backend_bound", "metric-threshold" : "bad"}
     {"metric-value" : "29.212267", "metric-unit" : "%  tma_frontend_bound", "metric-threshold" : "bad"}
     {"metric-value" : "7.138972", "metric-unit" : "%  tma_bad_speculation", "metric-threshold" : "good"}
     {"metric-value" : "20.422759", "metric-unit" : "%  tma_retiring", "metric-threshold" : "good"}
     {"counter-value" : "3817732.000000", "unit" : "", "event" : "topdown-retiring", "event-runtime" : 5552708, "pcnt-running" : 100.00, }
     {"counter-value" : "5472824.000000", "unit" : "", "event" : "topdown-fe-bound", "event-runtime" : 5552708, "pcnt-running" : 100.00, }
     {"counter-value" : "7984780.000000", "unit" : "", "event" : "topdown-be-bound", "event-runtime" : 5552708, "pcnt-running" : 100.00, }
     {"counter-value" : "1418181.000000", "unit" : "", "event" : "topdown-bad-spec", "event-runtime" : 5552708, "pcnt-running" : 100.00, }
     ...
 
 perf sched
 ----------
 * Add -P/--pre-migrations option for 'timehist' sub-command to track
   time a task waited on a run-queue before migrating to a different CPU.
 
     $ perf sched timehist -P
                time    cpu  task name                       wait time  sch delay   run time  pre-mig time
                             [tid/pid]                          (msec)     (msec)     (msec)     (msec)
     --------------- ------  ------------------------------  ---------  ---------  ---------  ---------
       585940.535527 [0000]  perf[584885]                        0.000      0.000      0.000      0.000
       585940.535535 [0000]  migration/0[20]                     0.000      0.002      0.008      0.000
       585940.535559 [0001]  perf[584885]                        0.000      0.000      0.000      0.000
       585940.535563 [0001]  migration/1[25]                     0.000      0.001      0.004      0.000
       585940.535678 [0002]  perf[584885]                        0.000      0.000      0.000      0.000
       585940.535686 [0002]  migration/2[31]                     0.000      0.002      0.008      0.000
       585940.535905 [0001]  <idle>                              0.000      0.000      0.342      0.000
       585940.535938 [0003]  perf[584885]                        0.000      0.000      0.000      0.000
       585940.537048 [0001]  sleep[584886]                       0.000      0.019      1.142      0.001
       585940.537749 [0002]  <idle>                              0.000      0.000      2.062      0.000
     ...
 
 Build
 -----
 * Make libunwind opt-in (LIBUNWIND=1) rather than opt-out.  The perf
   tools are generally built with libelf and libdw which has unwinder
   functionality.  The libunwind support predates it and no need to
   have duplicate unwinders by default.
 
 * Rename NO_DWARF=1 build option to NO_LIBDW=1 in order to clarify it's
   using libdw for handling DWARF information.
 
 Internals
 ---------
 * Do not set exclude_guest bit in the perf_event_attr by default.  This
   was causing a trouble in AMD IBS PMU as it doesn't support the bit.
   The bit will be set when it's needed later by the fallback logic.
   Also update the missing feature detection logic to make sure not clear
   supported bits unnecessarily.
 
 * Run perf test in parallel by default and mark flaky tests "exclusive"
   to run them serially at the end.  Some test numbers are changed but
   the test can complete in less than half the time.
 
 JSON vendor events
 ------------------
 * Add AMD Zen 5 events and metrics.
 
 * Add i.MX91 and i.MX95 DDR metrics
 
 * Fix HiSilicon HIP08 Topdown metric name.
 
 * Support compat events on PowerPC.
 
 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZ0Qi3gAKCRCMstVUGiXM
 g6NIAP49eoSmQF40u55sJN0J7RpYd+bTgXZkahv0IUCBX98TLwEA2NrK0oUcB84C
 xeanq28/3JxNM/oBpsEvvB8mb/0lGwI=
 =FAVF
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.13-2024-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Namhyung Kim:
 "perf record:

   - Enable leader sampling for inherited task events. It was supported
     only for system-wide events but the kernel started to support such
     a setup since v6.12.

     This is to reduce the number of PMU interrupts. The samples of the
     leader event will contain counts of other events and no samples
     will be generated for the other member events.

       $ perf record -e '{cycles,instructions}:S'  ${MYPROG}

  perf report:

   - Fix --branch-history option to display more branch-related
     information like prediction, abort and cycles which is available
     on Intel machines.

       $ perf record -bg -- perf test -w brstack

       $ perf report --branch-history
       ...
       #
       # Overhead  Source:Line               Symbol          Shared Object         Predicted  Abort  Cycles  IPC   [IPC Coverage]
       # ........  ........................  ..............  ....................  .........  .....  ......  ....................
       #
            8.17%  copy_page_64.S:19         [k] copy_page   [kernel.kallsyms]     50.0%      0      5       -      -
                   |
                   ---xas_load xarray.h:171
                      |
                      |--5.68%--xas_load xarray.c:245 (cycles:1)
                      |          xas_load xarray.c:242
                      |          xas_load xarray.h:1260 (cycles:1)
                      |          xas_descend xarray.c:146
                      |          xas_load xarray.c:244 (cycles:2)
                      |          xas_load xarray.c:245
                      |          xas_descend xarray.c:218 (cycles:10)
       ...

  perf stat:

   - Add HWMON PMU support.

     The HWMON provides various system information like CPU/GPU
     temperature, fan speed and so on. Expose them as PMU events so that
     users can see the values using perf stat commands.

       $ perf stat -e temp_cpu,fan1 true

        Performance counter stats for 'true':

                    60.00 'C   temp_cpu
                        0 rpm  fan1

              0.000745382 seconds time elapsed

              0.000883000 seconds user
              0.000000000 seconds sys

   - Display metric threshold in JSON output.

     Some metrics define thresholds to classify value ranges. It used to
     be in a different color but it won't work for JSON.

     Add "metric-threshold" field to the JSON that can be one of "good",
     "less good", "nearly bad" and "bad".

       # perf stat -a -M TopdownL1 -j true
       {"counter-value" : "18693525.000000", "unit" : "", "event" : "TOPDOWN.SLOTS", "event-runtime" : 5552708, "pcnt-running" : 100.00, "metric-value" : "43.226002", "metric-unit" : "%  tma_backend_bound", "metric-threshold" : "bad"}
       {"metric-value" : "29.212267", "metric-unit" : "%  tma_frontend_bound", "metric-threshold" : "bad"}
       {"metric-value" : "7.138972", "metric-unit" : "%  tma_bad_speculation", "metric-threshold" : "good"}
       {"metric-value" : "20.422759", "metric-unit" : "%  tma_retiring", "metric-threshold" : "good"}
       {"counter-value" : "3817732.000000", "unit" : "", "event" : "topdown-retiring", "event-runtime" : 5552708, "pcnt-running" : 100.00, }
       {"counter-value" : "5472824.000000", "unit" : "", "event" : "topdown-fe-bound", "event-runtime" : 5552708, "pcnt-running" : 100.00, }
       {"counter-value" : "7984780.000000", "unit" : "", "event" : "topdown-be-bound", "event-runtime" : 5552708, "pcnt-running" : 100.00, }
       {"counter-value" : "1418181.000000", "unit" : "", "event" : "topdown-bad-spec", "event-runtime" : 5552708, "pcnt-running" : 100.00, }
       ...

  perf sched:

   - Add -P/--pre-migrations option for 'timehist' sub-command to track
     time a task waited on a run-queue before migrating to a different
     CPU.

       $ perf sched timehist -P
                  time    cpu  task name                       wait time  sch delay   run time  pre-mig time
                               [tid/pid]                          (msec)     (msec)     (msec)     (msec)
       --------------- ------  ------------------------------  ---------  ---------  ---------  ---------
         585940.535527 [0000]  perf[584885]                        0.000      0.000      0.000      0.000
         585940.535535 [0000]  migration/0[20]                     0.000      0.002      0.008      0.000
         585940.535559 [0001]  perf[584885]                        0.000      0.000      0.000      0.000
         585940.535563 [0001]  migration/1[25]                     0.000      0.001      0.004      0.000
         585940.535678 [0002]  perf[584885]                        0.000      0.000      0.000      0.000
         585940.535686 [0002]  migration/2[31]                     0.000      0.002      0.008      0.000
         585940.535905 [0001]  <idle>                              0.000      0.000      0.342      0.000
         585940.535938 [0003]  perf[584885]                        0.000      0.000      0.000      0.000
         585940.537048 [0001]  sleep[584886]                       0.000      0.019      1.142      0.001
         585940.537749 [0002]  <idle>                              0.000      0.000      2.062      0.000
       ...

  Build:

   - Make libunwind opt-in (LIBUNWIND=1) rather than opt-out.

     The perf tools are generally built with libelf and libdw which has
     unwinder functionality. The libunwind support predates it and no
     need to have duplicate unwinders by default.

   - Rename NO_DWARF=1 build option to NO_LIBDW=1 in order to clarify
     it's using libdw for handling DWARF information.

  Internals:

   - Do not set exclude_guest bit in the perf_event_attr by default.

     This was causing a trouble in AMD IBS PMU as it doesn't support the
     bit. The bit will be set when it's needed later by the fallback
     logic. Also update the missing feature detection logic to make sure
     not clear supported bits unnecessarily.

   - Run perf test in parallel by default and mark flaky tests
     "exclusive" to run them serially at the end. Some test numbers are
     changed but the test can complete in less than half the time.

  JSON vendor events:

   - Add AMD Zen 5 events and metrics.

   - Add i.MX91 and i.MX95 DDR metrics

   - Fix HiSilicon HIP08 Topdown metric name.

   - Support compat events on PowerPC"

* tag 'perf-tools-for-v6.13-2024-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (232 commits)
  perf tests: Fix hwmon parsing with PMU name test
  perf hwmon_pmu: Ensure hwmon key union is zeroed before use
  perf tests hwmon_pmu: Remove double evlist__delete()
  perf/test: fix perf ftrace test on s390
  perf bpf-filter: Return -ENOMEM directly when pfi allocation fails
  perf test: Correct hwmon test PMU detection
  perf: Remove unused del_perf_probe_events()
  perf pmu: Move pmu_metrics_table__find and remove ARM override
  perf jevents: Add map_for_cpu()
  perf header: Pass a perf_cpu rather than a PMU to get_cpuid_str
  perf header: Avoid transitive PMU includes
  perf arm64 header: Use cpu argument in get_cpuid
  perf header: Refactor get_cpuid to take a CPU for ARM
  perf header: Move is_cpu_online to numa bench
  perf jevents: fix breakage when do perf stat on system metric
  perf test: Add missing __exit calls in tool/hwmon tests
  perf tests: Make leader sampling test work without branch event
  perf util: Remove kernel version deadcode
  perf test shell trace_exit_race: Use --no-comm to avoid cases where COMM isn't resolved
  perf test shell trace_exit_race: Show what went wrong in verbose mode
  ...
2024-11-26 14:54:00 -08:00
Linus Torvalds
4b01712311 tracing/tools: Updates for 6.13
- Add ':' to getopt option 'trace-buffer-size' in timerlat_hist for
   consistency
 
 - Remove unused sched_getattr define
 
 - Rename sched_setattr() helper to syscall_sched_setattr() to avoid
   conflicts
 
 - Update counters to long from int to avoid overflow
 
 - Add libcpupower dependency detection
 
 - Add --deepest-idle-state to timerlat to limit deep idle sleeps
 
 - Other minor clean ups and documentation changes
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZz5O/hQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qkLlAQDAJ0MASrdbJRDrLrfmKX6sja582MLe
 3MvevdSkOeXRdQEA0tzm46KOb5/aYNotzpntQVkTjuZiPBHSgn1JzASiaAI=
 =OZ1w
 -----END PGP SIGNATURE-----

Merge tag 'trace-tools-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing tools updates from Steven Rostedt:

 - Add ':' to getopt option 'trace-buffer-size' in timerlat_hist for
   consistency

 - Remove unused sched_getattr define

 - Rename sched_setattr() helper to syscall_sched_setattr() to avoid
   conflicts

 - Update counters to long from int to avoid overflow

 - Add libcpupower dependency detection

 - Add --deepest-idle-state to timerlat to limit deep idle sleeps

 - Other minor clean ups and documentation changes

* tag 'trace-tools-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  verification/dot2: Improve dot parser robustness
  tools/rtla: Improve exception handling in timerlat_load.py
  tools/rtla: Enhance argument parsing in timerlat_load.py
  tools/rtla: Improve code readability in timerlat_load.py
  rtla/timerlat: Do not set params->user_workload with -U
  rtla: Documentation: Mention --deepest-idle-state
  rtla/timerlat: Add --deepest-idle-state for hist
  rtla/timerlat: Add --deepest-idle-state for top
  rtla/utils: Add idle state disabling via libcpupower
  rtla: Add optional dependency on libcpupower
  tools/build: Add libcpupower dependency detection
  rtla/timerlat: Make timerlat_hist_cpu->*_count unsigned long long
  rtla/timerlat: Make timerlat_top_cpu->*_count unsigned long long
  tools/rtla: fix collision with glibc sched_attr/sched_set_attr
  tools/rtla: drop __NR_sched_getattr
  rtla: Fix consistency in getopt_long for timerlat_hist
  rv: Fix a typo
  tools/rv: Correct the grammatical errors in the comments
  tools/rv: Correct the grammatical errors in the comments
  rtla: use the definition for stdout fd when calling isatty()
2024-11-22 13:24:22 -08:00
Yicong Yang
35de42cdfb perf build: Include libtraceevent headers directly indicated by pkg-config
Currently the libtraceevent's found by pkg-config, which give the
include path as:

  [root@localhost tmp]# pkg-config --cflags libtraceevent
  -I/usr/local/include/traceevent

So we should include the libtraceevent headers directly without
"traceevent/" prefix. Update all the users.

Fixes: 0f0e1f445690 ("perf build: Use pkg-config for feature check for libtrace{event,fs}")
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/linux-perf-users/ZyF5_Hf1iL01kldE@google.com/
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Cc: leo.yan@arm.com
Cc: amadio@gentoo.org
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/20241105105649.45399-1-yangyicong@huawei.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-08 22:42:57 -08:00
Ian Rogers
91e81e988f perf probe: Move elfutils support check to libdw check
The test _ELFUTILS_PREREQ(0, 142) is false for elfutils before
2009-06-13, but that is 15 years ago and very unlikely. Add a test to
test-libdw.c and assume the libdw version is at least 0.142 to
simplify the build logic.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241017001354.56973-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-18 10:17:40 -07:00
Ian Rogers
26385fd237 perf build: Combine test-dwarf-getcfi into test-libdw
dwarf_getcfi support in libdw is 15 years old. Make libdw imply
dwarf_getcfi support and simplify build logic.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241017001354.56973-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-18 10:17:40 -07:00
Ian Rogers
23580d7bb1 perf build: Combine test-dwarf-getlocations into test-libdw
dwarf_getlocations support in libdw is more than 10 years old. Make
libdw imply dwarf_getlocations support and simplify build logic.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241017001354.56973-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-18 10:17:40 -07:00
Ian Rogers
3034b48a4b perf build: Combine libdw-dwarf-unwind into libdw feature tests
Support in libdw has been present for 10 years so let's simplify the
build logic with a single feature test.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241017001354.56973-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-18 10:17:40 -07:00
Ian Rogers
7c943261a1 perf build: Rename test-dwarf to test-libdw
Be more intention revealing that the dwarf test is actually testing
for libdw support.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241017001354.56973-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-18 10:17:40 -07:00
Tomas Glozar
0f59a6c9c4 tools/build: Add libcpupower dependency detection
Add the ability to detect the presence of libcpupower on a system to
the Makefiles in tools/build.

Link: https://lore.kernel.org/20241017140914.3200454-2-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-17 17:13:15 -04:00
Yang Jihong
a530337ba9 perf build: Fix build feature-dwarf_getlocations fail for old libdw
For libdw versions below 0.177, need to link libdl.a in addition to
libbebl.a during static compilation, otherwise
feature-dwarf_getlocations compilation will fail.

Before:

  $ make LDFLAGS=-static
    BUILD:   Doing 'make -j20' parallel build
  <SNIP>
  Makefile.config:483: Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.157
  <SNIP>

  $ cat ../build/feature/test-dwarf_getlocations.make.output
  /usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/libebl.a(eblclosebackend.o): in function `ebl_closebackend':
  (.text+0x20): undefined reference to `dlclose'
  collect2: error: ld returned 1 exit status

After:

  $ make LDFLAGS=-static
  <SNIP>
    Auto-detecting system features:
  ...                                   dwarf: [ on  ]
  <SNIP>

    $ ./perf probe
   Usage: perf probe [<options>] 'PROBEDEF' ['PROBEDEF' ...]
      or: perf probe [<options>] --add 'PROBEDEF' [--add 'PROBEDEF' ...]
      or: perf probe [<options>] --del '[GROUP:]EVENT' ...
      or: perf probe --list [GROUP:]EVENT ...
  <SNIP>

Fixes: 536661da6ea18fe6 ("perf: build: Only link libebl.a for old libdw")
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Yang Jihong <yangjihong@bytedance.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240919013513.118527-3-yangjihong@bytedance.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-10-02 18:21:49 -03:00
Yang Jihong
43f6564f18 perf build: Fix static compilation error when libdw is not installed
If libdw is not installed in build environment, the output of
'pkg-config --modversion libdw' is empty, causing LIBDW_VERSION_2 to be
empty and the shell test will have the following error:

  /bin/sh: 1: test: -lt: unexpected operator

Before:

  $ pkg-config --modversion libdw
  Package libdw was not found in the pkg-config search path.
  Perhaps you should add the directory containing `libdw.pc'
  to the PKG_CONFIG_PATH environment variable
  No package 'libdw' found
  $ make LDFLAGS=-static -j16
    BUILD:   Doing 'make -j20' parallel build
  <SNIP>
  Package libdw was not found in the pkg-config search path.
  Perhaps you should add the directory containing `libdw.pc'
  to the PKG_CONFIG_PATH environment variable
  No package 'libdw' found
  /bin/sh: 1: test: -lt: unexpected operator

After:

  1. libdw is not installed:

  $ pkg-config --modversion libdw
  Package libdw was not found in the pkg-config search path.
  Perhaps you should add the directory containing `libdw.pc'
  to the PKG_CONFIG_PATH environment variable
  No package 'libdw' found
  $ make LDFLAGS=-static -j16
    BUILD:   Doing 'make -j20' parallel build
  <SNIP>
  Package libdw was not found in the pkg-config search path.
  Perhaps you should add the directory containing `libdw.pc'
  to the PKG_CONFIG_PATH environment variable
  No package 'libdw' found
  Makefile.config:473: No libdw DWARF unwind found, Please install elfutils-devel/libdw-dev >= 0.158 and/or set LIBDW_DIR

  2. libdw version is lower than 0.177

  $ pkg-config --modversion libdw
  0.176
  $ make LDFLAGS=-static -j16
    BUILD:   Doing 'make -j20' parallel build
  <SNIP>

  Auto-detecting system features:
  ...                                   dwarf: [ on  ]
  <SNIP>
    INSTALL libsubcmd_headers
    INSTALL libapi_headers
    INSTALL libperf_headers
    INSTALL libsymbol_headers
    INSTALL libbpf_headers
    LINK    perf

  3. libdw version is higher than 0.177

  $ pkg-config --modversion libdw
  0.186
  $ make LDFLAGS=-static -j16
    BUILD:   Doing 'make -j20' parallel build
  <SNIP>

  Auto-detecting system features:
  ...                                   dwarf: [ on  ]
  <SNIP>
    CC      util/bpf-utils.o
    CC      util/pfm.o
    LD      util/perf-util-in.o
    LD      perf-util-in.o
    AR      libperf-util.a
    LINK    perf

Fixes: 536661da6ea18fe6 ("perf: build: Only link libebl.a for old libdw")
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Yang Jihong <yangjihong@bytedance.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240919013513.118527-2-yangjihong@bytedance.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-10-02 18:21:49 -03:00
James Clark
332f60ac05 perf build: Remove unused feature test target
llvm-version was removed in commit 56b11a2126bf ("perf bpf: Remove
support for embedding clang for compiling BPF events (-e foo.c)") but
some parts were left in the Makefile so finish removing them.

Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Daniel Wagner <dwagner@suse.de>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Manu Bretelle <chantr4@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <qmo@kernel.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Link: https://lore.kernel.org/r/20240910140405.568791-2-james.clark@linaro.org
[ Removed one leftover, 'llvm-version' from FEATURE_TESTS_EXTRA ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-10 17:32:47 -03:00
James Clark
206dcfca1f perf build: Autodetect minimum required llvm-dev version
The new LLVM addr2line feature requires a minimum version of 13 to
compile. Add a feature check for the version so that NO_LLVM=1 doesn't
need to be explicitly added. Leave the existing llvm feature check
intact because it's used by tools other than Perf.

This fixes the following compilation error when the llvm-dev version
doesn't match:

  util/llvm-c-helpers.cpp: In function 'char* llvm_name_for_code(dso*, const char*, u64)':
  util/llvm-c-helpers.cpp:178:21: error: 'std::remove_reference_t<llvm::DILineInfo>' {aka 'struct llvm::DILineInfo'} has no member named 'StartAddress'
    178 |   addr, res_or_err->StartAddress ? *res_or_err->StartAddress : 0);

Fixes: c3f8644c21df9b7d ("perf report: Support LLVM for addr2line()")
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Manu Bretelle <chantr4@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <qmo@kernel.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Link: https://lore.kernel.org/r/20240910140405.568791-1-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-10 17:32:46 -03:00
Steinar H. Gunderson
c3f8644c21 perf report: Support LLVM for addr2line()
In addition to the existing support for libbfd and calling out to
an external addr2line command, add support for using libllvm directly.

This is both faster than libbfd, and can be enabled in distro builds
(the LLVM license has an explicit provision for GPLv2 compatibility).

Thus, it is set as the primary choice if available.

As an example, running 'perf report' on a medium-size profile with
DWARF-based backtraces took 58 seconds with LLVM, 78 seconds with
libbfd, 153 seconds with external llvm-addr2line, and I got tired and
aborted the test after waiting for 55 minutes with external bfd
addr2line (which is the default for perf as compiled by distributions
today).

Evidently, for this case, the bfd addr2line process needs 18 seconds (on
a 5.2 GHz Zen 3) to load the .debug ELF in question, hits the 1-second
timeout and gets killed during initialization, getting restarted anew
every time. Having an in-process addr2line makes this much more robust.

As future extensions, libllvm can be used in many other places where
we currently use libbfd or other libraries:

 - Symbol enumeration (in particular, for PE binaries).
 - Demangling (including non-Itanium demangling, e.g. Microsoft
   or Rust).
 - Disassembling (perf annotate).

However, these are much less pressing; most people don't profile PE
binaries, and perf has non-bfd paths for ELF. The same with demangling;
the default _cxa_demangle path works fine for most users, and while bfd
objdump can be slow on large binaries, it is possible to use
--objdump=llvm-objdump to get the speed benefits.  (It appears
LLVM-based demangling is very simple, should we want that.)

Tested with LLVM 14, 15, 16, 18 and 19. For some reason, LLVM 12 was not
correctly detected using feature_check, and thus was not tested.

Committer notes:

 Added the name and a __maybe_unused to address:

   1    13.50 almalinux:8                   : FAIL gcc version 8.5.0 20210514 (Red Hat 8.5.0-22) (GCC)
    util/srcline.c: In function 'dso__free_a2l':
    util/srcline.c:184:20: error: parameter name omitted
     void dso__free_a2l(struct dso *)
                        ^~~~~~~~~~~~
    make[3]: *** [/git/perf-6.11.0-rc3/tools/build/Makefile.build:158: util] Error 2

Signed-off-by: Steinar H. Gunderson <sesse@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20240803152008.2818485-1-sesse@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-03 10:15:16 -03:00
Arnaldo Carvalho de Melo
0fd77ae4a3 Revert "tools build: Remove leftover libcap tests that prevents fast path feature detection from working"
Ian pointed out that the libcap feature test is also used by bpftool, so
we can't remove it just because perf stopped using it, revert the
removal of the feature test.

Since both perf and libcap uses the fast path feature detection
(tools/build/feature/test-all.c), probably the best thing is to keep
libcap-devel when building perf even it not being used there.

This reverts commit 47b3b6435e4bfb61ae8ffc63a11bd3c310f69acf.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-29 11:51:43 -03:00
Arnaldo Carvalho de Melo
47b3b6435e tools build: Remove leftover libcap tests that prevents fast path feature detection from working
I noticed that the fast path feature detection was failing:

  $ cat /tmp/build/perf-tools-next/feature/test-all.make.output
  /usr/bin/ld: cannot find -lcap: No such file or directory
  collect2: error: ld returned 1 exit status
  $

The patch removing the dependency (Fixes tag below) didn't remove the
detection of libcap, and as the fast path feature detection (test-all.c)
had -lcap in its Makefile link list of libraries to link, it was failing
when libcap-devel is not available, fix it by removing those leftover
files.

Fixes: e25ebda78e230283 ("perf cap: Tidy up and improve capability testing")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Zs-gjOGFWtAvIZit@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-28 19:11:36 -03:00
Alexander Gordeev
b53f20b323 tools build: Provide consistent build options for fixdep
The fixdep binary is being compiled and linked in one step. While the
host linker flags are passed to the compiler the host compiler flags are
missed.

That leads to build errors at least on x86_64, arm64 and s390 as result
of the compiler vs linker flags inconsistency. For example, during RPM
package build redhat-hardened-ld script is provided to gcc, while
redhat-hardened-cc1 script is missed.

Provide both KBUILD_HOSTCFLAGS and KBUILD_HOSTLDFLAGS to avoid that.

Fixes: ea974028a049f2ce ("tools build: Avoid circular .fixdep-in.o.cmd issues")
Closes: https://lore.kernel.org/lkml/99ae0d34-ed76-4ca0-a9fd-c337da33c9f9@leemhuis.info/
Reported-by: Thorsten Leemhuis <linux@leemhuis.info>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20240815072046.1002837-1-agordeev@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-15 18:08:17 -03:00
Arnaldo Carvalho de Melo
4c55560f23 perf build: Fix up broken capstone feature detection fast path
The capstone devel headers define 'struct bpf_insn' in a way that clashes with
what is in the libbpf devel headers, so we so far need to avoid including both.

This is happening on the tools/build/feature/test-all.c file, where we try
building all the expected set of libraries to be normally available on a
system:

  ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output
  In file included from test-bpf.c:3,
                   from test-all.c:150:
  /home/acme/git/perf-tools-next/tools/include/uapi/linux/bpf.h:77:8: error: ‘bpf_insn’ defined as wrong kind of tag
     77 | struct bpf_insn {
        |        ^~~~~~~~
  ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output

When doing so there is a trick where we define main to be
main_test_libcapstone, then include the individual
tools/build/feture/test-libcapstone.c capability query test, and then we undef
'main' because we'll do it all over again with the next expected library to
be tested (at this time 'lzma').

To complete this mechanism we need to, in test-all.c 'main' routine, to
call main_test_libcapstone(), which isn't being done, so the effect of
adding references to capstone in test-all.c are not achieved.

The only thing that is happening is that test-all.c is failing to build and thus
all the tests will have to be done individually, which nullifies the test-all.c
single build speedup.

So lets remove references to capstone from test-all.c to see if this makes it
build again so that we get faster builds or go on fixing up whatever is
preventing us to get that benefit.

Nothing: after this fix we get a clean test-all.c build and get the build speedup back:

  ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output
  ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.
  test-all.bin          test-all.d            test-all.make.output
  ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output
  ⬢[acme@toolbox perf-tools-next]$ ldd /tmp/build/perf-tools-next/feature/test-all.bin
  	linux-vdso.so.1 (0x00007f13277a1000)
  	libpython3.12.so.1.0 => /lib64/libpython3.12.so.1.0 (0x00007f1326e00000)
  	libm.so.6 => /lib64/libm.so.6 (0x00007f13274be000)
  	libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1327496000)
  	libtracefs.so.1 => /lib64/libtracefs.so.1 (0x00007f132746f000)
  	libcrypto.so.3 => /lib64/libcrypto.so.3 (0x00007f1326800000)
  	libunwind-x86_64.so.8 => /lib64/libunwind-x86_64.so.8 (0x00007f1327452000)
  	libunwind.so.8 => /lib64/libunwind.so.8 (0x00007f1327436000)
  	liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f1327403000)
  	libdw.so.1 => /lib64/libdw.so.1 (0x00007f1326d6f000)
  	libz.so.1 => /lib64/libz.so.1 (0x00007f13273e2000)
  	libelf.so.1 => /lib64/libelf.so.1 (0x00007f1326d53000)
  	libnuma.so.1 => /lib64/libnuma.so.1 (0x00007f13273d4000)
  	libslang.so.2 => /lib64/libslang.so.2 (0x00007f1326400000)
  	libperl.so.5.38 => /lib64/libperl.so.5.38 (0x00007f1326000000)
  	libc.so.6 => /lib64/libc.so.6 (0x00007f1325e0f000)
  	libzstd.so.1 => /lib64/libzstd.so.1 (0x00007f1326741000)
  	/lib64/ld-linux-x86-64.so.2 (0x00007f13277a3000)
  	libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f1326d3f000)
  	libcrypt.so.2 => /lib64/libcrypt.so.2 (0x00007f1326d07000)
  ⬢[acme@toolbox perf-tools-next]$

And when having capstone-devel installed we get it detected and linked with
perf, allowing us to benefit from the features that it enables:

  ⬢[acme@toolbox perf-tools-next]$ rpm -q capstone-devel
  capstone-devel-5.0.1-3.fc40.x86_64
  ⬢[acme@toolbox perf-tools-next]$ ldd /tmp/build/perf-tools-next/perf | grep capstone
  	libcapstone.so.5 => /lib64/libcapstone.so.5 (0x00007fe6a5c00000)
  ⬢[acme@toolbox perf-tools-next]$ /tmp/build/perf-tools-next/perf -vv | grep cap
             libcapstone: [ on  ]  # HAVE_LIBCAPSTONE_SUPPORT
  ⬢[acme@toolbox perf-tools-next]$

Fixes: 8b767db3309595a2 ("perf: build: introduce the libcapstone")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Zry0sepD5Ppa5YKP@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14 10:44:26 -03:00
Brian Norris
dbb2a7a986 tools build: Correct bpf fixdep dependencies
The dependencies in tools/lib/bpf/Makefile are incorrect. Before we
recurse to build $(BPF_IN_STATIC), we need to build its 'fixdep'
executable.

I can't use the usual shortcut from Makefile.include:

  <target>: <sources> fixdep

because its 'fixdep' target relies on $(OUTPUT), and $(OUTPUT) differs
in the parent 'make' versus the child 'make' -- so I imitate it via
open-coding.

I tweak a few $(MAKE) invocations while I'm at it, because
1. I'm adding a new recursive make; and
2. these recursive 'make's print spurious lines about files that are "up
   to date" (which isn't normally a feature in Kbuild subtargets) or
   "jobserver not available" (see [1])

I also need to tweak the assignment of the OUTPUT variable, so that
relative path builds work. For example, for 'make tools/lib/bpf', OUTPUT
is unset, and is usually treated as "cwd" -- but recursive make will
change cwd and so OUTPUT has a new meaning. For consistency, I ensure
OUTPUT is always an absolute path.

And $(Q) gets a backup definition in tools/build/Makefile.include,
because Makefile.include is sometimes included without
tools/build/Makefile, so the "quiet command" stuff doesn't actually work
consistently without it.

After this change, top-level builds result in an empty grep result from:

  $ grep 'cannot find fixdep' $(find tools/ -name '*.cmd')

[1] https://www.gnu.org/software/make/manual/html_node/MAKE-Variable.html
If we're not using $(MAKE) directly, then we need to use more '+'.

Signed-off-by: Brian Norris <briannorris@chromium.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240715203325.3832977-4-briannorris@chromium.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-05 12:19:48 -03:00
Brian Norris
ea974028a0 tools build: Avoid circular .fixdep-in.o.cmd issues
The 'fixdep' tool is used to post-process dependency files for various
reasons, and it runs after every object file generation command. This
even includes 'fixdep' itself.

In Kbuild, this isn't actually a problem, because it uses a single
command to generate fixdep (a compile-and-link command on fixdep.c), and
afterward runs the fixdep command on the accompanying .fixdep.cmd file.

In tools/ builds (which notably is maintained separately from Kbuild),
fixdep is generated in several phases:

 1. fixdep.c -> fixdep-in.o
 2. fixdep-in.o -> fixdep

Thus, fixdep is not available in the post-processing for step 1, and
instead, we generate .cmd files that look like:

  ## from tools/objtool/libsubcmd/.fixdep.o.cmd
  # cannot find fixdep (/path/to/linux/tools/objtool/libsubcmd//fixdep)
  [...]

These invalid .cmd files are benign in some respects, but cause problems
in others (such as the linked reports).

Because the tools/ build system is rather complicated in its own right
(and pointedly different than Kbuild), I choose to simply open-code the
rule for building fixdep, and avoid the recursive-make indirection that
produces the problem in the first place.

Signed-off-by: Brian Norris <briannorris@chromium.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/all/Zk-C5Eg84yt6_nml@google.com/
Link: https://lore.kernel.org/r/20240715203325.3832977-3-briannorris@chromium.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-05 12:09:14 -03:00
Linus Torvalds
e254e0c5ba Another perf tools fixes for v6.11
Some more fixes about the build and a random crash:
 
 * Fix cross-build by setting pkg-config env according to the arch
 * Fix static build for missing library dependencies
 * Fix Segfault when callchain has no symbols
 
 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZqls2wAKCRCMstVUGiXM
 g12aAQCovAOO6jC5GrzCS8KlBoHXplyGL1rscI2uZfOttotBnQEA875Ov6FNL9Ux
 u9oxHZOpN/U4Xnyeoj9c43ibae/urAQ=
 =LTu1
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v6.11-2024-07-30' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Namhyung Kim:
 "Some more build fixes and a random crash fix:

   - Fix cross-build by setting pkg-config env according to the arch

   - Fix static build for missing library dependencies

   - Fix Segfault when callchain has no symbols"

* tag 'perf-tools-fixes-for-v6.11-2024-07-30' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  perf docs: Document cross compilation
  perf: build: Link lib 'zstd' for static build
  perf: build: Link lib 'lzma' for static build
  perf: build: Only link libebl.a for old libdw
  perf: build: Set Python configuration for cross compilation
  perf: build: Setup PKG_CONFIG_LIBDIR for cross compilation
  perf tool: fix dereferencing NULL al->maps
2024-07-30 19:22:41 -07:00
Leo Yan
f42596c738 perf: build: Link lib 'zstd' for static build
When build static perf, Makefile reports the error:

  Makefile.config:480: No libdw DWARF unwind found, Please install
  elfutils-devel/libdw-dev >= 0.158 and/or set LIBDW_DIR

The libdw has been installed on the system, but the build system fails
to build the feature detecting binary 'test-libdw-dwarf-unwind'. The
failure is caused by missing to link the lib 'zstd'.

Link lib 'zstd' for the static build, in the end, the dwarf feature can
be enabled in the static perf.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: James Clark <james.clark@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-6-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-26 11:15:47 -07:00
Leo Yan
91b6a536b4 perf: build: Link lib 'lzma' for static build
The libunwind feature test failed with the static linkage. This is due
to the 'lzma' lib is missed, so link it to dismiss building failure.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: James Clark <james.clark@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-5-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-26 11:15:40 -07:00
Leo Yan
536661da6e perf: build: Only link libebl.a for old libdw
Since libdw version 0.177, elfutils has merged libebl.a into libdw (see
the commit "libebl: Don't install libebl.a, libebl.h and remove backends
from spec." in the elfutils repository).

As a result, libebl.a does not exist on Debian Bullseye and newer
releases, causing static perf builds to fail on these distributions.

This commit checks the libdw version and only links libebl.a if it
detects that the libdw version is older than 0.177.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: James Clark <james.clark@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-4-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-26 11:15:25 -07:00
Leo Yan
440cf77625 perf: build: Setup PKG_CONFIG_LIBDIR for cross compilation
On recent Linux distros like Ubuntu Noble and Debian Bookworm, the
'pkg-config-aarch64-linux-gnu' package is missing. As a result, the
aarch64-linux-gnu-pkg-config command is not available, which causes
build failures.

When a build passes the environment variables PKG_CONFIG_LIBDIR or
PKG_CONFIG_PATH, like a user uses make command or a build system
(like Yocto, Buildroot, etc) prepares the variables and passes to the
Perf's Makefile, the commit keeps these variables for package
configuration. Otherwise, this commit sets the PKG_CONFIG_LIBDIR
variable to use the Multiarch libs for the cross compilation.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-2-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-26 11:14:56 -07:00
Linus Torvalds
786c8248db perf tools fixes for v6.11
Two fixes about building perf and other tools:
 
 * Fix breakage in tracing tools due to pkg-config for libtrace{event,fs}
 
 * Fix build of perf when libunwind is used
 
 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHQEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZqA1SAAKCRCMstVUGiXM
 g3TfAQDXLi+XcSDE/u5JcDN3H6+bXvavDn2k8Gsd6vWZQc5LEQD3X1E+GbtWTQsE
 ruk5ZT3voy8qBPgmrUg72NJwmRxYAQ==
 =0RKR
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v6.11-2024-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Namhyung Kim:
 "Two fixes for building perf and other tools:

   - Fix breakage in tracing tools due to pkg-config for
     libtrace{event,fs}

   - Fix build of perf when libunwind is used"

* tag 'perf-tools-fixes-for-v6.11-2024-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  perf dso: Fix build when libunwind is enabled
  tools/latency: Use pkg-config in lib_setup of Makefile.config
  tools/rtla: Use pkg-config in lib_setup of Makefile.config
  tools/verification: Use pkg-config in lib_setup of Makefile.config
  tools: Make pkg-config dependency checks usable by other tools
  perf build: Warn if libtracefs is not found
2024-07-23 18:15:51 -07:00
Guilherme Amadio
8f61e98ad5 tools: Make pkg-config dependency checks usable by other tools
Other tools, in tools/verification and tools/tracing, make use of
libtraceevent and libtracefs as dependencies. This allows setting
up the feature check flags for them as well.

Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Tested-by: Leo Yan <leo.yan@arm.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20240717174739.186988-3-amadio@gentoo.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-17 13:14:35 -07:00
Guilherme Amadio
0f0e1f4456 perf build: Use pkg-config for feature check for libtrace{event,fs}
Needed to add required include directories for the feature detection
to succeed. The header tracefs.h is installed either into the include
directory /usr/include/tracefs/tracefs.h when using the Makefile, or
into /usr/include/libtracefs/tracefs.h when using meson to build
libtracefs. The header tracefs.h uses #include <event-parse.h> from
libtraceevent, so pkg-config needs to pick the correct include directory
for libtracefs and add the one for libtraceevent to succeed.

Note that in baa2ca59ec1e31ccbe3f24ff0368152b36f68720 the variable
LIBTRACEEVENT_DIR was introduced, and now the method to compile against
non-standard locations requires PKG_CONFIG_PATH to be set instead, which
works for both libtraceevent and libtracefs.

Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240606153625.2255470-2-amadio@gentoo.org
2024-06-21 16:19:03 -07:00
Daniel Wagner
28beb730ee tools: build: use correct lib name for libtracefs feature detection
Use libtracefs as package name to lookup the CFLAGS for libtracefs. This
makes it possible to use the distro specific path as include path for
the header file.

Link: https://lkml.kernel.org/r/20240617-rtla-build-v1-1-6882c34678e8@suse.de

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
2024-06-21 10:28:06 +02:00
Changbin Du
8b767db330 perf: build: introduce the libcapstone
Later we will use libcapstone to disassemble instructions of samples.

Signed-off-by: Changbin Du <changbin.du@huawei.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: changbin.du@gmail.com
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240217074046.4100789-2-changbin.du@huawei.com
2024-02-20 18:06:25 -08:00
James Clark
2dbba30fd6 perf cs-etm: Bump minimum OpenCSD version to ensure a bugfix is present
Since commit d927ef5004ef ("perf cs-etm: Add exception level consistency
check"), the exception that was added to Perf will be triggered unless
the following bugfix from OpenCSD is present:

 - _Version 1.2.1_:
  - __Bugfix__:
    ETM4x / ETE - output of context elements to client can in some
    circumstances be delayed until after subsequent atoms have been
    processed leading to incorrect memory decode access via the client
    callbacks. Fixed to flush context elements immediately they are
    committed.

Rather than remove the assert and silently fail, just increase the
minimum version requirement to avoid hard to debug issues and
regressions.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230901133716.677499-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27 10:21:27 -03:00
Namhyung Kim
f67f2fda7d perf build: Add feature check for dwarf_getcfi()
The dwarf_getcfi() is available on libdw 0.142+.  Instead of just
checking the version number, it'd be nice to have a config item to check
the feature at build time.

Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231110000012.3538610-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-10 09:04:16 -03:00
Namhyung Kim
fed3a1be64 perf tools fixes for v6.6: 2nd batch
- Fix regression in reading scale and unit files from sysfs for PMU
   events, so that we can use that info to pretty print instead of
   printing raw numbers:
 
   # perf stat -e power/energy-ram/,power/energy-gpu/ sleep 2
 
    Performance counter stats for 'system wide':
 
               1.64 Joules power/energy-ram/
               0.20 Joules power/energy-gpu/
 
        2.001228914 seconds time elapsed
   #
   # grep -m1 "model name" /proc/cpuinfo
   model name	: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
   #
 
 - The small llvm.cpp file used to check if the llvm devel files are present was
   incorrectly deleted when removing the BPF event in 'perf trace', put it back
   as it is also used by tools/bpf/bpftool, that uses llvm routines to do
   disassembly of BPF object files.
 
 - Fix use of addr_location__exit() in dlfilter__object_code(), making sure that
   it is only used to pair a previous addr_location__init() call.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZTKh5AAKCRCyPKLppCJ+
 J/g/AP0f6SNyHJz21JzDTzyjXAeSdMzKwic0LXv+kATQy31HJAD+Kf7UKQieUeZB
 fxvp60aKyFN8IVIgpYiAjZMS3k9XPAY=
 =N7Gv
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v6.6-2-2023-10-20' into perf-tools-next

To get the latest fixes in the perf tools including perf stat output,
dlfilter and LLVM feature detection.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-30 13:46:27 -07:00