- Use dev_err_probe() helpers to simplify the init code in the Qoriq
thermal driver (Frank Li).
- Power down the Qoriq's TMU at suspend time (Alice Guo).
- Add ipq5332, ipq5424 compatible to the QCom's tsens thermal driver
and TSENS enable / calibration support for V2 (Praveenkumar I).
- Add missing rk3328 mapping entry (Trevor Woerner).
- Remove duplicate struct declaration from the thermal core header
file (Xueqin Luo).
- Disable the monitoring mode during suspend in the LVTS Mediatek
driver to prevent temperature acquisition glitches (Nícolas F. R. A.
Prado).
- Disable Stage 3 thermal threshold in the LVTS Mediatek driver
because it disables the suspend ability and does not have an
interrupt handler (Nícolas F. R. A. Prado).
- Fix low temperature offset interrupt in the LVTS Mediatek driver
to prevent multiple interrupts from triggering when the system is at
its normal functionning temperature (Nícolas F. R. A. Prado).
- Enable interrupts in the LVTS Mediatek driver only on sensors that
are in use (Nícolas F. R. A. Prado).
- Add the BCM74110 compatible DT binding and the corresponding code
to support a chip based on a different process node than previous
chips (Florian Fainelli).
- Correct indentation and style in DTS example (Krzysztof Kozlowski).
- Unify hexadecimal annotatation in the rcar_gen3 driver (Niklas
Söderlund).
- Factor out the code logic to read fuses on Gen3 and Gen4 in the
rcar_gen3 thermal driver (Niklas Söderlund).
- Drop unused driver data from the QCom's spmi temperature alarm
driver (Johan Hovold).
-----BEGIN PGP SIGNATURE-----
iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmfq7sQSHHJqd0Byand5
c29ja2kubmV0AAoJEO5fvZ0v1OO15hAH/imWw7djjvVup1ZDxTHICUHbAjS+QwmT
EZr5zy6SSHlnQ/h5eWwCziXBEYY04Ra3qXn/H8cqVKcgeSVdYJQ49adKTK7V0/5O
eszOGQQDtog0xeIYavLDcXFmN3jVa7BF6Zq6XQrezw6HELyWH14EYVVZ4qvgqFvt
jXiMdMzkzfQ/TYkso69lMK+asX0M9V/aRtf4Fzw4WH5qTmkqNkoSLOY7yPidwrYy
t4zXhTWpDmzB0mRaljaZ+Z9rDbyTMe8cRL+K9+PLGLVhPhU2DpVTvAEGs5w4IaoB
Nja+QixI8lOi/m1ErqqeduMtT4Rk4FC6T4uC58tMzLYLYY96snEg/EY=
=5IMD
-----END PGP SIGNATURE-----
Merge tag 'thermal-6.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more thermal control updates from Rafael Wysocki:
"These are mostly assorted updates of thermal drivers used on ARM
platforms:
- Use dev_err_probe() helpers to simplify the init code in the Qoriq
thermal driver (Frank Li)
- Power down the Qoriq's TMU at suspend time (Alice Guo)
- Add ipq5332, ipq5424 compatible to the QCom's tsens thermal driver
and TSENS enable / calibration support for V2 (Praveenkumar I)
- Add missing rk3328 mapping entry (Trevor Woerner)
- Remove duplicate struct declaration from the thermal core header
file (Xueqin Luo)
- Disable the monitoring mode during suspend in the LVTS Mediatek
driver to prevent temperature acquisition glitches (Nícolas F. R.
A. Prado)
- Disable Stage 3 thermal threshold in the LVTS Mediatek driver
because it disables the suspend ability and does not have an
interrupt handler (Nícolas F. R. A. Prado)
- Fix low temperature offset interrupt in the LVTS Mediatek driver to
prevent multiple interrupts from triggering when the system is at
its normal functionning temperature (Nícolas F. R. A. Prado)
- Enable interrupts in the LVTS Mediatek driver only on sensors that
are in use (Nícolas F. R. A. Prado)
- Add the BCM74110 compatible DT binding and the corresponding code
to support a chip based on a different process node than previous
chips (Florian Fainelli)
- Correct indentation and style in DTS example (Krzysztof Kozlowski)
- Unify hexadecimal annotatation in the rcar_gen3 driver (Niklas
Söderlund)
- Factor out the code logic to read fuses on Gen3 and Gen4 in the
rcar_gen3 thermal driver (Niklas Söderlund)
- Drop unused driver data from the QCom's spmi temperature alarm
driver (Johan Hovold)"
* tag 'thermal-6.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal/drivers/qcom-spmi-temp-alarm: Drop unused driver data
thermal: rcar_gen3: Reuse logic to read fuses on Gen3 and Gen4
thermal: rcar_gen3: Use lowercase hex constants
dt-bindings: thermal: Correct indentation and style in DTS example
thermal/drivers/brcmstb_thermal: Add support for BCM74110
dt-bindings: thermal: Update for BCM74110
thermal/drivers/mediatek/lvts: Only update IRQ enable for valid sensors
thermal/drivers/mediatek/lvts: Start sensor interrupts disabled
thermal/drivers/mediatek/lvts: Disable low offset IRQ for minimum threshold
thermal/drivers/mediatek/lvts: Disable Stage 3 thermal threshold
thermal/drivers/mediatek/lvts: Disable monitor mode during suspend
thermal: core: Remove duplicate struct declaration
thermal/drivers/rockchip: Add missing rk3328 mapping entry
thermal/drivers/tsens: Add TSENS enable and calibration support for V2
dt-bindings: thermal: tsens: Add ipq5332, ipq5424 compatible
thermal/drivers/qoriq: Power down TMU on system suspend
thermal/drivers/qoriq: Use dev_err_probe() simplify the code
reservation" from Sourabh Jain changes powerpc's kexec code to use more
of the generic layers.
- The 2 patch series "get_maintainer: report subsystem status
separately" from Vlastimil Babka makes some long-requested improvements
to the get_maintainer output.
- The 4 patch series "ucount: Simplify refcounting with rcuref_t" from
Sebastian Siewior cleans up and optimizing the refcounting in the ucount
code.
- The 12 patch series "reboot: support runtime configuration of
emergency hw_protection action" from Ahmad Fatoum improves the ability
for a driver to perform an emergency system shutdown or reboot.
- The 16 patch series "Converge on using secs_to_jiffies() part two"
from Easwar Hariharan performs further migrations from
msecs_to_jiffies() to secs_to_jiffies().
- The 7 patch series "lib/interval_tree: add some test cases and
cleanup" from Wei Yang permits more userspace testing of kernel library
code, adds some more tests and performs some cleanups.
- The 2 patch series "hung_task: Dump the blocking task stacktrace" from
Masami Hiramatsu arranges for the hung_task detector to dump the stack
of the blocking task and not just that of the blocked task.
- The 4 patch series "resource: Split and use DEFINE_RES*() macros" from
Andy Shevchenko provides some cleanups to the resource definition
macros.
- Plus the usual shower of singleton patches - please see the individual
changelogs for details.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ+nuqwAKCRDdBJ7gKXxA
jtNqAQDxqJpjWkzn4yN9CNSs1ivVx3fr6SqazlYCrt3u89WQvwEA1oRrGpETzUGq
r6khQUIcQImPPcjFqEFpuiSOU0MBZA0=
=Kii8
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2025-03-30-18-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- The series "powerpc/crash: use generic crashkernel reservation" from
Sourabh Jain changes powerpc's kexec code to use more of the generic
layers.
- The series "get_maintainer: report subsystem status separately" from
Vlastimil Babka makes some long-requested improvements to the
get_maintainer output.
- The series "ucount: Simplify refcounting with rcuref_t" from
Sebastian Siewior cleans up and optimizing the refcounting in the
ucount code.
- The series "reboot: support runtime configuration of emergency
hw_protection action" from Ahmad Fatoum improves the ability for a
driver to perform an emergency system shutdown or reboot.
- The series "Converge on using secs_to_jiffies() part two" from Easwar
Hariharan performs further migrations from msecs_to_jiffies() to
secs_to_jiffies().
- The series "lib/interval_tree: add some test cases and cleanup" from
Wei Yang permits more userspace testing of kernel library code, adds
some more tests and performs some cleanups.
- The series "hung_task: Dump the blocking task stacktrace" from Masami
Hiramatsu arranges for the hung_task detector to dump the stack of
the blocking task and not just that of the blocked task.
- The series "resource: Split and use DEFINE_RES*() macros" from Andy
Shevchenko provides some cleanups to the resource definition macros.
- Plus the usual shower of singleton patches - please see the
individual changelogs for details.
* tag 'mm-nonmm-stable-2025-03-30-18-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (77 commits)
mailmap: consolidate email addresses of Alexander Sverdlin
fs/procfs: fix the comment above proc_pid_wchan()
relay: use kasprintf() instead of fixed buffer formatting
resource: replace open coded variant of DEFINE_RES()
resource: replace open coded variants of DEFINE_RES_*_NAMED()
resource: replace open coded variant of DEFINE_RES_NAMED_DESC()
resource: split DEFINE_RES_NAMED_DESC() out of DEFINE_RES_NAMED()
samples: add hung_task detector mutex blocking sample
hung_task: show the blocker task if the task is hung on mutex
kexec_core: accept unaccepted kexec segments' destination addresses
watchdog/perf: optimize bytes copied and remove manual NUL-termination
lib/interval_tree: fix the comment of interval_tree_span_iter_next_gap()
lib/interval_tree: skip the check before go to the right subtree
lib/interval_tree: add test case for span iteration
lib/interval_tree: add test case for interval_tree_iter_xxx() helpers
lib/rbtree: add random seed
lib/rbtree: split tests
lib/rbtree: enable userland test suite for rbtree related data structure
checkpatch: describe --min-conf-desc-length
scripts/gdb/symbols: determine KASLR offset on s390
...
- Delay exposing thermal zone sysfs interface to prevent user space
from accessing thermal zones that have not been completely
initialized yet (Lucas De Marchi).
- Check a pointer against NULL early in int3402_thermal_probe() to
avoid a potential NULL pointer dereference (Chenyuan Yang).
- Use kcalloc() instead of kzalloc() in some places in the thermal
control subsystem (Lukasz Luba, Ethan Carter Edwards).
- Fix a spelling mistake in a comment in the thermal core (Colin Ian
King).
- Clean up variable initialization in int340x_thermal_zone_add()
(Christophe JAILLET).
-----BEGIN PGP SIGNATURE-----
iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmfhhfYSHHJqd0Byand5
c29ja2kubmV0AAoJEO5fvZ0v1OO1FYIH/14phhTpJGvK1M8NxAVO2fvzXBvGeB0z
SUUgJHkctopg1qW5cIvPph21vc2ZFcbqYMBb23zy6SzC4ytqQVKVJ6gkGnwypeOu
yqKGpBEQlZTeL2KmI4QdAikNc8AGLDBC8M4aohrkRoQSpjEIVAs9Qbm8tu9yexTN
R/wNih1fg0sLg22aJ5jPa1oCItohtG7z2Zxt9j5m5xKGrv4XdP3c51uetpCyBNcw
J2Cs/7xwP0CFiTH2hOQtyqY+GMLyebe7OuAJtKGx47KEgcr61imcGXezcTSmumlC
l5oHWhSad/FTayx/OCvV8ajf1qPPcb0P88EUF0cuaawj2zKdlGBZzco=
=mhAK
-----END PGP SIGNATURE-----
Merge tag 'thermal-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These include one thermal core fix for an issue leading to a NULL
pointer dereference, a similar fix for the int340x thermal driver
(even though the issue may not actually occur in practice in this
particular case), and a bunch of cleanups, mostly related to replacing
kzalloc() with kcalloc() where applicable.
Summary:
- Delay exposing thermal zone sysfs interface to prevent user space
from accessing thermal zones that have not been completely
initialized yet (Lucas De Marchi)
- Check a pointer against NULL early in int3402_thermal_probe() to
avoid a potential NULL pointer dereference (Chenyuan Yang)
- Use kcalloc() instead of kzalloc() in some places in the thermal
control subsystem (Lukasz Luba, Ethan Carter Edwards)
- Fix a spelling mistake in a comment in the thermal core (Colin Ian
King)
- Clean up variable initialization in int340x_thermal_zone_add()
(Christophe JAILLET)"
* tag 'thermal-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: int340x: Add NULL check for adev
thermal: core: Delay exposing sysfs interface
thermal: core: Fix spelling mistake "Occurences" -> "Occurrences"
thermal: intel: Clean up zone_trips[] initialization in int340x_thermal_zone_add()
thermal: hisi: Use kcalloc() instead of kzalloc() with multiplication
thermal: int340x: Use kcalloc() instead of kzalloc() with multiplication
thermal: k3_j72xx_bandgap: Use kcalloc() instead of kzalloc()
thermal/of: Use kcalloc() instead of kzalloc() with multiplication
thermal/debugfs: replace kzalloc() with kcalloc() in thermal_debug_tz_add()
The platform device driver data has not been used since commit
7a4ca51b7040 ("thermal/drivers/qcom-spmi: Use devm_iio_channel_get") so
drop the unnecessary assignment.
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250228082936.5694-1-johan+linaro@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The hardware calibration is fused on some, but not all, Gen3 and Gen4
boards. The calibrations values are the same on both generations but
located at different register offsets.
Instead of having duplicated logic to read the and store the values
create structure to hold the register parameters and have a common
function do the reading.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20250305174631.4119374-3-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The style of the driver is to use lowercase hex constants, correct the
few outlines.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20250305174631.4119374-2-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
BCM74110 uses a different process node compared to previous chips that
requires a different equation, account for that.
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20250116193842.758788-3-florian.fainelli@broadcom.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Only sensors that are valid need to have their interrupts enable status
updated based on their thresholds. Use the lvts_for_each_valid_sensor()
helper in lvts_update_irq_mask() to ignore invalid sensors.
Currently, since the invalid sensors will always contain zeroed out
thresholds (from kzalloc), they will always get their interrupts
disabled on this loop. So this commit doesn't change the resulting
interrupts configuration, but it slightly optimizes the loop by skipping
the invalid sensors, avoids potential future surprises if at some point
memory is no longer allocated for invalid sensors, as well as makes the
code more obvious.
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20250113-mt8192-lvts-filtered-suspend-fix-v2-5-07a25200c7c6@collabora.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Interrupts are enabled per sensor in lvts_update_irq_mask() as needed,
there's no point in enabling all of them during initialization. Change
the MONINT register initial value so all sensor interrupts start
disabled.
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20250113-mt8192-lvts-filtered-suspend-fix-v2-4-07a25200c7c6@collabora.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
In order to get working interrupts, a low offset value needs to be
configured. The minimum value for it is 20 Celsius, which is what is
configured when there's no lower thermal trip (ie the thermal core
passes -INT_MAX as low trip temperature). However, when the temperature
gets that low and fluctuates around that value it causes an interrupt
storm.
Prevent that interrupt storm by not enabling the low offset interrupt if
the low threshold is the minimum one.
Cc: stable@vger.kernel.org
Fixes: 77354eaef821 ("thermal/drivers/mediatek/lvts_thermal: Don't leave threshold zeroed")
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20250113-mt8192-lvts-filtered-suspend-fix-v2-3-07a25200c7c6@collabora.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The Stage 3 thermal threshold is currently configured during
the controller initialization to 105 Celsius. From the kernel
perspective, this configuration is harmful because:
* The stage 3 interrupt that gets triggered when the threshold is
crossed is not handled in any way by the IRQ handler, it just gets
cleared. Besides, the temperature used for stage 3 comes from the
sensors, and the critical thermal trip points described in the
Devicetree will already cause a shutdown when crossed (at a lower
temperature, of 100 Celsius, for all SoCs currently using this
driver).
* The only effect of crossing the stage 3 threshold that has been
observed is that it causes the machine to no longer be able to enter
suspend. Even if that was a result of a momentary glitch in the
temperature reading of a sensor (as has been observed on the
MT8192-based Chromebooks).
For those reasons, disable the Stage 3 thermal threshold configuration.
Cc: stable@vger.kernel.org
Reported-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
Closes: https://lore.kernel.org/all/20241108-lvts-v1-1-eee339c6ca20@chromium.org/
Fixes: f5f633b18234 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20250113-mt8192-lvts-filtered-suspend-fix-v2-2-07a25200c7c6@collabora.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
When configured in filtered mode, the LVTS thermal controller will
monitor the temperature from the sensors and trigger an interrupt once a
thermal threshold is crossed.
Currently this is true even during suspend and resume. The problem with
that is that when enabling the internal clock of the LVTS controller in
lvts_ctrl_set_enable() during resume, the temperature reading can glitch
and appear much higher than the real one, resulting in a spurious
interrupt getting generated.
Disable the temperature monitoring and give some time for the signals to
stabilize during suspend in order to prevent such spurious interrupts.
Cc: stable@vger.kernel.org
Reported-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
Closes: https://lore.kernel.org/all/20241108-lvts-v1-1-eee339c6ca20@chromium.org/
Fixes: 8137bb90600d ("thermal/drivers/mediatek/lvts_thermal: Add suspend and resume")
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20250113-mt8192-lvts-filtered-suspend-fix-v2-1-07a25200c7c6@collabora.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The mapping table for the rk3328 is missing the entry for -25C which is
found in the TRM section 9.5.2 "Temperature-to-code mapping".
NOTE: the kernel uses the tsadc_q_sel=1'b1 mode which is defined as:
4096-<code in table>. Whereas the table in the TRM gives the code
"3774" for -25C, the kernel uses 4096-3774=322.
[Dragan Simic] : "After going through the RK3308 and RK3328 TRMs, as
well as through the downstream kernel code, it seems we may have
some troubles at our hands. Let me explain, please.
To sum it up, part 1 of the RK3308 TRM v1.1 says on page 538 that
the equation for the output when tsadc_q_sel equals 1 is (4096 -
tsadc_q), while part 1 of the RK3328 TRM v1.2 says that the output
equation is (1024 - tsadc_q) in that case.
The downstream kernel code, however, treats the RK3308 and RK3328
tables and their values as being the same. It even mentions 1024 as
the "offset" value in a comment block for the rk_tsadcv3_control()
function, just like the upstream code does, which is obviously wrong
"offset" value when correlated with the table on page 544 of part 1
of the RK3308 TRM v1.1.
With all this in mind, it's obvious that more work is needed to make
it clear where's the actual mistake (it could be that the TRM is
wrong), which I'll volunteer for as part of the SoC binning project.
In the meantime, this patch looks fine as-is to me, by offering
what's a clear improvement to the current state of the upstream
code"
Link: https://opensource.rock-chips.com/images/9/97/Rockchip_RK3328TRM_V1.1-Part1-20170321.pdf
Cc: stable@vger.kernel.org
Fixes: eda519d5f73e ("thermal: rockchip: Support the RK3328 SOC in thermal driver")
Signed-off-by: Trevor Woerner <twoerner@gmail.com>
Reviewed-by: Dragan Simic <dsimic@manjaro.org>
Link: https://lore.kernel.org/r/20250207175048.35959-1-twoerner@gmail.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
SoCs without RPM need to enable sensors and calibrate them from the kernel.
The IPQ5332 and IPQ5424 use the tsens v2.3.3 IP and do not have RPM.
Therefore, add a new calibration function for V2, as the tsens.c calib
function only supports V1. Also add new feature_config, ops and data for
IPQ5332, IPQ5424.
Although the TSENS IP supports 16 sensors, not all are used. The hw_id
is used to enable the relevant sensors.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Praveenkumar I <quic_ipkumar@quicinc.com>
Signed-off-by: Manikanta Mylavarapu <quic_mmanikan@quicinc.com>
Link: https://lore.kernel.org/r/20250210120436.821684-3-quic_mmanikan@quicinc.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Enable power-down of TMU (Thermal Management Unit) for TMU version 2 during
system suspend to save power. Save approximately 4.3mW on VDD_ANA_1P8 on
i.MX93 platforms.
Signed-off-by: Alice Guo <alice.guo@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20241209164859.3758906-2-Frank.Li@nxp.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Merge thermal core updates and miscellaneous updates of the thermal
control subsystem for 6.15-rc1:
- Delay exposing thermal zone sysfs interface to prevent user space
from accessing thermal zones that have not been completely
initialized yet (Lucas De Marchi).
- Fix a spelling mistake in a comment in the thermal core (Colin Ian
King).
- Use kcalloc() instead of kzalloc() in some places in the thermal
control subsystem (Lukasz Luba, Ethan Carter Edwards).
- Clean up variable initialization in int340x_thermal_zone_add()
(Christophe JAILLET).
* thermal-core:
thermal: core: Delay exposing sysfs interface
thermal: core: Fix spelling mistake "Occurences" -> "Occurrences"
* thermal-misc:
thermal: intel: Clean up zone_trips[] initialization in int340x_thermal_zone_add()
thermal: hisi: Use kcalloc() instead of kzalloc() with multiplication
thermal: int340x: Use kcalloc() instead of kzalloc() with multiplication
thermal: k3_j72xx_bandgap: Use kcalloc() instead of kzalloc()
thermal/of: Use kcalloc() instead of kzalloc() with multiplication
thermal/debugfs: replace kzalloc() with kcalloc() in thermal_debug_tz_add()
In the general case, we don't know which of system shutdown or reboot is
the better action to take to protect hardware in an emergency situation.
We thus allow the policy to come from the device-tree in the form of an
optional critical-action OF property, but so far there was no way for the
end user to configure this.
With recent addition of the hw_protection parameter, the user can now
choose a default action for the case, where the driver isn't fully sure
what's the better course of action.
Let's make use of this by passing HWPROT_ACT_DEFAULT in absence of the
critical-action OF property.
As HWPROT_ACT_DEFAULT is shutdown by default, this introduces no
functional change for users, unless they start using the new parameter.
Link: https://lkml.kernel.org/r/20250217-hw_protection-reboot-v3-11-e1c09b090c0c@pengutronix.de
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Reviewed-by: Tzung-Bi Shih <tzungbi@kernel.org>
Cc: Benson Leung <bleung@chromium.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Fabio Estevam <festevam@denx.de>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Lukasz Luba <lukasz.luba@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Matteo Croce <teknoraver@meta.com>
Cc: Matti Vaittinen <mazziesaccount@gmail.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Rob Herring (Arm) <robh@kernel.org>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Not all devices have an ACPI companion fwnode, so adev might be NULL.
This is similar to the commit cd2fd6eab480
("platform/x86: int3472: Check for adev == NULL").
Add a check for adev not being set and return -ENODEV in that case to
avoid a possible NULL pointer deref in int3402_thermal_probe().
Note, under the same directory, int3400_thermal_probe() has such a
check.
Fixes: 77e337c6e23e ("Thermal: introduce INT3402 thermal driver")
Signed-off-by: Chenyuan Yang <chenyuan0y@gmail.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20250313043611.1212116-1-chenyuan0y@gmail.com
[ rjw: Subject edit, added Fixes: ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There's a race between initializing the governor and userspace accessing
the sysfs interface. From time to time the Intel graphics CI shows this
signature:
<1>[] #PF: error_code(0x0000) - not-present page
<6>[] PGD 0 P4D 0
<4>[] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
<4>[] CPU: 3 UID: 0 PID: 562 Comm: thermald Not tainted 6.14.0-rc4-CI_DRM_16208-g7e37396f86d8+ #1
<4>[] Hardware name: Intel Corporation Twin Lake Client Platform/AlderLake-N LP5 RVP, BIOS TWLNFWI1.R00.5222.A01.2405290634 05/29/2024
<4>[] RIP: 0010:policy_show+0x1a/0x40
thermald tries to read the policy file between the sysfs files being
created and the governor set by thermal_set_governor(), which causes the
NULL pointer dereference.
Similarly to the hwmon interface, delay exposing the sysfs files to when
the governor is already set.
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13655
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patch.msgid.link/20250307-thermal-sysfs-race-v1-1-8a3d4d4ac9c4@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We are going to apply a new series that conflicts with pending
work in x86/mm, so merge in x86/mm to avoid it, and also to
refresh the x86/cpu branch with fixes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
According to the latest recommendations, kcalloc() should be used
instead of kzalloc() with multiplication (which might overflow).
Switch to this new scheme and use more safe kcalloc().
No functional impact.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20250224173432.1946070-5-lukasz.luba@arm.com
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
According to the latest recommendations, kcalloc() should be used
instead of kzalloc() with multiplication (which might overflow).
Switch to this new scheme and use more safe kcalloc().
No functional impact.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20250224173432.1946070-4-lukasz.luba@arm.com
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
According to the latest recommendations, kcalloc() should be used
instead of kzalloc() with multiplication (which might overflow).
Switch to this new scheme and use more safe kcalloc().
No functional impact.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20250224173432.1946070-3-lukasz.luba@arm.com
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
According to the latest recommendations, kcalloc() should be used
instead of kzalloc() with multiplication (which might overflow).
Switch to this new scheme and use more safe kcalloc().
No functional impact.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20250224173432.1946070-2-lukasz.luba@arm.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Work is under way to get rid of all multiplications from allocation
functions to prevent integer overflows [1].
Here the multiplication is obviously safe, but using kcalloc() is more
appropriate and improves readability.
This change has no effect on runtime behavior.
Link: https://github.com/KSPP/linux/issues/162 [1]
Signed-off-by: Ethan Carter Edwards <ethan@ethancedwards.com>
Link: https://patch.msgid.link/20250222-thermal_kcalloc-v1-1-9f7a747fbed7@ethancedwards.com
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
params->total_weight is not initialized during bind and not updated when
the bound cdev changes. The cooling device weight will not be used due
to the uninitialized total_weight, until an update via sysfs is
triggered.
The bound cdevs are updated during thermal zone registration, where each
cooling device will be bound to the thermal zone one by one, but
power_allocator_bind() can be called without an additional cdev update
when manually changing the policy of a thermal zone via sysfs.
Add a new function to handle weight update logic, including updating
total_weight, and call it when bind, weight changes, and cdev updates to
ensure total_weight is always correct.
Fixes: a3cd6db4cc2e ("thermal: gov_power_allocator: Support new update callback of weights")
Signed-off-by: Yu-Che Cheng <giver@chromium.org>
Link: https://patch.msgid.link/20250222-fix-power-allocator-weight-v2-1-a94de86b685a@chromium.org
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Since thermal_of_should_bind() terminates the loop after processing
the first child found in cooling-maps, it will never match more than
one cdev to a given trip point which is incorrect, as there may be
cooling-maps associating one trip point with multiple cooling devices.
Address this by letting the loop continue until either all
children have been processed or a matching one has been found.
To avoid adding conditionals or goto statements, put the loop in
question into a separate function and make that function return
right away after finding a matching cooling-maps entry.
Fixes: 94c6110b0b13 ("thermal/of: Use the .should_bind() thermal zone callback")
Link: https://lore.kernel.org/linux-pm/20250219-fix-thermal-of-v1-1-de36e7a590c4@chromium.org/
Reported-by: Yu-Che Cheng <giver@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Yu-Che Cheng <giver@chromium.org>
Tested-by: Yu-Che Cheng <giver@chromium.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/2788228.mvXUDI8C0e@rjwysocki.net
divvy_up_power() should use weighted_req_power instead of req_power to
calculate granted_power. Otherwise, granted_power may be unexpected as
the denominator total_req_power is a weighted sum.
This is a mistake made during the previous refactor.
Replace req_power with weighted_req_power in divvy_up_power()
calculation.
Fixes: 912e97c67cc3 ("thermal: gov_power_allocator: Move memory allocation out of throttle()")
Signed-off-by: Yu-Che Cheng <giver@chromium.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20250219-fix-power-allocator-calc-v1-1-48b860291919@chromium.org
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This CPU was mistakenly given the name INTEL_ATOM_AIRMONT_MID. But it
uses a Silvermont core, not Airmont.
Change #define name to INTEL_ATOM_SILVERMONT_MID2
Reported-by: Christian Ludloff <ludloff@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20241007165701.19693-1-tony.luck%40intel.com
Merge updates of Intel thermal drivers for 6.14:
- Add support for Panther Lake processors in multiple places (Zhang
Rui, Srinivas Pandruvada).
- Remove explicit user_space governor selection from Intel thermal
drivers (Srinivas Pandruvada).
* thermal-intel:
thermal: intel: Fix compile issue when CONFIG_NET is not defined
thermal: intel: int340x: Panther Lake power floor and workload hint support
thermal: intel: int340x: Panther Lake DLVR support
thermal: intel: Remove explicit user_space governor selection
ACPI: DPTF: Support Panther Lake
thermal: intel: int340x: processor: Enable MMIO RAPL for Panther Lake
powercap: intel_rapl: Add support for Panther Lake platform
Rename the 'crossed_up' function argument to 'upward', which is more
proper English and a better match for representing temperature change
direction, everywhere in the code.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/2360961.ElGaqSPkdT@rjwysocki.net
[ rjw: Rebased ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Move the regulation logic description from the bang_bang_trip_crossed()
kerneldoc to the preamble.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/4987649.31r3eYUQgx@rjwysocki.net
[ rjw: Removed a trailing whitespace ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The names of :trip_crossed() callback functions in the Bang-bang and
User-space thermal governors don't match their current purpose any
more after previous changes, so rename them.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/5859084.DvuYhMxLoT@rjwysocki.net
of_thermal_zone_find() calls of_parse_phandle_with_args(), but does not
release the OF node reference obtained by it.
Add a of_node_put() call when the call is successful.
Fixes: 3fd6d6e2b4e8 ("thermal/of: Rework the thermal device tree initialization")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Link: https://patch.msgid.link/20241224031809.950461-1-joe@pf.is.s.u-tokyo.ac.jp
[ rjw: Changelog edit ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If CONFIG_NET is not defined then THERMAL_NETLINK can't be selected.
Hence add dependency on CONFIG_NET. Othewise it will generate compile
errors while compiling thermal_netlink.c.
Fixes: 4596cbea0ed2 ("thermal: intel: Remove explicit user_space governor selection")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20241218214444.1904650-1-srinivas.pandruvada@linux.intel.com
[ rjw: Merge the "depends on" lines ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Panther Lake follows same register set as Lunar Lake. Enable feature
flags to support workload hints and power floor status.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20241216211810.1207028-2-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Panther Lake follows same register set as Lunar Lake for DLVR. Enable
feature flag to support DLVR.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/thermal: intel: int340x: Panther Lake DLVR support
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently some user space programs like Linux thermald needs to register
to get notifications from both thermal user space governor and also
Thermal netlink. This is required as some messages like HFI (Hardware
Feedback Notifications) requires Thermal netlink.
This results in additional processing in kernel and user space to process
both notifications. The cost of using user space governor using
kobject_uevent is much higher as this is also used by other user space
daemons like udev daemon.
Do not select user_space thermal governor by default. If it is present
user space programs can always use this governor by writing to
"policy" attribute.
Instead from the kernel select THERMAL_NETLINK. Trip temperature
violation can be received by user space programs via thermal netlink
events:
THERMAL_GENL_EVENT_TZ_TRIP_UP
THERMAL_GENL_EVENT_TZ_TRIP_DOWN
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20241216190821.1137162-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The current implementation does not work if the thermal zone is
interrupt driven only.
The boundaries are not correctly checked and computed as it happens
only when the temperature is increasing or decreasing.
The problem arises because the routine to detect when we cross a
threshold is correlated with the computation of the boundaries. We
assume we have to recompute the boundaries when a threshold is crossed
but actually we should do that even if the it is not the case.
Mixing the boundaries computation and the threshold detection for the
sake of optimizing the routine is much more complex as it appears
intuitively and prone to errors.
This fix separates the boundaries computation and the threshold
crossing detection into different routines. The result is a code much
more simple to understand, thus easier to maintain.
The drawback is we browse the thresholds list several time but we can
consider that as neglictible because that happens when the temperature
is updated. There are certainly some aeras to improve in the
temperature update routine but it would be not adequate as this change
aims to fix the thresholds for v6.13.
Fixes: 445936f9e258 ("thermal: core: Add user thresholds support")
Tested-by: Daniel Lezcano <daniel.lezcano@linaro.org> # rock5b, Lenovo x13s
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/20241216212644.1145122-1-daniel.lezcano@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Add a NULL pointer check that was missed by recent modifications of
the Power Allocator thermal governor (Rafael Wysocki).
- Remove the data_vault attribute_group from int3400 because it is only
used for exposing one binary file that can be exposed directly (Thomas
Weißschuh).
- Prevent the current_uuid sysfs attribute in int3400 from mistakenly
treating valid UUID values as invalid on some older systems (Srinivas
Pandruvada).
- Use the cleanup.h mechanics to simplify DT data parsing in the thermal
core and some drivers (Krzysztof Kozlowski).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmdHY1ASHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxg6gP/2Yrv1tBY2rjGgjd7zm4SYMsC/z72aXi
hiNfvA4d735t1HZ33DifoZa4ukQhlxYvHmEhv3QTx/nNULTTKyBlw/B7zgDwieY6
4zOYf944Yl7D3ckUbbQcHIbrwJnDpv7HKJEwlKMbBCeZuzoJc/6S256QlUrQn6VT
LXWtUHsjntsH2g3wkz2Ju4wTd4QdekIhNnfIxGri6WHsnPHJEEVY7gojvkZQdoRt
gmuZlAAg09VTjzRrBktyRgFBj3Uh+4GBlP5MU+Ao2h6L2c6W38u9CInT2bE0nYRD
//zJnAVVPxv0VHcY3BSlZt1y/rNr9f6+ndunoTnmp1yWzVrVB0XRpKDPIbw2E2wq
7WJr2J3ESDsr3VVGGHo08zeU33yoxMUd21Y4aABW09FA5zxGgOW17eCCmT1zbSQa
JANpc8dFD2NSiwJMcMrKI5SOynlhGbuhnIin3vx78irJysZy9AvPbh2JJp73fqfD
rWXmgYiXo2h+Hcmu87gyu8yDIfwuVwdBpZJek6P6h0Z3eQlOkD0B2OrRYI6X0Si9
ukIarNvp5eAbjaSRB7d4u33puFhASQ2RF5YwX1Ll0Dx87f16nRmrYPunRCTEKp97
UjhAPTUIoArVq3Y+I6qmieJDFzPw6YoOUnD0vQQjRC/UsbjWH1wYhUV+BtKWJnxM
Fie/qwzNs+6B
=9Wp9
-----END PGP SIGNATURE-----
Merge tag 'thermal-6.13-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more thermal control updates from Rafael Wysocki:
"These fix a Power Allocator thermal governor issue reported recently,
update the Intel int3400 thermal driver and simplify DT data parsing
in the thermal control subsystem:
- Add a NULL pointer check that was missed by recent modifications of
the Power Allocator thermal governor (Rafael Wysocki)
- Remove the data_vault attribute_group from int3400 because it is
only used for exposing one binary file that can be exposed directly
(Thomas Weißschuh)
- Prevent the current_uuid sysfs attribute in int3400 from mistakenly
treating valid UUID values as invalid on some older systems
(Srinivas Pandruvada)
- Use the cleanup.h mechanics to simplify DT data parsing in the
thermal core and some drivers (Krzysztof Kozlowski)"
* tag 'thermal-6.13-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: sun8i: Use scoped device node handling to simplify error paths
thermal: tegra: Simplify with scoped for each OF child loop
thermal: qcom-spmi-adc-tm5: Simplify with scoped for each OF child loop
thermal: of: Use scoped device node handling to simplify of_thermal_zone_find()
thermal: of: Use scoped memory and OF handling to simplify thermal_of_trips_init()
thermal: of: Simplify thermal_of_should_bind with scoped for each OF child
thermal: gov_power_allocator: Add missing NULL pointer check
thermal: int3400: Remove unneeded data_vault attribute_group
thermal: int3400: Fix reading of current_uuid for active policy
Merge updates of Intel int3400 thermal driver for 6.13-rc1:
- Remove the data_vault attribute_group from int3400 because it is only
used for exposing one binary file that can be exposed directly (Thomas
Weißschuh).
- Prevent the current_uuid sysfs attribute in int3400 from mistakenly
treating valid UUID values as invalid on some older systems (Srinivas
Pandruvada).
* thermal-intel:
thermal: int3400: Remove unneeded data_vault attribute_group
thermal: int3400: Fix reading of current_uuid for active policy
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmdE14wUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vxMPRAAslaEhHZ06cU/I+BA0UrMJBbzOw+/
XM2XUojxWaNMYSBPVXbtSBrfFMnox4G3hFBPK0T0HiWoc7wGx/TUVJk65ioqM8ug
gS/U3NjSlqlnH8NHxKrb/2t0tlMvSll9WwumOD9pMFeMGFOS3fAgUk+fBqXFYsI/
RsVRMavW9BucZ0yMHpgr0KGLPSt3HK/E1h0NLO+TN6dpFcoIq3XimKFyk1QQQgiR
V3W21JMwjw+lDnUAsijU+RBYi5Fj6Rpqig/biRnzagVE6PJOci3ZJEBE7dGqm4LM
UlgG6Ql/eK+bb3fPhcXxVmscj5XlEfbesX5PUzTmuj79Wq5l9hpy+0c654G79y8b
rGiEVGM0NxmRdbuhWQUM2EsffqFlkFu7MN3gH0tP0Z0t3VTXfBcGrQJfqCcSCZG3
5IwGdEE2kmGb5c3RApZrm+HCXdxhb3Nwc3P8c27eXDT4eqHWDJag4hzLETNBdIrn
Rsbgry6zzAVA6lLT0uasUlWerq/I6OrueJvnEKRGKDtbw/JL6PLveR1Rvsc//cQD
Tu4FcG81bldQTUOdHEgFyJgmSu77Gvfs5RZBV0cEtcCBc33uGJne08kOdGD4BwWJ
dqN3wJFh5yX4jlMGmBDw0KmFIwKstfUCIoDE4Kjtal02CURhz5ZCDVGNPnSUKN0C
hflVX0//cRkHc5g=
=2Otz
-----END PGP SIGNATURE-----
Merge tag 'pci-v6.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull PCI updates from Bjorn Helgaas:
"Enumeration:
- Make pci_stop_dev() and pci_destroy_dev() safe so concurrent
callers can't stop a device multiple times, even as we migrate from
the global pci_rescan_remove_lock to finer-grained locking (Keith
Busch)
- Improve pci_walk_bus() implementation by making it recursive and
moving locking up to avoid need for a 'locked' parameter (Keith
Busch)
- Unexport pci_walk_bus_locked(), which is only used internally by
the PCI core (Keith Busch)
- Detect some Thunderbolt chips that are built-in and hence
'trustworthy' by a heuristic since the 'ExternalFacingPort' and
'usb4-host-interface' ACPI properties are not quite enough (Esther
Shimanovich)
Resource management:
- Use PCI bus addresses (not CPU addresses) in 'ranges' properties
when building dynamic DT nodes so systems where PCI and CPU
addresses differ work correctly (Andrea della Porta)
- Tidy resource sizing and assignment with helpers to reduce
redundancy (Ilpo Järvinen)
- Improve pdev_sort_resources() 'bogus alignment' warning to be more
specific (Ilpo Järvinen)
Driver binding:
- Convert driver .remove_new() callbacks to .remove() again to finish
the conversion from returning 'int' to being 'void' (Sergio
Paracuellos)
- Export pcim_request_all_regions(), a managed interface to request
all BARs (Philipp Stanner)
- Replace pcim_iomap_regions_request_all() with
pcim_request_all_regions(), and pcim_iomap_table()[n] with
pcim_iomap(n), in the following drivers: ahci, crypto qat, crypto
octeontx2, intel_th, iwlwifi, ntb idt, serial rp2, ALSA korg1212
(Philipp Stanner)
- Remove the now unused pcim_iomap_regions_request_all() (Philipp
Stanner)
- Export pcim_iounmap_region(), a managed interface to unmap and
release a PCI BAR (Philipp Stanner)
- Replace pcim_iomap_regions(mask) with pcim_iomap_region(n), and
pcim_iounmap_regions(mask) with pcim_iounmap_region(n), in the
following drivers: fpga dfl-pci, block mtip32xx, gpio-merrifield,
cavium (Philipp Stanner)
Error handling:
- Add sysfs 'reset_subordinate' to reset the entire hierarchy below a
bridge; previously Secondary Bus Reset could only be used when
there was a single device below a bridge (Keith Busch)
- Warn if we reset a running device where the driver didn't register
pci_error_handlers notification callbacks (Keith Busch)
ASPM:
- Disable ASPM L1 before touching L1 PM Substates to follow the spec
closer and avoid a CPU load timeout on some platforms (Ajay
Agarwal)
- Set devices below Intel VMD to D0 before enabling ASPM L1 Substates
as required per spec for all L1 Substates changes (Jian-Hong Pan)
Power management:
- Enable starfive controller runtime PM before probing host bridge
(Mayank Rana)
- Enable runtime power management for host bridges (Krishna chaitanya
chundru)
Power control:
- Use of_platform_device_create() instead of of_platform_populate()
to create pwrctl platform devices so we can control it based on the
child nodes (Manivannan Sadhasivam)
- Create pwrctrl platform devices only if there's a relevant power
supply property (Manivannan Sadhasivam)
- Add device link from the pwrctl supplier to the PCI dev to ensure
pwrctl drivers are probed before the PCI dev driver; this avoids a
race where pwrctl could change device power state while the PCI
driver was active (Manivannan Sadhasivam)
- Find pwrctl device for removal with of_find_device_by_node()
instead of searching all children of the parent (Manivannan
Sadhasivam)
- Rename 'pwrctl' to 'pwrctrl' to match new bandwidth controller
('bwctrl') and hotplug files (Bjorn Helgaas)
Bandwidth control:
- Add read/modify/write locking for Link Control 2, which is used to
manage Link speed (Ilpo Järvinen)
- Extract Link Bandwidth Management Status check into
pcie_lbms_seen(), where it can be shared between the bandwidth
controller and quirks that use it to help retrain failed links
(Ilpo Järvinen)
- Re-add Link Bandwidth notification support with updates to address
the reasons it was previously reverted (Alexandru Gagniuc, Ilpo
Järvinen)
- Add pcie_set_target_speed() and related functionality so drivers
can manage PCIe Link speed based on thermal or other constraints
(Ilpo Järvinen)
- Add a thermal cooling driver to throttle PCIe Links via the
existing thermal management framework (Ilpo Järvinen)
- Add a userspace selftest for the PCIe bandwidth controller (Ilpo
Järvinen)
PCI device hotplug:
- Add hotplug controller driver for Marvell OCTEON multi-function
device where function 0 has a management console interface to
enable/disable and provision various personalities for the other
functions (Shijith Thotton)
- Retain a reference to the pci_bus for the lifetime of a pci_slot to
avoid a use-after-free when the thunderbolt driver resets USB4 host
routers on boot, causing hotplug remove/add of downstream docks or
other devices (Lukas Wunner)
- Remove unused cpcihp struct cpci_hp_controller_ops.hardware_test
(Guilherme Giacomo Simoes)
- Remove unused cpqphp struct ctrl_dbg.ctrl (Christophe JAILLET)
- Use pci_bus_read_dev_vendor_id() instead of hand-coded presence
detection in cpqphp (Ilpo Järvinen)
- Simplify cpqphp enumeration, which is already simple-minded and
doesn't handle devices below hot-added bridges (Ilpo Järvinen)
Virtualization:
- Add ACS quirk for Wangxun FF5xxx NICs, which don't advertise an ACS
capability but do isolate functions as though PCI_ACS_RR and
PCI_ACS_CR were set, so the functions can be in independent IOMMU
groups (Mengyuan Lou)
TLP Processing Hints (TPH):
- Add and document TLP Processing Hints (TPH) support so drivers can
enable and disable TPH and the kernel can save/restore TPH
configuration (Wei Huang)
- Add TPH Steering Tag support so drivers can retrieve Steering Tag
values associated with specific CPUs via an ACPI _DSM to improve
performance by directing DMA writes closer to their consumers (Wei
Huang)
Data Object Exchange (DOE):
- Wait up to 1 second for DOE Busy bit to clear before writing a
request to the mailbox to avoid failures if the mailbox is still
busy from a previous transfer (Gregory Price)
Endpoint framework:
- Skip attempts to allocate from endpoint controller memory window if
the requested size is larger than the window (Damien Le Moal)
- Add and document pci_epc_mem_map() and pci_epc_mem_unmap() to
handle controller-specific size and alignment constraints, and add
test cases to the endpoint test driver (Damien Le Moal)
- Implement dwc pci_epc_ops.align_addr() so pci_epc_mem_map() can
observe DWC-specific alignment requirements (Damien Le Moal)
- Synchronously cancel command handler work in endpoint test before
cleaning up DMA and BARs (Damien Le Moal)
- Respect endpoint page size in dw_pcie_ep_align_addr() (Niklas
Cassel)
- Use dw_pcie_ep_align_addr() in dw_pcie_ep_raise_msi_irq() and
dw_pcie_ep_raise_msix_irq() instead of open coding the equivalent
(Niklas Cassel)
- Avoid NULL dereference if Modem Host Interface Endpoint lacks
'mmio' DT property (Zhongqiu Han)
- Release PCI domain ID of Endpoint controller parent (not controller
itself) and before unregistering the controller, to avoid
use-after-free (Zijun Hu)
- Clear secondary (not primary) EPC in pci_epc_remove_epf() when
removing the secondary controller associated with an NTB (Zijun Hu)
Cadence PCIe controller driver:
- Lower severity of 'phy-names' message (Bartosz Wawrzyniak)
Freescale i.MX6 PCIe controller driver:
- Fix suspend/resume support on i.MX6QDL, which has a hardware
erratum that prevents use of L2 (Stefan Eichenberger)
Intel VMD host bridge driver:
- Add 0xb60b and 0xb06f Device IDs for client SKUs (Nirmal Patel)
MediaTek PCIe Gen3 controller driver:
- Update mediatek-gen3 DT binding to require the exact number of
clocks for each SoC (Fei Shao)
- Add support for DT 'max-link-speed' and 'num-lanes' properties to
restrict the link speed and width (AngeloGioacchino Del Regno)
Microchip PolarFlare PCIe controller driver:
- Add DT and driver support for using either of the two PolarFire
Root Ports (Conor Dooley)
NVIDIA Tegra194 PCIe controller driver:
- Move endpoint controller cleanups that depend on refclk from the
host to the notifier that tells us the host has deasserted PERST#,
when refclk should be valid (Manivannan Sadhasivam)
Qualcomm PCIe controller driver:
- Add qcom SAR2130P DT binding with an additional clock (Dmitry
Baryshkov)
- Enable MSI interrupts if 'global' IRQ is supported, since a
previous commit unintentionally masked them (Manivannan Sadhasivam)
- Move endpoint controller cleanups that depend on refclk from the
host to the notifier that tells us the host has deasserted PERST#,
when refclk should be valid (Manivannan Sadhasivam)
- Add DT binding and driver support for IPQ9574, with Synopsys IP
v5.80a and Qcom IP 1.27.0 (devi priya)
- Move the OPP "operating-points-v2" table from the
qcom,pcie-sm8450.yaml DT binding to qcom,pcie-common.yaml, where it
can be used by other Qcom platforms (Qiang Yu)
- Add 'global' SPI interrupt for events like link-up, link-down to
qcom,pcie-x1e80100 DT binding so we can start enumeration when the
link comes up (Qiang Yu)
- Disable ASPM L0s for qcom,pcie-x1e80100 since the PHY is not tuned
to support this (Qiang Yu)
- Add ops_1_21_0 for SC8280X family SoC, which doesn't use the
'iommu-map' DT property and doesn't need BDF-to-SID translation
(Qiang Yu)
Rockchip PCIe controller driver:
- Define ROCKCHIP_PCIE_AT_SIZE_ALIGN to replace magic 256 endpoint
.align value (Damien Le Moal)
- When unmapping an endpoint window, compute the region index instead
of searching for it, and verify that the address was mapped (Damien
Le Moal)
- When mapping an endpoint window, verify that the address hasn't
been mapped already (Damien Le Moal)
- Implement pci_epc_ops.align_addr() for rockchip-ep (Damien Le Moal)
- Fix MSI IRQ data mapping to observe the alignment constraint, which
fixes intermittent page faults in memcpy_toio() and memcpy_fromio()
(Damien Le Moal)
- Rename rockchip_pcie_parse_ep_dt() to
rockchip_pcie_ep_get_resources() for consistency with similar DT
interfaces (Damien Le Moal)
- Skip the unnecessary link train in rockchip_pcie_ep_probe() and do
it only in the endpoint start operation (Damien Le Moal)
- Implement pci_epc_ops.stop_link() to disable link training and
controller configuration (Damien Le Moal)
- Attempt link training at 5 GT/s when both partners support it
(Damien Le Moal)
- Add a handler for PERST# signal so we can detect host-initiated
resets and start link training after PERST# is deasserted (Damien
Le Moal)
Synopsys DesignWare PCIe controller driver:
- Clear outbound address on unmap so dw_pcie_find_index() won't match
an ATU index that was already unmapped (Damien Le Moal)
- Use of_property_present() instead of of_property_read_bool() when
testing for presence of non-boolean DT properties (Rob Herring)
- Advertise 1MB size if endpoint supports Resizable BARs, which was
inadvertently lost in v6.11 (Niklas Cassel)
TI J721E PCIe driver:
- Add PCIe support for J722S SoC (Siddharth Vadapalli)
- Delay PCIE_T_PVPERL_MS (100 ms), not just PCIE_T_PERST_CLK_US (100
us), before deasserting PERST# to ensure power and refclk are
stable (Siddharth Vadapalli)
TI Keystone PCIe controller driver:
- Set the 'ti,keystone-pcie' mode so v3.65a devices work in Root
Complex mode (Kishon Vijay Abraham I)
- Try to avoid unrecoverable SError for attempts to issue config
transactions when the link is down; this is racy but the best we
can do (Kishon Vijay Abraham I)
Miscellaneous:
- Reorganize kerneldoc parameter names to match order in function
signature (Julia Lawall)
- Fix sysfs reset_method_store() memory leak (Todd Kjos)
- Simplify pci_create_slot() (Ilpo Järvinen)
- Fix incorrect printf format specifiers in pcitest (Luo Yifan)"
* tag 'pci-v6.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (127 commits)
PCI: rockchip-ep: Handle PERST# signal in EP mode
PCI: rockchip-ep: Improve link training
PCI: rockship-ep: Implement the pci_epc_ops::stop_link() operation
PCI: rockchip-ep: Refactor endpoint link training enable
PCI: rockchip-ep: Refactor rockchip_pcie_ep_probe() MSI-X hiding
PCI: rockchip-ep: Refactor rockchip_pcie_ep_probe() memory allocations
PCI: rockchip-ep: Rename rockchip_pcie_parse_ep_dt()
PCI: rockchip-ep: Fix MSI IRQ data mapping
PCI: rockchip-ep: Implement the pci_epc_ops::align_addr() operation
PCI: rockchip-ep: Improve rockchip_pcie_ep_map_addr()
PCI: rockchip-ep: Improve rockchip_pcie_ep_unmap_addr()
PCI: rockchip-ep: Use a macro to define EP controller .align feature
PCI: rockchip-ep: Fix address translation unit programming
PCI/pwrctrl: Rename pwrctrl functions and structures
PCI/pwrctrl: Rename pwrctl files to pwrctrl
PCI/pwrctl: Remove pwrctl device without iterating over all children of pwrctl parent
PCI/pwrctl: Ensure that pwrctl drivers are probed before PCI client drivers
PCI/pwrctl: Create pwrctl device only if at least one power supply is present
PCI/pwrctl: Use of_platform_device_create() to create pwrctl devices
tools: PCI: Fix incorrect printf format specifiers
...