Appendix · reference

RK3399 TSADC

On-die thermal sensor — driver in tree with safe default TSHUT disarmed.

TSADC is the RK3399’s on-die thermal sensor. Two channels (CPU and GPU), each driving a temperature register and a hardware shutdown (“TSHUT”) output that can asynchronously gate SoC power if a trip point is exceeded. Powerful, dangerous, and now driven by rk_tsadc(4).

Identity

PropertyValue
BlockRK3399 TSADC
MMIO base0xff260000
Channels2 (CPU, GPU)
Hardware shutdown outputTSHUT, polarity programmable
DTS compatiblerockchip,rk3399-tsadc
Linux driverdrivers/thermal/rockchip_thermal.c

Driver

◐ partial Driver lands at src/sys/arm64/rockchip/rk_tsadc.c. The DTS node is now status = "okay" with rockchip,hw-tshut-polarity = <0> (LOW_ACTIVE) and rockchip,hw-tshut-temp = <95000>. Driver and DTS hint agree, but the driver now leaves steady-state comparator IRQs and the asynchronous hardware shutdown GPIO source disarmed. Temperature sysctls are trustworthy enough for debug/status UI and the soft thermal policy is enabled by default. The policy samples TSADC from a low-rate callout, runs the work on taskqueue_thread, caps the GPU through Panfrost’s dev.panfrost.0.gpu_max_auto_mhz, and caps CPU through dev.cpu.0.freq / dev.cpu.4.freq. The phone image starts phone_thermal_guard, which yields powerd while the policy is warm/hot/critical so adaptive frequency scaling cannot immediately race those soft caps away.

Validation state as of 2026-05-07: sensor readout, the soft governor, and guarded comparator IRQs are fixed; hardware TSHUT remains open. The live readout falsifier was a hot phone with MPU-6500 die temperature near 70 °C while TSADC reported single-digit °C. Comparing Linux showed the missing transform: RK3399 Linux enables TSADCV3_AUTO_Q_SEL_EN, so the reported code is 1024 - tsadc_q. ede7969 rk_tsadc: derive RK3399 q-select code in software first kept the known-working native-q hardware mode and applied that transform in software. Later comparator bench work proved the hardware mode was the right final direction: e4017e7 thermal: enable tsadc hardware qsel enables hardware Q-select by default, so last_raw_ch* and last_code_ch* now both reflect the data register’s Linux-style code. Final receipt on kernel #195:

dev.rk_tsadc.0.cpu_temp: 56666
dev.rk_tsadc.0.gpu_temp: 60625
dev.rk_tsadc.0.last_raw_ch0: 567
dev.rk_tsadc.0.last_raw_ch1: 574
dev.rk_tsadc.0.last_code_ch0: 567
dev.rk_tsadc.0.last_code_ch1: 574
dev.rk_tsadc.0.q_sel_enabled: 1
dev.rk_tsadc.0.auto_con: 51
dev.rk_tsadc.0.tshut_enabled: 0

e1f73c5 rk_tsadc: disarm IRQs in software q-select mode then disarmed comparator IRQs in this software q-select mode. 4b97aca rk_tsadc: add guarded comparator IRQ bench knobs added guarded comparator bench tunables and sysctls so the threshold registers can be inspected without making IRQs part of the shipping default.

The soft-governor path landed after the comparator work. It has three policy levels above normal: warm caps GPU max-auto to 400 MHz and both CPU clusters to 1200 MHz; hot caps GPU to 297 MHz and CPU to 816 MHz; critical caps GPU to 200 MHz and CPU to 408 MHz. Kernel #175 validated the full guarded path under glmark2: warm caps engaged near GPU 72.777 °C, phone_thermal_guard stopped powerd, and cooldown restored powerd plus the previous GPU max-auto posture.

Kernel #191 tried to move CPU throttling out of userland sysctl writes and into CPUFREQ_SET(..., CPUFREQ_PRIO_KERN). That boot reached root mount, then the eMMC controller started returning sdhci_rockchip0-slot0: Controller timeout and g_vfs_done():ufs/FreeBSD_Install[...] error = 5. Restoring kernel.prev (#190) booted cleanly with the same eMMC, so the cpufreq-priority attempt is backed out until it can be isolated behind a recovery tunable.

The non-destructive regression receipt is now mise run thermal:phone -- soft-test. On 2026-05-06 it forced the warm threshold below the current temperature, observed policy_level=2, policy_gpu_cap_mhz=400, policy_cpu_cap_mhz=1200, CPU frequencies at or below the cap, and powerd=stopped, then restored the default thresholds and observed policy_level=1, GPU max-auto 600, both CPU clusters back at their adaptive ceiling, and policy_error_count=0.

The comparator path closed in two steps. Kernel #193 added mise run thermal:phone -- irq-bench, a runtime arm/disarm harness that programs one channel, clears pending state, enables only that comparator bit, and auto-disarms on the first matching IRQ. In the older software q-select mode, code-domain thresholds still did not fire and native-q thresholds had the wrong high-temperature sense. Kernel #195 then enabled TSADCV3_AUTO_Q_SEL_EN and the same bench passed the real predicate: code-domain nofire/fire cases worked on both channels, native-q cases stayed as the negative control, and the final state was INT_EN=0, irq_enabled=0, tshut_enabled=0, irq_bench_last_error=0. That proves guarded comparator IRQ semantics. It does not prove the asynchronous TSHUT output, which remains a separate recovery-attached boot test.

Why the node was disabled: enabling the device with the original polarity caused immediate false shutdowns. The TSHUT GPIO asserts at boot, the SoC reads that as “die is on fire,” and gates power. Same failure mode as the PineTab2 TSADC reboot loop documented in the Hardware reference under PineTab2 SoC and memory.

What the driver does (mirrors rk_tsadcv3_initialize / rk_tsadcv3_control from drivers/thermal/rockchip_thermal.c):

  1. Hybrid RK3399 register block — Linux’s rk3399_tsadc_data uses rk_tsadcv3_initialize for setup, but v2 callbacks for data, alarm, and shutdown registers: TSADCV2_DATA(chn) = 0x20 + chn*4, TSADCV2_COMP_INT(chn) = 0x30 + chn*4, and TSADCV2_COMP_SHUT(chn) = 0x40 + chn*4. AUTO_PERIOD addresses also stay at the v2 location (0x68 / 0x6c).
  2. TSADCV3_AUTO_PERIOD = TSADCV3_AUTO_PERIOD_HT = 1875 (≈2.5 ms cadence).
  3. Two channels (chn=0 CPU, chn=1 GPU). The driver calculates INT and SHUT trip registers from rk3399_code_table, but leaves INT_EN and TSHUT sources disarmed unless loader tunables explicitly opt in.
  4. TSHUT_MODE_GPIO, TSHUT_LOW_ACTIVE (the TSADCV2_AUTO_TSHUT_POLARITY_HIGH bit is left clear). The shutdown source itself is opt-in through hw.rk_tsadc.enable_tshut=1.
  5. Sysctls dev.rk_tsadc.0.cpu_temp and dev.rk_tsadc.0.gpu_temp read the v2 data register for each channel. With the default hw.rk_tsadc.enable_qsel=1, AUTO_CON includes TSADCV3_AUTO_Q_SEL_EN and the data register already contains Linux-style table codes, so last_raw_ch* and last_code_ch* match. With QSEL disabled for diagnosis, the driver exposes the native q sample as last_raw_ch* and computes last_code_ch* = 1024 - raw. The final code back-translates via rk3399_code_table to milli-Celsius.
  6. dev.rk_tsadc.0.tshut_threshold_mc reports the configured shutdown trip, trim_offset_mc remains writable but is 0 on the phone, and tshut_enabled reports whether the hardware shutdown GPIO source is armed.
  7. IRQ handler can log threshold crossings and ack TSADCV2_INT_PD, but threshold sources are not enabled in the default kernel. The bench path exposes irq_enabled, irq_count, last_int_pd, int_en, auto_con, comp_int_ch*, and comp_shut_ch*; a storm guard disarms COMP_INT after repeated IRQs.
  8. The software policy throttles without IRQs: GPU caps are reversible Panfrost max-auto writes, CPU caps are reversible dev.cpu.N.freq writes, and phone_thermal_guard yields powerd while caps are active.

eFuse trim caveat

Linux’s generic Rockchip thermal driver supports trim cells on SoCs that define a trim callback. RK3399 does not use that path in mainline: rk3399_tsadc_data has no get_trim_code hook. The standard RK3399 efuse binding exposes CPU ID, leakage cells, and wafer info, but no temperature calibration cell that Linux consumes for this SoC.

The FreeBSD driver therefore uses the upstream RK3399 code table directly. trim_offset_mc remains as a manual bench knob, but the 2026-05-05 Q-select fix made the previous -30000 loader workaround wrong; the phone now runs with trim 0.

Parity verification

Bench predicate now that the DTS node is okay:

  1. Reboot the phone with serial console + recovery USB attached (the falsifier path requires reverting the DTS over the wire if the polarity is wrong on this PCB revision).

  2. Run mise run debug:sensors:phone for the read-only combined receipt. The TSADC section should classify cpu_temp, gpu_temp, raw channel codes, and tshut_enabled before any opt-in shutdown polarity test.

  3. Expect these attach lines in dmesg:

    rk_tsadc0: <Rockchip RK3399 TSADC> mem 0xff260000-... irq ... on ofwbus0
    rk_tsadc0: auto-poll enabled, 2 channels, trip=95000 shut=105000 (mC), tshut=disarmed
  4. Read both channels:

    sysctl dev.rk_tsadc.0.cpu_temp dev.rk_tsadc.0.gpu_temp \
          dev.rk_tsadc.0.last_raw_ch0 dev.rk_tsadc.0.last_raw_ch1 \
          dev.rk_tsadc.0.last_code_ch0 dev.rk_tsadc.0.last_code_ch1 \
          dev.rk_tsadc.0.tshut_enabled

    cpu_temp should report 35000–50000 mC idle on most boards (idle die at 25 °C ambient is typically ~35–45 °C). last_raw_ch* is the native q sample and should be non-zero and below 1024. last_code_ch* should sit in the rk3399 code-table window 530–650 at room temperature.

  5. The phone must not spontaneously reboot or shut down within the first few seconds of attach. That was the original false-shutdown symptom. With the safe default, tshut_enabled should be 0; if the phone still reboots, TSADC attach itself is unsafe and not just the shutdown GPIO source.

  6. After ~1 minute of glxgears (or four pinned dd if=/dev/zero of=/dev/null), cpu_temp should rise by 5000–15000 mC. If it does not change, channel autoscan is not running.

  7. Comparator IRQ bench result as of kernel #195: code-domain thresholds fire and stay quiet correctly on both channels when hardware QSEL is enabled. Use mise run thermal:phone -- irq-bench; it is a runtime test and should leave dev.rk_tsadc.0.int_en=0, irq_enabled=0, and tshut_enabled=0 at exit. Native-q cases are diagnostic only.

  8. Only after the comparator bench passes, boot once with hw.rk_tsadc.enable_tshut=1 and confirm tshut=armed in dmesg and dev.rk_tsadc.0.tshut_enabled=1. Do this with serial and recovery USB attached; this is the polarity proof, not the default path.

  9. Cross-check against a Linux PPP image (postmarketOS):

    cat /sys/class/thermal/thermal_zone0/temp     # CPU
    cat /sys/class/thermal/thermal_zone1/temp     # GPU

    The two zones should report values within a few degrees of our sysctls under matched ambient + workload. A divergence of more than ~5 °C points at a missing trim offset (see eFuse caveat).

Falsifiers

Status

QuestionAnswer
Probes?Yes — rk_tsadc0: <Rockchip RK3399 TSADC> mem 0xff260000-0xff2600ff irq 37 on ofwbus0; auto-poll runs both channels
Used by anything?Yes — software thermal governor plus phone_thermal_guard; guarded comparator IRQs are bench-proven but not part of steady-state policy; hardware TSHUT is still disarmed
User-visible?Temperature, native-q, transformed-code, policy, trim, IRQ, and TSHUT sysctls under dev.rk_tsadc.0.*; guard state in /var/run/phone-thermal-guard/state
Blocking?Partial — soft CPU/GPU throttling and guarded interrupt trips work, but kernel-priority cpufreq and hardware shutdown are not bench-proven

Calibration

The previous calibration story was wrong. Native q values around 466 looked cold only because they were being interpreted directly against Linux’s RK3399 table. Linux sets TSADCV3_AUTO_Q_SEL_EN, whose documented effect is to report 1024 - tsadc_q; with hardware QSEL enabled, the data register reports code-domain values directly. A warmed phone now reports data/code values in the high 560s/570s, mapping to plausible mid-50s/mid-60s °C readings.

7613729 rk_tsadc: make trim_offset_mc writable + tunable still exposes trim_offset_mc as a writable sysctl plus a hw.rk_tsadc.trim_offset_mc loader tunable, but the correct default is now 0. Do not carry forward the old hw.rk_tsadc.trim_offset_mc=-30000 workaround.

Open work

This unblocks the practical part of the moderate HARDWARE.md item — soft CPU/GPU throttling plus guarded comparator IRQs via TSADC. The remaining upstream-shaped version is hardware TSHUT policy and a safer kernel-priority cpufreq integration.