11 · bluetooth

SSP, Secure Connections, and the Murata firmware swap

Four netgraph patches, a userland SSP daemon, and the Bluetooth blob that turned out to be the bug.

● working

The chip from essay 9 talks HCI. The wakeup plumbing from essay 10 keeps it talking under load. Neither buys us pairing. To pair with a 2026-vintage Bluetooth speaker — anything that requires Secure Simple Pairing (BT 2.1+) and Secure Connections (BT 4.1+) — three things have to happen together:

  1. The kernel HCI stack handles SSP events. FreeBSD’s ng_hci only knew about the legacy pre-2.1 events. SSP introduces opcodes 0x320x35 (IO_Capability_Request, IO_Capability_Response, User_Confirmation_Request, User_Passkey_Request, Remote_OOB_Data_Request) plus 0x36 (Simple_Pairing_Complete) that have to be dispatched up to userland.
  2. Userland answers those events. hcsecd — the BSD security daemon — knew how to handle PIN code requests but had no concept of SSP. We added a Just Works mode.
  3. The chip negotiates Secure Connections cleanly. Encryption mode 0x02 (AES-CCM) versus 0x01 (legacy E0) is the gate that modern profiles refuse to cross. The chip’s firmware build determines whether SC is even available.

The first two were patches. The third was the surprise: the bug was inside the binary blob, not our code.

ng_hci: making SSP events visible

ng_hci_evnt.c had a switch in its event-dispatch path that listed events the stack would forward to upper hooks (hcsecd listens via the raw HCI socket). The list included IO_CAPABILITY_REQUEST and SIMPLE_PAIRING_COMPLETE but nothing in between. The result: pairing peers would send IO_Capability_Response, User_Confirmation_Request, etc., the chip would hand the events to ng_hci, and ng_hci would just NG_FREE_M(event) without notifying anyone. Pairing stalled at the user-confirmation step forever.

The fix was small but had to land in the right order:

sys/netgraph/bluetooth/hci/ng_hci_evnt.c

@@ -121,8 +121,12 @@
121 121 case NG_HCI_EVENT_REMOTE_NAME_REQ_COMPL:
122 122 case NG_HCI_EVENT_READ_REMOTE_VER_INFO_COMPL:
123 123 case NG_HCI_EVENT_IO_CAPABILITY_REQUEST:
124 + case NG_HCI_EVENT_IO_CAPABILITY_RESPONSE:
125 + case NG_HCI_EVENT_USER_CONFIRMATION_REQUEST:
126 + case NG_HCI_EVENT_USER_PASSKEY_REQUEST:
127 + case NG_HCI_EVENT_REMOTE_OOB_DATA_REQUEST:
124 128 case NG_HCI_EVENT_SIMPLE_PAIRING_COMPLETE:
125 /* These do not need post processing */
129 + /* SSP events passed to upper hooks (hcsecd) */
126 130 NG_FREE_M(event);
127 131 break;
128 132 case NG_HCI_EVENT_LE:
@@ -727,6 +731,14 @@
727 731 }
728 732
729 733 bcopy(&ep->bdaddr, &con->bdaddr, sizeof(con->bdaddr));
734 + /*
735 + * Raw-HCI-initiated connection: notify upstream so
736 + * l2cap/sco can track this handle.
737 + */
738 + if (ep->link_type == NG_HCI_LINK_ACL)
739 + con->flags |= NG_HCI_CON_NOTIFY_ACL;
740 + else
741 + con->flags |= NG_HCI_CON_NOTIFY_SCO;
730 742 } else if ((error = ng_hci_con_untimeout(con)) != 0)
731 743 goto out;
732 744

That second hunk — adding NG_HCI_CON_NOTIFY_ACL / NG_HCI_CON_NOTIFY_SCO flags to raw-HCI-initiated connection completions — is the seam that lets userland-driven create_connection (via hccontrol or our bt_attach) get its connection handle propagated to L2CAP. Without it, the kernel knows the ACL exists but never tells L2CAP, and any bt_a2dp.sh script that uses hccontrol create_connection gets a dead handle. 242c4ef ng_hci: handle SSP events 0x32/0x33/0x34/0x35 is the SSP event dispatch; the connection-notify flag came in the same batch.

hcsecd: Just Works pairing in userland

hcsecd is the BSD bluetooth security daemon. It listens for HCI events on a raw socket, makes pairing decisions per the policy in /etc/bluetooth/hcsecd.conf, and writes HCI commands back to answer link-key and PIN requests. It had no SSP code. We added it ( 91261a8 hcsecd: add SSP (Just Works) pairing support , patches/usr.sbin/bluetooth/hcsecd/hcsecd.c.patch):

The IO capability bytes are from BT Vol 2 Part E §7.7.40. The relevant constants are commented in the patch:

#define HCSECD_IO_CAP_NO_IO        0x03    /* "Just Works" */
#define HCSECD_AUTH_GENERAL_BONDING 0x04   /* preferred */

This is the minimum viable pairing path — no UI, no numeric comparison, no MITM protection. Sufficient for a speaker on a phone you control. Insufficient for production, by design; an essay later in the arc covers what a real pairing UI would need.

ng_hci: stop alerting on spurious Command_Complete

The first time we got partway through a pair flow with the kernel-side SSP fix, the next failure was new. ng_hci_cmds.c would log ALERT and refuse to drain the command queue when the chip sent a Command_Complete for an opcode that wasn’t currently pending. That happens — Linux’s HCI core silently drops these — most often when the BCM firmware emits an unsolicited Reset Command_Complete during its internal SSP state-machine resets. Worse: the upstream ng_hci_cmds.c would also clear NG_HCI_UNIT_INITED when processing a Reset completion. After an unsolicited Reset in the middle of an SSP flow, the unit was no longer “INITED,” and the next Connection_Request from the peer hit lp_con_rsp: unit is not ready and the pair flow fell apart.

sys/netgraph/bluetooth/hci/ng_hci_cmds.c

@@ -335,34 +335,36 @@
335 335 {
336 336 struct mbuf *m = NULL;
337 337
338 /* Check unit state */
338 + /*
339 + * BCM (and other) firmware occasionally emits unsolicited
340 + * Command_Complete events (most commonly for Reset 0x0c03)
341 + * during internal state resets. Linux silently drops these
342 + * rather than wedging the queue. Mirror that behavior.
343 + */
339 344 if (!(unit->state & NG_HCI_UNIT_COMMAND_PENDING)) {
340 NG_HCI_ALERT(
341 "%s: %s - no pending command, state=%#x\n",
342 __func__, NG_NODE_NAME(unit->node), unit->state);
343
344 return (EINVAL);
345 + NG_HCI_INFO(
346 + "%s: %s - spurious Command_Complete opcode=0x%04x, no pending\n",
347 + __func__, NG_NODE_NAME(unit->node), opcode);
348 + *cp = NULL;
349 + return (EPROTO);
345 350 }
346 351
347 /* Get first command in the queue */
348 352 m = NG_BT_MBUFQ_FIRST(&unit->cmdq);
349 353 if (m == NULL) {
350 NG_HCI_ALERT(
351 "%s: %s - empty command queue?!\n", __func__, NG_NODE_NAME(unit->node));
352
353 return (EINVAL);
354 + NG_HCI_INFO(
355 + "%s: %s - spurious Command_Complete opcode=0x%04x, empty queue\n",
356 + __func__, NG_NODE_NAME(unit->node), opcode);
357 + *cp = NULL;
358 + return (EPROTO);
354 359 }
355 360
356 /*
357 * Match command opcode, if does not match - do nothing and
358 * let timeout handle that.
359 */
360
361 361 if (mtod(m, ng_hci_cmd_pkt_t *)->opcode != opcode) {
362 NG_HCI_ALERT(
363 "%s: %s - command queue is out of sync\n", __func__, NG_NODE_NAME(unit->node));
364
365 return (EINVAL);
362 + NG_HCI_INFO(
363 + "%s: %s - spurious Command_Complete opcode=0x%04x, queue head=0x%04x\n",
364 + __func__, NG_NODE_NAME(unit->node), opcode,
365 + le16toh(mtod(m, ng_hci_cmd_pkt_t *)->opcode));
366 + *cp = NULL;
367 + return (EPROTO);
366 368 }
367 369
368 370 /*
@@ -654,7 +656,12 @@
654 656 NG_HCI_BUFF_SCO_TOTAL(unit->buffer, size);
655 657 NG_HCI_BUFF_SCO_FREE(unit->buffer, size);
656 658
657 unit->state &= ~NG_HCI_UNIT_INITED;
659 + /*
660 + * Do not clear NG_HCI_UNIT_INITED on Reset. BCM firmware
661 + * triggers unsolicited Reset Command_Complete during SSP
662 + * flows; clearing INITED breaks incoming Connection_Request
663 + * handling (lp_con_rsp: unit is not ready).
664 + */
658 665 } break;
659 666
660 667 default:

The patch does two things. First, it converts NG_HCI_ALERT to NG_HCI_INFO and returns EPROTO (instead of EINVAL) when a spurious or out-of-order Command_Complete arrives — so the kernel logs are quiet and the queue continues to drain. Second, it removes the unit->state &= ~NG_HCI_UNIT_INITED clear on Reset completions, with a comment explaining why. 22d91e3 bcm firmware + ng_hci: fix post-patchram init + preserve INITED .

ng_hci_ulpi.c got one related tweak: when paging a peer with no cached info, default page_scan_rep_mode to R1 instead of R0. Most BT 2.1+ peers run R1 page-scan windows; using R0 caused Page Timeout 0x04 because the page train and scan window never overlapped.

The firmware swap

With the kernel and userland both handling SSP, pairing got further. IO_Capability_Request arrived. hcsecd answered. User_Confirmation_Request arrived. hcsecd answered. Simple_Pairing_Complete arrived with status 0x00. Auth_Complete arrived with status 0x00. Then we asked for encryption — Set_Connection_Encryption with enable=1 — and Encryption_Change came back with enabled=0x01. Which is legacy E0, not AES-CCM. And bluez on the speaker side refused to bring up A2DP over a non-SC link.

We chased this for two days on the wrong ladder.

[WAR STORY]

The firmware was the bug

BCM4345C0 / Secure Connections

▸ symptom

Encryption_Change returns enabled=0x01 (E0) consistently. Speaker refuses A2DP profile. Write_Secure_Connections_Host_Support (0x0c7a) with parameter 01 returns Command Disallowed (0x0c). Read_Local_Extended_Features page 1 reports bit 3 (Secure Connections Host Support) as zero.

▸ hypothesis 1

ng_hci’s pairing flow is sending the SSP events too late, or in the wrong order, and the chip is internally falling back to legacy. Audited the sequence against Linux: IO_Capability_Reply before any user-level confirmation, User_Confirmation_Reply after the chip’s request, both before Set_Connection_Encryption. Order was correct. No change.

▸ hypothesis 2

hcsecd’s IO_Capability_Reply is claiming the wrong authentication-requirements byte. Tried MITM_GENERAL_BONDING (0x05) instead of GENERAL_BONDING (0x04). Same E0 result. Tried DEDICATED_BONDING (0x01). Same. The auth-requirements byte controls which SSP method the peer picks, not which cipher gets negotiated.

▸ hypothesis 3

The chip’s stock firmware doesn’t support SC. Confirmed by reading LMP page 1 bit 3 — definitively zero. The BCM4345C0.hcd we had been using was the linux-firmware BCM4345C0.hcd, which is the Raspberry Pi 3+ build, version 0190. We checked Read_Local_Name — it returned "BCM4345C0", no version suffix. The RPi build has SC stripped.

▸ breakthrough

Murata maintains a public cyw-bt-patch repo of HCD blobs for Cypress/Broadcom radios in their modules. The “Type-1MW” variant is the same BCM4345C0 silicon at the same crystal frequency (37.4 MHz) — but build 0187.0366.1MW, not 0190. After dropping the Murata blob into overlay/usr/share/firmware/brcm/BCM4345C0.hcd, Read_Local_Name returned:

BCM4345C0 Murata Type-1MW UART 37.4 MHz BT 5.0-0187

Write_Secure_Connections_Host_Support=1 returned 0x00 (success). LMP page 1 bit 3 read 1. The very next pair attempt produced Encryption_Change with enabled=0x02 (AES-CCM) on the first try. 41f22e4 BCM4345C0.hcd: swap to Murata 1MW build 0187.0366 .

▸ fix

Swap the .hcd file. That is the entirety of the fix. No code change in our tree. No tweak to bcm_firmware_load.pl (5e479d4 later added an opt-in Write_Secure_Connections_Host_Support=1 step in the loader for cleanliness, but it was the firmware swap that made the chip capable of accepting that command). The Murata blob is committed alongside the loader.

▸ lesson

The bug is rarely where you’re looking. We had spent two days inside ng_hci and hcsecd, instrumenting the SSP event flow, double-checking auth-requirements byte values, comparing HCI sniff traces against Linux. The bug was in a binary that we had taken on faith because “everyone uses linux-firmware.” Always check the chip’s local features against what your code is asking for. If the chip says it can’t do something, your code can’t make it do that thing — no matter how correct your code is.

The trace that proved it

After the firmware swap, with all four patches in place:

HCI trace 5 packets
1776832627.488071 EVT len=10 04 36 07 00 eb 70 c5 64 d4 e0
1776832627.488149 EVT len=26 04 18 17 eb 70 c5 64 d4 e0 26 b8 d0 54 da 0d ab fb 14 15 f0 51 9e 81 98 70 07
1776832627.488192 EVT len=6 04 06 03 00 0c 00
1776832628.412288 CMD len=7 01 13 04 03 0c 00 01
1776832628.745240 EVT len=7 04 08 04 00 0c 00 02

Decoded:

That enabled=0x02 is the gate. First time we ever saw it. Everything downstream — ACL data, L2CAP signaling, AVDTP profile negotiation, A2DP — assumes the link is encrypted under a modern cipher, and bluez 5.86 on the receiving side specifically refuses to bring up A2DP if the link reports enabled=0x01. Two-byte field, one bit difference, weeks of work to flip it.

What also had to land

Three smaller fixes shipped in the same window and are part of why the trace above is reproducible:

The link is encrypted under AES-CCM, the peer trusts us, the kernel routes ACL data into L2CAP, and userland believes a pair happened. Now we need audio to actually traverse that link. That’s essay 12 — and the breakthrough commit, and the soundbite.