This was a long one–I spent a fair bit of time with the Orange Pi 6 Plus over the past few months, and what I expected to be a quick look at another fast ARM board turned into one of those test runs where the hardware looks promising on paper, the software is wonky in exactly the wrong places, and you end up diving far more into boot chains, vendor GPU blobs and inference runtimes than you ever intended.
The Orange Pi 6+ on a corner of my desk
Unlike most of the ARM boards I’ve reviewed until now, this one is not an RK3588 board: The Orange Pi 6 Plus uses the CIX P1 (CD8180/CD8160), with 12 CPU cores, a Mali G720 GPU, a dedicated NPU and a wild set of specs for the form factor. Boards like this promise everything at once–homelab, edge AI, dual 5GbE, low power–but they only matter if the software gets out of the way.
Disclaimer:Orange Pi supplied me with a 6 Plus free of charge, and, as usual, this article follows my review policy.
And, for a change, I decided to make sure the software did exactly that, and made it my concern from the start–i.e., I built my own OS images for it (a fork of orangepi-build) and went in a bit deeper than usual, spending around two months taking notes, benchmark logs and even Graphite telemetry as I went along.
One of the reasons I wanted to test this board is that the SoC is the CIX P1, which Orange Pi bills as a 12-core part with a combined 45 TOPS across CPU, GPU and NPU. The machine I tested came with:
CIX P1 (CD8180/CD8160), 4×Cortex-A520 plus 8×Cortex-A720 cores
16GiB of RAM (roughly 14GiB visible to Linux)
dual Realtek RTL8126 5GbE
Realtek RTL8852BE Wi-Fi and Bluetooth card
Mali G720 / Immortalis-class GPU
A three-core Zhouyi NPU
And if you’ve been paying attention to all my homelab testing, those two 5GbE ports alone make this more interesting than most hobbyist SBCs. But, of course, there is a lot more to expandability than that:
The CPU is interesting in itself–the fastest A720 cluster reaches about 2.6GHz, the A520s top out around 1.8GHz, so like many other big.LITTLE ARM architectures you get asymmetric clusters rather than a uniform twelve-core machine:
lspci is a bit more revealing, especially because you get to see where the dual 5GbE setup and Wi-Fi controller are placed–each seems to get its own PCI bridge:
Nothing exotic, which I rather like. And, by the way, the board ships with Cix Technology Group UEFI, version 1.3, so setting up boot devices and managing (very) basic settings was trivial.
This is where I took a very large detour from my usual approach: I decided early on that I wasn’t going to use a vendor image for this board.
Vendor images for SBCs like this always tend to be good enough to boot, occasionally good enough to do basic benchmarks, and almost never something I want to build on–especially if I’m doing local AI work, host-native services, or anything that requires me to trust package sources, first-boot behaviour and upgrade paths.
I wanted a server-first layout, reproducible fixes and a place to bake in GPU/NPU prerequisites, so I forked orangepi-build and started from there, with a fairly high bar:
I wanted a fully reproducible Debian 13 / Trixie build with features like /dev/kvm present, not a vendor image with stale software and missing features I wanted.
The build needed to stop treating Ubuntu as the only real target–add-apt-repository, PPA logic and software-properties-common had to be cleaned out.
Boot fixes had to be baked in from the start, not applied as post-flash rituals.
First boot had to be deterministic. If the root filesystem resize requires me nearby with serial and patience, the image isn’t finished.
I needed a clean place to stage GPU firmware, vendor userspace and NPU packages.
The Orange Pi repository included kernel 6.6.89-cix, so a lot of the above was already “there”–I just needed to hack at it, but instead of doing it entirely by hand I got piclaw to set things up on an Ubuntu 22.04 VM.
Over a few weeks (this took a while), the above list translated into a fairly concrete set of changes in the build tree:
added Trixie configs under external/config/{cli,desktop,distributions}/trixie
patched scripts/distributions.sh for Debian 13 support
fixed the board config to allow trixie under DISTRIB_TYPE_NEXT
removed Ubuntu-only dependencies from the package lists
forced standard Debian mirrors
made the kernel build non-interactive
started baking in GPU/NPU prerequisites and development tooling for later testing
The package side needed archaeological work too. I patched orangepi-config to stop behaving as though it were on Ubuntu, removed software-properties-common from the Trixie dependency chain, forced regeneration of cached packages, and went hunting through component_cix-next for whatever vendor bits still existed and matched my kernel, taking notes throughout.
My first boot-related note on this board was short: I flashed my custom Trixie image, got as far as GRUB, and it fell over because the EFI stub was wrong. The image did contain the right DTBs (SKY1-ORANGEPI-6-PLUS.DTB and friends), but the build scripts somehow commented out useful menu entries and the default pointed at the ACPI path.
But getting past GRUB was only half the battle. The first real boot surfaced another annoying issue: the partition resize worked, the root filesystem resize didn’t, and the machine failed to reboot cleanly at the handoff. I had piclaw trace the resize helper, found it was disabling itself before the second stage could run, and patched that too.
The whole thing made for a pretty intensive couple of weeks:
Build and fix timeline
In parallel, I made sure to include GPU/NPU support:
firmware symlink so panthor could find mali_csffw.bin
baked in cix-noe-umd and cix-npu-onnxruntime
and a big pile of dev tooling so the board could bootstrap AI experiments without turning into a scavenger hunt
Once the image was booting reliably, I wanted the board off SD entirely. I had a 512GB NVMe drive sitting about, so I had piclaw handle the migration–even though it had just finished patching orangepi-config, the actual cutover was done manually: partition the NVMe into EFI, root and swap, rsync everything across, patch GRUB.CFG to point at the new PARTUUID, reboot, verify, remove the SD card.
So, to recap, I had to fix these things for my custom image:
Boot chain: initially broken because GRUB defaulted to the wrong path; stable once DTB boot was forced
GPU / Vulkan: initially llvmpipe fallback or panvk failure; working with vendor Vulkan ICD on mali_kbase
OpenCL: not useful at first, functional once the vendor userspace was in place
NPU kernel side: visible from the beginning, probe messages reporting three cores
NPU userspace: present only in fragments, inconsistent package references, a lot of manual validation needed
But after the first few steps were done, I had zero issues installing or building software on this–GCC 14.2 from Trixie, Bun as the primary scripting runtime, and the usual complement of build-essential, cmake, clang and ninja for C/C++ projects.
Python 3 and pip are present for the inevitable bits that still need them, and Docker runs cleanly, plus I made sure I had /dev/kvm available for virtualised workloads–and with the CIX patches for the P1 SoC, everything went swimmingly. The kernel is PREEMPT-enabled, which is pleasant for interactive work and inference latency, though I haven’t tested RT workloads.
I even got Proxmox to run reliably on this with zero issues (including creating ARM VMs on it) before wiping the NVMe to do some AI testing.
The one area where the software story gets awkward is the vendor-specific GPU and NPU userspace–covered in the next two sections. Everything else about running Debian on this board is unremarkable, which is a compliment.
Out of the box, the Linux graphics story was absent. The kernel side was in a half-state that looked superficially encouraging–/dev/dri/* present, both panthor and mali_kbase around, the system clearly aware of a Mali GPU, etc.
But Vulkan fell back to llvmpipe, and forcing the Mesa Panfrost ICD produced Unknown gpu_id (0xc870) errors. So I had piclaw go through the Orange Pi and component_cix-next package sources and find the missing pieces: vendor userspace for the CIX stack–cix-gpu-umd, cix-libglvnd, cix-libdrm, cix-mesa and a Vulkan ICD pointing at libmali.so.
Installing those got me partway–the userspace reported No mali devices found, because the board was still on the wrong kernel path. Once I rebound the GPU from panthor to the vendor mali/mali_kbase stack, /dev/mali0 appeared and Vulkan reported actual hardware:
deviceName = Mali-G720-Immortalis
driverID = DRIVER_ID_ARM_PROPRIETARY
OpenCL also came up correctly afterwards, again via the vendor path.
This was pretty good news as far as typical SBC testing goes, since it means you can get decent (if vendor-specific) GPU support working–but getting there involved driver rebinding, vendor package archaeology and a persistent module policy to keep the machine on the right stack across reboots.
The NPU story was, if anything, even more typical of this class of hardware.
Linux clearly knew there was an NPU–dmesg reported three cores during probe–but the userspace was absent or incomplete and the package references inconsistent enough that I had to validate URLs by hand. One package version was simply gone, another worked, and I only reached a coherent install because component_cix-next still had enough usable artifacts lying about.
Not to say the NPU is fake or useless–it isn’t. But the tooling has that familiar feeling of being assembled by several teams who weren’t speaking to each other as often as they ought–and if your interest in a board like this is local AI, that matters more than any TOPS figure on a product page.
This is where the board started being interesting.
Since I have been getting more and more involved in low level AI work, I spent most of my time testing local inference–the Orange Pi 6 Plus is not a universally good AI box, but it is surprisingly usable within a narrow envelope of models and runtimes.
And to make it usable for a few use cases, I needed a model-and-runtime combination that felt like an actual working stack rather than a demo. I ended up trying four inference runtimes–PowerInfer, ik_llama (which is a CPU-optimized version of llama.cpp), vanilla llama.cpp, and my own Vulkan-patched version of llama.cpp that for the Orange Pi 6 Plus’s GPU (the NPU, alas, like many other ARM SoC NPUs, is designed more for vision processing than LLM work, and I spent a few evenings trying).
I ended up running well over a dozen different combinations of models and runtimes, and these five were the ones I invested the most time in, since I wanted a model that was powerful enough for “production” use even if it was a little slow in practice:
Inference performance by model and runtime|669
The dark bars are generation speed, the lighter bars are prompt processing. The verdicts on the right reflect what happened when I pushed each model through a real agent pipeline with tool calls, not just a short benchmark prompt–and that is where the gap between “fast on paper” and “actually works” showed up.
The Liquid models posted impressive raw tok/s figures but broke down in practice with blank responses and formatting failures. The 35B sparse model was surprisingly fast under ik_llama.cpp but ate all available RAM and failed roughly 40% of the time.
Only the Qwen 4B on Vulkan held up as something I would actually leave running and the best all-round result was Qwen3.5 4B Q4_K_M on Vulkan:
Metric
Value
Runtime
llama.cpp Vulkan
Prompt t/s
8.4
Generation t/s
9.7
Typical response time
6-25s
RSS
~5.3GB
Stability
10/10 pass at -ub 8
Not desktop-GPU territory, but enough to move the board from “cute” to “useful”. More importantly, it was stable–it followed my coding assistant’s AGENTS.md prompt correctly, handled tool calls, and didn’t chew through all available memory.
The production configuration I eventually settled on was:
Every flag has a story–especially (-ub), the micro-batch size, which controls how many tokens llama.cpp tries to process per Vulkan dispatch.
It turns out that the Mali Vulkan backend had a descriptor-set exhaustion issue that needed patching upstream before it stopped crashing (yes, I spent a while debugging Vulkan…), and I ran a set of benchmarks specifically for that:
Vulkan micro-batch tuning sweep|695
Bigger batches should mean better GPU utilisation and faster prompt ingestion, but the Mali G720’s Vulkan driver has a hard limit on descriptor sets–exceed it and the backend either crashes or silently degrades.
The green bars are stable configurations, the orange ones are not–and the dashed box marks where I landed for production. At -ub 16, prompt speed collapsed because the driver was already struggling; at 64+ it fell over entirely.
The tuning sweep showed where the practical ceiling was rather than the theoretical one:
At -ub 2, the setup was stable but underwhelming: about 4.3 prompt tok/s and 9.7 generation tok/s.
At -ub 4, prompt speed improved to 5.9 tok/s with the same 9.7 generation rate.
At -ub 8, which is where I eventually landed, prompt speed climbed to 8.4 tok/s and generation stayed at 9.7 tok/s.
At -ub 16, the whole thing became temperamental and prompt throughput actually collapsed to around 2.0 tok/s.
At -ub 32, it could survive a test run, but not in a way that inspired confidence.
At 64+, it was simply crashy.
So the practical production setting was not some elegant theoretical optimum–it was simply the highest value that stopped the Vulkan backend from crashing. That, in a sentence, sums up a fair bit of the experience of using this board.
llama.cpp on Vulkan was the best all-round practical setup, but only after patching and tuning.
llama.cpp on CPU was useful as a baseline and for sanity checks, but too slow once model size started to climb.
ik_llama.cpp on CPU turned out to be dramatically better for some 2-bit and sparse-ish workloads than I had expected, to the point where it occasionally made GPU offload look silly.
PowerInfer remained interesting mostly in theory; in practice it was too awkward and too far behind the other options to matter.
GPU offload was not always the right answer. A lot of the marketing gravity around boards like this points you toward the GPU or NPU as the only interesting path, but once you start timing things, the answer is much more conditional.
Qwen3.5 35B-A3B IQ2_XXS was instructive. Under stock llama.cpp, far too slow. Under ik_llama.cpp, dramatically faster on CPU–to the point where it occasionally behaved like a real system rather than a cry for help. But it had a roughly 40% empty-response rate, consumed nearly all RAM and swap, and was slow enough end-to-end that I would only call it “working” in the same tone one might describe a vintage British car that has just completed a short journey without shedding visible parts.
For that model, the runtime comparison was actually rather stark:
Upstream llama.cpp on pure CPU (-ngl 0) managed about 0.63 prompt tok/s, 1.07 generation tok/s and took 76.67s end to end.
Upstream llama.cpp with a token amount of offload (-ngl 8) was, if anything, slightly worse at 80.03s total.
ik_llama.cpp on CPU was the surprise winner by a ridiculous margin: 16.24 prompt tok/s, 5.24 generation tok/s and 12.75s total.
ik_llama.cpp with -ngl 8 promptly ruined that advantage and fell back to a miserable 71.33s total.
That is one of the more useful things I learned here: for some quantized models on this machine, CPU inference with the right runtime was not just competitive with GPU offload, it was much better.
The Liquid models were interesting for a different reason. LFM2 8B-A1B Q4_K_M managed roughly 46.7 tok/s prompt and ~32 tok/s generation on Vulkan–objectively impressive for the active parameter count–and LFM2.5 1.2B pushed generation to around 45 tok/s. On paper, these look like the hidden sweet spot. In practice both failed when pushed through the full agent pipeline: blank output, formatting failures, over-eager obedience to internal conventions. Useful to know, but not deployable.
For reference, the ranking I ended up with:
Qwen3.5 4B Q4_K_M on llama.cpp Vulkan at 9.7 generation tok/s was the only setup that felt production-usable.
Qwen3.5 35B-A3B IQ2_XXS on ik_llama.cpp CPU at roughly 5.3 generation tok/s was the most surprising result–impressive, but too flaky and memory-hungry to trust.
LFM2 8B-A1B Q4_K_M on Vulkan at roughly 32 tok/s generation posted a great benchmark number but broke down in real agent use.
LFM2.5 1.2B Q4_K_M on Vulkan at roughly 45 tok/s generation was quick but not dependable enough to matter.
Qwen3.5 0.8B Q4_K_M on CPU at about 46 tok/s sounds good until you ask it to cope with a full agent prompt.
So yes, the board can run local models. It cannot run all of them well, and a distressing amount of the work lies in sorting out which bits of the stack are broken on any given day, but it was a much better experience than with Rockchip boards, and I intend to try out Gemma 4 and more recent models soon.
While the above was going on, I kept tabs on both thermals and memory, since I expected sustained GPU or inference workloads to need active airflow. But I had to deal with the fan first, since the Orange Pi 6 Plus ships with a pretty beefy cooling solution that, sadly, is very on the loud side.
And there’s no fan curve–all you get with the CIX kernel is a sysfs interface via cix-ec-fan with three modes:
mute
normal
performance
The first leads to the CPU reaching fairly high temperatures under even moderate load, the last is unbearably loud, and the normal setting ranges from moderately quiet to annoying, so for most of the testing I moved the board to my server closet.
Again, the CIX P1 has 12 cores, but they are not equal–four low-power Cortex-A520 cores clocked at 1.8GHz and eight faster Cortex-A720 cores spread across four clusters at different peak speeds (2.2 to 2.6GHz). The kernel’s cpufreq subsystem treats each cluster independently, which means that it takes a bit of effort to max out all the cores:
sbc-bench reported no throttling during its run, which was encouraging.
The aggregate 7-Zip score landed around 33k, with the best single A720 core around 3874 and the A520 cluster way behind at about 1617–a nice reminder that workload placement matters on this SoC.
Memory bandwidth on the A720 cores was respectable: libc memcpy in the 15-17 GB/s range, memset often 35-47 GB/s.
The A520 results were dramatically lower across the board.
Memory Bandwidth
An interesting twist I lost some time exploring is that you can actually see some differences per CPU cluster, which is new for me in ARM machines:
Memory bandwidth by CPU cluster
Blue bars are memcpy (read-then-write), red bars are memset (pure write). The A520 cluster is roughly half the bandwidth of the A720s across both. This matters for inference because memory access patterns land on whichever cores the scheduler picks, and a hot path pinned to the efficiency cluster is immediately noticeable.
Thermals
On a quiescent system, sensor readings were good–most blocks hovered in the high twenties to low thirties Celsius:
GPU_AVE: 29°C
NPU: 30°C
CPU_M1: 30°C
CPU_B0: 32°C
PCB_HOT: 33°C
The thermal logs during the benchmarks were more reassuring than I expected:
idle and light-load readings sat mostly around 29-33°C across GPU, NPU and CPU blocks
under the longer benchmark runs, board and package sensors generally rose into the mid-30s to about 40°C range, which is very good (but, as you’d expect, audibly noticeable from outside the closet)
frequency traces showed the active cluster spending long stretches pinned at its target clocks before later dropping back, which looked much more like workload phase changes than panicked throttling
One benchmark artifact I largely ignored was the iozone run, because it was aimed at /tmp and therefore mostly measuring the memory-backed path rather than telling me anything meaningful about persistent storage.
Here’s a new chart that tries to capture thermals and frequency a little better than my old ones:
Thermal and frequency trace during sbc-bench run|653
The above covers the full sbc-bench session–roughly 40 minutes of mixed workloads.
The three shaded phases correspond to what was running at the time: a short iozone burst (memory-backed, not interesting), the main sbc-bench battery (OpenSSL, 7-Zip single and multi-threaded, tinymembench across all clusters), and the trailing cooldown.
The key thing to notice is that frequency stayed pinned at target clocks throughout the heavy phases and only dropped back during transitions–there was no thermal throttling, which is pretty amazing.
Temperature peaked around 43°C during the sustained multi-threaded 7-Zip run, which is well within spec for a board with active cooling. The idle baseline was around 29°C, and it settled back there fairly quickly once the load came off.
One thing I could not track was fan speed, since the cix-ec-fan interface does not expose current RPM or duty cycle, and I had no way to correlate the thermal curve with what the fan was actually doing at each point. I could hear it spin up and settle, but I have no real data to overlay, and even though I considered setting up a dB meter, I never got around to it.
All of the above covers the first week or so. But I’ve been running this board as an always-on machine since March 8, and by now have a month’s data on what it’s like to live with.
The board now hosts a piclaw instance (my personal assistant) that I’ve been using for development and model testing, since I realized that LFM2-8B-A1B made for a faster thing to experiment with (31 t/s generation, 47 t/s prompt on Vulkan) even if it’s effectively not that “smart”.
Alongside the assistant work, I’ve been using the board for a real development project: porting the BasiliskII classic Mac emulator’s JIT to AArch64.
Over the past month that has meant a good deal of compilation, linking, automated experiment runs and testing. The JIT now executes real 68k ROM code with basic optimisations–interrupt delivery and display rendering are the active frontier, but it boots to a Mac OS desktop every now and then. The constant rebuilds around AArch64 JIT bugs I hit (broken optflag inline asm bindings, various register allocation and flag bugs in codegen_arm64.cpp, VM_MAP_32BIT allocation failures, repeated runs at fixing emulated 68k interrupt delivery) were genuine low-level issues that exercised the board’s toolchain and memory subsystem in ways no synthetic benchmark would, and it’s been working great.
One thing that came up in every review of the CIX P1 I read–Jeff Geerling’s Orion O6 writeup being the most prominent–is power draw, and I have a month’s worth of data to confirm that it’s higher than average–averaging at 15.5W, rather than the usual 13W that I see quoted in other places:
Orange Pi 6 Plus wall power over 30 days
The flat zeros on the left are the setup period when I was reflashing and debugging offline. Once it came up as an always-on machine the power draw settled into a consistent daily pattern.
Orange Pi 6 Plus wall power over 7 days
Zooming into the last week at 15-minute resolution, the daily idle/load cycle is clearly visible–overnight the board drops to about 15-16W, and during the day it hovers around 20-27W depending on what I am doing. Compilation and inference bursts push it briefly toward 30W; the rest of the time it sits comfortably in the low twenties.
That said, the idle floor of 15-16W is noticeably higher than what I am used to from other SBCs. A Raspberry Pi 5 idles around 3-4W, an RK3588 board typically settles around 5-8W, and even a Mini PC with an N100 can idle below 10W.
The Orange Pi 6 Plus never really gets below 15W even with nothing running, and that appears to be a common trait of the CIX P1 reference design rather than anything specific to this board–the Radxa Orion O6 (same SoC) shows a very similar baseline in the reports I have seen.
Whether that is down to the memory controller, the 5GbE PHYs, the always-on fan or some combination of all three, I cannot say for certain. But it does mean the board is less attractive as a low-traffic always-on appliance than the raw compute-per-watt numbers might suggest. At 15W idle you are paying about 130 kWh/year just to keep it breathing, which is not terrible but is not nothing either.
Orange Pi 6 Plus current draw over 7 days
I checked, and current draw mirrors the power profile and stays well under 0.2A on the 230V circuit. The board’s power supply is not doing anything exotic.
Mains voltage on the office circuit over 7 days
The voltage trace is mostly here for completeness–Lisbon mains hovering around 230-232V with the usual overnight sag and daytime recovery. Nothing that would stress any reasonable power supply, and useful as a sanity check that the power readings are not being skewed by wild grid swings.
Reboots over the month: essentially none that weren’t my doing. The board has been stable in a way I did not expect from the early boot-chain experience.
After all of this, the Orange Pi 6 Plus fits a fairly specific set of roles:
local inference experiments with carefully chosen models
edge-side telemetry or monitoring
compact Linux services that benefit from dual 5GbE
infrastructure roles where you want something denser and lower-power than x86 but more capable than the usual toy SBC
I wouldn’t use it as a general-purpose desktop, and I wouldn’t trust the NPU story for anything LLM-related without more soak time. But I would keep it around for the sort of edge-AI and systems work I usually get drawn into–enough real capability to justify the effort, even if that effort is, right now, unreasonably high.
Even considering that I cut a lot of corners on the software side to get to a usable state, the hardware is still very much ahead of the software.
The GPU works, the NPU stack exists in some recognisable form, and local AI is not only possible but occasionally good, and I like what it can do, even if the power consumption and fan noise are higher than I would like for a board in this class, but compared to Rockchip’s offerings, it’s a much more polished experience–and the fact that I can get it to do useful work at all by myself, with my own OS image, is a testament to the progress ARM boards have made in the last couple of years.
The Wii is, indeed, a PowerPC machine, but getting Mac OS X to boot on it still requires a fair amount of kernel hacking—never mind the real life altitude it was actually written at, although it does confirm that flight time can, indeed, be used productively.
This was a shorter work week partly due to the Easter weekend and partly because I book-ended it with a couple of days off in an attempt to restore personal sanity–only to catch a cold and remain stuck at home.
In practice that meant I spent most of it hacking on things I’ve been putting off for weeks, and after we finally made it to the countryside, the weather was nice enough to spend most of my time lounging about with nothing but my phone and the Xteink X4–which is about as close to a holiday as I’m going to get this quarter.
I’m also partway through Patrick McGee’s Apple in China, which is one of those books that makes you want to check the news every few pages to see if anything has changed since the last chapter. The thesis–that Apple’s dependence on Chinese manufacturing is a strategic vulnerability that could be exploited overnight–has already become obvious thanks to last year’s geopolitical upheaval, but McGee’s sourcing and detail make it feel much more concrete than the usual hand-wringing.
I wrote this up separately, but the short version: it’s an absurdly small e-ink reader that costs almost nothing, feels better made than it has any right to, and reminds me of reading e-books on a Palm PDA. No front light, which limits it to daylight and decently lit rooms, but for the price and the form factor I’m not complaining. I’ve been carrying it around all weekend and it’s a genuinely pleasant little device.
I had a slight whoopsie in my Proxmox cluster (lite LVM doesn’t fully isolate volumes, so one of my new machines corrupted another two…), and Piclaw helped me fix everything and restore from backups with nothing but raw API calls to the cluster. That was enough motivation to make things official and add SSH, Proxmox and Portainer tooling, all lazy-loaded to save LLM context.
I now have a fourth instance running on my network that can discover, inspect and manage VMs and containers across the cluster without helper scripts, which is pretty neat. I’ll be rebuilding my GPU sandbox with it in a couple of days, and have it update my Obsidian notes directly.
I spent a good chunk of this weekend working on finishing that review I’ve been going on about for weeks. Since I have a Piclaw instance on it, I asked it to update benchmark data, charts and diagrams remotely, and revise my notes on the entire thing (GRUB, resize-filesystem bugs, three rounds of reboot-fallback patches).
But last week I was using it to work on an AArch64 JIT for BasiliskII, and it’s tantalisingly close to done. The bringup consumed most of my evenings, ranging from byte-order bugs–graphics corruption, palette rendering, JIT dispatch–to a steady stream of fixes to the original JIT source, with every emulated instruction block exposing a new flag-handling bug.
This is the sort of work I would never have been able to do without AI. I do have a background in 68k assembly (and a shelf of ancient books), but things like register mapping and memory alignment fixes would have gone right past me. And yet, neither Codex nor Claude were able to see the big picture–like realising that IRQs were messing with JIT state and needed proper handling:
Piclaw spotting that C function calls from compiled code were clobbering native registers the JIT expected to be preserved
The X flag (extend/carry) was the recurring villain–the models found bugs in how it was copied, masked and converted between the interpreter’s format and the JIT’s native representation. A wrong bit position in DUPLICATE_CARRY, COPY_CARRY not masking to the carry bit, LSL missing from legacy_copy_carry_to_flagx, and a countdown variable type mismatch that broke tick synchronisation. All individually plausible, collectively maddening until you realise it was polluted by both IRQ handling and state tracking whenever you jumped out of the JIT.
Another great thing about AI is effortless documentation: a comprehensive bringup document now lives in the repo tracking the bugs found so far, and it’s easy to trim out the fluff and keep things factual.
Right now the JIT boots to a Mac OS desktop with basic optimisations, but full JIT is still a work in progress. However, even the partial speedups are already a bit beyond what Classic benchmarking tools can handle, which is rather fun:
Even partial JIT speedups are already beyond what Classic benchmarking tools can handle
With any luck, I should get this working reproducibly in another week or so.
I got an Xteink X4 this week, and my first reaction was somewhere between amusement and nostalgia–it is absurdly small, feels a lot better made than I expected for the price, and the form factor harks back to the times when I was reading e-books on Palm PDAs and the original iPod Touch.
The X4 during a weekend outing
Disclaimer: Even though I paid for the device with my own money, this can be considered a review, so I’m linking to my review policy for consistency.
I had been tracking the hype around the X4 for quite a few months, and part of the appeal here is obvious: it is cheap, tiny, and simple in a way that most e-readers stopped being years ago. But one of the interesting parts for me is that it uses an ESP32-C3 and has already attracted a small but very active firmware community, which means that unlike most budget reading devices, this one has a decent chance of getting better after you buy it.
And yeah, I’m a sucker for a new gadget, and this was both cheap and moddable enough to be a no-brainer purchase.
The hardware is exactly the sort of compromise I expected, but with a few pleasant surprises. The body is light, pocketable and thin enough that it feels more like a phone accessory than a “real” reader, and the physical buttons are, at least initially, better than I feared.
The screen is small enough that I don’t think of it as a Kindle replacement–not in the way I thought of the Supernote Nomad as an iPad substitute for a very narrow set of tasks–but more as a sort of digital paperback fragment, something meant for short stretches of reading or carrying around just in case.
There is already an even smaller device, but Xteink decided to remove the USB-C port in favor of a pogo pin connector, and that was more than enough reason for me to pass on it.
That said, the lack of a front light is immediately noticeable, and not in an abstract “spec sheet” way. It changes where and when I can use it, and means I am already mentally classifying it as a daylight or well-lit-room device. That is fine for a toy, or for commuting, but it can be a meaningful constraint if you read in bed or on planes (so far I haven’t really had any issues with it and my bedside table lamp, but this isn’t a deep night reading gadget).
The other immediate hardware tell is that everything about it has been budgeted very carefully–screen size, battery, controls, CPU, software assumptions–and that is both the problem and the charm.
The absence of a touchscreen is, I think, the defining choice here.
On the one hand, I can see the appeal: fewer layers of UI indirection, fewer opportunities for sluggish touch handling, and a somewhat more deliberate feel when all you want to do is page forwards and backwards. On the other hand, every task that is not pure reading becomes a little awkward, and that awkwardness adds up very quickly once you get to Wi-Fi setup, library navigation or anything involving text entry.
This feels a bit like old Palm and iPod territory–perfectly usable once muscle memory kicks in, mildly exasperating until you get used to it.
I have not had it long enough to make sweeping claims, but one thing is already clear: the X4’s real value is not the stock firmware, and the very first thing I did after getting it was flashing CrossPoint Reader on it (and I just updated it to 1.2.0 before posting this).
And yes, the reason this device exists in my head at all is the CrossPoint ecosystem, and the fact that there are already multiple forks with visibly different goals–plain upstream CrossPoint, CJK-focused builds, reading-centric mods, and at least one gloriously odd fork that adds a virtual chicken to the whole thing.
That changes the equation quite a bit. Without that community, the X4 would just be an interestingly cheap, slightly awkward e-reader. With it, the hardware becomes a small platform–limited, yes, but still a platform, and something that I can fool around with myself.
And that matters, because the ESP32-C3 underpinnings imply a level of hackability that most mainstream readers don’t even pretend to offer.
This is the part I still need more time with, but which has been a resounding success over the past three days (although that is certainly due to my long history with tiny screens). With CrossPoint, page turns are snappy, chapter navigation a matter of 2-3 clicks, and the default Bookerly font is pretty much perfect.
Although it feels a bit weird to have gotten another, pocketable screen when I effectively work from home and thus have little need for a “snackable” reading device that is always on my person, I find it more appealing (and purposeful) than digging out the Kindle app on my phone.
My instinct is that the X4 will be best for the sort of fragmented reading I typically do when traveling or in short breaks throughout the day–in the Instapaper days, I would take my iPod and read articles converted and pushed over in batches–rather than long, immersive reading sessions. The size almost guarantees that. It is not trying to disappear the way a larger reader does; it is trying to always be there.
Whether that is enough depends entirely on friction, and with fast page turns, Wi-Fi support and an OPDS client in CrossPoint, that seems quite well in hand.
The missing piece of the puzzle was getting books on it, and even though CrossPoint provides a nice on-device web server to manually upload files to it (which is pretty amazing for an ESP32), I decided to flip the issue around and hack together a very quick Bun OPDS server that works beautifully with my Calibre setup and the X4, letting me browse all my libraries and download books to it without any manual file management at all.
That was a fun little project, and it is already making the X4 feel much more like a “real” reader, to the point where I’ve already started modernizing my ancient Instapaper pipeline in favor of something “better”.
In fact, this might be the push I needed to move away completely from the Kindle ecosystem–I have long preferred to get DRM-free EPUBs, and I can get my Nomad to use OPDS via KOReader…
I keep thinking back to the Nomad, partly because both devices are trying to sell focus through constraints, but they go about it in very different ways.
The Nomad tries to be a deliberate, paper-adjacent environment for writing and reading–and often succeeds, even if syncing and workflow integration kept getting in the way (at least until I found the right combination of SyncThing and Obsidian plugins). The X4 feels less ambitious and, paradoxically, more interesting because of that. It is not trying to be a notebook, or a paper surrogate, or a productivity system. It is just a tiny e-reader with enough open firmware momentum to become something slightly stranger.
I think that honesty may end up working in its favour.
Since this is meant to be pocketable and I always end up stuffing my jacket pockets, I printed a hard case for it to protect the screen and buttons, and am quite happy with the results.
What I want to do next is fairly straightforward:
Spend a few more days on CrossPoint 1.2.0 and hack away at a moderately sane content pipeline that forces me to read interesting articles on it rather than bookmarking them into a black hole of oblivion.
Compare that against either the Enhanced Reading Mod or CrossPet–the former because it sounds sensible, the latter because it sounds gloriously unserious.
Hack away a bit more at that OPDS server to see what I can do about syncing reading progress (I’m the kind of person who never used bookmarks because I had zero issues memorizing page numbers, but I do like the convenience of Kindle’s Whispersync).
Figure out over a month or so whether this thing fits my actual reading habits, or merely appeals to my taste in tiny hackable hardware.
And, if all else fails, turn it into a mini TRMNL–the software for that already exists…
Right now, I think the Xteink X4 is more interesting gadget than a fully fleshed out product–but that is not necessarily a criticism. Some of the best gadgets I’ve owned started out exactly that way.
Mar 31st 2026 · 1 min read
·
#arena
#concert
#meo
#music
#photo
#photography
Work ate the week again. I’m exhausted, running on fumes, and daylight saving time stole an hour of sleep I could not afford–the biannual clock shuffle is one of those vestigial absurdities that nobody can be bothered to abolish, and I’m starting to take it personally.
I did manage to get my AI minions to do a proper refactor of the piclaw codebase (which was desperately needed), spent a bit of time cursing at the SK1 (and concluding that I need a new 3D printer), and that’s about it. Meetings, deadlines, the usual corporate grind.
This was the week of the great piclaw codebase reckoning–299 commits, all of them aimed at breaking apart the monolithic mess it was becoming before it got completely out of hand.
Agentic development works, but, again, you need to have taste and force a feedback loop to get good results. But you can automate away the boring parts, mostly.
The whole thing was driven by an autoresearch-style loop–I basically adapted Karpathy’s approach of having an LLM do research on a codebase, generate a plan, execute it, and then verify the results, except in my case the “research” phase also involves running the test suite and feeding failures back as context. It works very well for mechanical refactoring like this, where the risk of hallucination is low and the reward for not doing it by hand is immense:
I still haven’t written up those SBC benchmarks I keep promising, but I have been using the SBC extensively–in fact, I dusted off my macemu fork (BasiliskII/SheepShaver with Raspberry Pi-optimized SDL2 builds, Docker images, and .deb packages) and got that running on it.
It’s been sitting at v1.0.5-patch-3 since February, but Marchintosh guilt finally got me to fire it up and poke at it. No new Mac Classic replica yet, but at least the emulator is working, and I am back trying to get an ARM64 JIT working in it, which is a fun challenge:
Orange Pi 6 Plus JIT status report from piclaw
I expect to have something to show on that front… this year?
Update: An hour later… Works, but still needs a fair amount of clean up, and it’s actually slower than the interpreter for now, which is a bit embarrassing:
JIT booting to desktop
Mar 27th 2026 · 1 min read
·
#apple
#design
#hacks
#macos
#ui
This is absolutely hilarious. The infuriating window corner roundness in Tahoe has been bugging me too–and this is a brilliant take on the problem.
Instead of disabling SIP and patching system apps to remove the rounded corners (which is the usual approach), this simply forces a consistent corner radius across all third-party apps via a DYLD-injected dynamic library.
It’s a small thing, but inconsistency in UI chrome is the kind of detail that, once you notice it, you can never un-notice. The fact that Safari has different corner radii from other apps is inexcusable–and that’s before the Liquid Glass disaster made everything look like a Fisher-Price toy dipped in vaseline. I appreciate the “if you can’t beat them, at least make them all equally ugly” philosophy here.
The implementation is old-timey, straightforward Objective-C method swizzling on NSThemeFrame–nothing exotic, but the approach of skipping com.apple.* bundles and only touching third-party apps means you don’t need to mess with SIP at all. That alone makes it worth bookmarking.
Mar 25th 2026 · 1 min read
·
#ai
#arm
#chip
#cpu
#hardware
#inference
The fact that ARM, whose entire business model revolved around licensing CPU designs, has decided to actually go and build their own chips is remarkable by itself, but the design specs (and power envelope) are very interesting.
I have been keeping tabs on the dedicated inference hardware space ever since I got wind of Cerebras, and I like the idea of special purpose/optimized CPU designs that would remove (or at least lessen) our dependency on NVIDIA (and GPUs in general) to run AI models, because that is the way to make it cheaper, less power hungry and, eventually, desktop-sized.
I do find it stupid to refer to this as an AGI CPU, though.
Mar 22nd 2026 · 2 min read
·
#agents
#ai
#balance
#bun
#dev
#life
#notes
#piclaw
#typescript
#weekly
#windows
#work
This week’s update is going to be short, largely because work was hell and I ended up spending my Saturday evening poring through my meeting notes backlog until 2AM today and I have a splitting headache to show for it.
There will be a smattering of Wiki updates on the RSS feed as well, since I realized I had a bunch of updates pages sitting uncommitted in the queue that I hadn’t gotten around to pushing yet, but the only relevant updates I have right now are that piclaw got a major package layout overhaul and a new live widget system, while vibes got some shared UI improvements and a new agent avatar caching system.
And, most importantly, I have shifted from building the tooling to using the tooling, which is long overdue.
Here’s a good example of that:
Smith and Gates, happily coexisting in perfect harmony
Yes, that is a Linux agent and a Windows agent setting up Bluetooth passthrough on a VM. And yes, “Smith” manages my Proxmox cluster, as well as many other things now.
And yes, piclawsort of runs on Windows now (in bare/non-sandboxed/YOLO mode), thanks to Bun’s incredibly good cross-platform support. I gave it a VM to break and a PowerShell extension to play with, and although I’m still deeply sorry I did it, it works well enough that I don’t need to care about the OS.
I now have four or five semi-permanent installs running on various machines (and a couple of Azure and Oracle VMs), and having them all accessible from my phone has completely changed how I deal with remote management–I can check on agents, trigger actions, and do light debugging from anywhere, which beats the old “SSH in and hope for the best” approach by a wide margin.
And soon, they will talk to each other. And who knows, I might do a desktop app at some point, but for now the web interface is doing just fine.
I never did get around to writing up those SBC benchmarks I mentioned last week. The hardware is sitting on my desk, the notes are in my Obsidian vault and I have a piclaw instance that is happy as a clam trying to get various local models running on it, but, again, work was just too much this week and I plan to spend the afternoon watching a movie and doing some light reading instead.
Mar 15th 2026 · 3 min read
·
#agents
#ai
#dev
#mcp
#notes
#typescript
#weekly
Well, there went another work week. Slightly better (to a degree, although I got some discouraging news regarding a potential change), and another week where piclaw ate most of my evenings–it went from v1.3.0 to v1.3.16 in seven days, which is frankly absurd even by my standards.
But there was a lot of ground to cover, and it’s turned into a sort of “agentic IDE” thing at this point, terminal and all:
piclaw
Yes, it looks like VS Code. But I suspect everything does at this point
Most of the week went into reliability work. I spent a day or so doing a full refactor, and then got three different LLMs to do a comprehensive full-stack audit of the turn/queue/steering system–which turned up a bunch of race conditions in message submission that I’d been chasing for weeks (plus proper queue deduplication and retry caps, which I should have added from the start). The deferred followup system I was using for steering was also broken in subtle ways–turns were inheriting stale thread roots, which caused all sorts of weirdness. The fun bits were the visual polish and theme support, but those came after the plumbing was solid.
On the UX side, I added Adaptive Cards rendering to the chat timeline (with validation and lifecycle flows)–the idea being that agents can now push structured forms, tables, and interactive elements into the conversation instead of just Markdown. The workspace editor got markdown attachment previews and a Ghostty-based terminal lifted directly from webterm, plus a bunch of pipework for future multi-chat support (which is going to be a whole thing).
All of it involved, as you would expect, a lot of agent herding, and I had plenty of time to come up with stuff like this:
I couldn't help myself
And I cleaned up all the JS vendoring, since like pi itself, piclaw is now an extension host, so most of the new features are actually implemented as extensions.
python-office-mcp-server
I dusted off a Python-based Office document MCP server I’d been building alongside go-ooxml and carved it into its own repository. It’s a pretty comprehensive set of tools for reading, editing, and generating Word/Excel/PowerPoint documents via MCP–unified interface, template support, auditing, the works. It’s stable enough that I decided to publish it as-is, with zero ongoing support commitment. Sometimes the best thing you can do for a side project is to ship it and walk away.
As to this site, I did another big batch of old page conversions–around 200 wiki pages from the hardware and apps sections got modernized from Textile to Markdown with proper frontmatter, descriptive image filenames, and cleaned-up links. Some of these pages date back to 2002, which is always a weird trip.
What I should be doing, though, is celebrating Marchintosh and building a new Mac Classic replica–but all I’ve done hardware-wise has been benchmarking SBCs for a potential new project. I hope to clean up my notes and post something about that next week.
Mar 15th 2026 · 1 min read
·
#a18
#apple
#hardware
#mac
I went to a local mall yesterday and happened to chance upon a couple of MacBook Neos on display at our local (monopolistic) Apple retailer1, and spent a half hour playing with them.
The display is… fine? With their strategic placement away from the other laptops on display, side-by-side comparison was impossible, but I didn’t find it bad or dim. The keyboard also felt OK, although I am wary of white keyboards, as they tend to get dirty and look bad over time. The trackpad was… also fine. I don’t press to click, so comparing the physical click with the taptic engine doesn’t make sense to me.
However, the bezels were a bit thicker than I am used to (hence noticeable) and I do think that the overall size could be smaller (even if that would be clearly against mainstream taste). It’s not something I would choose for a cheap travel laptop (especially considering I have the Minibook X), but volume-wise, and in “backpack scale”, it’s not that far off from the 13” MacBook Air if you squint.
To my surprise, it ran this site’s 3D visualization quite snappily, which is no small feat (most PC laptops struggle with it, regardless of what browser and GPU they have). And although I didn’t have the chance to run any benchmarks, RAM usage in Activity Monitor was pretty much OK after launching a gaggle of Apple apps, which wasn’t surprising (their software load doesn’t include any of the Electron bloat everyone has to deal with to some degree).
In general, I think it makes a killer Chromebook/PC laptop replacement for school, and although I expect the A18 to not be a powerhouse, it felt quite snappy, even with multiple apps open. And I’m sure Apple went to some pains to position it where it wouldn’t threaten either the Air or the iPad Pro.
I’d buy one in a flash if it was smaller (well, if I had money to spare), but at least the A18 performance makes me optimistic for whatever they decide to put into the next iPad mini, since my 5th generation one is in desperate need of an upgrade.
No, Portugalstill does not have official Apple Stores. I’ve given up on it and just order online. ↩︎
Mar 8th 2026 · 6 min read
·
#agents
#ai
#coding
#dev
#editors
#security
We’re three months into 2026, and coding agents have been a big part of my time since last year–things have definitely intensified, and one of my predictions has already panned out: agents are everywhere.
This was a frankly absurd week work-wise, with some pretty long days and a lot of late-night hacking on my projects (which is not exactly a new thing, but at least now I am asking piclaw to do it during the day time, which is a small improvement).
This is just lovely. If, like me, you grew up with the LEGO Space collection and loved the artwork on those pieces, and do 3D printing, this 10:1 scale recreation with a Mac mini and a 7 inch display will make your day.
I’m just a bit sad that the cabling is still very visible, but you can grab the files from Makerworld and give them your own spin.
Mar 4th 2026 · 1 min read
·
#a18
#apple
#hardware
#mac
I know a bunch of people will disagree, but this is the most relevant Mac announcement in years for two reasons:
It’s the first new Mac model in a while that isn’t just a spec bump, but rather a new product line with a clear target audience and a pretty aggressive price point (at Apple standards, that is).
It’s not running on an M-series chip, which is a bold move that could have significant implications for Apple’s product strategy and the broader Mac ecosystem.
The fact that it has “only” 8GB of RAM and 256GB of storage (which is OK if you think of it as a school machine) is going to be widely maligned, whereas I would focus on the missed opportunity to make it even more portable by shipping a 12” display instead of 13” (probably some sort of golden ratio thing) and the unbelievable stinginess of shipping with a USB-C 2.0 port.
What? You couldn’t afford a USB-C 3.0 port? Really? I mean, I get that this is an entry-level machine, but come on, Apple.
Update: this seems to be a limitation of the A18 chipset’s I/O setup, from what I’m reading. There’s a lot of chip information out there now, including breakdowns of the new M5 lineup that are worth perusing as well.
That said, I would swap my iPad Pro for it in a flash (if it had a 12” display, that is). And that is probably exactly why it is that big.
Mar 1st 2026 · 5 min read
·
#agents
#ai
#dev
#golang
#notes
#security
#weekly
This is a great round-up, and it isn’t hard to spot the main themes-great hardware, absolutely damning feedback on software quality on so many aspects (from the Liquid Glass Tsunami to people straight out avoiding installing Tahoe) that I cannot help but agree on (especially considering my current travails).
The best possible outcome from this is that Apple backtracks on the mess they created last year.
The most likely one is that they will simply carry on without acknowledging any of it publicly and discreetly patch the most critical issues, because they are still making tons of cash on hardware and services and software quality really hasn’t been a priority in half a decade.
At this point, I am even starting to question if they still have the talent (or the ability to retain it), especially considering that the people from most startups they’ve acquired over the years keep leaving. And I know for a fact that they stopped recruiting remotely a few years ago, which definitely hasn’t helped.
Feb 21st 2026 · 2 min read
·
#agents
#ai
#automation
#home
#notes
#siri
#weekly
#wellness
This week I did something different: I took a wellness break from work and generally tried to tune out all the noise and messiness I have been experiencing there. It ate a chunk out of my PTO, but was mostly worth it.
I have no idea of what is happening since I can’t even find any decent logs in Console.app, but it seems that the latest update to macOS Tahoe (26.3) has a serious bug.
So, I finally got around to opening the Maclock I’ve had sitting around for a while, and I was looking at the STL files for the screen bracket that are over on Printables.
Since I want to fit a lot more electronics into the case, one of the things I need to do is modify that bracket a bit so I can actually use the brightness knob to drive a potentiometer instead of just a dummy knob.
But you can’t really edit STL files, and I don’t have the original CAD files, so I thought “well, maybe I can just ask an LLM to reverse-engineer the STL into an OpenSCAD file, and then I can modify that”.
So I created a SPEC.md file with the following content, spun up a new agentbox container to handle it, and gave it to pi:
# STL to OpenScad reconstruction
You are running inside a Debian system with sudo, brew, python3 and whatever you need to accomplish this task.
Your job is to take the STL file in this workspace and reconstruct a copy of it using OpenSCAD.
If the STL is too large or complex, you can try decimating it/using octrees/segmenting it/using marching cubes as needed.
## Requirements-The final STL file must be within 0.1mm accuracy of the original
-The OpenSCAD code must be modular, readable, and have relevant constants surfaced at the beginning for a customizer
-You should focus on building reusable tools, not one-off scripts
-You should maintain a plan.md file with your progress checklist and tick off items (including failed approaches) as you make progress
Much to my surprise, Claude Opus 4.6 was able to handle this task pretty well, and it produced a stupefyingly accurate OpenSCAD file:
// WonderMac Screen Bracket - OpenSCAD Reconstruction// Reconstructed from STL analysis// All dimensions in mm/* [Main Dimensions] */// Total width of the bracket (X direction)bracket_width=68;// X: -50 to 18// X offset (left edge position)bracket_x_offset=-50;/* [Back Plate] */// Height of the tall vertical back plateback_plate_height=21.5;// Z: 0 to 21.5// Thickness of the back plate (Y direction)back_plate_thickness=2;// Y: -16.5 to -14.5// Y position of back plate inner edgeback_plate_y=-16.5;/* [Top Shelf] */// Bridge depth (connects back plate to top lip)bridge_depth=5.5;// Y: -14.5 to -9// Bridge heightbridge_height=2;// Z: 0 to 2// Top lip depthtop_lip_depth=2;// Y: -9 to -7// Top lip heighttop_lip_height=5;// Z: 0 to 5/* [Frame] */// Frame outer Y boundsframe_outer_y_min=-66.5;frame_outer_y_max=-14.5;// matches back plate// Frame inner bounds (lower section Z=0-7)frame_inner_x_min=-47;frame_inner_x_max_lower=18;// open right sideframe_inner_y_min_lower=-64.5;frame_inner_y_max_lower=-16.5;// Frame inner bounds (upper section Z=7-9)frame_inner_x_max_upper=13;frame_inner_y_min_upper=-62.5;frame_inner_y_max_upper=-17.5;// Frame heightsframe_lower_height=7;// Z: 0 to 7frame_upper_height=2;// Z: 7 to 9/* [Bottom Lip] */// Lip extension below framelip_y_min=-70.5;lip_y_max=-66.5;lip_z_min=3;lip_z_max=5;lip_height=2;// lip_z_max - lip_z_min/* [Extended Plate] */// L-shaped plate at Z=5-7plate_z_min=5;plate_z_max=7;plate_height=2;// Bottom part of L (full width)plate_bottom_y_min=-9;plate_bottom_y_max=-3;// Left part of L (extends upward)plate_left_y_max=15;plate_left_x_max=-30;/* [Screw Post] */// Cylindrical post with through-holepost_center_x=-41;post_center_y=10.5;post_outer_radius=2;post_inner_radius=1;post_z_min=0.5;post_z_max=5;// meets plate bottom/* [Screw Holes] */// Hole radius for mounting holesmount_hole_radius=0.75;// Lip mounting holeslip_hole_left_x=-47.75;lip_hole_right_x=15.75;lip_hole_y=-69;// Plate mounting holesplate_hole_left_x=-46;plate_hole_right_x=15;plate_hole_y=-5;/* [Resolution] */$fn=64;// ============================================// Modules// ============================================moduleback_plate(){translate([bracket_x_offset,back_plate_y,0])cube([bracket_width,back_plate_thickness,back_plate_height]);}modulebridge(){translate([bracket_x_offset,back_plate_y+back_plate_thickness,0])cube([bracket_width,bridge_depth,bridge_height]);}moduletop_lip(){translate([bracket_x_offset,-9,0])cube([bracket_width,top_lip_depth,top_lip_height]);}moduleframe_lower(){difference(){// Outer rectangletranslate([bracket_x_offset,frame_outer_y_min,0])cube([bracket_width,frame_outer_y_max-frame_outer_y_min,frame_lower_height]);// Inner cutouttranslate([frame_inner_x_min,frame_inner_y_min_lower,-1])cube([frame_inner_x_max_lower-frame_inner_x_min,frame_inner_y_max_lower-frame_inner_y_min_lower,frame_lower_height+2]);}}moduleframe_upper(){difference(){// Outer rectangle (same as lower)translate([bracket_x_offset,frame_outer_y_min,frame_lower_height])cube([bracket_width,frame_outer_y_max-frame_outer_y_min,frame_upper_height]);// Inner cutout (smaller = thicker walls)translate([frame_inner_x_min,frame_inner_y_min_upper,frame_lower_height-1])cube([frame_inner_x_max_upper-frame_inner_x_min,frame_inner_y_max_upper-frame_inner_y_min_upper,frame_upper_height+2]);}}modulebottom_lip(){translate([bracket_x_offset,lip_y_min,lip_z_min])cube([bracket_width,lip_y_max-lip_y_min,lip_height]);}moduleextended_plate(){// Bottom part of L (full width, Y=-9 to -3)translate([bracket_x_offset,plate_bottom_y_min,plate_z_min])cube([bracket_width,plate_bottom_y_max-plate_bottom_y_min,plate_height]);// Left part of L (Y=-3 to 15, X=-50 to -30)translate([bracket_x_offset,plate_bottom_y_max,plate_z_min])cube([plate_left_x_max-bracket_x_offset,plate_left_y_max-plate_bottom_y_max,plate_height]);}modulescrew_post(){translate([post_center_x,post_center_y,post_z_min])cylinder(r=post_outer_radius,h=post_z_max-post_z_min);}modulescrew_post_hole(){translate([post_center_x,post_center_y,-1])cylinder(r=post_inner_radius,h=back_plate_height+2);}modulelip_holes(){// Left lip holetranslate([lip_hole_left_x,lip_hole_y,lip_z_min-1])cylinder(r=mount_hole_radius,h=lip_height+2);// Right lip holetranslate([lip_hole_right_x,lip_hole_y,lip_z_min-1])cylinder(r=mount_hole_radius,h=lip_height+2);}moduleplate_holes(){// Left plate holetranslate([plate_hole_left_x,plate_hole_y,plate_z_min-1])cylinder(r=mount_hole_radius,h=plate_height+2);// Right plate holetranslate([plate_hole_right_x,plate_hole_y,plate_z_min-1])cylinder(r=mount_hole_radius,h=plate_height+2);}// ============================================// Assembly// ============================================modulebracket_assembly(){difference(){union(){back_plate();bridge();top_lip();frame_lower();frame_upper();bottom_lip();extended_plate();screw_post();}// Subtract all holesscrew_post_hole();lip_holes();plate_holes();}}bracket_assembly();
This is what the output looks like
But what is more important is that I was able to capture the entire process in a SKILL.md file, and it makes for an amazing read:
# SKILL: STL to Parametric OpenSCAD Reconstruction## Goal
Reverse-engineer a binary/ASCII STL mesh file into a clean, parametric OpenSCAD source file that reproduces the original geometry within a specified tolerance (e.g. 0.1mm Hausdorff distance).
## When to Use-You have an STL file of a mechanical part and need an editable parametric source
-The part is primarily composed of prismatic (box-like) and cylindrical features — not organic/sculpted shapes
-You need the output to be human-readable and customizable, not just a mesh re-export
## Prerequisites-**Python packages**: `numpy`, `trimesh`, `scipy`, `shapely`, `networkx`, `rtree`, `numpy-stl`-**System packages**: `openscad`-Install with: `pip3 install numpy trimesh scipy shapely networkx rtree numpy-stl` and `sudo apt-get install openscad`## High-Level Approach### Phase 1: Mesh Triage
Load the STL with `trimesh` and gather key statistics to understand the scope:
-**Vertex/face count**: Determines complexity. Under ~5k faces is likely a machined/printed part with clean geometry.
-**Bounding box and extents**: Gives the overall dimensions.
-**Volume and watertightness**: Confirms the mesh is valid and closed.
-**Euler number**: Computes genus (number of through-holes). Formula: `genus = (2 - euler_number) / 2`. This tells you how many holes to find.
### Phase 2: Identify Z-Level Structure
For prismatic parts (brackets, enclosures, mounts), the geometry is almost always built from features extruded along one principal axis. Identify which axis that is by examining the unique coordinate values of vertices.
1.**Find unique vertex coordinates** along each axis (rounded to ~3 decimal places). The axis with the fewest unique values is the extrusion/stacking axis.
2.**List the discrete levels** on that axis. For this bracket, Z had only 8 unique values: `[0, 0.5, 2, 3, 5, 7, 9, 21.5]`. Each pair of adjacent levels defines a "layer" of constant cross-section.
3.**Count up-facing and down-facing face areas** at each level. Up-facing faces at a Z-level mark the *top* of a feature; down-facing faces mark the *bottom* of a feature starting at that height. The area values serve as checksums for your reconstruction.
### Phase 3: Cross-Section Analysis
Take planar cross-sections at the midpoint of each layer using `trimesh.section()`:
1.**Slice the mesh** at each intermediate Z value (e.g. Z=0.25, Z=1, Z=2.5, etc.).
2.**Convert to 2D polygons** via `section.to_planar()` and examine the `polygons_full` property.
3.**Simplify polygons** with Shapely's `.simplify()` to reduce curved arcs to key vertices while preserving corners.
4.**Transform back to world coordinates** using the planar transform matrix to get actual XY positions.
5.**Record each polygon's exterior and interior (hole) boundaries**. Note how many vertices remain after simplification — a 5-point polygon is a rectangle, a 9-point polygon is an L-shape, a 17-point polygon is a circle approximation, etc.
Track how the cross-section *changes* between layers — this reveals where features start, end, merge, or split.
### Phase 4: Identify Geometric Primitives
From the cross-section data, decompose the shape into CSG primitives:
-**Rectangles** (5 simplified vertices = box cross-section): Record corner coordinates, extrusion height range.
-**L-shapes / U-shapes** (9+ vertices): Decompose into union of rectangles, or model as rectangle-minus-rectangle.
-**Circles / arcs** (17+ vertices after simplification): Compute center as midpoint of extremes, radius as half the span. Verify by checking vertex distances from the computed center — they should all equal the radius.
-**Rings/annuli** (polygon with circular hole): Outer and inner radius from the exterior and interior boundaries.
For each primitive, determine:
-XY bounds or center+radius
-Z range (which layers it spans)
-Whether it's additive (part of the union) or subtractive (a hole to difference out)
### Phase 5: Cross-Validate with Vertex Grouping
For extra confidence, directly examine the raw vertices at each Z-level:
-Group vertices by their Z coordinate.
-For levels with few vertices (≤20), print them all — these directly reveal rectangle corners.
-For levels with many vertices, look for clusters. Compute distances from suspected circle centers and verify constant radius.
-Check that circle parameters (center, radius) are consistent across multiple Z-levels.
### Phase 6: Build the OpenSCAD Model
Structure the `.scad` file for readability and customization:
1.**Constants at the top** in OpenSCAD Customizer sections (`/* [Section Name] */`). Every dimension gets a named variable with a comment showing its physical meaning and original coordinate range.
2.**One module per feature**: `back_plate()`, `frame_lower()`, `screw_post()`, `lip_holes()`, etc. Each module is self-contained and uses only the global constants.
3.**Assembly module**: A single `bracket_assembly()` module that `union()`s all additive features, then `difference()`s all holes. This keeps the boolean logic clean and makes it easy to toggle features.
4.**Resolution control**: A single `$fn` parameter controls circle smoothness globally.
Modeling patterns:
-**Rectangular frame**: `difference()` of outer `cube()` minus inner `cube()`.
-**L-shaped plate**: `union()` of two overlapping `cube()` calls.
-**Through-hole**: `cylinder()` with height extending past the material (add 1mm on each side with `-1` offset and `+2` height to ensure clean boolean cuts).
-**Ring/post**: `cylinder()` for the outer, with a through `cylinder()` subtracted.
### Phase 7: Render and Compare1.**Render** with `openscad -o output.stl model.scad`.
2.**Compare** using a reusable Python comparison tool that computes:
-**Bidirectional surface distance**: Sample 50k points on each surface, find nearest point on the other surface using `trimesh.nearest.on_surface()`. Report mean, max, 95th/99th percentile.
-**Volume difference**: Compare `mesh.volume` values.
-**Bounds match**: Check bounding boxes agree within tolerance.
-**Topology match**: Compare Euler numbers.
3.**Iterate** if the Hausdorff distance exceeds the tolerance. Common fixes:
-Wrong dimension by a small amount → re-examine vertex coordinates at that Z-level
-Missing feature → look at the worst-mismatch sample points to locate the problem area
-Circle approximation error → increase `$fn`### Phase 8: Verify the Accuracy Target
The final gate is the bidirectional Hausdorff distance. For this task the target was 0.1mm; the achieved result was 0.004mm (25× better than required). The residual error comes entirely from polygon approximation of circles (`$fn=64` gives a theoretical max error of `r × (1 - cos(π/64))` ≈ 0.0024mm for r=2mm).
## Key Lessons1.**Z-level analysis is the critical insight for prismatic parts.** If the mesh has only a handful of unique Z values, the part is a stack of extruded profiles and can be exactly decomposed.
2.**Cross-sections + simplification finds the primitives fast.** Shapely's `simplify()` with a small tolerance (0.05–0.1mm) collapses arcs to their key points while preserving sharp corners.
3.**Euler number tells you how many holes to find.** Don't stop looking for features until you can account for all `(2 - χ) / 2` topological handles.
4.**Face normal grouping separates flat vs. curved surfaces.** Axis-aligned normals (±X, ±Y, ±Z) are planar faces; all others are cylinder walls. The Z-component of non-axis normals reveals whether cylinders are vertical (Z=0) or angled.
5.**Up/down face area sums serve as checksums.** Compute the expected area of each horizontal surface from your model parameters and verify it matches the STL. This catches dimension errors before rendering.
6.**Model in original coordinates, not relocated.** Keeping the STL's native coordinate system avoids translation errors and makes comparison trivial.
7.**Build the comparison tool first.** A reusable `compare_stl.py` with surface sampling and Hausdorff distance makes iteration fast and objective.
## Reusable Tools
All tools live in `tools/` with CLI interfaces, `--help`, and `--json` output.
See `tools/README.md` for full usage.
| Tool | Phase | Purpose |
|------|-------|---------|
| `tools/stl_info.py` | 1 | Mesh triage: stats, topology, genus, components |
| `tools/stl_zlevel.py` | 2 | Find discrete height levels, face areas, vertex coords |
| `tools/stl_cross_section.py` | 3–4 | Slice mesh, extract & classify 2D polygons |
| `tools/stl_normals.py` | 4 | Face normal grouping, cylinder feature detection |
| `tools/compare_stl.py` | 7 | Bidirectional Hausdorff distance, volume, topology |
### Quick-start workflow
python3 tools/stl_info.py part.stl # What am I dealing with?
python3 tools/stl_zlevel.py part.stl --vertices # Layer structure + corners
python3 tools/stl_cross_section.py part.stl # Auto-slice cross-sections
python3 tools/stl_normals.py part.stl # Find cylinders and holes
python3 tools/stl_cross_section.py part.stl --axis x --at 0 # Hidden internal features
# ... write OpenSCAD model ...
openscad -o output.stl model.scad
python3 tools/compare_stl.py part.stl output.stl # Verify accuracy
## Deliverables
| File | Purpose |
|------|---------|
| `tools/` | Reusable analysis toolkit (see `tools/README.md`) |
| `bracket.scad` | Parametric OpenSCAD source with customizer sections |
| `bracket_output.stl` | Rendered STL for comparison |
| `plan.md` | Progress checklist with identified components and results |
…and yes, it also created tools for its own use. It’s not a chimpanzee using a stick to get at termites, but it is pretty close: it’s an LLM creating its own toolchain to get at the underlying geometry of a mesh.
This is far more sophisticated than I expected, and it shows that LLMs can be used for scoped reverse-engineering tasks with the right prompting and constraints–but it also shows that you need to be able to understand the problem domain and guide the model with the right structure and checks to get a usable result.
The caveat is that this is a very specific use case (STL to OpenSCAD for prismatic parts), and I wouldn’t expect the same approach to work for more complex shapes or different file formats without significant adjustments. But it’s very much above and beyond what we could do a year ago.
Now excuse me while I go and give it a development board’s dimensions and ask it to design a custom case for it…
Update: Pretty impressive results on the custom case as well–here’s a first draft of it, which is already pretty close to what I need:
An Orange Pi 6 Custom Case, generated after a few web searches and image references
Feb 15th 2026 · 3 min read
·
#ai
#go
#notes
#porting
#testing
I’ve been sleeping on doing mine, which has been sitting next to the larger 3D printed version for a couple of months now.
Even though I spent quite some time to try to improve Basilisk II performance, I kind of lost the plot with all the other AI hackery that took place since, but I I have all the right parts and everyone seems to have sorted out how to open the case without breaking anything, so this is very high up on my list of things to do over Mardi Gras.
Feb 9th 2026 · 1 min read
·
#french
#games
#rpg
#screenshots
#xbox
We’ve had one of these in the house for years, and it’s insane how good and genuinely smooth it’s been throughout.
Ours has survived two Apple TV generations and is still our go-to to watch most things, to the point where I’m starting to dread replacing it because there just isn’t anything else out there that is at least half as good in the Android space.
Feb 7th 2026 · 3 min read
·
#agents
#ai
#docker
#golang
#notes
#productivity
#python
#swift
#weekly
Xcode 26.3 getting official Claude and Codex integration without the usual guardrails is interesting enough, but having MCP in the mix is… unusually open for Apple.
Given Claude’s propensity to botch up Swift semantics, though, I’m happy they announced both–and although I’m not surprised they didn’t add any Apple models, I would have expected Gemini to be in the mix, even this early.
But at least they seem to have done their homework where it regards the in-editor agent harness–not sure how deep they went into IDE primitives (file graph, docs search, project settings), though, and the apparent lack of skills and other creature comforts that all the VS Code–based tools have is a bit of a bummer.
The fact that they put the chat interface on the left in total defiance of everyone else’s design is, of course, because they just had to mess with our heads.
Design is how it works, after all.
Feb 1st 2026 · 2 min read
·
#agents
#ai
#go
#notes
#python
#rdp
#specs
I’ve had some feedback that my last few weekly notes (especially over the holidays) have been a bit too long and that I should try to keep them shorter and more focused on a single topic.