Notes for March 30 – April 5

This was a shorter work week partly due to the Easter weekend and partly because I book-ended it with a couple of days off in an attempt to restore personal sanity–only to catch a cold and remain stuck at home.

In practice that meant I spent most of it hacking on things I’ve been putting off for weeks, and after we finally made it to the countryside, the weather was nice enough to spend most of my time lounging about with nothing but my phone and the Xteink X4–which is about as close to a holiday as I’m going to get this quarter.

I’m also partway through Patrick McGee’s Apple in China, which is one of those books that makes you want to check the news every few pages to see if anything has changed since the last chapter. The thesis–that Apple’s dependence on Chinese manufacturing is a strategic vulnerability that could be exploited overnight–has already become obvious thanks to last year’s geopolitical upheaval, but McGee’s sourcing and detail make it feel much more concrete than the usual hand-wringing.

Very nice reading, especially and phone logistics.

The Xteink X4

I , but the short version: it’s an absurdly small e-ink reader that costs almost nothing, feels better made than it has any right to, and reminds me of reading e-books on a Palm PDA. No front light, which limits it to daylight and decently lit rooms, but for the price and the form factor I’m not complaining. I’ve been carrying it around all weekend and it’s a genuinely pleasant little device.

Piclaw

I had a slight whoopsie in my cluster (lite LVM doesn’t fully isolate volumes, so one of my new machines corrupted another two…), and Piclaw helped me fix everything and restore from backups with nothing but raw API calls to the cluster. That was enough motivation to make things official and add SSH, and tooling, all lazy-loaded to save LLM context.

I now have a fourth instance running on my network that can discover, inspect and manage VMs and containers across the cluster without helper scripts, which is pretty neat. I’ll be rebuilding my GPU sandbox with it in a couple of days, and have it update my notes directly.

That ARM SBC, and the BasiliskII JIT

I spent a good chunk of this weekend working on finishing that review I’ve been going on about for weeks. Since I have a Piclaw instance on it, I asked it to update benchmark data, charts and diagrams remotely, and revise my notes on the entire thing (GRUB, resize-filesystem bugs, three rounds of reboot-fallback patches).

But I was using it to work on an AArch64 JIT for , and it’s tantalisingly close to done. The bringup consumed most of my evenings, ranging from byte-order bugs–graphics corruption, palette rendering, JIT dispatch–to a steady stream of fixes to the original JIT source, with every emulated instruction block exposing a new flag-handling bug.

This is the sort of work I would never have been able to do without AI. I do have a background in 68k assembly (and a shelf of ancient books), but things like register mapping and memory alignment fixes would have gone right past me. And yet, neither Codex nor Claude were able to see the big picture–like realising that IRQs were messing with JIT state and needed proper handling:

Piclaw analysing JIT register clobbering from C function calls
Piclaw spotting that C function calls from compiled code were clobbering native registers the JIT expected to be preserved

The X flag (extend/carry) was the recurring villain–the models found bugs in how it was copied, masked and converted between the interpreter’s format and the JIT’s native representation. A wrong bit position in DUPLICATE_CARRY, COPY_CARRY not masking to the carry bit, LSL missing from legacy_copy_carry_to_flagx, and a countdown variable type mismatch that broke tick synchronisation. All individually plausible, collectively maddening until you realise it was polluted by both IRQ handling and state tracking whenever you jumped out of the JIT.

Another great thing about AI is effortless documentation: a comprehensive bringup document now lives in the repo tracking the bugs found so far, and it’s easy to trim out the fluff and keep things factual.

Right now the JIT boots to a Mac OS desktop with basic optimisations, but full JIT is still a work in progress. However, even the partial speedups are already a bit beyond what Classic benchmarking tools can handle, which is rather fun:

Classic Mac OS benchmark results: partial JIT already saturating the test tools
Even partial JIT speedups are already beyond what Classic benchmarking tools can handle

With any luck, I should get this working reproducibly in another week or so.

The Xteink X4

I got an Xteink X4 this week, and my first reaction was somewhere between amusement and nostalgia–it is absurdly small, feels a lot better made than I expected for the price, and the form factor harks back to the times when I was reading e-books on Palm PDAs and the original iPod Touch.

The X4 during a weekend outing
The X4 during a weekend outing

Disclaimer: Even though I paid for the device with my own money, this can be considered a review, so I’m linking to for consistency.

I had been tracking the hype around the X4 for quite a few months, and part of the appeal here is obvious: it is cheap, tiny, and simple in a way that most e-readers stopped being years ago. But one of the interesting parts for me is that it uses an and has already attracted a small but very active firmware community, which means that unlike most budget reading devices, this one has a decent chance of getting better after you buy it.

And yeah, I’m a sucker for a new gadget, and this was both cheap and moddable enough to be a no-brainer purchase.

Hardware

The hardware is exactly the sort of compromise I expected, but with a few pleasant surprises. The body is light, pocketable and thin enough that it feels more like a phone accessory than a “real” reader, and the physical buttons are, at least initially, better than I feared.

The screen is small enough that I don’t think of it as a replacement–not in the way I thought of the as an iPad substitute for a very narrow set of tasks–but more as a sort of digital paperback fragment, something meant for short stretches of reading or carrying around just in case.

There is already an even smaller device, but Xteink decided to remove the USB-C port in favor of a pogo pin connector, and that was more than enough reason for me to pass on it.

That said, the lack of a front light is immediately noticeable, and not in an abstract “spec sheet” way. It changes where and when I can use it, and means I am already mentally classifying it as a daylight or well-lit-room device. That is fine for a toy, or for commuting, but it can be a meaningful constraint if you read in bed or on planes (so far I haven’t really had any issues with it and my bedside table lamp, but this isn’t a deep night reading gadget).

The other immediate hardware tell is that everything about it has been budgeted very carefully–screen size, battery, controls, CPU, software assumptions–and that is both the problem and the charm.

The Button “Problem”

The absence of a touchscreen is, I think, the defining choice here.

On the one hand, I can see the appeal: fewer layers of UI indirection, fewer opportunities for sluggish touch handling, and a somewhat more deliberate feel when all you want to do is page forwards and backwards. On the other hand, every task that is not pure reading becomes a little awkward, and that awkwardness adds up very quickly once you get to Wi-Fi setup, library navigation or anything involving text entry.

This feels a bit like old Palm and iPod territory–perfectly usable once muscle memory kicks in, mildly exasperating until you get used to it.

Stock Firmware, CrossPoint, and the Real Value Proposition

I have not had it long enough to make sweeping claims, but one thing is already clear: the X4’s real value is not the stock firmware, and the very first thing I did after getting it was flashing CrossPoint Reader on it (and I just updated it to 1.2.0 before posting this).

And yes, the reason this device exists in my head at all is the CrossPoint ecosystem, and the fact that there are already multiple forks with visibly different goals–plain upstream CrossPoint, CJK-focused builds, reading-centric mods, and at least one gloriously odd fork that adds a virtual chicken to the whole thing.

That changes the equation quite a bit. Without that community, the X4 would just be an interestingly cheap, slightly awkward e-reader. With it, the hardware becomes a small platform–limited, yes, but still a platform, and something that I can fool around with myself.

And that matters, because the ESP32-C3 underpinnings imply a level of hackability that most mainstream readers don’t even pretend to offer.

Reading On It

This is the part I still need more time with, but which has been a resounding success over the past three days (although that is certainly due to my long history with tiny screens). With CrossPoint, page turns are snappy, chapter navigation a matter of 2-3 clicks, and the default Bookerly font is pretty much perfect.

Although it feels a bit weird to have gotten another, pocketable screen when I effectively work from home and thus have little need for a “snackable” reading device that is always on my person, I find it more appealing (and purposeful) than digging out the Kindle app on my phone.

My instinct is that the X4 will be best for the sort of fragmented reading I typically do when traveling or in short breaks throughout the day–in the Instapaper days, I would take my iPod and read articles converted and pushed over in batches–rather than long, immersive reading sessions. The size almost guarantees that. It is not trying to disappear the way a larger reader does; it is trying to always be there.

Removing Friction

Whether that is enough depends entirely on friction, and with fast page turns, Wi-Fi support and an OPDS client in CrossPoint, that seems quite well in hand.

The missing piece of the puzzle was getting books on it, and even though CrossPoint provides a nice on-device web server to manually upload files to it (which is pretty amazing for an ESP32), I decided to flip the issue around and hack together a very quick Bun OPDS server that works beautifully with my setup and the X4, letting me browse all my libraries and download books to it without any manual file management at all.

That was a fun little project, and it is already making the X4 feel much more like a “real” reader, to the point where I’ve already started modernizing my ancient Instapaper pipeline in favor of something “better”.

In fact, this might be the push I needed to move away completely from the Kindle ecosystem–I have long preferred to get DRM-free EPUBs, and I can get my to use OPDS via KOReader

The Comparison I Keep Making

I keep thinking back to the , partly because both devices are trying to sell focus through constraints, but they go about it in very different ways.

The tries to be a deliberate, paper-adjacent environment for writing and reading–and often succeeds, even if syncing and workflow integration kept getting in the way (at least until I found the right combination of and plugins). The X4 feels less ambitious and, paradoxically, more interesting because of that. It is not trying to be a notebook, or a paper surrogate, or a productivity system. It is just a tiny e-reader with enough open firmware momentum to become something slightly stranger.

I think that honesty may end up working in its favour.

Next Steps

Since this is meant to be pocketable and I always end up stuffing my jacket pockets, I printed a hard case for it to protect the screen and buttons, and am quite happy with the results.

What I want to do next is fairly straightforward:

  • Spend a few more days on CrossPoint 1.2.0 and hack away at a moderately sane content pipeline that forces me to read interesting articles on it rather than bookmarking them into a black hole of oblivion.
  • Compare that against either the Enhanced Reading Mod or CrossPet–the former because it sounds sensible, the latter because it sounds gloriously unserious.
  • Hack away a bit more at that OPDS server to see what I can do about syncing reading progress (I’m the kind of person who never used bookmarks because I had zero issues memorizing page numbers, but I do like the convenience of Kindle’s Whispersync).
  • Figure out over a month or so whether this thing fits my actual reading habits, or merely appeals to my taste in tiny hackable hardware.
  • And, if all else fails, turn it into a mini –the software for that already exists…

Right now, I think the Xteink X4 is more interesting gadget than a fully fleshed out product–but that is not necessarily a criticism. Some of the best gadgets I’ve owned started out exactly that way.

Hans Zimmer

At least they aren’t from Behringer
Modular synths on stage. Who would have thought?

Notes for March 23–29

Work ate the week again. I’m exhausted, running on fumes, and daylight saving time stole an hour of sleep I could not afford–the biannual clock shuffle is one of those vestigial absurdities that nobody can be bothered to abolish, and I’m starting to take it personally.

I did manage to get to do a proper refactor of the piclaw codebase (which was desperately needed), spent a bit of time cursing at the (and concluding that I need a new 3D printer), and that’s about it. Meetings, deadlines, the usual corporate grind.

Refactoring Grind

This was the week of the great piclaw codebase reckoning–299 commits, all of them aimed at breaking apart the monolithic mess it was becoming before it got completely out of hand.

works, but, again, you need to have taste and force a feedback loop to get good results. But you can automate away the boring parts, mostly.

The whole thing was driven by an autoresearch-style loop–I basically adapted Karpathy’s approach of having an LLM do research on a codebase, generate a plan, execute it, and then verify the results, except in my case the “research” phase also involves running the test suite and feeding failures back as context. It works very well for mechanical refactoring like this, where the risk of hallucination is low and the reward for not doing it by hand is immense:

Autoresearch grinding away
Autoresearch grinding away

Everything Else

I still haven’t written up those SBC benchmarks I keep promising, but I have been using the SBC extensively–in fact, I dusted off my macemu fork (BasiliskII/SheepShaver with Raspberry Pi-optimized SDL2 builds, Docker images, and .deb packages) and got that running on it.

It’s been sitting at v1.0.5-patch-3 since February, but Marchintosh guilt finally got me to fire it up and poke at it. No new Mac Classic replica yet, but at least the emulator is working, and I am back trying to get an ARM64 JIT working in it, which is a fun challenge:

Orange Pi 6 Plus JIT status report from piclaw
Orange Pi 6 Plus JIT status report from piclaw

I expect to have something to show on that front… this year?

Update: An hour later… Works, but still needs a fair amount of clean up, and it’s actually slower than the interpreter for now, which is a bit embarrassing:

JIT booting to desktop
JIT booting to desktop

Notes for March 16–22

This week’s update is going to be short, largely because work was hell and I ended up spending my Saturday evening poring through my meeting notes backlog until 2AM today and I have a splitting headache to show for it.

There will be a smattering of Wiki updates on the RSS feed as well, since I realized I had a bunch of updates pages sitting uncommitted in the queue that I hadn’t gotten around to pushing yet, but the only relevant updates I have right now are that piclaw got a major package layout overhaul and a new live widget system, while vibes got some shared UI improvements and a new agent avatar caching system.

The Lobster Chase

And, most importantly, I have shifted from building the tooling to using the tooling, which is long overdue.

Here’s a good example of that:

Smith and Gates, happily coexisting in perfect harmony
Smith and Gates, happily coexisting in perfect harmony

Yes, that is a agent and a agent setting up Bluetooth passthrough on a VM. And yes, “Smith” manages my cluster, as well as many other things now.

And yes, piclaw sort of runs on Windows now (in bare/non-sandboxed/YOLO mode), thanks to ’s incredibly good cross-platform support. I gave it a VM to break and a extension to play with, and although I’m still deeply sorry I did it, it works well enough that I don’t need to care about the OS.

I now have four or five semi-permanent installs running on various machines (and a couple of and Oracle VMs), and having them all accessible from my phone has completely changed how I deal with remote management–I can check on agents, trigger actions, and do light debugging from anywhere, which beats the old “SSH in and hope for the best” approach by a wide margin.

And soon, they will talk to each other. And who knows, I might do a desktop app at some point, but for now the web interface is doing just fine.

Everything Else

I never did get around to writing up those SBC benchmarks I mentioned last week. The hardware is sitting on my desk, the notes are in my vault and I have a piclaw instance that is happy as a clam trying to get various local models running on it, but, again, work was just too much this week and I plan to spend the afternoon watching a movie and doing some light reading instead.

Notes for March 9–15

Well, there went another work week. Slightly better (to a degree, although I got some discouraging news regarding a potential change), and another week where piclaw ate most of my evenings–it went from v1.3.0 to v1.3.16 in seven days, which is frankly absurd even by my standards.

But there was a lot of ground to cover, and it’s turned into a sort of “agentic IDE” thing at this point, terminal and all:

piclaw

Yes, it looks like VS Code. But I suspect everything does at this point
Yes, it looks like VS Code. But I suspect everything does at this point

Most of the week went into reliability work. I spent a day or so doing a full refactor, and then got three different LLMs to do a comprehensive full-stack audit of the turn/queue/steering system–which turned up a bunch of race conditions in message submission that I’d been chasing for weeks (plus proper queue deduplication and retry caps, which I should have added from the start). The deferred followup system I was using for steering was also broken in subtle ways–turns were inheriting stale thread roots, which caused all sorts of weirdness. The fun bits were the visual polish and theme support, but those came after the plumbing was solid.

On the UX side, I added Adaptive Cards rendering to the chat timeline (with validation and lifecycle flows)–the idea being that agents can now push structured forms, tables, and interactive elements into the conversation instead of just Markdown. The workspace editor got markdown attachment previews and a -based terminal lifted directly from webterm, plus a bunch of pipework for future multi-chat support (which is going to be a whole thing).

All of it involved, as you would expect, a lot of agent herding, and I had plenty of time to come up with stuff like this:

I couldn't help myself
I couldn't help myself

And I cleaned up all the JS vendoring, since like itself, piclaw is now an extension host, so most of the new features are actually implemented as extensions.

python-office-mcp-server

I dusted off a -based Office document server I’d been building alongside go-ooxml and carved it into its own repository. It’s a pretty comprehensive set of tools for reading, editing, and generating Word/Excel/PowerPoint documents via MCP–unified interface, template support, auditing, the works. It’s stable enough that I decided to publish it as-is, with zero ongoing support commitment. Sometimes the best thing you can do for a side project is to ship it and walk away.

Wiki Archaeology

As to this site, I did another big batch of old page conversions–around 200 wiki pages from the hardware and apps sections got modernized from to with proper frontmatter, descriptive image filenames, and cleaned-up links. Some of these pages date back to 2002, which is always a weird trip.

What I should be doing, though, is celebrating Marchintosh and building a new Mac Classic replica–but all I’ve done hardware-wise has been benchmarking SBCs for a potential new project. I hope to clean up my notes and post something about that next week.

MacBook Neo Impressions

I went to a local mall yesterday and happened to chance upon a couple of s on display at our local (monopolistic) retailer1, and spent a half hour playing with them.

Hardware

The display is… fine? With their strategic placement away from the other laptops on display, side-by-side comparison was impossible, but I didn’t find it bad or dim. The keyboard also felt OK, although I am wary of white keyboards, as they tend to get dirty and look bad over time. The trackpad was… also fine. I don’t press to click, so comparing the physical click with the taptic engine doesn’t make sense to me.

However, the bezels were a bit thicker than I am used to (hence noticeable) and I do think that the overall size could be smaller (even if that would be clearly against mainstream taste). It’s not something I would choose for a cheap travel laptop (especially considering I have the ), but volume-wise, and in “backpack scale”, it’s not that far off from the 13” MacBook Air if you squint.

Software

Of course it was Tahoe, with . The less said about that horrid mess the better.

Performance

To my surprise, it ran this site’s 3D visualization quite snappily, which is no small feat (most PC laptops struggle with it, regardless of what browser and GPU they have). And although I didn’t have the chance to run any benchmarks, RAM usage in Activity Monitor was pretty much OK after launching a gaggle of Apple apps, which wasn’t surprising (their software load doesn’t include any of the Electron bloat everyone has to deal with to some degree).

In general, I think it makes a killer Chromebook/PC laptop replacement for school, and although I expect the A18 to not be a powerhouse, it felt quite snappy, even with multiple apps open. And I’m sure Apple went to some pains to position it where it wouldn’t threaten either the Air or the iPad Pro.

I’d buy one in a flash if it was smaller (well, if I had money to spare), but at least the A18 performance makes me optimistic for whatever they decide to put into the next iPad mini, since my is in desperate need of an upgrade.


  1. No, does have official Apple Stores. I’ve given up on it and just order online. ↩︎

So You Want To Do Agentic Development

We’re three months into 2026, and coding agents have been a big part of my time since –things have definitely intensified, and has already panned out: agents are everywhere.

Yes, I love this picture, and I won't apologize for it
Yes, I love this picture, and I won't apologize for it

My advice for people getting into this remains the same:

Choose Mature Tooling

In the music hobby, there’s a thing called GAS–Gear Acquisition Syndrome–where people get obsessed with buying the latest gear even if they don’t know how to use it. I see in the space right now, so I’d rather recommend starting with mature, well-supported tools:

  • with GitHub Copilot is still the best entry point–you can compare Claude, GPT and Gemini side by side, and it affords real control over the agent’s environment (plus it’s designed for enterprise use).
  • Mistral Vibe and Gemini CLI both have daily free tiers with enough fail-safes to experiment safely. (I still recommend sandboxing, but it’s less critical than it was a few months ago.)
  • OpenCode is the fully free route, but the models have fewer guardrails and can take definitely sandbox this one.

I can’t in good conscience recommend spending hundreds on Anthropic or OpenAI subscriptions right now–the market is saturated, and both are shipping desktop tools (Claude Code, Codex) that will likely come with cheaper tiers. The “use boring technology” adage applies here too.

Of course by now and keep , but that’s a long-term project, and I don’t think it’s the best starting point for most people.

Sandboxing

I never run agent tools on a machine with personal data–that’s why I built agentbox in the first place. You may not need to be as as I am, but supports dev containers on any platform, and both Anthropic and OpenAI are shipping sandboxes with their tools, so there’s really no excuse.

Privacy and Security

There are rather a lot of misconceptions about data privacy, and things like OpenClaw don’t help matters (I’m still gobsmacked people give it direct access to their e-mail). Even with enterprise-hosted models that don’t train on your data, “don’t run untrusted code on a machine with personal data” is a principle worth keeping.

Like I was quipping the other day, AI is the new digital advertising–and yet people are giving it more access to their data than they give ad networks, which is just baffling.

The Local Fallacy

And speaking of OpenClaw–the “local AI” fallacy needs addressing. None of these things are really “local” in any meaningful sense–the gap between local and cloud models is huge, and even tens of thousands of dollars in hardware won’t get you close to frontier capabilities.

The agentic loop is inescapable
The agentic loop is inescapable

“Fast, Good, Cheap: pick two” still applies, and it’s easy to get bitten by technology advances: I and it’s already obsolete.

And there is an almost weekly hype cycle around local models that I think is totally unwarranted in practice. For instance, Qwen is promising, but the local quantizations match last year’s cloud models at best, and the gap keeps widening.

Workflow

I keep coming across people who say AI generates rubbish code, and I think it’s usually one of two things: wrong tools ( does much more than provide a UI–it curates context and feeds models your code structure), or wrong approach (they’re quite literally “holding it wrong”). I wrote about two years ago and the fundamentals haven’t changed.

Part of it is inflated expectations, of course. Frontier models like Opus 4.6 and GPT-5.4 are very capable, but they need skill to use effectively, and they never produce perfect code on the first try. You have to know how to use them, and that takes practice.

I’ve been refining how I work with them since , and although things have evolved quite a bit, the core principles remain the same.

SPEC.md

Every project starts with a SPEC.md that I , 20-questions style, until it covers the essentials–goals, functional specs, non-functional specs, technical specs, and acceptance criteria.

I prefer SPEC.md over PRD.md because it emphasises specification over requirements–I want the agent to follow it, not interpret it freely.

This isn’t a prompt–it’s a living document that evolves with the project. And –feeding one an actual ECMA-376 spec document got me to 60% compliance in days, with no hallucinated APIs.

SKILL.md

I complement specs with SKILL.md files–guidelines for coding, tooling, or domain-specific tasks. I have a growing collection in agentbox, and every new project starts with a make init that .

You can also fold these into .github/copilot-instructions.md (which picks up automatically), but standalone skills are tool-agnostic.

The properly interesting bit is that agents now piclaw built its own hot-reload, backup, and web scraping skills after I’d guided it through the process a few times. Early days, but that’s where this is headed.

–models struggle to chain skills together, whereas MCP narrows context and presents clear next steps. But for teaching how rather than what, skills are still invaluable.

The PLAN.md Loop

After doing the prep work, I go into a loop :

The workflow
The workflow

From the SPEC.md, I create a PLAN.md–not a flat TODO checklist, but a structured breakdown the agent can reason about (what’s done, what’s blocked, why). It updates the plan as it goes, which also refreshes model context. No reliance on built-in planning tools (which are patchy across models), and the plan is always in the repo for me to review.

The loop itself is pretty simple:

  • I break down work into focused chunks–scaffolding, data model, API, etc.
  • The agent writes code, lints, tests, documents, and updates the PLAN.md.
  • I review and steer–correcting, feeding it more context, or pointing at examples.

There is also an “advanced mode” where I break up the PLAN.md into multiple Kanban-like tickets and we then play “20 questions” again for that ticket alone and the LLM asks me about it to refine the scope, but I realize most people just don’t have the patience for that even though it massively improves what gets done and how.

Steering

The most important bit of the loop–and the one most people get wrong at first. Effective steering isn’t about reprompting and hoping for the best; it’s about funnelling the agent’s attention to the right context:

  • –describe expected behaviour, ask the agent to make them pass. My most reliable workflow, especially when porting across languages.
  • Linting and static analysis via Makefile or CI–the agent self-corrects. I aim for 80% coverage as a quality bar.
  • Steering by example–pointing at existing code that demonstrates the right approach.
  • Claude still writes dodgy tests (ok, fine, it forgets all sorts of corner cases), so I use Codex models for test and security audits.

Yes, that’s proper work. But it’s no different from managing humans, and it gets easier with practice.

Language Matters

Some languages are inherently better for agents–not in terms of popularity (any agent can do decent ), but because strong types and annotations help models understand intent and self-correct.

In my experience, , and work much better than (too many revisions, too-opinionated frameworks). in particular through explicit function and interface references, and has excellent tooling for enforcing good practices–which is why I’ve been using both a lot more recently.

What’s Next

My is working well, and I don’t see myself changing it anytime soon. But I’ve been slowly extending my agents’ reach–piclaw, which started as a weekend hack, is now , and I’ve been giving it more autonomy as I learn to trust the guardrails.

The next frontier (boy, is this a pompous term, but I guess is sneaking in this late in the evening) is getting agents to collaborate–sharing context and skills to work as a group. I have , but that’s a matter for another post.

Notes for March 2–8

This was a frankly absurd week work-wise, with some pretty long days and a lot of late-night hacking on my projects (which is not exactly a new thing, but at least now I am asking piclaw to do it during the day time, which is a small improvement).

Read More...

Notes for February 23–March 1

Well, going back to work after a week off was rough.

Read More...

Notes for February 16-21

This week I did something different: I took a wellness break from work and generally tried to tune out all the noise and messiness I have been experiencing there. It ate a chunk out of my PTO, but was mostly worth it.

Read More...

macOS Tahoe 26.3 is Broken

I have no idea of what is happening since I can’t even find any decent logs in Console.app, but it seems that the latest update to macOS Tahoe (26.3) has a serious bug.

Read More...

TIL: Claude Opus 4.6 Can Reverse Engineer STL Files

So, I finally got around to opening the I’ve had sitting around for a while, and I was looking at the STL files for the screen bracket that are over on Printables.

Since I want to fit a lot more electronics into the case, one of the things I need to do is modify that bracket a bit so I can actually use the brightness knob to drive a potentiometer instead of just a dummy knob.

But you can’t really edit STL files, and I don’t have the original CAD files, so I thought “well, maybe I can just ask an to reverse-engineer the STL into an file, and then I can modify that”.

So I created a SPEC.md file with the following content, spun up a new agentbox container to handle it, and gave it to pi:

# STL to OpenScad reconstruction

You are running inside a Debian system with sudo, brew, python3 and whatever you need to accomplish this task.

Your job is to take the STL file in this workspace and reconstruct a copy of it using OpenSCAD.

If the STL is too large or complex, you can try decimating it/using octrees/segmenting it/using marching cubes as needed.

## Requirements

- The final STL file must be within 0.1mm accuracy of the original
- The OpenSCAD code must be modular, readable, and have relevant constants surfaced at the beginning for a customizer
- You should focus on building reusable tools, not one-off scripts
- You should maintain a plan.md file with your progress checklist and tick off items (including failed approaches) as you make progress

Much to my surprise, Claude Opus 4.6 was able to handle this task pretty well, and it produced a stupefyingly accurate file:

// WonderMac Screen Bracket - OpenSCAD Reconstruction
// Reconstructed from STL analysis
// All dimensions in mm

/* [Main Dimensions] */
// Total width of the bracket (X direction)
bracket_width = 68;         // X: -50 to 18
// X offset (left edge position)
bracket_x_offset = -50;

/* [Back Plate] */
// Height of the tall vertical back plate
back_plate_height = 21.5;   // Z: 0 to 21.5
// Thickness of the back plate (Y direction)
back_plate_thickness = 2;   // Y: -16.5 to -14.5
// Y position of back plate inner edge
back_plate_y = -16.5;

/* [Top Shelf] */
// Bridge depth (connects back plate to top lip)
bridge_depth = 5.5;         // Y: -14.5 to -9
// Bridge height
bridge_height = 2;          // Z: 0 to 2
// Top lip depth
top_lip_depth = 2;          // Y: -9 to -7
// Top lip height
top_lip_height = 5;         // Z: 0 to 5

/* [Frame] */
// Frame outer Y bounds
frame_outer_y_min = -66.5;
frame_outer_y_max = -14.5;  // matches back plate
// Frame inner bounds (lower section Z=0-7)
frame_inner_x_min = -47;
frame_inner_x_max_lower = 18;  // open right side
frame_inner_y_min_lower = -64.5;
frame_inner_y_max_lower = -16.5;
// Frame inner bounds (upper section Z=7-9)
frame_inner_x_max_upper = 13;
frame_inner_y_min_upper = -62.5;
frame_inner_y_max_upper = -17.5;
// Frame heights
frame_lower_height = 7;     // Z: 0 to 7
frame_upper_height = 2;     // Z: 7 to 9

/* [Bottom Lip] */
// Lip extension below frame
lip_y_min = -70.5;
lip_y_max = -66.5;
lip_z_min = 3;
lip_z_max = 5;
lip_height = 2;             // lip_z_max - lip_z_min

/* [Extended Plate] */
// L-shaped plate at Z=5-7
plate_z_min = 5;
plate_z_max = 7;
plate_height = 2;
// Bottom part of L (full width)
plate_bottom_y_min = -9;
plate_bottom_y_max = -3;
// Left part of L (extends upward)
plate_left_y_max = 15;
plate_left_x_max = -30;

/* [Screw Post] */
// Cylindrical post with through-hole
post_center_x = -41;
post_center_y = 10.5;
post_outer_radius = 2;
post_inner_radius = 1;
post_z_min = 0.5;
post_z_max = 5;             // meets plate bottom

/* [Screw Holes] */
// Hole radius for mounting holes
mount_hole_radius = 0.75;
// Lip mounting holes
lip_hole_left_x = -47.75;
lip_hole_right_x = 15.75;
lip_hole_y = -69;
// Plate mounting holes
plate_hole_left_x = -46;
plate_hole_right_x = 15;
plate_hole_y = -5;

/* [Resolution] */
$fn = 64;

// ============================================
// Modules
// ============================================

module back_plate() {
    translate([bracket_x_offset, back_plate_y, 0])
        cube([bracket_width, back_plate_thickness, back_plate_height]);
}

module bridge() {
    translate([bracket_x_offset, back_plate_y + back_plate_thickness, 0])
        cube([bracket_width, bridge_depth, bridge_height]);
}

module top_lip() {
    translate([bracket_x_offset, -9, 0])
        cube([bracket_width, top_lip_depth, top_lip_height]);
}

module frame_lower() {
    difference() {
        // Outer rectangle
        translate([bracket_x_offset, frame_outer_y_min, 0])
            cube([bracket_width,
                  frame_outer_y_max - frame_outer_y_min,
                  frame_lower_height]);
        // Inner cutout
        translate([frame_inner_x_min, frame_inner_y_min_lower, -1])
            cube([frame_inner_x_max_lower - frame_inner_x_min,
                  frame_inner_y_max_lower - frame_inner_y_min_lower,
                  frame_lower_height + 2]);
    }
}

module frame_upper() {
    difference() {
        // Outer rectangle (same as lower)
        translate([bracket_x_offset, frame_outer_y_min, frame_lower_height])
            cube([bracket_width,
                  frame_outer_y_max - frame_outer_y_min,
                  frame_upper_height]);
        // Inner cutout (smaller = thicker walls)
        translate([frame_inner_x_min, frame_inner_y_min_upper, frame_lower_height - 1])
            cube([frame_inner_x_max_upper - frame_inner_x_min,
                  frame_inner_y_max_upper - frame_inner_y_min_upper,
                  frame_upper_height + 2]);
    }
}

module bottom_lip() {
    translate([bracket_x_offset, lip_y_min, lip_z_min])
        cube([bracket_width, lip_y_max - lip_y_min, lip_height]);
}

module extended_plate() {
    // Bottom part of L (full width, Y=-9 to -3)
    translate([bracket_x_offset, plate_bottom_y_min, plate_z_min])
        cube([bracket_width,
              plate_bottom_y_max - plate_bottom_y_min,
              plate_height]);
    // Left part of L (Y=-3 to 15, X=-50 to -30)
    translate([bracket_x_offset, plate_bottom_y_max, plate_z_min])
        cube([plate_left_x_max - bracket_x_offset,
              plate_left_y_max - plate_bottom_y_max,
              plate_height]);
}

module screw_post() {
    translate([post_center_x, post_center_y, post_z_min])
        cylinder(r=post_outer_radius, h=post_z_max - post_z_min);
}

module screw_post_hole() {
    translate([post_center_x, post_center_y, -1])
        cylinder(r=post_inner_radius, h=back_plate_height + 2);
}

module lip_holes() {
    // Left lip hole
    translate([lip_hole_left_x, lip_hole_y, lip_z_min - 1])
        cylinder(r=mount_hole_radius, h=lip_height + 2);
    // Right lip hole
    translate([lip_hole_right_x, lip_hole_y, lip_z_min - 1])
        cylinder(r=mount_hole_radius, h=lip_height + 2);
}

module plate_holes() {
    // Left plate hole
    translate([plate_hole_left_x, plate_hole_y, plate_z_min - 1])
        cylinder(r=mount_hole_radius, h=plate_height + 2);
    // Right plate hole
    translate([plate_hole_right_x, plate_hole_y, plate_z_min - 1])
        cylinder(r=mount_hole_radius, h=plate_height + 2);
}

// ============================================
// Assembly
// ============================================

module bracket_assembly() {
    difference() {
        union() {
            back_plate();
            bridge();
            top_lip();
            frame_lower();
            frame_upper();
            bottom_lip();
            extended_plate();
            screw_post();
        }
        // Subtract all holes
        screw_post_hole();
        lip_holes();
        plate_holes();
    }
}

bracket_assembly();

This is what the output looks like
This is what the output looks like

But what is more important is that I was able to capture the entire process in a SKILL.md file, and it makes for an amazing read:

# SKILL: STL to Parametric OpenSCAD Reconstruction

## Goal

Reverse-engineer a binary/ASCII STL mesh file into a clean, parametric OpenSCAD source file that reproduces the original geometry within a specified tolerance (e.g. 0.1mm Hausdorff distance).

## When to Use

- You have an STL file of a mechanical part and need an editable parametric source
- The part is primarily composed of prismatic (box-like) and cylindrical features — not organic/sculpted shapes
- You need the output to be human-readable and customizable, not just a mesh re-export

## Prerequisites

- **Python packages**: `numpy`, `trimesh`, `scipy`, `shapely`, `networkx`, `rtree`, `numpy-stl`
- **System packages**: `openscad`
- Install with: `pip3 install numpy trimesh scipy shapely networkx rtree numpy-stl` and `sudo apt-get install openscad`

## High-Level Approach

### Phase 1: Mesh Triage

Load the STL with `trimesh` and gather key statistics to understand the scope:

- **Vertex/face count**: Determines complexity. Under ~5k faces is likely a machined/printed part with clean geometry.
- **Bounding box and extents**: Gives the overall dimensions.
- **Volume and watertightness**: Confirms the mesh is valid and closed.
- **Euler number**: Computes genus (number of through-holes). Formula: `genus = (2 - euler_number) / 2`. This tells you how many holes to find.

### Phase 2: Identify Z-Level Structure

For prismatic parts (brackets, enclosures, mounts), the geometry is almost always built from features extruded along one principal axis. Identify which axis that is by examining the unique coordinate values of vertices.

1. **Find unique vertex coordinates** along each axis (rounded to ~3 decimal places). The axis with the fewest unique values is the extrusion/stacking axis.
2. **List the discrete levels** on that axis. For this bracket, Z had only 8 unique values: `[0, 0.5, 2, 3, 5, 7, 9, 21.5]`. Each pair of adjacent levels defines a "layer" of constant cross-section.
3. **Count up-facing and down-facing face areas** at each level. Up-facing faces at a Z-level mark the *top* of a feature; down-facing faces mark the *bottom* of a feature starting at that height. The area values serve as checksums for your reconstruction.

### Phase 3: Cross-Section Analysis

Take planar cross-sections at the midpoint of each layer using `trimesh.section()`:

1. **Slice the mesh** at each intermediate Z value (e.g. Z=0.25, Z=1, Z=2.5, etc.).
2. **Convert to 2D polygons** via `section.to_planar()` and examine the `polygons_full` property.
3. **Simplify polygons** with Shapely's `.simplify()` to reduce curved arcs to key vertices while preserving corners.
4. **Transform back to world coordinates** using the planar transform matrix to get actual XY positions.
5. **Record each polygon's exterior and interior (hole) boundaries**. Note how many vertices remain after simplification — a 5-point polygon is a rectangle, a 9-point polygon is an L-shape, a 17-point polygon is a circle approximation, etc.

Track how the cross-section *changes* between layers — this reveals where features start, end, merge, or split.

### Phase 4: Identify Geometric Primitives

From the cross-section data, decompose the shape into CSG primitives:

- **Rectangles** (5 simplified vertices = box cross-section): Record corner coordinates, extrusion height range.
- **L-shapes / U-shapes** (9+ vertices): Decompose into union of rectangles, or model as rectangle-minus-rectangle.
- **Circles / arcs** (17+ vertices after simplification): Compute center as midpoint of extremes, radius as half the span. Verify by checking vertex distances from the computed center — they should all equal the radius.
- **Rings/annuli** (polygon with circular hole): Outer and inner radius from the exterior and interior boundaries.

For each primitive, determine:
- XY bounds or center+radius
- Z range (which layers it spans)
- Whether it's additive (part of the union) or subtractive (a hole to difference out)

### Phase 5: Cross-Validate with Vertex Grouping

For extra confidence, directly examine the raw vertices at each Z-level:

- Group vertices by their Z coordinate.
- For levels with few vertices (≤20), print them all — these directly reveal rectangle corners.
- For levels with many vertices, look for clusters. Compute distances from suspected circle centers and verify constant radius.
- Check that circle parameters (center, radius) are consistent across multiple Z-levels.

### Phase 6: Build the OpenSCAD Model

Structure the `.scad` file for readability and customization:

1. **Constants at the top** in OpenSCAD Customizer sections (`/* [Section Name] */`). Every dimension gets a named variable with a comment showing its physical meaning and original coordinate range.
2. **One module per feature**: `back_plate()`, `frame_lower()`, `screw_post()`, `lip_holes()`, etc. Each module is self-contained and uses only the global constants.
3. **Assembly module**: A single `bracket_assembly()` module that `union()`s all additive features, then `difference()`s all holes. This keeps the boolean logic clean and makes it easy to toggle features.
4. **Resolution control**: A single `$fn` parameter controls circle smoothness globally.

Modeling patterns:
- **Rectangular frame**: `difference()` of outer `cube()` minus inner `cube()`.
- **L-shaped plate**: `union()` of two overlapping `cube()` calls.
- **Through-hole**: `cylinder()` with height extending past the material (add 1mm on each side with `-1` offset and `+2` height to ensure clean boolean cuts).
- **Ring/post**: `cylinder()` for the outer, with a through `cylinder()` subtracted.

### Phase 7: Render and Compare

1. **Render** with `openscad -o output.stl model.scad`.
2. **Compare** using a reusable Python comparison tool that computes:
   - **Bidirectional surface distance**: Sample 50k points on each surface, find nearest point on the other surface using `trimesh.nearest.on_surface()`. Report mean, max, 95th/99th percentile.
   - **Volume difference**: Compare `mesh.volume` values.
   - **Bounds match**: Check bounding boxes agree within tolerance.
   - **Topology match**: Compare Euler numbers.
3. **Iterate** if the Hausdorff distance exceeds the tolerance. Common fixes:
   - Wrong dimension by a small amount → re-examine vertex coordinates at that Z-level
   - Missing feature → look at the worst-mismatch sample points to locate the problem area
   - Circle approximation error → increase `$fn`

### Phase 8: Verify the Accuracy Target

The final gate is the bidirectional Hausdorff distance. For this task the target was 0.1mm; the achieved result was 0.004mm (25× better than required). The residual error comes entirely from polygon approximation of circles (`$fn=64` gives a theoretical max error of `r × (1 - cos(π/64))` ≈ 0.0024mm for r=2mm).

## Key Lessons

1. **Z-level analysis is the critical insight for prismatic parts.** If the mesh has only a handful of unique Z values, the part is a stack of extruded profiles and can be exactly decomposed.
2. **Cross-sections + simplification finds the primitives fast.** Shapely's `simplify()` with a small tolerance (0.05–0.1mm) collapses arcs to their key points while preserving sharp corners.
3. **Euler number tells you how many holes to find.** Don't stop looking for features until you can account for all `(2 - χ) / 2` topological handles.
4. **Face normal grouping separates flat vs. curved surfaces.** Axis-aligned normals (±X, ±Y, ±Z) are planar faces; all others are cylinder walls. The Z-component of non-axis normals reveals whether cylinders are vertical (Z=0) or angled.
5. **Up/down face area sums serve as checksums.** Compute the expected area of each horizontal surface from your model parameters and verify it matches the STL. This catches dimension errors before rendering.
6. **Model in original coordinates, not relocated.** Keeping the STL's native coordinate system avoids translation errors and makes comparison trivial.
7. **Build the comparison tool first.** A reusable `compare_stl.py` with surface sampling and Hausdorff distance makes iteration fast and objective.

## Reusable Tools

All tools live in `tools/` with CLI interfaces, `--help`, and `--json` output.
See `tools/README.md` for full usage.

| Tool | Phase | Purpose |
|------|-------|---------|
| `tools/stl_info.py` | 1 | Mesh triage: stats, topology, genus, components |
| `tools/stl_zlevel.py` | 2 | Find discrete height levels, face areas, vertex coords |
| `tools/stl_cross_section.py` | 3–4 | Slice mesh, extract & classify 2D polygons |
| `tools/stl_normals.py` | 4 | Face normal grouping, cylinder feature detection |
| `tools/compare_stl.py` | 7 | Bidirectional Hausdorff distance, volume, topology |

### Quick-start workflow

python3 tools/stl_info.py part.stl                          # What am I dealing with?
python3 tools/stl_zlevel.py part.stl --vertices              # Layer structure + corners
python3 tools/stl_cross_section.py part.stl                  # Auto-slice cross-sections
python3 tools/stl_normals.py part.stl                        # Find cylinders and holes
python3 tools/stl_cross_section.py part.stl --axis x --at 0  # Hidden internal features
# ... write OpenSCAD model ...
openscad -o output.stl model.scad
python3 tools/compare_stl.py part.stl output.stl             # Verify accuracy

## Deliverables

| File | Purpose |
|------|---------|
| `tools/` | Reusable analysis toolkit (see `tools/README.md`) |
| `bracket.scad` | Parametric OpenSCAD source with customizer sections |
| `bracket_output.stl` | Rendered STL for comparison |
| `plan.md` | Progress checklist with identified components and results |

…and yes, it also created tools for its own use. It’s not a chimpanzee using a stick to get at termites, but it is pretty close: it’s an LLM creating its own toolchain to get at the underlying geometry of a mesh.

This is far more sophisticated than I expected, and it shows that LLMs can be used for scoped reverse-engineering tasks with the right prompting and constraints–but it also shows that you need to be able to understand the problem domain and guide the model with the right structure and checks to get a usable result.

The caveat is that this is a very specific use case (STL to for prismatic parts), and I wouldn’t expect the same approach to work for more complex shapes or different file formats without significant adjustments. But it’s very much above and beyond what we could do a year ago.

Now excuse me while I go and give it a development board’s dimensions and ask it to design a custom case for it…

Update: Pretty impressive results on the custom case as well–here’s a first draft of it, which is already pretty close to what I need:

An Orange Pi 6 Custom Case, generated after a few web searches and image references
An Orange Pi 6 Custom Case, generated after a few web searches and image references

Notes for February 8-15

This was a grueling week at work, but I managed to keep polishing my tools, even if I decided to reinvent a couple of wheels in the process.

Read More...

The Haves and The Have Nots

One thing I’ve noticed over the past few weeks is the huge gap in both perception and availability of .

Read More...

Clair Obscur

Le cul de sac
I've actually never played RPGs before, but Expedition 33 has a… certain style to it. Maybe it's the French dialogue.

Notes for February 2-7

Half my working week was spent at an internal company thing, so I decided to make the best of my weekend and start cutting down on coding a bit.

Read More...

Accelerando, But Janky

The past couple of weeks have been sheer madness in the AI hype space, enough that I think it’s worthwhile capturing the moment for posterity.

Read More...

Notes for January 26 - February 1

I’ve had some feedback that my last few weekly notes (especially ) have been a bit too long and that I should try to keep them shorter and more focused on a single topic.

Read More...

Thoughts on AI-Assisted Software Development in 2026

A few things I jotted down during –i.e., while building out my agentbox and webterm setups and other things.

Read More...

Vibing with the Agent Control Protocol

Although most of my actual work , I have been wanting an easy way to talk to the newfangled crop of agents from my iPhone.

Read More...

Seizing The Means Of Production (Again)

Since , I’ve been hardening my agentbox and webterm setup through sheer friction. The pattern is still the same:

Read More...

Archives3D Site Map