This was a shorter work week partly due to the Easter weekend and partly because I book-ended it with a couple of days off in an attempt to restore personal sanity–only to catch a cold and remain stuck at home.
In practice that meant I spent most of it hacking on things I’ve been putting off for weeks, and after we finally made it to the countryside, the weather was nice enough to spend most of my time lounging about with nothing but my phone and the Xteink X4–which is about as close to a holiday as I’m going to get this quarter.
I’m also partway through Patrick McGee’s Apple in China, which is one of those books that makes you want to check the news every few pages to see if anything has changed since the last chapter. The thesis–that Apple’s dependence on Chinese manufacturing is a strategic vulnerability that could be exploited overnight–has already become obvious thanks to last year’s geopolitical upheaval, but McGee’s sourcing and detail make it feel much more concrete than the usual hand-wringing.
I wrote this up separately, but the short version: it’s an absurdly small e-ink reader that costs almost nothing, feels better made than it has any right to, and reminds me of reading e-books on a Palm PDA. No front light, which limits it to daylight and decently lit rooms, but for the price and the form factor I’m not complaining. I’ve been carrying it around all weekend and it’s a genuinely pleasant little device.
I had a slight whoopsie in my Proxmox cluster (lite LVM doesn’t fully isolate volumes, so one of my new machines corrupted another two…), and Piclaw helped me fix everything and restore from backups with nothing but raw API calls to the cluster. That was enough motivation to make things official and add SSH, Proxmox and Portainer tooling, all lazy-loaded to save LLM context.
I now have a fourth instance running on my network that can discover, inspect and manage VMs and containers across the cluster without helper scripts, which is pretty neat. I’ll be rebuilding my GPU sandbox with it in a couple of days, and have it update my Obsidian notes directly.
I spent a good chunk of this weekend working on finishing that review I’ve been going on about for weeks. Since I have a Piclaw instance on it, I asked it to update benchmark data, charts and diagrams remotely, and revise my notes on the entire thing (GRUB, resize-filesystem bugs, three rounds of reboot-fallback patches).
But last week I was using it to work on an AArch64 JIT for BasiliskII, and it’s tantalisingly close to done. The bringup consumed most of my evenings, ranging from byte-order bugs–graphics corruption, palette rendering, JIT dispatch–to a steady stream of fixes to the original JIT source, with every emulated instruction block exposing a new flag-handling bug.
This is the sort of work I would never have been able to do without AI. I do have a background in 68k assembly (and a shelf of ancient books), but things like register mapping and memory alignment fixes would have gone right past me. And yet, neither Codex nor Claude were able to see the big picture–like realising that IRQs were messing with JIT state and needed proper handling:
Piclaw spotting that C function calls from compiled code were clobbering native registers the JIT expected to be preserved
The X flag (extend/carry) was the recurring villain–the models found bugs in how it was copied, masked and converted between the interpreter’s format and the JIT’s native representation. A wrong bit position in DUPLICATE_CARRY, COPY_CARRY not masking to the carry bit, LSL missing from legacy_copy_carry_to_flagx, and a countdown variable type mismatch that broke tick synchronisation. All individually plausible, collectively maddening until you realise it was polluted by both IRQ handling and state tracking whenever you jumped out of the JIT.
Another great thing about AI is effortless documentation: a comprehensive bringup document now lives in the repo tracking the bugs found so far, and it’s easy to trim out the fluff and keep things factual.
Right now the JIT boots to a Mac OS desktop with basic optimisations, but full JIT is still a work in progress. However, even the partial speedups are already a bit beyond what Classic benchmarking tools can handle, which is rather fun:
Even partial JIT speedups are already beyond what Classic benchmarking tools can handle
With any luck, I should get this working reproducibly in another week or so.
I got an Xteink X4 this week, and my first reaction was somewhere between amusement and nostalgia–it is absurdly small, feels a lot better made than I expected for the price, and the form factor harks back to the times when I was reading e-books on Palm PDAs and the original iPod Touch.
The X4 during a weekend outing
Disclaimer: Even though I paid for the device with my own money, this can be considered a review, so I’m linking to my review policy for consistency.
I had been tracking the hype around the X4 for quite a few months, and part of the appeal here is obvious: it is cheap, tiny, and simple in a way that most e-readers stopped being years ago. But one of the interesting parts for me is that it uses an ESP32-C3 and has already attracted a small but very active firmware community, which means that unlike most budget reading devices, this one has a decent chance of getting better after you buy it.
And yeah, I’m a sucker for a new gadget, and this was both cheap and moddable enough to be a no-brainer purchase.
The hardware is exactly the sort of compromise I expected, but with a few pleasant surprises. The body is light, pocketable and thin enough that it feels more like a phone accessory than a “real” reader, and the physical buttons are, at least initially, better than I feared.
The screen is small enough that I don’t think of it as a Kindle replacement–not in the way I thought of the Supernote Nomad as an iPad substitute for a very narrow set of tasks–but more as a sort of digital paperback fragment, something meant for short stretches of reading or carrying around just in case.
There is already an even smaller device, but Xteink decided to remove the USB-C port in favor of a pogo pin connector, and that was more than enough reason for me to pass on it.
That said, the lack of a front light is immediately noticeable, and not in an abstract “spec sheet” way. It changes where and when I can use it, and means I am already mentally classifying it as a daylight or well-lit-room device. That is fine for a toy, or for commuting, but it can be a meaningful constraint if you read in bed or on planes (so far I haven’t really had any issues with it and my bedside table lamp, but this isn’t a deep night reading gadget).
The other immediate hardware tell is that everything about it has been budgeted very carefully–screen size, battery, controls, CPU, software assumptions–and that is both the problem and the charm.
The absence of a touchscreen is, I think, the defining choice here.
On the one hand, I can see the appeal: fewer layers of UI indirection, fewer opportunities for sluggish touch handling, and a somewhat more deliberate feel when all you want to do is page forwards and backwards. On the other hand, every task that is not pure reading becomes a little awkward, and that awkwardness adds up very quickly once you get to Wi-Fi setup, library navigation or anything involving text entry.
This feels a bit like old Palm and iPod territory–perfectly usable once muscle memory kicks in, mildly exasperating until you get used to it.
I have not had it long enough to make sweeping claims, but one thing is already clear: the X4’s real value is not the stock firmware, and the very first thing I did after getting it was flashing CrossPoint Reader on it (and I just updated it to 1.2.0 before posting this).
And yes, the reason this device exists in my head at all is the CrossPoint ecosystem, and the fact that there are already multiple forks with visibly different goals–plain upstream CrossPoint, CJK-focused builds, reading-centric mods, and at least one gloriously odd fork that adds a virtual chicken to the whole thing.
That changes the equation quite a bit. Without that community, the X4 would just be an interestingly cheap, slightly awkward e-reader. With it, the hardware becomes a small platform–limited, yes, but still a platform, and something that I can fool around with myself.
And that matters, because the ESP32-C3 underpinnings imply a level of hackability that most mainstream readers don’t even pretend to offer.
This is the part I still need more time with, but which has been a resounding success over the past three days (although that is certainly due to my long history with tiny screens). With CrossPoint, page turns are snappy, chapter navigation a matter of 2-3 clicks, and the default Bookerly font is pretty much perfect.
Although it feels a bit weird to have gotten another, pocketable screen when I effectively work from home and thus have little need for a “snackable” reading device that is always on my person, I find it more appealing (and purposeful) than digging out the Kindle app on my phone.
My instinct is that the X4 will be best for the sort of fragmented reading I typically do when traveling or in short breaks throughout the day–in the Instapaper days, I would take my iPod and read articles converted and pushed over in batches–rather than long, immersive reading sessions. The size almost guarantees that. It is not trying to disappear the way a larger reader does; it is trying to always be there.
Whether that is enough depends entirely on friction, and with fast page turns, Wi-Fi support and an OPDS client in CrossPoint, that seems quite well in hand.
The missing piece of the puzzle was getting books on it, and even though CrossPoint provides a nice on-device web server to manually upload files to it (which is pretty amazing for an ESP32), I decided to flip the issue around and hack together a very quick Bun OPDS server that works beautifully with my Calibre setup and the X4, letting me browse all my libraries and download books to it without any manual file management at all.
That was a fun little project, and it is already making the X4 feel much more like a “real” reader, to the point where I’ve already started modernizing my ancient Instapaper pipeline in favor of something “better”.
In fact, this might be the push I needed to move away completely from the Kindle ecosystem–I have long preferred to get DRM-free EPUBs, and I can get my Nomad to use OPDS via KOReader…
I keep thinking back to the Nomad, partly because both devices are trying to sell focus through constraints, but they go about it in very different ways.
The Nomad tries to be a deliberate, paper-adjacent environment for writing and reading–and often succeeds, even if syncing and workflow integration kept getting in the way (at least until I found the right combination of SyncThing and Obsidian plugins). The X4 feels less ambitious and, paradoxically, more interesting because of that. It is not trying to be a notebook, or a paper surrogate, or a productivity system. It is just a tiny e-reader with enough open firmware momentum to become something slightly stranger.
I think that honesty may end up working in its favour.
Since this is meant to be pocketable and I always end up stuffing my jacket pockets, I printed a hard case for it to protect the screen and buttons, and am quite happy with the results.
What I want to do next is fairly straightforward:
Spend a few more days on CrossPoint 1.2.0 and hack away at a moderately sane content pipeline that forces me to read interesting articles on it rather than bookmarking them into a black hole of oblivion.
Compare that against either the Enhanced Reading Mod or CrossPet–the former because it sounds sensible, the latter because it sounds gloriously unserious.
Hack away a bit more at that OPDS server to see what I can do about syncing reading progress (I’m the kind of person who never used bookmarks because I had zero issues memorizing page numbers, but I do like the convenience of Kindle’s Whispersync).
Figure out over a month or so whether this thing fits my actual reading habits, or merely appeals to my taste in tiny hackable hardware.
And, if all else fails, turn it into a mini TRMNL–the software for that already exists…
Right now, I think the Xteink X4 is more interesting gadget than a fully fleshed out product–but that is not necessarily a criticism. Some of the best gadgets I’ve owned started out exactly that way.
Mar 31st 2026 · 1 min read
·
#arena
#concert
#meo
#music
#photo
#photography
Work ate the week again. I’m exhausted, running on fumes, and daylight saving time stole an hour of sleep I could not afford–the biannual clock shuffle is one of those vestigial absurdities that nobody can be bothered to abolish, and I’m starting to take it personally.
I did manage to get my AI minions to do a proper refactor of the piclaw codebase (which was desperately needed), spent a bit of time cursing at the SK1 (and concluding that I need a new 3D printer), and that’s about it. Meetings, deadlines, the usual corporate grind.
This was the week of the great piclaw codebase reckoning–299 commits, all of them aimed at breaking apart the monolithic mess it was becoming before it got completely out of hand.
Agentic development works, but, again, you need to have taste and force a feedback loop to get good results. But you can automate away the boring parts, mostly.
The whole thing was driven by an autoresearch-style loop–I basically adapted Karpathy’s approach of having an LLM do research on a codebase, generate a plan, execute it, and then verify the results, except in my case the “research” phase also involves running the test suite and feeding failures back as context. It works very well for mechanical refactoring like this, where the risk of hallucination is low and the reward for not doing it by hand is immense:
I still haven’t written up those SBC benchmarks I keep promising, but I have been using the SBC extensively–in fact, I dusted off my macemu fork (BasiliskII/SheepShaver with Raspberry Pi-optimized SDL2 builds, Docker images, and .deb packages) and got that running on it.
It’s been sitting at v1.0.5-patch-3 since February, but Marchintosh guilt finally got me to fire it up and poke at it. No new Mac Classic replica yet, but at least the emulator is working, and I am back trying to get an ARM64 JIT working in it, which is a fun challenge:
Orange Pi 6 Plus JIT status report from piclaw
I expect to have something to show on that front… this year?
Update: An hour later… Works, but still needs a fair amount of clean up, and it’s actually slower than the interpreter for now, which is a bit embarrassing:
JIT booting to desktop
Mar 27th 2026 · 1 min read
·
#apple
#design
#hacks
#macos
#ui
This is absolutely hilarious. The infuriating window corner roundness in Tahoe has been bugging me too–and this is a brilliant take on the problem.
Instead of disabling SIP and patching system apps to remove the rounded corners (which is the usual approach), this simply forces a consistent corner radius across all third-party apps via a DYLD-injected dynamic library.
It’s a small thing, but inconsistency in UI chrome is the kind of detail that, once you notice it, you can never un-notice. The fact that Safari has different corner radii from other apps is inexcusable–and that’s before the Liquid Glass disaster made everything look like a Fisher-Price toy dipped in vaseline. I appreciate the “if you can’t beat them, at least make them all equally ugly” philosophy here.
The implementation is old-timey, straightforward Objective-C method swizzling on NSThemeFrame–nothing exotic, but the approach of skipping com.apple.* bundles and only touching third-party apps means you don’t need to mess with SIP at all. That alone makes it worth bookmarking.
Mar 25th 2026 · 1 min read
·
#ai
#arm
#chip
#cpu
#hardware
#inference
The fact that ARM, whose entire business model revolved around licensing CPU designs, has decided to actually go and build their own chips is remarkable by itself, but the design specs (and power envelope) are very interesting.
I have been keeping tabs on the dedicated inference hardware space ever since I got wind of Cerebras, and I like the idea of special purpose/optimized CPU designs that would remove (or at least lessen) our dependency on NVIDIA (and GPUs in general) to run AI models, because that is the way to make it cheaper, less power hungry and, eventually, desktop-sized.
I do find it stupid to refer to this as an AGI CPU, though.
Mar 22nd 2026 · 2 min read
·
#agents
#ai
#balance
#bun
#dev
#life
#notes
#piclaw
#typescript
#weekly
#windows
#work
This week’s update is going to be short, largely because work was hell and I ended up spending my Saturday evening poring through my meeting notes backlog until 2AM today and I have a splitting headache to show for it.
There will be a smattering of Wiki updates on the RSS feed as well, since I realized I had a bunch of updates pages sitting uncommitted in the queue that I hadn’t gotten around to pushing yet, but the only relevant updates I have right now are that piclaw got a major package layout overhaul and a new live widget system, while vibes got some shared UI improvements and a new agent avatar caching system.
And, most importantly, I have shifted from building the tooling to using the tooling, which is long overdue.
Here’s a good example of that:
Smith and Gates, happily coexisting in perfect harmony
Yes, that is a Linux agent and a Windows agent setting up Bluetooth passthrough on a VM. And yes, “Smith” manages my Proxmox cluster, as well as many other things now.
And yes, piclawsort of runs on Windows now (in bare/non-sandboxed/YOLO mode), thanks to Bun’s incredibly good cross-platform support. I gave it a VM to break and a PowerShell extension to play with, and although I’m still deeply sorry I did it, it works well enough that I don’t need to care about the OS.
I now have four or five semi-permanent installs running on various machines (and a couple of Azure and Oracle VMs), and having them all accessible from my phone has completely changed how I deal with remote management–I can check on agents, trigger actions, and do light debugging from anywhere, which beats the old “SSH in and hope for the best” approach by a wide margin.
And soon, they will talk to each other. And who knows, I might do a desktop app at some point, but for now the web interface is doing just fine.
I never did get around to writing up those SBC benchmarks I mentioned last week. The hardware is sitting on my desk, the notes are in my Obsidian vault and I have a piclaw instance that is happy as a clam trying to get various local models running on it, but, again, work was just too much this week and I plan to spend the afternoon watching a movie and doing some light reading instead.
Mar 15th 2026 · 3 min read
·
#agents
#ai
#dev
#mcp
#notes
#typescript
#weekly
Well, there went another work week. Slightly better (to a degree, although I got some discouraging news regarding a potential change), and another week where piclaw ate most of my evenings–it went from v1.3.0 to v1.3.16 in seven days, which is frankly absurd even by my standards.
But there was a lot of ground to cover, and it’s turned into a sort of “agentic IDE” thing at this point, terminal and all:
piclaw
Yes, it looks like VS Code. But I suspect everything does at this point
Most of the week went into reliability work. I spent a day or so doing a full refactor, and then got three different LLMs to do a comprehensive full-stack audit of the turn/queue/steering system–which turned up a bunch of race conditions in message submission that I’d been chasing for weeks (plus proper queue deduplication and retry caps, which I should have added from the start). The deferred followup system I was using for steering was also broken in subtle ways–turns were inheriting stale thread roots, which caused all sorts of weirdness. The fun bits were the visual polish and theme support, but those came after the plumbing was solid.
On the UX side, I added Adaptive Cards rendering to the chat timeline (with validation and lifecycle flows)–the idea being that agents can now push structured forms, tables, and interactive elements into the conversation instead of just Markdown. The workspace editor got markdown attachment previews and a Ghostty-based terminal lifted directly from webterm, plus a bunch of pipework for future multi-chat support (which is going to be a whole thing).
All of it involved, as you would expect, a lot of agent herding, and I had plenty of time to come up with stuff like this:
I couldn't help myself
And I cleaned up all the JS vendoring, since like pi itself, piclaw is now an extension host, so most of the new features are actually implemented as extensions.
python-office-mcp-server
I dusted off a Python-based Office document MCP server I’d been building alongside go-ooxml and carved it into its own repository. It’s a pretty comprehensive set of tools for reading, editing, and generating Word/Excel/PowerPoint documents via MCP–unified interface, template support, auditing, the works. It’s stable enough that I decided to publish it as-is, with zero ongoing support commitment. Sometimes the best thing you can do for a side project is to ship it and walk away.
As to this site, I did another big batch of old page conversions–around 200 wiki pages from the hardware and apps sections got modernized from Textile to Markdown with proper frontmatter, descriptive image filenames, and cleaned-up links. Some of these pages date back to 2002, which is always a weird trip.
What I should be doing, though, is celebrating Marchintosh and building a new Mac Classic replica–but all I’ve done hardware-wise has been benchmarking SBCs for a potential new project. I hope to clean up my notes and post something about that next week.
Mar 15th 2026 · 1 min read
·
#a18
#apple
#hardware
#mac
I went to a local mall yesterday and happened to chance upon a couple of MacBook Neos on display at our local (monopolistic) Apple retailer1, and spent a half hour playing with them.
The display is… fine? With their strategic placement away from the other laptops on display, side-by-side comparison was impossible, but I didn’t find it bad or dim. The keyboard also felt OK, although I am wary of white keyboards, as they tend to get dirty and look bad over time. The trackpad was… also fine. I don’t press to click, so comparing the physical click with the taptic engine doesn’t make sense to me.
However, the bezels were a bit thicker than I am used to (hence noticeable) and I do think that the overall size could be smaller (even if that would be clearly against mainstream taste). It’s not something I would choose for a cheap travel laptop (especially considering I have the Minibook X), but volume-wise, and in “backpack scale”, it’s not that far off from the 13” MacBook Air if you squint.
To my surprise, it ran this site’s 3D visualization quite snappily, which is no small feat (most PC laptops struggle with it, regardless of what browser and GPU they have). And although I didn’t have the chance to run any benchmarks, RAM usage in Activity Monitor was pretty much OK after launching a gaggle of Apple apps, which wasn’t surprising (their software load doesn’t include any of the Electron bloat everyone has to deal with to some degree).
In general, I think it makes a killer Chromebook/PC laptop replacement for school, and although I expect the A18 to not be a powerhouse, it felt quite snappy, even with multiple apps open. And I’m sure Apple went to some pains to position it where it wouldn’t threaten either the Air or the iPad Pro.
I’d buy one in a flash if it was smaller (well, if I had money to spare), but at least the A18 performance makes me optimistic for whatever they decide to put into the next iPad mini, since my 5th generation one is in desperate need of an upgrade.
No, Portugalstill does not have official Apple Stores. I’ve given up on it and just order online. ↩︎
Mar 8th 2026 · 6 min read
·
#agents
#ai
#coding
#dev
#editors
#security
We’re three months into 2026, and coding agents have been a big part of my time since last year–things have definitely intensified, and one of my predictions has already panned out: agents are everywhere.
Yes, I love this picture, and I won't apologize for it
My advice for people getting into this remains the same:
In the music hobby, there’s a thing called GAS–Gear Acquisition Syndrome–where people get obsessed with buying the latest gear even if they don’t know how to use it. I see quite a lot of that in the AI space right now, so I’d rather recommend starting with mature, well-supported tools:
VS Code with GitHub Copilot is still the best entry point–you can compare Claude, GPT and Gemini side by side, and it affords real control over the agent’s environment (plus it’s designed for enterprise use).
Mistral Vibe and Gemini CLI both have daily free tiers with enough fail-safes to experiment safely. (I still recommend sandboxing, but it’s less critical than it was a few months ago.)
OpenCode is the fully free route, but the models have fewer guardrails and can take weird turns–definitely sandbox this one.
I can’t in good conscience recommend spending hundreds on Anthropic or OpenAI subscriptions right now–the market is saturated, and both are shipping desktop tools (Claude Code, Codex) that will likely come with cheaper tiers. The “use boring technology” adage applies here too.
I never run agent tools on a machine with personal data–that’s why I built agentbox in the first place. You may not need to be as strict about it as I am, but VS Code supports dev containers on any platform, and both Anthropic and OpenAI are shipping sandboxes with their tools, so there’s really no excuse.
There are rather a lot of misconceptions about data privacy, and things like OpenClaw don’t help matters (I’m still gobsmacked people give it direct access to their e-mail). Even with enterprise-hosted models that don’t train on your data, “don’t run untrusted code on a machine with personal data” is a principle worth keeping.
Like I was quipping the other day, AI is the new digital advertising–and yet people are giving it more access to their data than they give ad networks, which is just baffling.
And speaking of OpenClaw–the “local AI” fallacy needs addressing. None of these things are really “local” in any meaningful sense–the gap between local and cloud models is huge, and even tens of thousands of dollars in hardware won’t get you close to frontier capabilities.
And there is an almost weekly hype cycle around local models that I think is totally unwarranted in practice. For instance, Qwen is promising, but the local quantizations match last year’s cloud models at best, and the gap keeps widening.
I keep coming across people who say AI generates rubbish code, and I think it’s usually one of two things: wrong tools (VS Code does much more than provide a UI–it curates context and feeds models your code structure), or wrong approach (they’re quite literally “holding it wrong”). I wrote about the expectation gap two years ago and the fundamentals haven’t changed.
Part of it is inflated expectations, of course. Frontier models like Opus 4.6 and GPT-5.4 are very capable, but they need skill to use effectively, and they never produce perfect code on the first try. You have to know how to use them, and that takes practice.
I’ve been refining how I work with them since last year, and although things have evolved quite a bit, the core principles remain the same.
Every project starts with a SPEC.md that I iterate on with the LLM, 20-questions style, until it covers the essentials–goals, functional specs, non-functional specs, technical specs, and acceptance criteria.
I prefer SPEC.md over PRD.md because it emphasises specification over requirements–I want the agent to follow it, not interpret it freely.
This isn’t a prompt–it’s a living document that evolves with the project. And agents love specs–feeding one an actual ECMA-376 spec document got me to 60% compliance in days, with no hallucinated APIs.
I complement specs with SKILL.md files–guidelines for coding, tooling, or domain-specific tasks. I have a growing collection in agentbox, and every new project starts with a make init that adapts them to the local spec.
You can also fold these into .github/copilot-instructions.md (which VS Code picks up automatically), but standalone skills are tool-agnostic.
The properly interesting bit is that agents now write their own skills–piclaw built its own hot-reload, backup, and web scraping skills after I’d guided it through the process a few times. Early days, but that’s where this is headed.
MCP tools work better than skills for chaining–models struggle to chain skills together, whereas MCP narrows context and presents clear next steps. But for teaching how rather than what, skills are still invaluable.
From the SPEC.md, I create a PLAN.md–not a flat TODO checklist, but a structured breakdown the agent can reason about (what’s done, what’s blocked, why). It updates the plan as it goes, which also refreshes model context. No reliance on built-in planning tools (which are patchy across models), and the plan is always in the repo for me to review.
The loop itself is pretty simple:
I break down work into focused chunks–scaffolding, data model, API, etc.
The agent writes code, lints, tests, documents, and updates the PLAN.md.
I review and steer–correcting, feeding it more context, or pointing at examples.
There is also an “advanced mode” where I break up the PLAN.md into multiple Kanban-like tickets and we then play “20 questions” again for that ticket alone and the LLM asks me about it to refine the scope, but I realize most people just don’t have the patience for that even though it massively improves what gets done and how.
The most important bit of the loop–and the one most people get wrong at first. Effective steering isn’t about reprompting and hoping for the best; it’s about funnelling the agent’s attention to the right context:
TDD-like tests–describe expected behaviour, ask the agent to make them pass. My most reliable workflow, especially when porting across languages.
Linting and static analysis via Makefile or CI–the agent self-corrects. I aim for 80% coverage as a quality bar.
Steering by example–pointing at existing code that demonstrates the right approach.
Switching models–Claude still writes dodgy tests (ok, fine, it forgets all sorts of corner cases), so I use Codex models for test and security audits.
Yes, that’s proper work. But it’s no different from managing humans, and it gets easier with practice.
Some languages are inherently better for agents–not in terms of popularity (any agent can do decent Python), but because strong types and annotations help models understand intent and self-correct.
In my experience, Go, Rust and TypeScript work much better than Swift (too many revisions, too-opinionated frameworks). Go in particular reinforces context through explicit function and interface references, and TypeScript has excellent tooling for enforcing good practices–which is why I’ve been using both a lot more recently.
My sandboxing approach is working well, and I don’t see myself changing it anytime soon. But I’ve been slowly extending my agents’ reach–piclaw, which started as a weekend hack, is now a fully-fledged agentic assistant on my phone, and I’ve been giving it more autonomy as I learn to trust the guardrails.
The next frontier (boy, is this a pompous term, but I guess corporate speak is sneaking in this late in the evening) is getting agents to collaborate–sharing context and skills to work as a group. I have some ideas about how to do it, but that’s a matter for another post.
Mar 8th 2026 · 2 min read
·
#agents
#ai
#dev
#notes
#security
#typescript
#weekly
This was a frankly absurd week work-wise, with some pretty long days and a lot of late-night hacking on my projects (which is not exactly a new thing, but at least now I am asking piclaw to do it during the day time, which is a small improvement).
This is just lovely. If, like me, you grew up with the LEGO Space collection and loved the artwork on those pieces, and do 3D printing, this 10:1 scale recreation with a Mac mini and a 7 inch display will make your day.
I’m just a bit sad that the cabling is still very visible, but you can grab the files from Makerworld and give them your own spin.
Mar 4th 2026 · 1 min read
·
#a18
#apple
#hardware
#mac
I know a bunch of people will disagree, but this is the most relevant Mac announcement in years for two reasons:
It’s the first new Mac model in a while that isn’t just a spec bump, but rather a new product line with a clear target audience and a pretty aggressive price point (at Apple standards, that is).
It’s not running on an M-series chip, which is a bold move that could have significant implications for Apple’s product strategy and the broader Mac ecosystem.
The fact that it has “only” 8GB of RAM and 256GB of storage (which is OK if you think of it as a school machine) is going to be widely maligned, whereas I would focus on the missed opportunity to make it even more portable by shipping a 12” display instead of 13” (probably some sort of golden ratio thing) and the unbelievable stinginess of shipping with a USB-C 2.0 port.
What? You couldn’t afford a USB-C 3.0 port? Really? I mean, I get that this is an entry-level machine, but come on, Apple.
Update: this seems to be a limitation of the A18 chipset’s I/O setup, from what I’m reading. There’s a lot of chip information out there now, including breakdowns of the new M5 lineup that are worth perusing as well.
That said, I would swap my iPad Pro for it in a flash (if it had a 12” display, that is). And that is probably exactly why it is that big.
Mar 1st 2026 · 5 min read
·
#agents
#ai
#dev
#golang
#notes
#security
#weekly
This is a great round-up, and it isn’t hard to spot the main themes-great hardware, absolutely damning feedback on software quality on so many aspects (from the Liquid Glass Tsunami to people straight out avoiding installing Tahoe) that I cannot help but agree on (especially considering my current travails).
The best possible outcome from this is that Apple backtracks on the mess they created last year.
The most likely one is that they will simply carry on without acknowledging any of it publicly and discreetly patch the most critical issues, because they are still making tons of cash on hardware and services and software quality really hasn’t been a priority in half a decade.
At this point, I am even starting to question if they still have the talent (or the ability to retain it), especially considering that the people from most startups they’ve acquired over the years keep leaving. And I know for a fact that they stopped recruiting remotely a few years ago, which definitely hasn’t helped.
Feb 21st 2026 · 2 min read
·
#agents
#ai
#automation
#home
#notes
#siri
#weekly
#wellness
This week I did something different: I took a wellness break from work and generally tried to tune out all the noise and messiness I have been experiencing there. It ate a chunk out of my PTO, but was mostly worth it.
I have no idea of what is happening since I can’t even find any decent logs in Console.app, but it seems that the latest update to macOS Tahoe (26.3) has a serious bug.
So, I finally got around to opening the Maclock I’ve had sitting around for a while, and I was looking at the STL files for the screen bracket that are over on Printables.
Since I want to fit a lot more electronics into the case, one of the things I need to do is modify that bracket a bit so I can actually use the brightness knob to drive a potentiometer instead of just a dummy knob.
But you can’t really edit STL files, and I don’t have the original CAD files, so I thought “well, maybe I can just ask an LLM to reverse-engineer the STL into an OpenSCAD file, and then I can modify that”.
So I created a SPEC.md file with the following content, spun up a new agentbox container to handle it, and gave it to pi:
# STL to OpenScad reconstruction
You are running inside a Debian system with sudo, brew, python3 and whatever you need to accomplish this task.
Your job is to take the STL file in this workspace and reconstruct a copy of it using OpenSCAD.
If the STL is too large or complex, you can try decimating it/using octrees/segmenting it/using marching cubes as needed.
## Requirements-The final STL file must be within 0.1mm accuracy of the original
-The OpenSCAD code must be modular, readable, and have relevant constants surfaced at the beginning for a customizer
-You should focus on building reusable tools, not one-off scripts
-You should maintain a plan.md file with your progress checklist and tick off items (including failed approaches) as you make progress
Much to my surprise, Claude Opus 4.6 was able to handle this task pretty well, and it produced a stupefyingly accurate OpenSCAD file:
// WonderMac Screen Bracket - OpenSCAD Reconstruction// Reconstructed from STL analysis// All dimensions in mm/* [Main Dimensions] */// Total width of the bracket (X direction)bracket_width=68;// X: -50 to 18// X offset (left edge position)bracket_x_offset=-50;/* [Back Plate] */// Height of the tall vertical back plateback_plate_height=21.5;// Z: 0 to 21.5// Thickness of the back plate (Y direction)back_plate_thickness=2;// Y: -16.5 to -14.5// Y position of back plate inner edgeback_plate_y=-16.5;/* [Top Shelf] */// Bridge depth (connects back plate to top lip)bridge_depth=5.5;// Y: -14.5 to -9// Bridge heightbridge_height=2;// Z: 0 to 2// Top lip depthtop_lip_depth=2;// Y: -9 to -7// Top lip heighttop_lip_height=5;// Z: 0 to 5/* [Frame] */// Frame outer Y boundsframe_outer_y_min=-66.5;frame_outer_y_max=-14.5;// matches back plate// Frame inner bounds (lower section Z=0-7)frame_inner_x_min=-47;frame_inner_x_max_lower=18;// open right sideframe_inner_y_min_lower=-64.5;frame_inner_y_max_lower=-16.5;// Frame inner bounds (upper section Z=7-9)frame_inner_x_max_upper=13;frame_inner_y_min_upper=-62.5;frame_inner_y_max_upper=-17.5;// Frame heightsframe_lower_height=7;// Z: 0 to 7frame_upper_height=2;// Z: 7 to 9/* [Bottom Lip] */// Lip extension below framelip_y_min=-70.5;lip_y_max=-66.5;lip_z_min=3;lip_z_max=5;lip_height=2;// lip_z_max - lip_z_min/* [Extended Plate] */// L-shaped plate at Z=5-7plate_z_min=5;plate_z_max=7;plate_height=2;// Bottom part of L (full width)plate_bottom_y_min=-9;plate_bottom_y_max=-3;// Left part of L (extends upward)plate_left_y_max=15;plate_left_x_max=-30;/* [Screw Post] */// Cylindrical post with through-holepost_center_x=-41;post_center_y=10.5;post_outer_radius=2;post_inner_radius=1;post_z_min=0.5;post_z_max=5;// meets plate bottom/* [Screw Holes] */// Hole radius for mounting holesmount_hole_radius=0.75;// Lip mounting holeslip_hole_left_x=-47.75;lip_hole_right_x=15.75;lip_hole_y=-69;// Plate mounting holesplate_hole_left_x=-46;plate_hole_right_x=15;plate_hole_y=-5;/* [Resolution] */$fn=64;// ============================================// Modules// ============================================moduleback_plate(){translate([bracket_x_offset,back_plate_y,0])cube([bracket_width,back_plate_thickness,back_plate_height]);}modulebridge(){translate([bracket_x_offset,back_plate_y+back_plate_thickness,0])cube([bracket_width,bridge_depth,bridge_height]);}moduletop_lip(){translate([bracket_x_offset,-9,0])cube([bracket_width,top_lip_depth,top_lip_height]);}moduleframe_lower(){difference(){// Outer rectangletranslate([bracket_x_offset,frame_outer_y_min,0])cube([bracket_width,frame_outer_y_max-frame_outer_y_min,frame_lower_height]);// Inner cutouttranslate([frame_inner_x_min,frame_inner_y_min_lower,-1])cube([frame_inner_x_max_lower-frame_inner_x_min,frame_inner_y_max_lower-frame_inner_y_min_lower,frame_lower_height+2]);}}moduleframe_upper(){difference(){// Outer rectangle (same as lower)translate([bracket_x_offset,frame_outer_y_min,frame_lower_height])cube([bracket_width,frame_outer_y_max-frame_outer_y_min,frame_upper_height]);// Inner cutout (smaller = thicker walls)translate([frame_inner_x_min,frame_inner_y_min_upper,frame_lower_height-1])cube([frame_inner_x_max_upper-frame_inner_x_min,frame_inner_y_max_upper-frame_inner_y_min_upper,frame_upper_height+2]);}}modulebottom_lip(){translate([bracket_x_offset,lip_y_min,lip_z_min])cube([bracket_width,lip_y_max-lip_y_min,lip_height]);}moduleextended_plate(){// Bottom part of L (full width, Y=-9 to -3)translate([bracket_x_offset,plate_bottom_y_min,plate_z_min])cube([bracket_width,plate_bottom_y_max-plate_bottom_y_min,plate_height]);// Left part of L (Y=-3 to 15, X=-50 to -30)translate([bracket_x_offset,plate_bottom_y_max,plate_z_min])cube([plate_left_x_max-bracket_x_offset,plate_left_y_max-plate_bottom_y_max,plate_height]);}modulescrew_post(){translate([post_center_x,post_center_y,post_z_min])cylinder(r=post_outer_radius,h=post_z_max-post_z_min);}modulescrew_post_hole(){translate([post_center_x,post_center_y,-1])cylinder(r=post_inner_radius,h=back_plate_height+2);}modulelip_holes(){// Left lip holetranslate([lip_hole_left_x,lip_hole_y,lip_z_min-1])cylinder(r=mount_hole_radius,h=lip_height+2);// Right lip holetranslate([lip_hole_right_x,lip_hole_y,lip_z_min-1])cylinder(r=mount_hole_radius,h=lip_height+2);}moduleplate_holes(){// Left plate holetranslate([plate_hole_left_x,plate_hole_y,plate_z_min-1])cylinder(r=mount_hole_radius,h=plate_height+2);// Right plate holetranslate([plate_hole_right_x,plate_hole_y,plate_z_min-1])cylinder(r=mount_hole_radius,h=plate_height+2);}// ============================================// Assembly// ============================================modulebracket_assembly(){difference(){union(){back_plate();bridge();top_lip();frame_lower();frame_upper();bottom_lip();extended_plate();screw_post();}// Subtract all holesscrew_post_hole();lip_holes();plate_holes();}}bracket_assembly();
This is what the output looks like
But what is more important is that I was able to capture the entire process in a SKILL.md file, and it makes for an amazing read:
# SKILL: STL to Parametric OpenSCAD Reconstruction## Goal
Reverse-engineer a binary/ASCII STL mesh file into a clean, parametric OpenSCAD source file that reproduces the original geometry within a specified tolerance (e.g. 0.1mm Hausdorff distance).
## When to Use-You have an STL file of a mechanical part and need an editable parametric source
-The part is primarily composed of prismatic (box-like) and cylindrical features — not organic/sculpted shapes
-You need the output to be human-readable and customizable, not just a mesh re-export
## Prerequisites-**Python packages**: `numpy`, `trimesh`, `scipy`, `shapely`, `networkx`, `rtree`, `numpy-stl`-**System packages**: `openscad`-Install with: `pip3 install numpy trimesh scipy shapely networkx rtree numpy-stl` and `sudo apt-get install openscad`## High-Level Approach### Phase 1: Mesh Triage
Load the STL with `trimesh` and gather key statistics to understand the scope:
-**Vertex/face count**: Determines complexity. Under ~5k faces is likely a machined/printed part with clean geometry.
-**Bounding box and extents**: Gives the overall dimensions.
-**Volume and watertightness**: Confirms the mesh is valid and closed.
-**Euler number**: Computes genus (number of through-holes). Formula: `genus = (2 - euler_number) / 2`. This tells you how many holes to find.
### Phase 2: Identify Z-Level Structure
For prismatic parts (brackets, enclosures, mounts), the geometry is almost always built from features extruded along one principal axis. Identify which axis that is by examining the unique coordinate values of vertices.
1.**Find unique vertex coordinates** along each axis (rounded to ~3 decimal places). The axis with the fewest unique values is the extrusion/stacking axis.
2.**List the discrete levels** on that axis. For this bracket, Z had only 8 unique values: `[0, 0.5, 2, 3, 5, 7, 9, 21.5]`. Each pair of adjacent levels defines a "layer" of constant cross-section.
3.**Count up-facing and down-facing face areas** at each level. Up-facing faces at a Z-level mark the *top* of a feature; down-facing faces mark the *bottom* of a feature starting at that height. The area values serve as checksums for your reconstruction.
### Phase 3: Cross-Section Analysis
Take planar cross-sections at the midpoint of each layer using `trimesh.section()`:
1.**Slice the mesh** at each intermediate Z value (e.g. Z=0.25, Z=1, Z=2.5, etc.).
2.**Convert to 2D polygons** via `section.to_planar()` and examine the `polygons_full` property.
3.**Simplify polygons** with Shapely's `.simplify()` to reduce curved arcs to key vertices while preserving corners.
4.**Transform back to world coordinates** using the planar transform matrix to get actual XY positions.
5.**Record each polygon's exterior and interior (hole) boundaries**. Note how many vertices remain after simplification — a 5-point polygon is a rectangle, a 9-point polygon is an L-shape, a 17-point polygon is a circle approximation, etc.
Track how the cross-section *changes* between layers — this reveals where features start, end, merge, or split.
### Phase 4: Identify Geometric Primitives
From the cross-section data, decompose the shape into CSG primitives:
-**Rectangles** (5 simplified vertices = box cross-section): Record corner coordinates, extrusion height range.
-**L-shapes / U-shapes** (9+ vertices): Decompose into union of rectangles, or model as rectangle-minus-rectangle.
-**Circles / arcs** (17+ vertices after simplification): Compute center as midpoint of extremes, radius as half the span. Verify by checking vertex distances from the computed center — they should all equal the radius.
-**Rings/annuli** (polygon with circular hole): Outer and inner radius from the exterior and interior boundaries.
For each primitive, determine:
-XY bounds or center+radius
-Z range (which layers it spans)
-Whether it's additive (part of the union) or subtractive (a hole to difference out)
### Phase 5: Cross-Validate with Vertex Grouping
For extra confidence, directly examine the raw vertices at each Z-level:
-Group vertices by their Z coordinate.
-For levels with few vertices (≤20), print them all — these directly reveal rectangle corners.
-For levels with many vertices, look for clusters. Compute distances from suspected circle centers and verify constant radius.
-Check that circle parameters (center, radius) are consistent across multiple Z-levels.
### Phase 6: Build the OpenSCAD Model
Structure the `.scad` file for readability and customization:
1.**Constants at the top** in OpenSCAD Customizer sections (`/* [Section Name] */`). Every dimension gets a named variable with a comment showing its physical meaning and original coordinate range.
2.**One module per feature**: `back_plate()`, `frame_lower()`, `screw_post()`, `lip_holes()`, etc. Each module is self-contained and uses only the global constants.
3.**Assembly module**: A single `bracket_assembly()` module that `union()`s all additive features, then `difference()`s all holes. This keeps the boolean logic clean and makes it easy to toggle features.
4.**Resolution control**: A single `$fn` parameter controls circle smoothness globally.
Modeling patterns:
-**Rectangular frame**: `difference()` of outer `cube()` minus inner `cube()`.
-**L-shaped plate**: `union()` of two overlapping `cube()` calls.
-**Through-hole**: `cylinder()` with height extending past the material (add 1mm on each side with `-1` offset and `+2` height to ensure clean boolean cuts).
-**Ring/post**: `cylinder()` for the outer, with a through `cylinder()` subtracted.
### Phase 7: Render and Compare1.**Render** with `openscad -o output.stl model.scad`.
2.**Compare** using a reusable Python comparison tool that computes:
-**Bidirectional surface distance**: Sample 50k points on each surface, find nearest point on the other surface using `trimesh.nearest.on_surface()`. Report mean, max, 95th/99th percentile.
-**Volume difference**: Compare `mesh.volume` values.
-**Bounds match**: Check bounding boxes agree within tolerance.
-**Topology match**: Compare Euler numbers.
3.**Iterate** if the Hausdorff distance exceeds the tolerance. Common fixes:
-Wrong dimension by a small amount → re-examine vertex coordinates at that Z-level
-Missing feature → look at the worst-mismatch sample points to locate the problem area
-Circle approximation error → increase `$fn`### Phase 8: Verify the Accuracy Target
The final gate is the bidirectional Hausdorff distance. For this task the target was 0.1mm; the achieved result was 0.004mm (25× better than required). The residual error comes entirely from polygon approximation of circles (`$fn=64` gives a theoretical max error of `r × (1 - cos(π/64))` ≈ 0.0024mm for r=2mm).
## Key Lessons1.**Z-level analysis is the critical insight for prismatic parts.** If the mesh has only a handful of unique Z values, the part is a stack of extruded profiles and can be exactly decomposed.
2.**Cross-sections + simplification finds the primitives fast.** Shapely's `simplify()` with a small tolerance (0.05–0.1mm) collapses arcs to their key points while preserving sharp corners.
3.**Euler number tells you how many holes to find.** Don't stop looking for features until you can account for all `(2 - χ) / 2` topological handles.
4.**Face normal grouping separates flat vs. curved surfaces.** Axis-aligned normals (±X, ±Y, ±Z) are planar faces; all others are cylinder walls. The Z-component of non-axis normals reveals whether cylinders are vertical (Z=0) or angled.
5.**Up/down face area sums serve as checksums.** Compute the expected area of each horizontal surface from your model parameters and verify it matches the STL. This catches dimension errors before rendering.
6.**Model in original coordinates, not relocated.** Keeping the STL's native coordinate system avoids translation errors and makes comparison trivial.
7.**Build the comparison tool first.** A reusable `compare_stl.py` with surface sampling and Hausdorff distance makes iteration fast and objective.
## Reusable Tools
All tools live in `tools/` with CLI interfaces, `--help`, and `--json` output.
See `tools/README.md` for full usage.
| Tool | Phase | Purpose |
|------|-------|---------|
| `tools/stl_info.py` | 1 | Mesh triage: stats, topology, genus, components |
| `tools/stl_zlevel.py` | 2 | Find discrete height levels, face areas, vertex coords |
| `tools/stl_cross_section.py` | 3–4 | Slice mesh, extract & classify 2D polygons |
| `tools/stl_normals.py` | 4 | Face normal grouping, cylinder feature detection |
| `tools/compare_stl.py` | 7 | Bidirectional Hausdorff distance, volume, topology |
### Quick-start workflow
python3 tools/stl_info.py part.stl # What am I dealing with?
python3 tools/stl_zlevel.py part.stl --vertices # Layer structure + corners
python3 tools/stl_cross_section.py part.stl # Auto-slice cross-sections
python3 tools/stl_normals.py part.stl # Find cylinders and holes
python3 tools/stl_cross_section.py part.stl --axis x --at 0 # Hidden internal features
# ... write OpenSCAD model ...
openscad -o output.stl model.scad
python3 tools/compare_stl.py part.stl output.stl # Verify accuracy
## Deliverables
| File | Purpose |
|------|---------|
| `tools/` | Reusable analysis toolkit (see `tools/README.md`) |
| `bracket.scad` | Parametric OpenSCAD source with customizer sections |
| `bracket_output.stl` | Rendered STL for comparison |
| `plan.md` | Progress checklist with identified components and results |
…and yes, it also created tools for its own use. It’s not a chimpanzee using a stick to get at termites, but it is pretty close: it’s an LLM creating its own toolchain to get at the underlying geometry of a mesh.
This is far more sophisticated than I expected, and it shows that LLMs can be used for scoped reverse-engineering tasks with the right prompting and constraints–but it also shows that you need to be able to understand the problem domain and guide the model with the right structure and checks to get a usable result.
The caveat is that this is a very specific use case (STL to OpenSCAD for prismatic parts), and I wouldn’t expect the same approach to work for more complex shapes or different file formats without significant adjustments. But it’s very much above and beyond what we could do a year ago.
Now excuse me while I go and give it a development board’s dimensions and ask it to design a custom case for it…
Update: Pretty impressive results on the custom case as well–here’s a first draft of it, which is already pretty close to what I need:
An Orange Pi 6 Custom Case, generated after a few web searches and image references
Feb 15th 2026 · 3 min read
·
#ai
#go
#notes
#porting
#testing
I’ve been sleeping on doing mine, which has been sitting next to the larger 3D printed version for a couple of months now.
Even though I spent quite some time to try to improve Basilisk II performance, I kind of lost the plot with all the other AI hackery that took place since, but I I have all the right parts and everyone seems to have sorted out how to open the case without breaking anything, so this is very high up on my list of things to do over Mardi Gras.
Feb 9th 2026 · 1 min read
·
#french
#games
#rpg
#screenshots
#xbox
We’ve had one of these in the house for years, and it’s insane how good and genuinely smooth it’s been throughout.
Ours has survived two Apple TV generations and is still our go-to to watch most things, to the point where I’m starting to dread replacing it because there just isn’t anything else out there that is at least half as good in the Android space.
Feb 7th 2026 · 3 min read
·
#agents
#ai
#docker
#golang
#notes
#productivity
#python
#swift
#weekly
Xcode 26.3 getting official Claude and Codex integration without the usual guardrails is interesting enough, but having MCP in the mix is… unusually open for Apple.
Given Claude’s propensity to botch up Swift semantics, though, I’m happy they announced both–and although I’m not surprised they didn’t add any Apple models, I would have expected Gemini to be in the mix, even this early.
But at least they seem to have done their homework where it regards the in-editor agent harness–not sure how deep they went into IDE primitives (file graph, docs search, project settings), though, and the apparent lack of skills and other creature comforts that all the VS Code–based tools have is a bit of a bummer.
The fact that they put the chat interface on the left in total defiance of everyone else’s design is, of course, because they just had to mess with our heads.
Design is how it works, after all.
Feb 1st 2026 · 2 min read
·
#agents
#ai
#go
#notes
#python
#rdp
#specs
I’ve had some feedback that my last few weekly notes (especially over the holidays) have been a bit too long and that I should try to keep them shorter and more focused on a single topic.