A key insight I’ve had this Thursday is that I think I’m regretting going back to Twitter/X, because everything is so loud there. I went back to follow Salvatore, Armin and Mario, so my “For You” feed is now all AI, all the time, and although I can tune out the FOMO it induces, the pace is exhausting, particularly around OpenClaw.
OpenClaw might be a blip, but right now it has spawned another Cambrian explosion of DIY agents, with everyone and their lobster creating minimal/more secure/tailored versions of it. If sandboxing was already high on everyone’s mind, the fact that people are actually giving API keys and “money” to a set of convoluted digital noodles running in a JavaScript runtime with administrative privileges drove everyone over the edge in various ways.
In short, it’s the Wild West, and there are no stable, clear patterns emerging yet.
Which is why I’ve decided that my current sandboxing solution is going to stay the way it is for the foreseeable future. I’m still developing a sandboxable/WASM-ready busybox clone, but I’m going to wait it out until proper reusable patterns emerge (spoiler: I don’t think we’ll get to a consensus this year, unless sticking to containers is a consensus).
Both Anthropic and OpenAI launched their incremental updates this week, pretty much on schedule and confirming my years-long assertion that the decreasing returns phase of LLMs would be compensated by relentless optimization—but no amazing breakthroughs.
Still, both are pretty decent upgrades. Since I have pretty big SPEC-driven projects built with their previous iterations, I’ve let both loose on those codebases (I have taken to asking for a code smell/best practices/logic and security audit, including writing fuzzing tests) and both Opus 4.6 and Codex 5.3 spotted a few things.
What they both lack, sadly, is taste. Claude models tend to be much better at creating UIs than Codex but write absolutely shit tests, and Codex will often design API surfaces that make sense but are cumbersome to use—and neither of those traits (nor the annoying “personality” twists some stupid product managers insist on instilling on the models) was fixed by this week’s updates.
The demos are, of course, amazing, but what should matter at this point is accuracy (against specs), correctness (of code), and speed (which is what Codex 5.3 improved for me). I don’t particularly care about other people’s use cases, and neither should you.
I’ve been sticking to the new GitHub Copilot CLI because I can get any frontier model on it, so I’ve been isolated from the OpenCode/minimal agent hype. I do likePi and it’s minimal, shell-like workflow and approach, but am still convinced that we need higher-level tooling (because, again, my ultimate goal is not to build coding agents).
Regardless of tools, people have finally cottoned on to skills—and although I think there is a relapse to the voodoo/cargo-culting of early prompt engineering approaches, there are a few useful nuggets out there worth collecting and adapting.
My take on it has been to fold them into a skel folder in agentbox, and asking Copilot to "take the skills, workflows and instructions from this URL and adapt them to the scope of our SPEC.md" for every new project - both Codex and Claude are smart enough to not just duplicate the structure, but also to rewrite the skills to better suit the project, which is delightful.
Then during the project I will often ask Copilot to capture a specific workflow into a new skill (for instance, this week I got some feedback on Swift development, and that is now codified in this file).
And after things stabilize a bit, I can take any truly new skills or useful updates and put them back into my little archive, rather than collecting random stuff off the Internet that might never really suit my workflow and tooling.
One of the few worthwhile outcomes of dipping into Twitter/X is that I can gauge the mass market impact of image and video generation with a broader view than just hanging around r/StableDiffusion, and this week was quite interesting in that regard.
Kling has been seeding social media with pretty amazing (but still detectable) AI shorts. They claim Hollywood is dead, but realistically I’d say video advertising (where impactful short content rules and there is maximal return) is going to be revolutionized, because good creatives will certainly know what to do with it.
That is worrying on several fronts, especially considering that even official sources seem to be using AI-generated media these days.
But I am cautiously optimistic that once visual inconsistencies are sorted out (or at least minimized and papered over by human SFX editors) we might see some actually good content coming out of it.
Feb 3rd 2026 · 1 min read
·
#ai
#anthropic
#apple
#development
#mcp
#openai
#xcode
Xcode 26.3 getting official Claude and Codex integration without the usual guardrails is interesting enough, but having MCP in the mix is… unusually open for Apple.
Given Claude’s propensity to botch up Swift semantics, though, I’m happy they announced both–and although I’m not surprised they didn’t add any Apple models, I would have expected Gemini to be in the mix, even this early.
But at least they seem to have done their homework where it regards the in-editor agent harness–not sure how deep they went into IDE primitives (file graph, docs search, project settings), though, and the apparent lack of skills and other creature comforts that all the VS Code–based tools have is a bit of a bummer.
The fact that they put the chat interface on the left in total defiance of everyone else’s design is, of course, because they just had to mess with our heads.
Design is how it works, after all.
Feb 1st 2026 · 2 min read
·
#agents
#ai
#go
#notes
#python
#rdp
#specs
I’ve had some feedback that my last few weekly notes (especially over the holidays) have been a bit too long and that I should try to keep them shorter and more focused on a single topic.
Well, this week I wrote a lot thanks to the inclement weather and some insomnia, so I broke most of it out into separate posts:
This is the age of TikTok, after all, and attention spans are shorter than ever, so I might as well try to adapt.
But there was a lot more going on this week, so here’s a quick roundup of two other things I worked on:
go-rdp Improvements
My web-based RDP client got UDP transport support—experimental, gated behind a --udp flag, but apparently functional. So far it’s going well, even if MP3 audio support is not all there yet (and yes, I know it’s a bit much for a web client).
This is one of my “things that should exist” projects that I’ve been using to experiment how to spoon-feed AI agents with highly complex protocol specs, and it’s been working out great so far, largely because RDP is so well documented.
The test suite now includes Microsoft Protocol Test Suite validation tests for RDPEUDP, RDPEMT, RDPEDISP, and RDPRFX. It’s the kind of spec compliance work that’s tedious but essential, and that I am increasingly convinced can be massively sped up with AI (after all, it took me less than a working week of continuous effort to do the whole thing, instead of the months it would have taken me otherwise).
go-ooxml from OO to… Hero?
As another data point, I decided to build my own Office Open XML library in Go this week, called go-ooxml. It too is meant to be a clean-room implementation, and I intend it to be a comprehensive replacement for the various format-specific Python libraries out there–I’ve already been using Go as an accelerator for Python in things like pysdfCAD, so I know it works.
And it also fits into the “things that I think should exist” category, because existing Go libraries for this are either abandoned, incomplete, or have APIs that make me want to cry.
It is a real thing people used for other implementations, and progress has been ridiculous–agents are genuinely useful for this kind of spec-driven implementation work.
In about two days (less than four hours of “actual work”), I went from initial project setup to enough to handle basic document creation (mostly Word, which is the most complex)
Is it production-ready? Of course not.
But it’s already more complete than many alternatives, and having the foundation right means I can add features as I need them without fighting the architecture.
Agents make the breadth possible, but they don’t remove the need for taste: deciding what the public API should look like, what’s testable, and what’s going to be maintainable six months from now is, of course, the hard part still to come…
At least I’m aware of it, I guess.
Feb 1st 2026 · 1 min read
·
#agentbox
#agents
#ai
#cicd
#docker
#go
#ios
#notes
#specs
#webterm
Agents love specs. My go-ooxml project went from nothing to 60% spec compliance in days because I fed the agents the actual ECMA-376 spec documents and told them to implement against those. No hallucination about what XML elements should be called, no invented APIs—just spec-compliant code.
Mobile is an afterthought until it isn’t. Half the webterm fixes this week were iOS/iPad edge cases. If you’re building tools you’ll use on multiple devices, test on all of them early, because agents can only help you with things that they can test for autonomously.
The unglamorous work matters. I did a lot of CI/CD cleanup jobs, release automation, Docker pipelines, and invested quite a few hours in creating solid SKILL.md scaffolding–none of this is exciting, but it’s what separates a tool you can rely on from a tool that occasionally bites you, and right now, for me, at least, it’s what makes AI agents genuinely useful.
There’s going to be more software. With such a low barrier to entry into new languages, tools or frameworks, any decent programmer is soon going to realize that their skills are transferable to the point where they can take on work in any technology.
There’s going to be more shitty software because, well, there are a lot of overconfident people out there, for starters, and the law of averages is inevitably going to kick in at some point. I am acutely aware that I am treading a fine line between “productive developer leveraging AI” and “architecture astronaut”, but my focus is always on shipping self-contained, small tools that solve real problems (for me, at least), so I hope I can avoid that pitfall.
The number of truly gifted developers is going to stay roughly the same, because programming, like any form of engineering, is a mindset much more than it is a skill.
Some of these are debatable, of course, but they are my current take on things. Let’s see how they hold up over time.
Feb 1st 2026 · 4 min read
·
#acp
#agents
#ai
#iphone
#mcp
#preact
#sqlite
#sse
Although most of my actual work happens inside a terminal, I have been wanting an easy way to talk to the newfangled crop of AI agents from my iPhone.
So I spent a good chunk of this week building out a Slack-like web interface for chatting with agents via the Agent Client Protocol (ACP) called vibes.
I blame OpenClaw for planting the seed of a chat-like interface for agents in my mind–this is not replacing my terminal-based workflow, but it’s a nice complement, especially for quick check-ins or when I want to give an agent a task and review the results later from my phone.
The web layer is “just” Preact and SSE, with enough CSS for it to work nicely in small screens and touch input, and the main timeline view shows messages from me and the agent, with support for rich content like code blocks, KaTeX formulae, images, and resource links.
But the key thing is the tool permission flow: when the agent wants to call a tool, it shows a modal with an explanation of what the tool does (fetched from the ACP server), and I can approve or deny it with a tap–that is the key part of ACP that I wanted to leverage, and that so far I’ve only seen in CLI/TUI clients.
One thing that isn’t on the diagram above is the database. I love SQLite for small projects like this, and all the more so now that I learned the tricks around using JSON columns for flexible data storage. And, of course, you get full text search support out of the box, which is perfect for searching what I intend to be an infinite timeline.
Like MCP, I am not crazy about it. ACP has many of the same flaws, except that now you also have to deal with the ambiguity of how to surface all of the interactivity you’d have in a TUI in a chat timeline.
Content parsing went through several iterations to handle all the edge cases: tool calls, thinking panels, resource links, embedded resources with annotations, live updates from the agent, etc.
And I had to test it with multiple ACP servers, since each implementation has its own quirks. Right now, vibes works reasonably well with my python-steward, Mistral’s vibe and GitHub Copilot CLI, but all of them have small differences in how they implement the spec.
If I had to do it again, I would have probably built a proper acp client library in Go or Python first, but since I was building both the client and server sides at the same time, I just kept iterating on the wire format until everything worked.
It’s not just the convenience of having a cute web app on my phone–having a low-friction review loop is essential when working with agents (which is why I was keen on leveraging ACP in the first place), but I also wanted persistent history and richer rendering than what a terminal can provide, because I want to give my agents more complex tasks that involve multiple steps and outputs.
Everyone and their dog seems to be thinking that agents only have to have bash (Armin Ronacher makes some excellent points), but I am trying to strike a balance when designing steward: Give it all the tools it should need for most use cases, a little scripting engine (QuickJS) for extensibility, and extensive SKILL.md support so I can teach it to do new things.
I am pretty sure that my endgame will eventually involve WASM (maybe tinygo in a WASM sandbox or a Cloudflare-like V8 isolate) and I’m actually hedging my bets by looking at porting a subset of busybox to WASM, but for the moment I want to keep things simple and give agents access to higher-level tools that can do complex things without needing to script them from scratch.
Because, well, I don’t want to write coding agents. There’s a special kind of myopia around their incredible success, but I think there should be some balance in the Force.
Crustaceans are cool and all, but sometimes you just want to vibe with your agent about something as prosaic as scheduling a meeting or searching your Obsidian vault.
Feb 1st 2026 · 5 min read
·
#agents
#ai
#ansi
#docker
#llm
#notes
#python
#terminals
#wasm
#webgl
#webterm
Since last week, I’ve been hardening my agentbox and webterm setup through sheer friction. The pattern is still the same:
Small, UNIX-like, loosely coupled pieces which I then glue together with labels, WebSockets and enough duct tape to survive iOS, and the goal is still the same: to reduce friction and cognitive load when managing multiple AI agents across what is now almost a dozen development sandboxes.
My take on tooling is, again, that a good personal stack is boring. You should be able to understand it end-to-end, swap parts in and out of, and keep it running even when upstreams change direction. The constraint is always the same: my time is finite, so any complexity I add needs to pay rent.
I started out with the excellent Textual scaffolding (i.e., xterm.js + a thin server), but I kept having weird glitches (mis-aligned double-width characters, non-working mobile input, theme handling, etc.).
So, being myself, I decided to reinvent that particular wheel, and serendipitously I stumbled onto a WASM build of Ghostty that is pretty amazing–it can render using WebGL, fixed all of my performance issues with xterm.js, and… well, it was a bit of a challenge to deal with, but only because of a few incomplete features.
In the pre-AI days, I would have stopped there, but this week it took me under an hour to create a patched fork of ghostty-web that filled in the gaps I wanted and that I could just drop into webterm.
Then came the boring part–ensuring the font stack worked properly across platforms, fixing a few rendering glitches, replacing the entire screenshot capture stack (which is what I loved about Textual) with pyte, and… a lot of mobile testing.
Still, the end result is totally worth it:
The prettiest thing I did all week
There were a lot of little quality-of-life improvements that came out of this rewrite:
The dashboard got typeahead search, so I can quickly find the right sandbox among many.
And the most satisfying cosmetic fix: dashboard screenshots now use each session’s actual theme palette.
PWA support landed, so the iPad can treat it like a proper “app”.
The WebSocket plumbing got a proper send queue so slow clients couldn’t freeze other sessions.
I would have rewritten this in Go, but as it happens the Go equivalent of pyte didn’t seem to be good enough yet, and running half a dozen sessions at a time for a single person isn’t a load-sensitive setup anyway.
Again, this is all about reducing friction: Color helps me recognize the project, and typehead find makes it trivial to, well… find. The less mental overhead I have to deal with when switching contexts, the more likely I am to actually use the tools I’ve built.
But getting it to work properly on mobile was a pain:
Mobile keyboard handling was a mess. You can’t customize the iOS onscreen keyboard in the browser, and modifier keys were especially problematic.
To make mobile usable for real work (not just htop screenshots), webterm now pops up a draggable keybar with Esc/Ctrl/Shift/Tab and arrow keys, which are “sticky” so you can tap out proper Ctrl/Shift arrow sequences–and Ctrl+C, which is kind of essential.
Focus was a big problem. iOS is incredibly finicky about input browser input events–and if you test on an iPad with a keyboard attached, you miss half the problems. The “solution” was to monkeypatch input via a hidden textarea that captures all input events and forwards them to the terminal renderer–and that still breaks in weird, unpredictable ways.
I might have gone a bit overboard with testing–I don’t have an Android tablet, so I decided to test on my Oculus Quest 2 headset browser, which is almost Android with a head strap:
Testing `webterm` on the Oculus Quest 2 browser--it works surprisingly well!
Then came even weirder rendering bugs, since, well, terminals are terminals. And for such a simple concept, the stack is surprisingly complex:
Each and every one of those arrows gave me a headache
For instance, you’ll notice in the diagram above that there is a PTY layer andtmux in the mix. That means there are two layers of terminal emulation happening, and both need to be configured properly to avoid glitches.
For instance, I kept getting 1;10;0c when I connected, which led me down the weirdness of ANSI escape codes and nested terminal emulators (something I hadn’t done since running emacs to wrap VAX sessions…). tmux sends DA2 queries, but my wrapper ended up having to filter more than DA1 responses and not messing up UTF-8 sequences.
Then I realized that the Copilot CLI sends a bunch of semi-broken escape sequences that pyte couldn’t handle properly, which led to all sorts of rendering glitches in the screenshots, and another round of patches, and another…
I also spent a good chunk of time this week improving the agentbox Docker setup, adding better release automation, cleaning up old artifacts, and generally making it easier to spin up new sandboxes with the right tools and my secret weapon:
A set of starter SKILL.md files that teach the bundled agents how to manage the environment, use how I prefer to develop, and generally be useful and run through proper code/lint/test/fix cycles without me having to babysit them.
Right now I’m at a point where I can just go into any of my git repositories, run make init (or, if it’s an old project, point Copilot at the skel files and tell it to read and adapt them according to the local SPEC.md), and have a fully functional AI agent sandbox ready to go.
That I can do that and the infra for it in under a minute, with proper workspace mappings, RDP/web terminal access, and SyncThing to get the results back out, is just… chef’s kiss.
Ah well. At least now I have a pretty solid UX that even works from on my ageing iPad Mini 5 snappily enough (as long as I don’t try to open too many tabs), and I can finally start focusing on other stuff.
Which I sort of did, all at once…
Feb 1st 2026 · 3 min read
·
#docker
#proxmox
#smb
#synology
#til
#timemachine
#zfs
Today, after a minor disaster with my Obsidian vault, I decided to restore from Time Machine, and… I realized that it had silently broken across both my Tahoe machines. I use a Synology NAS as Time Machine target, exporting the share over SMB and that has worked flawlessly for years, but this came as a surprise because I could have sworn it was working fine a couple of months ago–but no, it wasn’t.
For clarity: It just stopped doing backups, silently. No error messages, no notifications, nothing. Just no backups for around two months. On my laptop, I only noticed because I was trying to restore a file and the latest backup was from December. On my desktop, I had a Thunderbolt external drive as a secondary backup.
After some research, I found out that the issue is with Apple’s unilateral decision to change their SMB defaults (without apparently notifying anyone), and came across a few possible fixes.
I found this gist, which I am reproducing here for posterity, that seems to be working for me, but which entails editing the nsmb.conf file on the Mac itself–which is not exactly ideal, since I’m pretty sure Apple will break this again in the future.
sudonano/etc/nsmb.conf# I used vim, of course
…and adding the following lines (the file should be empty):
The explanation here is that macOS Tahoe changed the default from signing_required=no to stricter control, and NAS devices with relaxed SMB settings cannot handle this without explicit configuration.
Another common pitfall is name encoding issues in machine names, so you should remove Non-ASCII Characters from the .sparsebundle name (that wasn’t an issue for me, but YMMV).
On the Synology side, the recommendation was to go to Control Panel > File Services > SMB > Advanced and set:
Maximum SMB protocol: SMB3
Enable Opportunistic Locking: Yes
Enable SMB2 Lease: Yes
Enable SMB Durable Handles: Yes
Server signing: No (or “Auto”)
Transport encryption: Disabled
That doesn’t quite match my DSM UI, but it’s close enough, and my settings now look like this:
Since I’m tired of Apple breaking Time Machine every few years and the lack of transparency around this (it’s not Synology’s fault), I have decided to implement a more robust solution that doesn’t depend on Synology’s SMB implementation.
I already have a Proxmox server with ZFS as the backend storage that has an LXC container running Samba for general file sharing, so I decided to look into that as a possible Time Machine target.
As it happens, mbentley/timemachine is a Docker image specifically designed for this purpose, and it seems to be well-maintained, so I’m testing it like this:
services:timemachine:image:mbentley/timemachine:smbcontainer_name:timemachinerestart:alwaysnetwork_mode:hostenvironment:-TM_USERNAME=timemachine-TM_GROUPNAME=timemachine-PASSWORD=timemachine-TM_UID=65534# 'nobody' user-TM_GID=65534# 'nobody' group-SET_PERMISSIONS=false-VOLUME_SIZE_LIMIT=0volumes:# this is a pass-though mountpoint to the ZFS volume in Proxmox-/mnt/shares/timemachine:/opt/timemachinetmpfs:-/run/samba
Right now the first option seems to be working, but I will probably switch to the Docker solution in the near future, since it gives me more control over the SMB implementation and avoids relying on Synology’s software.
But if anyone from Apple is reading this: please, stop breaking Time Machine every few years. It’s a critical piece of infrastructure for many users, and the lack of communication around these changes is frustrating.
I have been using Borg for some time now on Fedora, and I am considering using it for my Macs as well. Vorta seems decent, I just haven’t tried it yet.
Plus I’m annoyed enough that earlier this morning I tried to set up a new iOS device and the infamous Restore in Progress: An estimated 100 MB will be downloaded… bug (which has bitten me repeatedly over the last six years) is still there.
The usual fix was hitting Reset Network Settings and a full hardware reboot, plus reconnecting to Wi-Fi… But this time it took three attempts.
Come on, Apple, get your act together. Hire people who care about the OS experience, not just Liquid Glass.
As someone who (until a few days ago) owned the lobsters.social domain and has long been into the gag of digitized lobsters evolving into self-aware AI (yes, Accelerando is one of my favorite books of all time), I have been following the madness around this for a bit, although i did not commit to the insanity of devoting hardware to it and was not (still am not) impressed by the code quality or utter lack or any substantial guardrails (security or otherwise).
It’s bad enough to use insecure public IM to talk to one of these things, but giving it root on a machine is madness (never mind the influencer gags around buying a Mac mini for it), so after running it on a VM for a couple of days and seeing how it was “designed”, I punted and kept doing my own (sandboxed) thing.
But what sets it apart from most other personal agents for me is the huge (if uneven and creaky) skill library it’s generating–people are using it for completely off the wall stuff, and going through the list feels like Reddit on crack.
My own agent is much more contained (I made it a point of not letting anything permanently loose or with access to any of my personal data), so I’m just making popcorn until one of these things causes real damage, financial ruin, or both.
Still, the idea is worth a look because it’s aiming at the right interface: chat is where people already are, and turning that into a command line for personal admin is a sensible direction (when it’s done with guardrails).
And it makes you wonder if Apple will ever do this properly with Siri/Gemini in any useful way that matters–but I’m not holding my breath here.
Since last week, I’ve been heads-down building a coding agent setup that works for me and using it to build a bunch of projects, and I think I’ve finally nailed it. A lot more stuff has happened since then, but I wanted to jot down some notes before I forget everything, and my next weekly post will probably be about the other projects I’ve been working on.
I have now achieved coding agent nirvana–I am running several instances of my agentbox code agent container in a couple of VMs (one trusted, another untrusted), and am using my textual-webterm front-end to check in on them with zero friction:
My trusted set of agents
This is all browser-based, so one click on those screenshots (which update automatically based on terminal activity) opens the respective terminal in a new tab, ready for me to review the work, pop into vim for fixes, etc. Since the agents themselves expend very little CPU or RAM and I’ve capped each container to half a CPU core, a 6-core VM can run literally dozens of agents in parallel, although the real limitation is my ability to review the code.
But it’s turned out to be a spectacularly productive setup – a very real benefit for me is having the segregated workspaces constantly active, which saves me hours of switching between them in VS Code, and another is being able to just “drop in” from my laptop, desktop, iPad, etc.
As someone who is constantly juggling dozens of projects and has to deal with hundreds of context switches a day, the less friction I have when coming back to a project the better, and this completely fixes that. Although I had this mostly working last week, getting the pty screen capture to work “right” was quite the pain, and I had to guide the LLM through various ANSI and double-width character scenarios–that would be worth a dedicated post on its own if I had the time, but anyone who’s worked with terminal emulators will know what I’m talking about.
Another benefit of this approach is that none of the agents are running locally and can’t possibly harm any of my personal data.
The whole thing (minus Tailscale, which is how I connect everything securely) looks like this:
I had to explain this to a few people already, so here's the detailed diagram
I have several levels of sandboxing in place:
Each container is an agentbox instance with its own /workspace folder
Containers are capped in both CPU and RAM (although that only impacts their ability to run builds and tests–but even Playwright testing works fine)
The containers are running in a full VM inside Proxmox (capped at six cores and 16GB) and one of my ARM boards (more cores, but just 8GB of physical RAM)
The “untrusted” agents use LiteLLM to access Azure OpenAI, so they never have production keys and can be capped in various ways
Each setup runs a SyncThing instance that syncs the workspace contents back to my Mac so I can do final reviews, testing and commits–that’s the only way any of the code reaches my own machine.
As to the actual agent TUI inside the agent containers, I’m using the new GitHub Copilot CLI (which gives me access to both Anthropic’s Claude Opus 4.5 and OpenAI’s GPT-5.2-Codex models), Gemini (for kicks) and Mistral Vibe (which has been surprisingly capable).
After last week’s shenanigans I relegated OpenCode to the “untrusted” tier, and I also have my own toy coding assistant (based on python-steward, and focused on testing custom MCP tooling) there.
A good part of the initial effort was bootstrapping this, of course, but since I did it the UNIX way (simple tools that work well together), I’ve avoided the pitfall of doing what most agent harnesses/sandboxing tools are trying to do, which is to do full-blown, heavily integrated environments that take forever to set up and are a pain to maintain.
I don’t care about that, and prefer to keep things nice and modular. Here’s an example of my docker compose file:
---x-env:&envDISPLAY:":10"TERM:xterm-256colorPUID:"${PUID:-1000}"PGID:"${PGID:-1000}"TZ:Europe/Lisbonx-agent:&agentimage:ghcr.io/rcarmo/agentbox:latestenvironment:<<:*envrestart:unless-stoppeddeploy:resources:limits:cpus:"${CPU_LIMITS:-2}"memory:"${MEMORY_LIMITS:-4G}"privileged:true# Required for Docker-in-Dockernetworks:-the_matrixservices:syncthing:image:syncthing/syncthing:latestcontainer_name:agent-syncthinghostname:sandboxenvironment:<<:*envHOME:/var/syncthing/configSTGUIADDRESS:0.0.0.0:8384GOMAXPROCS:"2"volumes:-./workspaces:/workspaces-./config:/var/syncthing/confignetwork_mode:hostrestart:unless-stoppedcpuset:"0"cpu_shares:2healthcheck:test:curl -fkLsS -m 2 127.0.0.1:8384/rest/noauth/health | grep -o --color=never OK || exit 1interval:1mtimeout:10sretries:3# ... various agent containers ...guerite:<<:*agentcontainer_name:agent-gueritehostname:gueriteenvironment:<<:*envENABLE_DOCKER:"true"# this one needs nested Dockerlabels:webterm-command:docker exec -u agent -it agent-guerite tmux new -As0 \; attach -dvolumes:-config:/config-local:/home/agent/.local-./workspaces/guerite:/workspacego-rdp:<<:*agentcontainer_name:agent-go-rdphostname:go-rdpports:-"4000:3000"# RDP service proxylabels:webterm-command:docker exec -u agent -it agent-go-rdp tmux new -As0 \; attach -dvolumes:-config:/config-local:/home/agent/.local-./workspaces/go-rdp:/workspace# ... more agent containers ...volumes:config:driver:localdriver_opts:type:noneo:binddevice:./homelocal:driver:localdriver_opts:type:noneo:binddevice:./home/.localnetworks:the_matrix:driver:bridge
You’ll notice the labels, which are what textual-webterm uses to figure out what containers to talk to.
It’s been insane. Since this setup lets me drop back into each project at the click of a link and I can guide the agents for a couple of minutes at a time, or take notes and write specs in a separate tmux window. Either of which fits well with my workflow and doesn’t require me to fire up a bloated IDE and loading a project folder (which can take quite a long time on its own).
So I now have the ability to create a bunch of things that I think should exist:
I now have my web-based RDP client working with a Go back-end that uses tinygo and WASM to do high-performance decoding in the browser (which is something I’ve always wanted), and I decided to push it to the limit against the public test suites because I think a Go-based RDP client is something that should exist.
I took the existing pysdfCAD implementation of signed distance functions and replaced the slow marching cubes implementation it was using to render STL meshes with a Go-based backend that renders meshes much faster and with better quality (when it works–I need to sort out some bugs).
I built two (for now) Visual Studio Code extensions for mind-mapping and Kanban that match what I currently need from Obsidian (and will be looking at enhancing Foam to match the Obsidian editor soon)
I’m taking a couple of years of hacky scripts and building a writing agent that is going to help me do the automated conversion and re-tagging of the 4000+ legacy pages of this site that are still in Textile format (the name editor is taken for building a WYSIWYG Markdown editor to replace Obsidian with VS Code)
I’ve read about the Ralph Wiggum Loop, and it’s not for me (I find it to be the kind of thing you’d do if you’re an irresponsible adolescent rich kid with an inexhaustible supply of money/tokens and don’t really care about the quality of the results, and that’s all I’m going to say about it for now).
My key workflow (write a SPEC.md, instruct the agents to run full lint/test cycles and aim for 80% test coverage, then go back, review and write TODO.md files for them to base their internal planning tools out of and work in batches) still works the best as far as final code quality is concerned. I still have to ask for significant refactorings now and then, but since my specs are usually very detailed (what libraries to use, which should be vendored, what code organization I want, and what specific test scenarios I believe to be the quality bar) things mostly work out fine.
Switching between models for coding and auditing/testing is still key. Claude (even Opus) has a tendency to be overly creative in tests, so I typically ask for test and security audits with GPT-5.2 that catch dozens of obviously stupid things that the Anthropic models did. Gemini is still a grey area, since I’m just using the free tier for it (although it seems unsurprisingly good at architecting Go packages).
Switching between frontier and small(ish) models for coding and testing also works great. gpt-5-mini, sonnet, haiku, mistral and gemini flash do a very adequate job of running and fixing most test cases, as well as front-end coding.
SyncThingreally doesn’t like when agents create virtualenvs or install npm packages, so I routinely have to tell the agents that they are in a containerized environment and that it’s fine to install pip and npm packages globally (i.e., outside the workspace mount point).
Like I wrote a little while back, MCP is still the way to go for deterministic results with tools. Support for MCP (and SKILL.md) is very uneven across all the current agentic TUIs, but with a few strategically placed symlinks I can have a workspace setup that works well across VS Code and the remote agents.
Having a shared set of MCP tooling and skills across as many of your agents as possible really cuts down on the amount of prompting and scaffolding agents need to create per project. In that regard, umcp has probably been the best bang for the buck (or line of code) that I wrote in 2025, because I use it all the time.
Claude Code and Gemini have a bunch of teething issues with tmux. Fortunately both Mistral Vibe and the new Copilot CLI work pretty well, and clipboard support is flawless even when using them inside both tmux and textual-webterm.
And, finally, coding agents are like crack. My current setup is so addictive I find myself reviewing work and crafting TODOs for the agents from my iPad before I go to bed instead of easing myself into sleep with a nice book, which is something I really need to put some work into.
But I have a huge, decades-long list of ideas and projects I think should exist, and after three years of hallucinations and false starts, we’re finally at an inflection point where for someone with my particular set of skills and experience LLMs are a tremendous force multiplier for getting (some) stuff done, provided you have the right setup and workflow.
They’re still very far from perfect, still very unreliable without the right guardrails and guidance, and still unable to replace a skilled programmer (let alone an accountant, a program manager or even your average call center agent), but in the right hands, they’re not a bicycle for the mind–they’re a motorcycle.
Or a wingsuit. Just mind the jagged cliffs zipping past you at horrendous speeds, and make sure you carry a parachute.
Jan 25th 2026 · 13 min read
·
#hardware
#minipc
#n150
#nas
#openmediavault
#review
#ssd
This one took me a while (for all the reasons you’ll be able to read elsewhere in recent posts), but the NestDisk has been quietly running beside my desktop for a month now, and it’s about time I do it justice.
The NestDisk mini NAS
This is a tiny Intel machine whose entire reason for existing is to let me cram four M.2 2280 SSDs behind dual 2.5GbE and end up with a small, fast, “boring” NAS that, like most mini PCs these days, can also double as a router and even an “AI box”, but the key point is that it’s designed around storage density and decent networking in a very small form factor.
Disclaimer: YouYeeToo sent me the NestDisk free of charge (for which I thank them), and as usual this article follows my review policy.
Like most of the devices I reviewed during 2025, the NestDisk is built around Intel’s N150, which is pretty much perfect for this category—low power (around 12W), modest clock speeds (1.6GHz up to 3.6GHz), and usually more than enough for file serving and a few services. The catch, as always with this kind of machine, is… thermals.
The short version is that if you already want an SSD-only NAS and have (or plan to have) 2.5GbE, this can make a lot of sense—provided it stays cool and stable once you actually populate all four slots.
Even for an N150 machine, this is a pretty small box: 146×97.5×35mm, which is roughly the size of the external HDD enclosures I used to get a few years ago.
I have to say that I very much like the color. I’m not usually a fan of bright colors on tech gear, but since most of the stuff on my desk is black, white or various shades of brown, the NestDisk stands out in a good way:
The NestDisk's bright orange case is much nicer in person.
Besides the NestDisk itself, I also got an unusual accessory: a dual-fan USB cooler with quite nice-looking 120mm fans, which is meant to sit underneath the NestDisk and blow air over the M.2 area. I found this especially amusing for two reasons: first, because it’s much bigger than the NestDisk itself; and second, because I’ve actually been building a similar DIY cooling solution for my own use with a cheap fan controller and two 90mm fans:
The cooler accessory and my homebrew dual-fan setup
So I can see the rationale here, although it did make me wonder how cool the machine ran. The case does have two rubber feet on the bottom that ensure it has an airgap on a desk, but removing the heatsink was actually quite revealing:
This is a surprisingly beefy heatsink, and the four thermal pads show they aren't skimping on expectations.
First of all the thing is thick. It’s almost 7mm of solid metal, and probably the most substantial part of the entire enclosure. Second, YouYeeToo clearly didn’t skimp on expectations: they added all four thermal pads for the M.2 slot (unlike other manufacturers that only add one or two).
The extra SSD fans
And, looking inside the SSD cavity (which, by the way, already came with a 1TB SSD on the second slot), I noticed there are two very small (40mm) fans that seem to blow air into the M.2 cavity.
This is interesting given that they seem to bring in air from the same side as the USB-A ports, although I have to question how effective they might be to cool all the SSDs.
Incidentally, the CPU seems to have its own cooling path and I believe there is a third fan that takes advantage of the case grilles to exhaust warm air out the other side and top.
But other than the (tragically ill-fated) iKOOLCore R2Max, this is the most substantial heatsink I’ve seen in a mini PC of this size.
Side Note: refreshingly, you can apparently disassemble the whole thing without removing any of the rubber feet, but I didn’t test that for two reasons: first, because the plastic enclosure is very tightly fitted, with the ports flush with the outside (and thus holding the enclosure in place); and second, because I didn’t want to risk damaging the device before I even got to the “fun” parts.
Four M.2 2280 slots (PCIe 3.0 x2 per slot for NVMe; one slot also supports M.2 SATA)
Dual HDMI outputs + a 3.5mm audio jack
Three USB-A ports + one USB-C data port + one USB-C power port (not PD, apparently 19V/3.42A only)
The USB-C data port should support DisplayPort alt-mode, which means you can theoretically run three displays at once (2× HDMI + 1× USB-C DP), but I didn’t test that given that this is supposed to be a NAS.
The NestDisk boots off the internal 64GB eMMC, which is plenty for OpenMediaVault and some plugins, and after setting it up (more on that later), here’s what the storage layout looks like:
The thing about building NAS devices with an N150 is that even though you get four M.2 2280 slots, they’re not necessarily full‑fat x4 per slot—from the specs and a little close inspection, the machine actually uses PCIe 2.0 switches to multiplex the available lanes, so this isn’t exactly a straightforward x4/x4/x4/x4 layout.
Either way, the important part is that the bandwidth seems to be plenty for most consumer SSDs, even when used four at a time. I wanted to test this, but since most of my test SSDs died inside a similar ARM box, I couldn’t really do any significant performance testing.
The USB-C power input is an interesting choice. The YouYeeToo Wiki lists it 19V/3.42A, which is refreshingly specific, and it should support Power Delivery negotiation, but my policy with USB-C power inputs is still to always use the power brick that came with the device, so I didn’t try any alternatives.
What I can say is that the power envelope is pretty much N150 standard: 11W idle, up to around 25W under load, can be pushed a little higher if you really tax the CPU. That’s pretty good for a personal NAS.
The NestDisk is not fanless, but I never heard the fan until I load tested it, and I would classify it as “quietly insistent”. I don’t know the size of the CPU fan, but I know for a fact that the kind of 40mm internal fans I spotted near the NVMe cavity can definitely be audible in a quiet room, but in this case I seldom heard them.
But I did do one thing that most people probably wouldn’t: I set the machine up vertically, with the USB ports down, but held by a 3D-printed bracket similar to this one:
A 3D-printed vertical stand for the NestDisk
I broke it by accident the day before I was finalizing this draft (I dropped the bracket on the floor when re-arranging my desk), but while it lasted it held the NestDisk firmly in place, and more importantly it oriented the case grilles in a way that allowed better airflow.
Which leads me to some of my testing. I got a fancy new IR thermometer for Christmas, so I was able to keep tabs on the NestDisk throughout, and the M.2 heatsink averaged 45°C, with everything else between 27°C under normal use. But it’s been quite a cold winter here (10°C outside average) and I don’t warm up the office much (it’s 20°C now), so these numbers might be a bit optimistic.
The BIOS itself is fairly standard for an N150 machine, with the usual assortment of power management, boot order, and device configuration options. There are a few interesting bits, though:
Note the fan control options and power settings, which are important for a NAS device.
The NestDisk comes preinstalled with OpenMediaVault, although you’ll have to be a little patient with it if you’re new to OMV. First off, there’s no initial setup wizard (or anything in the console, really), so you’ll have to boot the machine, note the IP address, and then log in with the default admin:openmediavault credentials.
The machine boots quickly into OpenMediaVault, and there isn't much to see on the console
Even though I am not a fan of OpenMediaVault’s quirky Amiga-era inside jokes (like fake error messages and Amiga cursors, which many newcomers find confusing), I can’t argue that it works perfectly for the NestDisk—it’s small enough to run off the internal eMMC and have plenty of room to spare for plugins, and it makes it trivial to do exactly what I want for this kind of device: get it on the network quickly, create shares, and move on with my life:
OMV's dashboard is clean and functional, giving me quick access to system status and storage overview.
What makes OMV a nicer consumer choice than a plain Debian install is that it gives me the things most people actually need—SMB/CIFS sharing, NFS if I want it, users/groups, permissions, monitoring, scheduled jobs and notifications—without requiring anyone to remember which config file does what.
OMV’s idea of a NAS is pleasantly conservative, and that’s great, and if I want to go beyond the core experience, OMV-Extras is the usual next step: it adds a much wider selection of third‑party plugins that can turn this from a simple NAS into a small server that happens to have a nice storage UI, which is exactly what I did with it over the past month: It sat on my desk running a small docker compose stack with the services I’ve been developing, and it did great.
A few caveats apply, though, besides the quirky interface: the OMV version that comes preinstalled tries to upgrade upon first boot and mine got a little confused, so it took a bit to get going. And you’ll probably want to set up ZFS if you use more NVMEs, since OMV still seems to default to EXT4 for new filesystems.
I saw no real difference from any of my other N150 reviews, which is to say that it performed exactly as expected for this class of device: plenty fast for SMB/CIFS file sharing, more than enough network throughput to saturate one 2.5GbE link, and more than enough CPU power to run a few containers without breaking a sweat.
What I did notice is that the thermals were better than I expected—I had some trouble getting the CPU to get past 60°C even under load, which is impressive for such a small box, and the idle temperatures were very reasonable:
It took me half an hour of stress-ng to get the CPU to hit 65°C, which is very good for a mini PC of this size. Of course, the elephant in the room is how well it handles thermals when all four M.2 slots are populated, and I honestly don’t know, and can’t know until I get my hands on more SSDs.
This is, of course, where Proxmox comes in in most of my reviews of mini PCs these days, and the NestDisk is no exception—I haven’t done it yet, but given its decent CPU, dual 2.5GbE, and four M.2 slots, it makes a lot of sense as a tiny hypervisor host that can run multiple VMs or containers, and I see no reason why it wouldn’t work well for that.
At this point, I think the NestDisk makes sense for three very specific use cases:
If you want an SSD-only NAS (and I’ve accepted that the drives cost more than the box, especially in this day and age).
If you have (or are moving to) 2.5GbE, and you care about improving the sustained transfer rates you’d get from an existing machine (say an HDD-based gigabit NAS).
If you value small and low power more than infinite expandability.
The NestDisk is appealing to me because despite coming from a standard N150 reference design, it is opinionated hardware: it prioritizes storage density and decent networking in a form factor that’s closer to a portable drive enclosure than a mini PC.
If the NVMe thermals perform well (which I haven’t been able to confirm), it can be a genuinely good always-on NAS or home media server for people who have already moved to SSDs and want something smaller and faster than a traditional SATA box.
Jan 24th 2026 · 1 min read
·
#apple
#bugs
#engineering
#ios
#mac
#metrics
#productivity
“Total time wasted by humanity because Apple won’t fix these” is a wonderfully blunt premise, and the math is… lovely: Base Impact is Users Affected × Frequency × Time Per Incident was enough of a zinger, but the Power User Tax (Σ (Workaround Time × Participation Rate)) and the Shame Multiplier (Years Unfixed × Pressure Factor) just pile it on.
Clearly nobody at Apple is measured by these OKRs, or we wouldn’t be constantly finding new UX disasters like breaking the Finder column view (which now stupidly lets the scroll bar overlap the resizing handles).
Hey Craig, why don’t you adopt the Shame Multiplier for your directs, or at the very least start having some sort of AI-driven Apple Feedback pipeline that actually tries to fix the code instead of acting as a black hole?
Jan 18th 2026 · 5 min read
·
#agentic
#ai
#dev
#grafana
#homelab
#notes
#observability
#sandboxing
#syncthing
#synology
#telemetry
Return to work happened mostly as expected–my personal productivity instantly tanked, but I still managed to finish a few things I’d started during the holiday break–and started entirely new ones, which certainly didn’t help my ever-growing backlog.
But I don’t think I could have done a better job than Jasmin did in this writeup, and the agent itself is pretty neat. If I wasn’t behind schedule on pretty much everything right now, I’d be hacking on it myself, and it’s just depressing to ponder when I’ll be on vacation again…
Jan 17th 2026 · 1 min read
·
#comics
#corporateculture
#dilbert
#eulogy
#humor
#scottadams
Scott Adams passed away this week, and regardless of the controversies that surrounded him of late, anyone who’s been reading this site since the beginning knows I have a soft spot for Dilbert.
As a telco person, it resonated with me so much (and had so many touch points with my daily life) that I was even accused of tipping Adams off about some of the absurdities I witnessed in the field.
And it stayed eerly accurate for decades, which is why I’m linking to this post, which I think pretty much captures the essence of Dilbert and its creator, with all its ups and downs, controversies, and humor.
And, of course, I’d like to share one of my favorite Dilbert strips, which I think completely nailed our future:
I've been using this one since I worked on NetMeeting over ISDN, and it's aged pretty well
Jan 14th 2026 · 4 min read
·
#ai
#llm
#mcp
#skill
#skills
Like everyone else, I’ve been looking at SKILL.md files and tried converting some of my MCP tooling into that format. While it’s an interesting approach, I’ve found that it doesn’t quite work for me as well as MCP does, which is… intriguing.
Apple is bundling Final Cut Pro, Logic Pro, Pixelmator Pro, Motion, Compressor, and MainStage into a new subscription suite (with the obligatory vague promise of AI features) and revamping Pages, Numbers, and Keynote with more content, which I find interesting in many respects…
First off, I’ll state the obvious: this is a jab at Adobe’s Creative Cloud, which has dominated the professional creative software market for years, with the side effect that it pretty much dried up the low-end prosumer space.
By offering a comprehensive suite of “professional-grade” applications under one subscription, Apple is positioning itself as a viable alternative not just for creatives who may have been hesitant to invest in Adobe’s ecosystem, but also for those looking for more affordable options.
But… how viable is it, really? I have my doubts, especially given that I recently tried Final Cut Pro and found it lacking in several areas compared to freemium competitors like DaVinci Resolve. And I have been using Logic Pro for years. It’s a solid DAW, but it faces stiff competition from Ableton Live and an increasing number of free or low-cost alternatives. But that’s my personal experience; I wonder how this will play out for the broader market, where there’s stuff like Affinity Suite, which has recently surfaced after the Canva acquisition as a free alternative to Pixelmator Pro (with paid add-ons).
Apple’s move is interesting because of its breadth and ambition–it’s great value for money for consumers, and does offer a lot of tools that can be used professionally. However, I think it also confirms one of my hunches, i.e., that (statistically) nobody was paying for the iPad app subscriptions.
And I do have to wonder how many people they have developing and maintaining all these apps–there have been many glitches and issues with Final Cut Pro and Logic Pro over the years that make me think they are not exactly top priority for Apple.
Update:Gruber’s round-up of reactions to the hideous new icons is well worth reading. I just took the whole thing in stride since it’s pretty obvious thwt Apple just doesn’t care about good UI design anymore, but this amazing zinger from Heliographe Studio made me laugh out loud:
"If you put the Apple icons in reverse it looks like the portfolio of someone getting really really good at icon design"
Jan 12th 2026 · 3 min read
·
#agents
#ai
#china
#coding
#llms
#opencode
#proxy
#security
There’s a special kind of UI regression that doesn’t look like a regression until you try to use it, and the Liquid Glass Tsunami just keeps on giving.
The huge rounded corners are turning a basic motor skill into a hand-eye coordination puzzle, and this piece goes into exquisite detail on that, and demonstrates how Apple completely botched window resizing.
Spoiler: the resize hotspot is still a 19×19px target, but with the new corner radius “about 75%” of that target ends up outside the visible window. So you do what any reasonable person does and aim inside the corner (where the window actually is), and nothing happens.
The best part (in a dry, slightly tragic way) is the conclusion: the “most reliable way” to resize is to grab outside the corner, which is exactly the kind of invisible affordance that makes people feel like they’re going slightly mad.
It’s also a neat reminder that Fitts’s Law doesn’t care about your design language, and that hit-testing should follow what users perceive as the object boundary, not whatever geometry is easiest to keep from the previous release.
Oh, and that whoever did Tahoe’s UX design is borderline incompetent.
Worth reading because it’s not just a rant, it’s a small, well-illustrated teardown of how a cosmetic change can quietly break muscle memory and cause endless attrition and frustration.
The annotated images (green “expected” area, blue dot, and the “accepted target area” sitting in empty space) make the point better than any amount of hand-waving, and we need more of this to make it obvious that Apple needs to reverse course on the whole thing.
Jan 10th 2026 · 1 min read
·
#arena
#film
#lisbon
#meo
#orchestra
#photo
#photography
#starwars
Anil Dash’s essay on how Markdown conquered everything is a good reminder that most “standards” don’t win because they’re perfect — they win because they’re good enough, easy to use, and easy to remember.
What I liked most is the through-line from early blogging (by which time I was also doing a PHP site generator that handled raw HTML posts) to today’s world, where LLMs are running stupefyingly complex workflows atop a pile of Markdown files.
It’s both empowering and a little unsettling that the same affordances that made writing on the web easier also make it easy to formalize instructions for systems that can do real damage.
I took a long detour through Textile before accepting Markdown had won (and am still paying the price for that, with thousands of Textile posts still to be converted), but the one thing I’ve found amusing over the years is that John Gruber created Markdown as a way to make writing HTML easier for humans.
One person can make a difference, and I’m willing to bet that when (ok, if) we ever become a Kardashev type 3 civilization we’ll still be using Markdown to write our interstellar missives.
Jan 9th 2026 · 5 min read
·
#azure
#cloud
#cloudflare
#compose
#docker
#homelab
#infra
#proxmox
#sqlite
#tailscale
As regular readers would know, I’ve been on the homelab bandwagon for a while now. The motivation for that was manifold, starting with the pandemic and a need to have a bit more stuff literally under my thumb.
Intricuit is pitching the Magic Screen as a snap-on touchscreen for MacBook Air and Pro that “comes ready with its own stylus that supports pressure sensitivity and stylus hover”, and as “the affordable alternative to bulky drawing tablets and iPads.” Which is a neat way of saying “we’re adding touch to macOS from the outside” (because Apple still won’t).
And yet, the first thing that came to my mind as I watched their videos was… How long is it going to take for Apple to “Sherlock” them?
As someone who regularly uses touchsceen laptops and very seldom use the touchscreen (the trackpad’s speed and precision is very much to blame, as well as the fiddliness of taking my hands off the keyboard to reach “up”), I have mixed feelings about this being something that I would use extensively–but I do miss the ability to dismiss dialogs and move windows like I do on Linux and Windows, and wish Apple just got on with it.
And I have a feeling they will, “soon”. In their own timescale.
Jan 5th 2026 · 1 min read
·
#apple
#design
#icons
#macos
#menus
#tahoe
#ui
#ux
This is a very well written, exquisitely detailed teardown of Apple’s decision to sprinkle tiny SF Symbols through menus in macOS Tahoe, and why it makes everything harder to scan, less consistent, and (sometimes) actively misleading.
Bruce Tognazzini (whose book I still pull out once in a while to remind myself of good UI design principles) would be proud.
Somehow I sort of tuned them out during last year’s rant about Liquid Glass, but they really do make a difference. I started actively noticing them in menus when my own little apps started sprouting some of them (apparently automatically) and thinking “why is that there?” or “what does that even mean?” and 90% of the time the icon is just not helping me at all, even in Apple’s own apps.
Jan 3rd 2026 · 1 min read
·
#communication
#hardware
#keyboard
#mobile
#phone
As a very heavy former Blackberry user, I have to say that this is pushing all the right buttons (pun intended), and I might just have to get one.
And of course there’s the obligatory high-standards Michael Fisher video announcing it in style, almost as an afterthought to their new (also pretty clever) Clicks Power Keyboard feature phone.
Their pitch is pretty direct–use the Clicks Communicator as a purpose-built device for taking action and communicating in a noisy world, etc., but what I think they are really getting right is the compact design, the audio recorder and the fingerprint reader. Oh, and it has a headphone jack.
This would make a killer corporate device for anyone who needs to be in touch but doesn’t want the distractions of a full smartphone, and I hope someone cottons on to that fact and places a huge bulk order so they get the funding to keep going.
Jan 1st 2026 · 1 min read
·
#carriers
#esim
#madness
#mobile
#security
#usability
Ryan Whitwam’s piece on switching to eSIM is a good reminder that a lot of “modernization” is really just moving failure modes around, and seems like a great way to kick off 2026 since it mirrors my own past experiences: my Truphone/iPad “side quest”, and my broader worries about eSIM-only iPhones.
A physical SIM is dumb and boring, but that’s exactly why it works: when you need to swap devices, it’s a 30‑second hardware shuffle that doesn’t depend on carrier apps, backend state, or a support queue.
With eSIM, the swap becomes a workflow—and if it fails at the wrong time you can end up locked out of your own number (and everything that still treats SMS as a magic identity token), with the delightful resolution of “go to a store and wait”.
If eSIM-only phones are the future, carriers really need an account recovery story that does not default to “we’ll text the number you currently can’t receive texts on”…
Dec 31st 2025 · 13 min read
·
#emulation
#hacking
#homelab
#music
#node-red
#notes
#observability
#photography
#projects
OK, this was an intense few days, for sure. I ended up going down around a dozen different rabbit holes and staying up until 3AM doing all sorts of debatably fun things, but here’s the most notable successes and failures.
I kept finding avahi-daemon pegging the CPU in some of my LXC containers, and I wanted a service policy that behaves like a human would: limit it to 10%, restart immediately if pegged, and restart if it won’t calm down above 5%.
Well, turns out systemd already gives us 90% of this, but the documentation for that is squirrely, and after poking around a bit I found that the remaining 10% is just a tiny watchdog script and a timer.
Then create a generic watchdog script at /usr/local/sbin/cpu-watch.sh:
#!/bin/bashset-euopipefail
UNIT="$1"INTERVAL=30# Policy thresholdsPEGGED_NS=$((INTERVAL*1000000000*9/10))# ~90% of quota windowSUSTAINED_NS=$((INTERVAL*1000000000*5/100))# 5% CPUSTATE="/run/cpu-watch-${UNIT}.state"current=$(systemctlshow"$UNIT"-pCPUUsageNSec--value)previous=0[[-f"$STATE"]]&&previous=$(cat"$STATE")echo"$current">"$STATE"delta=$((current-previous))# Restart if pegged (hitting CPUQuota)if((delta>=PEGGED_NS));thenlogger-tcpu-watch"CPU pegged for $UNIT (${delta}ns), restarting"systemctlrestart"$UNIT"exit0fi# Restart if consistently above 5%if((delta>=SUSTAINED_NS));thenlogger-tcpu-watch"Sustained CPU abuse for $UNIT (${delta}ns), restarting"systemctlrestart"$UNIT"fi
…and mark it executable: sudo chmod +x /usr/local/sbin/cpu-watch.sh
It’s not ideal to have hard-coded thresholds or to hit storage frequently, but in most modern systems /run is a tmpfs or similar, so for a simple watchdog this is acceptable.
The next step is to make it executable and figure out how to use it via systemd templates:
sudochmod+x/usr/local/sbin/cpu-watch.sh
# cat /etc/systemd/system/[email protected][Unit]Description=CPU watchdog for %iAfter=%i.service[Service]Type=oneshotExecStart=/usr/local/sbin/cpu-watch.sh %i.service
# cat /etc/systemd/system/[email protected][Unit]Description=Periodic CPU watchdog for %i[Timer]OnBootSec=2minOnUnitActiveSec=30sAccuracySec=5s[Install]WantedBy=timers.target
The trick I learned today was how to enable it with the target service name:
The magic, according to Internet lore and a bit of LLM spelunking, is in using CPUUsageNSec deltas over a timer interval, which has a few nice properties:
Short CPU spikes are ignored, since the timer provides natural hysteresis
Sustained abuse (>5%) triggers restart
Pegged at quota (90% of 10%) triggers immediate restart
Runaway loops are contained by CPUQuota
Everything is systemd-native and auditable via journalctl
It’s not perfect, but at least I got a reusable pattern/template out of this experiment, and I can adapt this to other services as needed.
Dec 27th 2025 · 1 min read
·
#circus
#cirque
#entertainment
#ovo
#photo
#photography
#soleil
It’s not their legendary boot floppy, but a QEMU image with a desktop environment for QNX 8.0, with support for self-hosted compilation, using XFCE on Wayland (sadly, their Photon UI is long gone) and providing a number of development stacks and editors. The installation process seems overly contrived (and I am not doing it this holiday season, as I have far too much to play with already), so I would have loved to see an ARM bootable image for a Pi or similar.
I’m sure some enterprising souls will hack their way through this to get it to run on bare metal somehow, since QNX has always been a fascinating RTOS (even if they lost a lot of audience to embedded Linux due to their commercial approach).
If they keep tightening the bootstrapping story, this could become a surprisingly effective bridge between “I have a QNX target” and “I can ship software on it without ritual sacrifices.”