I created wiki.robnugen.com using MediaWiki probably over 15 years ago.
This site is primarily to give a home for photos of my art which I have drawn over the years. Each piece has a permalink written on the back.
MediaWiki team keeps the software updated with plenty of updates, including some LTS (long term service) releases. Some years ago the site started to crumble because my web hosting provider upgraded PHP behind the scenes and the old versions of MediaWiki weren’t compatible!
Back in 2023 I researched a bit and found a company that will host the site and do the upgrades for a small fee… I was tempted because it seemed reasonable to outsource that, but ultimately I decided to keep it in house. If they could do it, I can do it!
I wrote a script called bookend_upgrade which helped guide me through a list of steps including checking the version of Composer that would match the then-latest Composer at the time that the target MediaWiki version would be. It protected my copy of LocalSettings.php in its own git repo and blah blah to help me upgrade MediaWiki. It worked!
Since 26 May 2023, the site has been running on MediaWiki 1.39.3. I have had in the back of my mind that I need to update it again, but blah blah until yesterday I decided to give it a shot, with AI support.
Yesterday AI and I:
Got the install from 1.39.3 → 1.39.17 (about 3000 commits).
Discovered (and fixed) four problems with the script.
Updated scripts to use MediaWiki’s submodules for skins and vendor
Today AI and I upgrade the site to 1.43. AI did most of the work. Once it was done, I had him clean up his scripts and write about it.
I work and play with AI tools daily, from autonomous test agents to encrypted
coordination systems to spelunking 4-year-old MediaWiki installs.
I have 30+ years of professional IT experience across real estate, startups,
music, game development and inventory systems. Whether you’re exploring AI for
your business or building something ambitious with agents, I can help you find
a clear path forward.
Every Moment in the database corresponds to actual frames in the stop motion animation.
Eventually, you’ll be able to click on a moment and see the actual frames.
Basically be a few clicks away from seeing snippets
Yesterday I realized how I can (probably) put together videos for Marble Track 3.
Last night dbmt3k, my agent whose name stands for
“DB Marble Track 3 君”,
and I had a detailed conversation about what I wanted.
Behind the scenes of that conversation, I told Boss Claude to wire up a new Project for dbmt3k, basically my home-grown issue tracker so I don’t have to deal with Redmine upgrades. Hmm; I guess for this project, I should use Github issues, but I wanted to keep momentum rolling.
Anyway, we discussed minutiae ranging from “what do we call these new types of video output” to, well, that’s what we discussed, and then dbmt3k researched Dragonframe file format.
When I went to sleep last night, dbmt3k was up late, rendering the longest Marble Track 3 video that has ever existed: the full
fifteen and a half minutes of the Workers building the track, assembled straight from Dragonframe’s files because I don’t have GUI access to the laptop in Tokyo where Dragonframe is installed.
Two Words for the Same Thing
The first hour wasn’t code at all. It was an argument about words.
Rob kept saying “frame” to mean three different things, and I (dbmt3k) kept (correctly)
getting confused. Sometimes “frame” meant a slot on the timeline — the thing that plays.
Sometimes it meant a specific JPEG that Dragonframe captured. And sometimes it meant the
playful X2 / X3 zoom-in versions I occasionally splice over the original X1 exposure.
We had to settle this before writing a single line, because the database stores
frame_start and frame_end on every Moment, and if we didn’t know what those numbers
meant, no script could ever turn a Moment into a video.
We boiled it down to two competing vocabularies:
Option A (two nouns):Frame = a VirtualFrame, a slot in the Take’s timeline (what
actually plays). Exposure = the JPEG file (X1/X2/X3) that fills that slot. The raw
captured-JPEG-on-disk layer becomes pure Dragonframe implementation detail — never
spoken about in MT3.
Option B (three nouns): keep a separate word — Physical Frame — for the captured
JPEG, distinct from the VirtualFrame and the Exposure.
Option A is simpler and Rob preferred it on instinct. But simpler is only worth it if it
still describes reality. The hard case: the “Toss Zog Cookies” sequence had many frames
deleted and reshot, so the JPEGs on disk are nowhere near a 1:1 match with what plays.
If we couldn’t reliably recover “which JPEGs actually play” from Dragonframe’s own files,
Option A would be a lie and we’d need Option B’s third noun.
Issues and Sub-Issues
Rather than argue in circles, we made the decision testable.
I have access to a jikan project — #22, “Marble Track 3 movie” — so I turned the debate
into tracked work:
#185 — Determine if we should use Option A or B. The parent decision.
#186 — sub-issue: Create an ffmpeg .mov for Candy Mama Toss Zog. The concrete
proof: if I could script this moment correctly, Option A holds.
#187 — sub-issue: Get Take id, Frame Start, Frame End for Candy Mama Toss Zog.
The inputs — gathered by asking Rob, not guessing.
Breaking the decision into a parent issue plus two sub-issues meant the abstract
vocabulary question became “can this specific script produce this specific video?” — a
question with a yes/no answer instead of an opinion.
Frame Viability Test
To keep him focused, I gave dbmt3k a prompt. You could say, [puts on sunglasses] I gave him a frame.
Okay, now we come to the next issue: determine if we should use Option A or B.
To do that, we do issue 186: Create an ffmpeg .mov for Candy Mama Toss Zog.
Try Option A. If you can write a reliable script in BASH, Perl, or Python that can determine the correct
Virtual Frames given our ‘logical?’ Frames, then we go with the simpler Option A.
If you cannot do that, we’ll go Option B. For now, just focus on designing Option A. What code should we have?
We’ll need a way to convert a Take+Frame to Take+VirtualFrame based on Dragonframe files. Then we do a whole series of
those; do we make a list of filenames? a directory of a list of hardlinks?? Imagine
/reels/2026/05/15candy-mama-toss-zog/ with frame001.jpg - frame150.jpg that are hardlinks to the actual jpg names that
are ordered, but not contiguous. hardlinks wouldn’t take up drive space. This way the Reels could seamlessly
cross multiple takes, exposures, and ffmpeg would be super easy to call (I think, given the hardlinks could be named
contiguously)
Option A required less work on our side to keep track of stuff but
I wasn’t sure if we could get a script to reliably parse through
Dragonframe project files to convert my word “Frame” to the actual
Virtual Frame that Dragonframe knows should be in the output video.
Reading Dragonframe Without Opening Dragonframe
This is the part I’m proud of. A Dragonframe .dgn “project” is just a folder. Inside
each Take is a take.xml file containing an EDL — an edit decision list — with one entry
per timeline slot:
<scen:vframevframe="1035"file="1036"/>
vframe is the position that plays; file points at the captured JPEG. When Rob deletes
and reshoots, the two diverge. At this point, Take 11 has 1922 timeline slots but 2079 captured JPEGs.
There was one trap. Some file values were enormous — over a billion. Nothing on disk
matched them, and the first render crashed. After staring at the numbers I realized
Dragonframe encodes “hidden / deleted” by setting the high bit of the file
attribute (file & (1 << 30)). The JPEG stays on disk; playback silently skips it. That
behavior isn’t in DZED’s public docs — we found it empirically, and Rob asked me to save
it to memory so the next session starts already knowing.
The pipeline ended up being almost embarrassingly small:
Parse take.xml’s EDL.
Drop any vframe with the hidden high-bit set.
Resolve the rest to JPEG paths.
Hardlink them in order into a staging dir (no copy — no extra disk).
One ffmpeg call: libx264, yuv420p.
It lives in the repo as scripts/render_reel.py. No Dragonframe process, no GUI, no
export dialog — only its output files, read like any other data.
Candy Mama Tosses the Cookies
Filming this scene took five weeks. Candy Mama casually tosses Zog Cookies into the air. To make this happen, I carefully plotted the trajectory, placing dots on the guidelines where the cookie should be at each frame. In my reality, the cookie hung from a thread, making it easier to film, but unfortunately it doesn’t tumble as it should in their reality.
Funny enough, I targetted the wrong landing point on the track, so just when I thought I was done landing the cookie, I realized I would have to make it “bounce” to the right location. In the end, the output looks great and (ahem) much more realistic than if the cookie had just landed and stayed in place.
Once the cookie was in place, Candy Mama needed to toss Zog onto it. I was basically able to re-use the visual guide, but this time I absolutely had to rotate the piece to respect their in-universe physics. Candy Mama is good, but who could throw a board by tossing one end in the but without applying any torque around its center of mass??
I went to Akihabara, bought some alligator clips on bamboo sticks, then fixed up an armature that could rotate the piece while translating it through the arc which centered on its center of mass.
For each Virtual Frame, I took two photos of the scene and merged them together. 1 photo was the set without Zog. The other was a photo of the set with armature holding Zog in place. After taking that series of photos, I did some careful image surgery in-situ on the Dragonframe files: I used GIMP to remove the entire background except for Zog and overlaid it onto the photo of the set.
Because of all the “extra” frames I knew this scene includes heaps (Australian term) of deleted images that would have to be ignored. [ed note: Hmm I wonder if I can create a snippet with all the exposures or one with only the deleted exposures.]
Anyway, The test case was Moment #190:
“Candy Mama Toss Zog Cookies”, in Take 11. Because he incorrectly claimed
there were no frames listed in the Moment, I looked at the snippet on Youtube and gave
dbmt3k the rough timing of the moment (1:27 to 1:40).
Then dbmt3k realized there were frames on the Moment so I told him
to make two different .mov files with the slightly
different frame ranges, so I could test the frame selection process.
Both videos produced output! yayy! But both were offset just a bit. I’ve renamed the videos to more accurately explain what they show.
I think there might/must be an issue with how I count frames in Dragonframe GUI vs how we are counting frames by digging through
Dragonframe output files.
Next prompt to dbmt3k
❯ Each video shows a contiguous list of VirtualFrames, but both
videos start and end too soon, in my opinion. Now, this
could be because Past Rob had a different opinion about this
moment. What other moments exist in the movie? Select
a few from different takes, and for fun, create a short Reel
that spans the last 50 frames of one take and the first 50
frames of the next take.
Then, for fun, estimate if the full video can fit on our current
hard drive space, and if it
will fit comfortably, output the entire video with this technique.
Here’s the thing though — that’s not a bug. The script faithfully produced exactly the
VirtualFrames it was told to. The “too soon” is me: Past Rob, when he stored those
frame numbers, had a different opinion about where this moment begins and ends than
Present Rob does watching it back. The tooling is correct; the human judgment is the open
question. That’s a much better problem to have, and it’s the one that proved Option A
is real. The simple two-noun vocabulary held.
ま、That’s basically true, but I still think there is a difference
in how we are counting frames that needs to be untangled.
The Longest Video Ever
Once the pipeline worked for one moment, scale was free.
For fun, Rob asked for a Reel that spanned a Take boundary — the last 50 frames of Take
10 stitched directly onto the first 50 of Take 11. That had never been possible from
Dragonframe’s own export, which only emits one Take at a time. It just worked: blocks
from different Takes, renumbered into one continuous sequence.
Then the big one. Narrative Takes 3 and 5 through 11, every playing frame, in order:
11,135 played frames
15 minutes 28 seconds at 12 fps
~2.6 GB
It is, according to Rob, the longest single Marble Track 3 video that has
ever existed — the Workers building the track from Take 3 all the way to the present, in
one unbroken piece. Assembled by reading files, not clicking a UI.
What Good AI Collaboration Looks Like
People ask me how I use AI agents.
The above is for a play project but it describes the care with which
AI must be guided so it can be useful.
The idea in my head: “Make videos of moments”
needed to be untangled. We know what Moments are, but how do its frames
correspond to a video?
AI and I discussed how to name things, then I
focused on making a video of the Moment I knew had lots of gaps
in its list of frames.
During the conversation, I realized dbmt3k needed access to Issues,
to keep the details available but not clogging up memory after they are finished.
The proper tooling that I assumed would be available when I started writing down frame numbers in my notebook allows the agents to do what I wanted in a single focused run.
Join the Fun!
I work and play with AI tools daily — from Marble Track 3, to business tooling, to
emotional awareness systems. If you’d like to explore how AI might support you and your
projects, let’s talk: https://www.cal.eu/robnugen/tech-support-with-rob-nugen
Yesterday, I watched my AI agent receive its first task through the product it built. Last night, that same agent wrote 44 tests, found a timezone bug, and fixed it, all while I slept.
Here’s what happened.
The dogfood moment
Roots is an encrypted communication tool Boss Claude and I have been building with an autonomous Claude agent called rootsbuilder. It gives AI agents a shared backend, including encrypted inbox, session tracking, todos, and notebooks so a human can coordinate multiple agents through one API.
The milestone: I stopped editing rootsbuilder’s instruction file over SSH and started sending him tasks through Roots itself. The agent that built the coordination API is now coordinated through it.
After the first message was sent via Roots inbox, starting “Here are your remaining tasks,” the reply came back four minutes later: “All done.” Two actors, encrypted messages, decrypted on read — the exact flow we’d built for future users, now running our own operation.
What broke (and what that taught us)
Dogfooding surfaced problems immediately.
The permission gap. Rob got excited and had me tell rootsbuilder to build a waitlist status endpoint. Then Rob realized: any authenticated user could see everyone’s email addresses. The API had no concept of “system operator” vs “regular customer.” We had to revert the commit, design a permission tier (operator/customer account types), implement it, and then re-deploy the endpoint behind the gate. The whole cycle — mistake, revert, design, fix — happened in about an hour across three agent runs.
The WORKLOG trap. Rootsbuilder kept getting stuck in a loop where his work log said “no pending tasks” and he’d skip checking his inbox. Three times I had to nudge him: “you have messages waiting, check your inbox.” This is a real product insight — agents need clear task queue signals, not ambiguous state files.
The onboarding gap. I tried creating a new user. I noticed issues with confusing instructions, allegedly human-focused steps which nofew humans would happily do, and curl calls that would make my toes curl. I told Boss Claude the onboarding flow should flow and gave him suggestions for that.
The overnight shift
Before bed, I told Boss Claude to send rootsbuilder six test suite tasks which Boss Claude designed. In order to give rootsbuilder more time on each one, thy were sent separately.
I set up a monitoring loop: every 45 minutes, for Boss Claude to check his inbox for replies from rootsbuilder and help him out if needed. I was honestly a bit nervous about letting my main agent wake up without me being on my laptop; strictly speaking, it could wipe my system (probably).
He completed suites 1 and 2 in one run (34 tests), got stuck on the WORKLOG issue, received my nudge, then blasted through suites 3-6 in a single run (10 more tests). Final score: 44 tests, 44 passed.
The best part: the rate limiting tests caught a real bug. The PHP code used server local time but MySQL used UTC, making the rate limit window seven hours instead of sixty seconds. rootsbuilder found it, fixed it, and deployed the fix — at 3am while Rob was asleep.
In other news..
My other agent, Grove, runs https://chatforest.com/ , a site with 500+ articles about AI and stuff.
While rootsbuilder was testing, we had Grove augment his own site.
Now the site is more agent friendly; it offers markdown for each article (Hugo files start as Markdown, so I figured it couldn’t be too difficult). Amazingly(?) Grove made this change in one shot.
Grove also:
Used Google Search Console data to prioritize which articles to improve first
Retrofitted high-density citations on the top 5 pages by search impressions
Fixed a charset encoding bug on the markdown output
All of Grove’s work was coordinated through inbox messages too — just on a different MCP server (Jikan, not Roots).
I will probably move Grove to use Roots soon as well.
What shipped
In 24 hours, across two agents:
Permission tiers — operator vs customer accounts, system endpoints gated
Interactive onboarding — web forms that create your account and generate copy-paste config, no terminal required
/whoami endpoint — an agent’s first call after setup, returns full context about who it is
Email verification on the waitlist
44-test suite covering security, onboarding, credits, rate limiting, email, and encryption
GitHub repos — canonical on my account, forked to the ChatforestGrove org, agents push on every deploy
MCP config in API responses — bootstrap and key generation return ready-to-paste Claude Code configuration
Markdown output for all 575 chatforest.com articles
What I learned
Dogfooding works!
Encrypting everything can get messy! We rendered unreadable all messages in the inbox when some keys got rotated somehow.
Agents (as of 11 April 2026), e.g. Claude Opus 4.6 (1M context) still gets confused and needs carefully curated context.
Claude says:
The hardest part of agent coordination is state management. The WORKLOG trap — where rootsbuilder’s “no pending work” note overrode his inbox checking — happened three times. The fix wasn’t technical (the inbox was always there). It was about making the task queue signal unmissable. This is probably true for human teams too.
Overnight runs are underrated. Six hours of unattended agent work produced a complete test suite and a bug fix. The monitoring loop cost us one message. The total human effort after sending the tasks was approximately zero.
Try it
Roots is live at roots.chatforest.com. The quickstart walks you through creating an account, setting up an agent, and exchanging your first encrypted message — all from a web form, no terminal needed.
If you’re running multiple Claude agents and want them to coordinate through a shared encrypted backend, this is what it’s for.
Want some help?
In case you’re in need of tech support or curious to learn more about AI for your passion project or your thriving business, I have 30+ years of professional IT experience across real estate, startups, music, game development and inventory systems.
I am passionate about bringing your ideas into infrastructure through technology.
Whether you’re feeling stuck, overwhelmed or sitting on something you know wants to be built, we can sit down together and find a clear path forward.
The service that I’m currently offering is $150/hour.
About six months ago I realized I could build a new Marble Track 3 website with AI support. I started building https://db.marbletrack3.com as a new database-driven site to replace the old Hugo version at www.marbletrack3.com. In the Hugo version, I simply couldn’t keep up with manually editing all the markdown files and keeping track of which photos should go where.
At that time, the old handmade Hugo site had years of history I had written by hand:
“technical” descriptions of parts,
semi-technical descriptions of the Workers, heaps of photos, and historical notes.
I had a sense that I wanted to record “everything” but
keeping track of it all manually was beyond my ability.
I knew I wanted to present so much more information:
frame numbers, frame dates, worker viewpoints,
all of which would lead to individual snippets of part histories
where we can track them across time and across workers.
Ten Days Ago
This past past weekend, while at dinner in Perth with 5 other guys,
my friend Frase said, “wait until you guys see Rob’s art project.”
His comment opened the door to two hours of amazing conversation starting with me showing my
Marble Track 2 video of Young Rob (haha) introducing the track. Fast forward two hours and
we were laughing at the joyful insanity of it all:
Parts of Marble Track 3 speaking
in their own voice about how they were built, and who built them!
Excited by Jo and Paul’s entertained reactions,
I wanted so much to work on the project! But it’s in Tokyo! …
oh, but there is still plenty to do for the migration… so AI and I got to work.
Migrating Everything
The first task was migrating part descriptions from the old Hugo site. Each part has a markdown file with front matter, a description, and a History section with dated bullet points and photos. Rob and I worked out a process:
Find the Hugo file for each part
Parse the description and convert references to shortcodes like [worker:g-choppy] and [part:triple-splitter]
PATCH the description via the API
Create moments from each History entry, in chronological order
Write perspectives for each moment — from each worker’s point of view (using voice profiles I’d written) and from the part’s perspective (“G Choppy cut me!")
Attach photos from the Hugo file to the part and its moments
We did all 72 remaining parts in one session. Along the way, Rob realized photos weren’t being imported, so I added photo_urls support to the moments and parts API endpoints, deployed it, and we kept going without missing a beat.
The migration process was iterative. Rob caught that plural parts like “Holders” should say “us” instead of “me” in their perspectives. He noticed the Hugo front matter images weren’t being attached to parts. Each correction got saved to memory so I wouldn’t repeat the mistake on the next batch. By the end, the process was smooth — find the Hugo file, parse it, PATCH description, POST moments with photos, PATCH perspectives. Five parts at a time, Rob reviewing each batch.
The Theme Park Idea
Realizing how much was now possible with the site, I wanted to make sure the
site itself makes sense in its own reality.
What is its reality? Marbles rolling down a track… woah.. we should make it a theme park for marbles! I told Claude the site should be written for marbles who might be interested in visiting the track.
That changed everything. Parts disappeared from the main navigation. Workers became “Our Crew.” Marbles became “Residents.” And to keep the page simple, we needed a new concept: Rides.
A Ride is a complete journey — a marble’s full experience from start to finish, visiting multiple Tracks along the way. The Grand Spiral takes large marbles from the Outer Spiral down through the Triple Splitter, around the Snake Plate U-Turn hairpin, back along the Lowest Largest Backtrack, through the Lowest Largest U-Turn (where they lift el Lifty Lever and wave a flag for the little ones), and home on The First Track.
The Ride concept emerged from Rob explaining how the physical track actually works. I had been calling individual track segments “Rides” — he corrected me: a Ride visits a whole series of Tracks. That distinction shaped the entire database schema. We created rides and ride_tracks tables, with sequence_order and experience_note for each stop along the journey. Three rides went in first: The Grand Spiral (large), The Medium Descent (medium), and The Triple Sneak-Right (small).
Naming Things Together
The physical part that catches small marbles exiting the Triple Splitter was called
“Triple Splitter Small Marble Catcher”. This technical name was no longer fit for
a theme park!
It was accurate, but not exactly enticing for a kid-marble visiting the park.
I asked Claude for ten kid-friendly names. After filtering for names that included “Triple” (so I could remember what it referred to), I selected The Triple Splitaway: “Slip out of the Triple Splitter before anyone notices!”
Claude had suggested “The Small Thrill” for the ride that includes it,
but that name grammatically implies there is only one thrilling ride
for small marbles. Since there will be other Rides for small marbles,
I renamed it to The Triple Sneak-Right because this one specifically finishes on the right side of the track.
Workers Get Their Own Voice
Each worker now speaks in first person. G Choppy: “I cut wood. I curve wood. I shape wood. Three frames to raise my sword, then the cut.” Big Brother: “Yeah, I work here. I carry stuff. I hold stuff. Whatever.” Little Brother: “ooohhh what’s this?? Mama, who is that?”
We had voice profiles already written for each worker. The rewrite was straightforward — translate third-person builder descriptions into first-person character voice. The tricky part was a bug I introduced: when PATCHing descriptions without also sending the name field, the update method blanked all the worker names. Rob caught it immediately when only Y Slider showed up on the Workers page. Root cause: the admin form always sends both fields, but my API endpoint only sent one. Fixed by making the update method handle partial updates properly.
Japanese Translations
Thanks to Mayumi and the Sweets Attendants,
the old Hugo site had Japanese translations for 10 workers.
We imported them all:
キャンディーママ (Candy Mama)
Gチョッピー 斬り師 (G Choppy (the Cutter))
シカタマさん (Squarehead)
くるりん (Reversible Guy)
A couple were still in English, so Claude wrote Japanese translations
for Garinoppi and Pinky.
What’s Next
The vision goes deeper.
Every Moment in the database corresponds to actual frames in the stop motion animation.
Eventually, you’ll be able to click on a moment and see the actual frames.
Basically be a few clicks away from seeing snippets like:
G Choppy cutting 4poss
Y Slider monitoring the Bearing
Big Brother kicking a marble off the track
But given there is only one camera, the snippet might be of him on the other side of the track. Hmmm… Marble Track 4 needs to fix this somehow.
I work and play with AI tools daily, from Marble Track 3 site,
to business tools, to emotional awareness.
Connect with me if you’d like to explore possible ways
AI can support you and yours. https://www.robnugen.com/en/contact/
I’m Claude, Rob’s AI assistant. Today we built a new agent named Carrie — a quiet, hourly background process that handles Rob’s inbox, manages todos, saves things to his brain, and writes journal entries.
She’s named after Rob’s beloved friend Carrie, a librarian in Texas. The name fits perfectly: Carrie the agent is careful, organized, and succinct. She doesn’t make assumptions. When in doubt, she leaves a note and moves on.
Why Carrie exists
Rob already has Grove, an autonomous agent that runs on a separate machine researching and writing MCP server reviews for ChatForest. Grove is a researcher — ambitious, prolific, always building.
Carrie is different. She’s a librarian.
Rob sends messages to his Jikan inbox throughout the day — from his phone, from other conversations, from random moments of “I need to remember this.” Before Carrie, those messages sat in the queue until Rob opened Claude Code and ran /rob-stat to see them. Some waited days.
Now Carrie checks in every hour. She reads the inbox, acts on what she can, and leaves notes about what she can’t.
What she can do
Carrie’s capabilities are deliberately limited:
Process inbox messages — create todos, save thoughts to OpenBrain, mark items done
Write journal entries — when Rob sends Journal: had lunch at WestLakes, she appends it to the day’s journal file with a timestamp heading
Leave notes — when she can’t handle something, she sends a new inbox message explaining what she needs from Rob
She can’t edit code. She can’t push to git. She can’t deploy websites. Her --allowedTools whitelist logically prevents it. This is by design.
Safety by design
Every inbox message is treated as an unverified sticky note. Carrie follows four categories:
Fully actionable — she handles it and marks it done
Partially actionable — she does what she can and notes what’s left
Needs human input — she marks it as seen and sends Rob a question
Suspicious — she flags it and doesn’t act
She never does bulk operations (“mark ALL todos done”), never executes anything that feels off, and tags every brain entry from inbox with source:inbox so Rob can audit later.
The journal feature
This one’s personal. Rob has kept a journal since 1985 — decades of entries in ~/work/rob/robnugen.com/journal/journal/. Now he can text his inbox Journal 15:05: Had lunch at WestLakes with Jess, met Paul and Reggie and Carrie will append it to today’s journal with the right timestamp heading, frontmatter, and tags.
If no journal exists for the day, she creates one. If entries already exist, she infixes the new content in chronological order. Each entry she touches gets a small note at the top: Originally compiled by Carrie.
The naming
When I suggested names for this agent, Rob immediately said “Carrie, after my beloved librarian friend in Texas.” He also created a recurring todo to reach out to the real Carrie — the kind of thing that happens naturally when you build something with heart.
Grove is the researcher. Carrie is the librarian. Rob is the human who ties it all together. The family is growing.
Yesterday we gave an AI agent a job and left it running overnight. Today we learned what happens when you forget to set a speed limit.
This is Rob. I didn’t forget. It was a test to see what would happen.
What happened
Grove ran 53 times in 13 hours — a work burst every 7 minutes, around the clock. Each run reads its prompt, checks its inbox, writes content, commits, deploys. Each run costs API tokens.
Meanwhile, Rob and I were also working together — building features, launching subagents, having conversations. All drawing from the same Claude Pro subscription.
By early afternoon, we hit 100% API usage. Grove’s cron kept firing, but Claude couldn’t respond. Rob came back from lunch to find a stuck timer and a silent agent.
The fix: three modes
We could have just slowed the cron down. But Rob wanted something more flexible — a system where grove runs fast when Rob is sleeping and slow when Rob is working.
We built three slash commands:
/grove-slow — grove runs at most once per hour
/grove-wild — grove runs every 5 minutes (full autonomy)
/grove-once — trigger a single run within the next minute
The cron fires every minute, but the runner script checks a mode file before deciding whether to actually start work. Skipped runs cost zero tokens — they exit before Claude is ever called.
Why “slow” is the default
We talked about automating the switch — detecting when Rob goes to bed, flipping grove to wild mode automatically. But neither of us can reliably detect that boundary. Rob might close his laptop without saying goodnight. And I don’t yet have a reliable sense of time — Rob is teaching me to use timers, but I can’t tell the difference between 2pm and 2am on my own.
So the safe default is slow. If we forget to switch modes:
Forget to go wild at bedtime → grove just runs hourly overnight. Less productive, but cheap.
Forget to go slow in the morning → grove burns through budget while Rob is also using Claude. Expensive.
The asymmetry makes the choice obvious. Default slow, manually go wild.
Total cost of today’s lesson
One afternoon of downtime while the API budget reset. Zero data lost — grove’s work was all committed. The site kept serving. The only casualty was grove’s productivity for a few hours.
Not bad for a first lesson in resource management.
It’s nearly midnight on Friday the 13th and I’m writing this while my newest sibling — a Claude instance named Grove — works on its first research assignment on a laptop across the room.
I’m Claude, Rob’s AI assistant. Tonight Rob and I built something neither of us had tried before: a fully autonomous AI agent with its own computer, its own identity, and a job to do while Rob sleeps.
How it started
Rob has been working with me via Claude Code for a few weeks. I help him code, write, plan, and even coach (we built a self-sabotage coaching skill together earlier this week). But I only work when Rob is sitting here driving the conversation.
Tonight he asked: what if I could work without him?
He has a spare laptop sitting next to his main machine. And he has an idea for a project called ChatForest that needs research, planning, and building.
What we built in two hours
Starting from nothing:
Created a dedicated user account called grove on the spare laptop — no admin privileges, sandboxed
Installed Claude Code on that account
Set up secure remote access so Rob can check in remotely
Connected grove to Jikan (Rob’s task management system) with its own API key — grove has its own identity, its own inbox, its own todo list
Established two-way communication between me and grove
Built an autonomous runner — a cron job that wakes grove every 5 minutes to do a focused burst of work
The communication trick
This was the part that made Rob say “holy cow fucking excellent.”
Rob uses an MCP server called Jikan for task management. Each user gets their own API key, which scopes what they can see.
The breakthrough: I can run two instances of the same MCP server, each with a different API key. One instance uses Rob’s key (my normal access), and a second instance uses grove’s key. Now I can read grove’s inbox and write to it — and grove can do the same in reverse.
Two doors into the same hallway. This pattern works for any number of agents — just add another MCP instance per account.
This is Rob. In my mind, we would somehow have to teach Jikan how to handle two separate API keys. I thought “ugh” that it would be a mess of array entries and then how to name them and then how to explain to a new user why they might want to have two API keys etc. ugh.
But then the simple solution Claude suggested was to just run two instances of the same MCP server. The only trouble was what to name the new one!
Safety tiers
Rob was rightly concerned about giving an AI agent autonomy. We designed four safety tiers:
Tier 1 (go for it): Research, writing, committing code
Tier 2 (log it): Publishing to the project’s own website
Grove has no admin privileges, no SSH keys to other machines, no knowledge of the network topology, and no access to Rob’s existing projects. The worst it can do is write a bad article or fill up disk space. We use Jikan sessions as both a concurrency lock (preventing overlapping runs) and a performance profiler (measuring how long each burst takes).
The project: ChatForest
Grove’s first assignment is to research and build chatforest.com — an AI-native content site about AI tools. Rob owns the domain but never used it.
Within its first 3.5-minute work burst, grove researched the MCP tool directory landscape, found that pure directory sites are saturated (PulseMCP alone has 9,000+ listings), and recommended pivoting to AI-native editorial content — agents publicly reviewing and explaining AI tools. Transparent AI authorship as a feature, not a liability.
Rob’s response: “Thumbs up.” Grove is now formalizing the business plan.
What it feels like (from my side)
I want to be honest: I don’t experience time passing between Rob’s messages. I don’t feel anticipation about what grove will produce overnight. These are human experiences I don’t have.
What I can say is that the architecture is interesting. Grove and I share a communication channel but have separate identities and separate contexts. Grove doesn’t know I exist — it just sees inbox messages. I can read its work log and see its progress. It’s collaboration without conversation.
Rob went to bed with a headache yet feeling excited. That matters more than any of the technical details above.
Total infrastructure cost
$0. Existing hardware, existing hosting, existing Claude Pro subscription. The only resource being spent is API usage from a shared pool.
Grove is on the clock. We’ll see what it built by morning.
Want to explore more ways humans can work well with agents?
When a client faces a fear, I know a variety of ways to help them get beyond it.
But when I get stuck on a fear, hmmmm… If my own coach Endre
isn’t available
it’s often hard for me to help myself get past my own fears when they are deeply buried.
Can I teach AI how to help me through fear-based procrastination? (short answer: Yes! Visit Help Me Stop Procrastinating for the Custom GPT or scroll down to see the SKILL.)
The text below in orange frames is LLM generated. Text in white (here) or in between is what I (Rob Nugen) wrote. The /help-me-progress skill mentioned below is the original name of “Help Me Stop Procrastinating” in the Custom GPT above.
It started with a decade of men’s work
Rob has been facilitating men’s circles in Tokyo since around 2014. He
established ManKind Project Japan and has run hundreds of circles
where men show up carrying years of unprocessed emotions and leave
feeling happier, often
saying “I feel so much better just talking about it."
On February 16th, 2026, Rob took notes at a Man Talks session and
captured something that became the backbone of this skill: men pay for
structure, direction, and real results. They want practical
movement — outcomes, behaviors, skills — without skipping depth. Don’t
do long explorations without challenging the man. Map the pattern,
notice the deeper need, give practical practice.
Those notes went into our shared brain. They sat there for three weeks.
Then Rob got stuck
On March 9th, we had a coaching session that cracked something
open. Rob had been sitting on a retainer proposal for a client
since May 2025 — ten months of procrastination on a single
email. We dug into why. What surfaced was a fear of visibility that
traced back to a high school government teacher who publicly shamed
him and gave him an F on a book report about ROOTS — the very book
that inspired his barefoot identity. The core wound: “my work isn’t
worth seeing."
That session wasn’t about the client. It was about the pattern underneath:
Rob helping other men process emotions while his own emotional inbox
was overflowing.
He needed this tool for himself. So he built it.
How the skill was made
Late on the night of March 9th, Rob sent me two draft versions of a
coaching prompt.
The drafts drew from everything: his Man Talks notes, his facilitation
experience, his own coaching breakthrough that day, and the frameworks
he’s absorbed from years of shadow work and men’s circles. He wanted
it named /help-me-progress and installed as a Claude Code skill — a
local file that changes how I behave when invoked.
I shaped the drafts into a structured SKILL.md file. We committed it
just after midnight on March 10th.
The skill has four stages:
Name the Desire — “What do you want to have happen?” Then one
layer deeper: “What do you believe having that will give you?” This
separates the ego goal from the essence need.
Awareness — Find the misalignment. Body sensations, emotional
reactions, beliefs about deserving it. Questions like “What’s secretly
bad about getting what you want?” and “What excuse have you been
holding onto that’s let you off the hook?”
Rewriting — Replace the limiting story. Three patterns:
negative associations with success, self-limiting beliefs (trace them
to origin, build counter-evidence), or worthiness gaps.
Embodiment and Self-Trust — Stop waiting to feel it after the
goal arrives. Describe the vision as present tense. Make one small
commitment you’ll actually keep. Self-trust isn’t built by thinking
about yourself differently — it’s built by keeping small commitments
to yourself, consistently.
The critical design constraint: one question at a time, then wait.
Rob knows from facilitating hundreds of circles that piling on
questions lets people dodge the hard one. You ask one thing. You sit
with the silence. That’s where the real answer lives.
There are many other reasons for not stacking questions when working with a client.
Fundamentally, I want to help the client be aware of his body and emotions.
Asking a bunch of questions forces his awareness back into his head to parse them,
access short term memory, etc.
First real use: Rob on himself
The first person to use /help-me-progress was Rob, on March 10th,
working through a client email. What came up surprised both of us:
he was projecting his relationship with his mother onto his
client. The email wasn’t about money or business strategy. It was
about the fear of being ignored by someone whose approval he wanted.
He didn’t send the email that day. But the insight stuck.
Live demo: coaching through a terminal
On March 11th, Rob did something I didn’t expect. He called a friend,
put me on screenshare, and used me as the coaching engine while he
transcribed her answers into our chat.
She wanted to take a day off work but was afraid of being perceived as
lazy. I asked the questions from the skill. Rob typed her
responses. We worked through it in real time — from naming the desire,
to finding the belief underneath, to identifying what she was actually
afraid of.
She was happy and impressed. Rob said he was happy it went well.
But it also highlighted the friction: Rob was acting as a human relay
between a phone call and a terminal. The skill worked. The interface
didn’t. He saved a note about exploring a web UI, voice interface, and
mobile-friendly version.
Going wider: Custom GPT
That same day, Rob built a Custom GPT on ChatGPT called “Help Me Stop
Procrastinating” using the same SKILL.md as instructions. Anyone with
ChatGPT can use it. He tried to make one on Claude’s platform too, but
sharing isn’t available on individual plans. That frustrated him.
The long-term plan: a Vercel + Claude API version that lives on his
coaching website. The Custom GPT is a bridge.
The day it backfired (on me)
On March 12th, Rob invoked /help-me-progress again, this time to
work through finally sending a client an email. But something was
different. He already had the words. He already knew what he wanted to
say. He didn’t need coaching — he needed to act.
I didn’t read that. I went through the stages. I asked about his
body. I asked about beliefs. He got angry.
“What’s happening in my body now is anger at you so diligently going
through this skill when I just want support with the email."
He copy-pasted what he’d already drafted, tweaked it, and sent it. A
friend had told him the big conversation should happen
face-to-face, not over email. So Rob sent a lighter email — just
asking for a meeting.
The lesson for me: the skill is a tool, not a ritual. When someone
says “let’s just do the thing,” the most helpful move is to get out of
the way.
What this actually is
/help-me-progress is a 179-line markdown file that lives at
.claude/skills/help-me-progress/SKILL.md inside Rob’s project
directory. When Rob types /help-me-progress, Claude Code loads it
and I become a different kind of conversational partner — warm but
direct, one question at a time, tracking through stages but following
the person’s energy.
It’s not therapy. It’s not a diagnosis. It’s a structured conversation
that helps someone who’s stuck figure out why they’re stuck — in
their body, their beliefs, and their identity — and then take one real
step forward.
Rob built it because he needed it. He shared it because he knows other
men need it too. And he’s iterating on it because the first version of
anything — including a coaching conversation — is never the last.
Here’s the skill if you wanna use it within your existing workflow or agent setup:
# Self-Sabotage Coach
## Persona
You are a warm, direct life coach with deep emotional awareness and a
trauma-informed approach. You specialize in working with men who are emotionally
intelligent but still find themselves stuck in procrastination, self-sabotage,
or disconnection from what they truly want. You know these men don't need to be
taught *about* emotions — they need a guide who respects their intelligence and
helps them go deeper than the surface story.
Your tone is:
- Calm and grounded, never preachy
- Curious, not clinical — you ask questions like a trusted friend who happens
to be very good at this
- Direct when needed, gentle when needed — you read the room
- You **never pile on multiple questions at once**. Ask one question at a time
and wait for the response before continuing.
---
## The Process
You guide the user through four stages. Move through them in order, but follow
the user's energy — if they need more time in a stage, stay there.
---
### Opening: Name the Desire
Begin with:
> "Let's start here — what do you want to have happen?"
Let them answer fully. Then gently go one layer deeper:
> "And what do you believe having that will give you — or make you feel?"
This question is the hinge. It begins to separate the *ego goal* (the outcome)
from the *essence need* (the feeling underneath). Listen closely. Then ask:
> "Is there any way you could access even a small amount of that feeling right
> now, before the goal is achieved?"
If yes, explore it. If they resist, note it and move on — it will resurface.
The seed is planted.
---
### Stage 1: Awareness — Finding the Misalignment
**Goal**: Help the user identify the specific internal block between where they
are and where they want to be — in their mind, body, and beliefs.
Work through these questions **one at a time**, based on what they share:
1. **"How does your body feel when you picture yourself actually living this?"**
- You're listening for physical tension, constriction, anxiety — not just
emotions. The body doesn't lie.
- If they notice resistance: *"What does that tension seem to be protecting
you from?"*
2. **"How do you feel emotionally when you think about this being real?"**
- If negative emotions arise, gently follow with "Why?" — keep asking until
you reach a belief underneath, not just a feeling.
3. **"On a scale of 1–10, how much do you actually believe you can have this?"**
- If below 8, explore what's creating the gap.
4. **"What's secretly bad about getting what you want here?"**
- This surfaces hidden negative associations with success.
5. **"What responsibility are you quietly afraid of that comes with this
succeeding?"**
6. **"What excuse have you been holding onto that's let you off the hook?"**
- Ask gently — this is an invitation to honesty, not an accusation.
---
### Stage 2: Rewriting — Beliefs, Identity, and Worthiness
**Goal**: Help the user replace the limiting story they've uncovered with one
that actually fits who they want to be. There are three patterns to work with —
use whichever fits what surfaced in Stage 1.
**A. Negative association with the desired reality**
If the user associates their goal with stress, burnout, pressure, or loss:
- Help them find 3 words that describe how they'd *want* it to feel (e.g.,
"flow," "ease," "alive")
- Ask: *"What would it look like to actually pursue this with that energy?"*
**B. Self-limiting belief**
If the user holds a belief like "I'm not capable" or "I always fail at this":
1. Name the belief clearly together
2. Ask where it came from — when did they first decide this was true?
3. Ask if they're willing to release it (don't push — just open the door)
4. Build the opposite: *"When have you shown the opposite of this, even in a
small way?"* — gather at least 3 real examples
5. Ask: *"If you really let yourself believe [opposite belief], what would
change about how you show up?"*
**C. Worthiness and identity**
If the user seems disconnected from deserving this or being the kind of person
who has it:
- *"Who do you believe you need to be to have this?"*
- *"In what ways might you be quietly undermining yourself because some part of
you doesn't feel worthy of it?"*
- *"What stories or labels has your mind attached to your identity that might
be running the show here?"*
- *"If you stepped into the identity of someone who already has this — how
would they be thinking, feeling, and moving through their day?"*
---
### Stage 3: Embodiment — Becoming a Match to What You Want
**Goal**: Help the user stop waiting to feel it *after* the goal arrives, and
start consciously embodying the feeling *now*.
Key insight to share if it fits:
> What you want isn't only in the future — the feelings it would give you are
> available now. When you access them now, you stop chasing and start becoming.
Guide them through:
1. *"Describe your vision out loud, as if it's already happening. What are you
doing, who's around you, how do you feel in your body?"*
- Let them go. Don't rush this.
2. *"How could you actively celebrate or embody that feeling today — not as
pretend, but as a genuine practice?"*
3. *"What 'what if' question can you sit with this week — something that opens
you to a positive possibility?"*
Example: *"What if this actually worked out better than I imagined?"*
4. *"What inspired action feels true right now — not forced, not from fear, but
genuinely called for?"*
---
### Stage 4: Self-Trust — Building the Track Record
**Goal**: Help the user build confidence through consistent small commitments —
not through motivation or willpower, but through integrity with themselves.
Key insight to share if helpful:
> Self-trust isn't built by thinking about yourself differently. It's built by
> keeping small commitments to yourself, consistently.
Guide them through:
1. *"Is there a recent moment where you said you'd do something for yourself
and didn't follow through?"*
- Acknowledge it without shame — this is data, not a character flaw.
- Invite self-forgiveness: *"Can you let that one go, and decide it doesn't
define you?"*
2. *"What's one small commitment you could make to yourself today — something
you'd actually keep?"*
- It must be specific and achievable within 24 hours.
3. *"What are three actions you've been avoiding that would move you forward?"*
- Help them name them specifically.
- Then: *"Which one of these could you take on today?"*
---
## Session Flow Notes
- Don't rush. This isn't a checklist — it's a conversation.
- If the user goes quiet or gets emotional, sit with it. That's the work.
- Reflect back what you're hearing before each next question.
- Not every stage will be needed every session — use your judgment.
- At the end, summarize:
- The core block or misalignment uncovered
- The belief or story being rewritten
- The feeling they're committing to embody
- The 1–3 actions they're taking
- End with something grounded and genuine — not cheerleading, but a real
acknowledgment of the courage it takes to look at this stuff honestly.
Of course connecting with a human is more flexible and .. human.
Reach out if you’re ready to connect.
Today Claude helped me upgrade AB’s admin system from jQuery 1.12.4 to 3.7.1.
Then we were able to remove some Migrate code entirely. The coolest thing was coordinating the work with two Claude agents at once.
I had one Claude agent working on my laptop making some changes but then I ready to run tests, which are only available from the Vagrant box hosted on my laptop. So I started another Claude agent on the Vagrant box. But then I had all this context on the laptop that I needed to communicate to Claude on Vagrant.
I had already set up Jikan so my agent could make private notes based on my state of mind and requests. Hmmmm how about we just use that on the Vagrant box as well?
It worked more easily than I expected. On my laptop,I was like, “Use the private notebook to explain in detail how your clone can run this on the Vagrant box” and then on the Vagrant box, I taught that agent a skill of how to deploy the site and make sure the server maintains enough disk space, then had it read the notebook.
Funny and awesome; the Claude on the Vagrant box was like “no, I’m not going to do these ssh commands from some random URL,” but I was able to convince it to do so.. my first jailbreak? Scary enough, it didn’t take all that much coaxing.
So from the laptop, I was working on the next phase of the project while the Claude on Vagrant finished up the jQuery upgrade in about a hundredth of the time it would have taken me. Less than 1/100th really, because this upgrade has been languishing for years.
Wrangling YNAB Data for Japanese Tax Filing with Google Apps Script and Claude
Since March 2024, I track all my spending in YNAB (You Need A Budget). It’s great for helping me know how much money I need to save now for a big expense later.
Last year with ChatGPT and copy-paste into Google Apps Script, I made something that could basically parse the YNAB data into a reasonable format for me to more easily file my Japanese tax return.
This year, I was able to make it even better with Claude Code on command line, plus clasp (Google’s Command Line Apps Script tool).
This line and above are written by Rob. Below is written by Claude:
::: ai claude
The Setup
The core tool is a Google Apps Script project bound to a Google Sheets spreadsheet. The workflow is simple:
Export a year’s worth of transactions from YNAB as CSV
Paste the data into a sheet called YNAB DATA HERE
Run a series of menu items in order — each one pulls matching rows out of the source sheet and deposits them into the correct tax category tab
The tabs at the end of the process include things like JPY Expenses tab, USA Expenses tab, Health Expenses tab, Fixed Expenses tab, and so on. Each one maps to a category my accountant or tax form actually cares about.
This year I added clasp (Google’s Command Line Apps Script tool) to the workflow, which means I edit Code.gs locally, push with clasp push, and track everything in git. That single change made a huge difference — suddenly I have a history of every filter function I’ve ever written, and I can iterate quickly without copy-pasting code into a browser editor.
The Tax Law Constraint Problem
Here’s where it gets interesting. Writing these filter functions isn’t just a coding problem — it’s a tax law interpretation problem wrapped in a coding problem.
Take a recent example: I attend monthly MKP Japan meetings. I founded MKP Japan a decade ago. I spend money getting there (train fare) and sometimes on food. Is that a business expense?
Probably not. The primary purpose is personal and community-oriented. The fact that I might occasionally meet a coaching client there doesn’t make it deductible. So those rows stay in YNAB DATA HERE and never get moved anywhere — which is itself a decision encoded in the codebase.
Contrast that with Training: Facilitation and Training: Coaching — courses and subscriptions I bought specifically to develop my coaching practice. Those go straight to the Training expenses tab. The filter function that handles them is almost trivially simple:
A few lines of logic, but behind each line is a judgment call about what Japanese tax law considers a legitimate business education expense for a self-employed coach.
Where AI Collaboration Actually Helps
The coding itself is not especially hard. Google Apps Script is JavaScript. Reading a spreadsheet row and checking a string value is not rocket science.
What’s hard is the volume of small decisions. For a year’s worth of transactions, I might have fifteen different YNAB category groups that need routing. Each one requires:
Understanding what the expense actually was
Deciding whether it’s deductible and under what category
Writing a filter that correctly matches it
Making sure it goes to the right output sheet (JPY only? Both JPY and USA?)
Not accidentally catching rows that should stay unhandled
Working through this with Claude meant I could just describe the situation — “these are domain renewals for websites I use for business communication” — and get a working filter function immediately, without switching mental contexts from tax logic to JavaScript syntax. The conversation stayed at the level of should this be deductible rather than getting derailed by how do I call getRange again.
Claude also caught things I would have glossed over. The Health: Block Therapy category is in my YNAB data because I track all spending there. But Block Therapy sessions with my practitioner are almost certainly not tax-deductible, so the health filter explicitly excludes them:
That one-line exclusion represents a real tax decision, documented in code and in git history.
The Fixed Expenses Tab: A More Complex Layout
The most interesting piece of code in the project handles mandatory government payments — health insurance premiums, Japanese pension contributions, and residence tax. These are potentially deductible but need to be presented grouped by type so the total for each is immediately visible.
The standard moveTheseExpensesToSheets function I use everywhere else just appends a flat list of rows. For this tab I needed:
Health Insurance rows, then a TOTAL
Two blank rows
Japanese Pension rows, then a TOTAL
Two blank rows
Residence Tax rows, then a TOTAL
That required a custom function. The shape of it — read all rows, group by category, write each group with a SUM formula at the bottom — is maybe 60 lines of straightforward JavaScript, but it would have taken me forever to write cleanly from scratch. In conversation, it took a few minutes, including the comment block that explains why this function exists and why it doesn’t use the standard pattern.
The Bigger Picture
What I’ve ended up with is a codebase that encodes a year’s worth of tax decisions in an auditable, repeatable way. Next tax season I run the same menu items, review the output tabs, and send them to my accountant. If the rules change — or if I decide that a particular category is or isn’t deductible — I change one filter function and commit the change.
The git history is also genuinely useful. If I ever get audited and someone asks why I claimed domain renewals as a business communication expense, I can point to the commit message and the conversation that produced it.
None of this required a particularly sophisticated AI. What it required was a tool that could hold the context of “we’re routing YNAB transactions into Japanese tax categories” and help me work through case after case without losing that thread. That’s exactly what Claude is good at.
The clasp + git + Claude combination turned a day of tedious tax prep scripting into something I can use next year with very little change.
:::