
We are seeing a shift from AI that chats to AI that acts. This week, we look at 5 open-source projects redefining the build—from autonomous coding agents to infinite video generation.
It’s becoming a bit of a full-time job just keeping up with open source right now. You blink, and the industry has moved on from “chatbots” to full-blown autonomous agents.
There’s a lot of noise — but there are also genuinely impressive tools being built that change how we think about software. I’ve been digging through repos with a simple filter: things that feel shippable, and meaningfully push the “AI that acts” frontier forward.
The shift is pretty clear: we’re moving away from AI that just talks, towards AI that does. Here are five projects that caught my eye this week.
If you’re building voice agents, you’ve probably noticed that a static text box feels a bit dead. As we move towards multimodal interaction — think Gemini Live or OpenAI’s advanced voice mode — the interface needs to catch up.
ElevenLabs UI is a neat library of React components that brings voice experiences to life. It gives you polished orbs and waveforms that visualise listening, thinking, and speaking states. It’s built on top of shadcn/ui, so it drops cleanly into a Next.js app.
It’s a small detail, but it’s the difference between an app feeling like a robot… and feeling like an interface.
Repo: https://github.com/elevenlabs/ui
Anthropic made a splash recently when they gave Claude the ability to control a mouse and keyboard. Naturally, the open-source community took that as a challenge.
Open Computer Use is a platform that effectively gives your AI agents hands. It orchestrates two roles: a Planner that breaks down what you want to do, and an Executor that actually clicks buttons and types text.
This is the bridge between generative AI and useful AI. It opens the door to automating workflows across legacy software and web interfaces — not just APIs — so the agent can actually get work done rather than narrate what it would do.
Repo: https://github.com/LLmHub-dev/open-computer-use
This one is equally impressive and slightly terrifying (for those of us who write code). Full Stack Agent is a “text-to-app” platform.
You give it a prompt, and it handles the end-to-end lifecycle: database setup, backend logic, frontend build, and deployment to a live URL. It also runs in a sandboxed environment (which matters a lot when you’re letting an agent generate and execute code).
For rapid prototyping and internal tooling, this is huge. It compresses the “first few hours of a project” into minutes, so you can spend your time on the logic and the user value instead of scaffolding.
Repo: https://github.com/FullAgent/fulling
Most AI agents try to be good at everything and end up being average at best. Dexter takes the opposite approach: it’s designed to do one thing really well — financial analysis.
Think of it as an autonomous junior analyst. It pulls in market data, reads financial documents, sanity-checks its own work (crucial in finance), and produces a structured report you can actually use.
It’s a great example of vertical AI: constrain the scope, increase the reliability, and suddenly you’ve got something that feels far closer to enterprise-grade. If you love Claude Code, Dexter will feel right at home for you.
Repo: https://github.com/virattt/dexter
Generative video is having a moment, but most tools still struggle with duration — they tap out after 5 or 10 seconds, or drift into visual nonsense.
Stable Video Infinity tackles this with a “prompt stream” technique: chaining scenes and transitions to produce longer-form video that aims to keep coherence across the narrative. You can script a sequence that moves from a city street, to a drone shot, to a close-up — and it’s built around the idea of maintaining continuity as the video evolves.
If you’re interested in storytelling rather than just short clips, this one is worth watching.
Repo: https://github.com/vita-epfl/Stable-Video-Infinity
The common thread here is agency.
We aren’t just prompting models for answers anymore — we’re giving them interfaces, tools, and execution environments so they can operate inside real systems and complete real tasks.
If you try any of these, I’d love to hear what you end up building.
I lead data & AI for New Zealand's largest insurer. Before that, 10+ years building enterprise software. I write about AI for people who need to finish things, not just play with tools

A Reddit post about telling Claude you work at a hospital went viral. Turns out there's actual research explaining why this works across all LLMs.

Microsoft just told thousands of engineers to install Claude Code and compare it to Copilot. When you're running internal benchmarks against a competitor, you're not confident you're winning.

How you split your documents determines whether RAG finds what you need or returns noise. Here's the complete breakdown with code.
AI patterns, workflow tips, and lessons from the field. No spam, just signal.