How we automated routine operational work at XPR with AI agents
An overview of the routine operational work we handed to AI agents at XPR, safeguards added and lessons learned
Any tech startup typically has some routine workflows across various functions like -
- Marketing - publishing blog posts, posting on LinkedIn or other social channels.
- Support - debugging, responding to tickets
- Creative Work - creating and editing graphics, videos etc.
- Sales - prospect research, building target lists, drafting proposals.
- Management - performance reporting for Support, Product Development & Sales.
At XPR we have moved a lot of routine work to AI agents over the last few months. Things like debugging production issues, maintaining troubleshooting docs, creating Jira tickets, evaluating support performance, image and video editing, drafting a blog post etc. Work that someone has to manually do, that takes real time, and that mostly follows a pattern.
I'm writing this series to show what is possible with today's AI agents, and how to implement the automations safely, so you can carry that into your own work.
This is an overview of what we automated and how it fits together. Each area gets its own post later in the series, including the parts that did not work the first time and what we changed.
How it is set up
The front end is simply a Slack Bot. We call ours Shikau ( शिकाऊ in Marathi means a Trainee). You tag it in a channel or drop a file, and it does the work in that thread. The agent runs as one service on an EC2 instance in our own AWS account. It has a fixed set of tools it can call, a separate credential for each outside service, and read-only access to anything it does not need to change. On the backend, there are several agents with their own prompts. The channel a message arrives in decides which agent answers and what that agent is allowed to do.
(Now we already use Slack and AWS so those were the default choice for us. You could very well implement yours in MS Teams, Azure Cloud etc. - the template remains the same - only the tools will change)
What it's built on
The engine behind all of this is the Claude Agent SDK. If you have used Claude Code, Anthropic's coding agent that runs in your terminal, this is the same engine and harness underneath, but run as a shared service for a team instead of a tool on one person's laptop.
Running it as a via Agent SDK as a shared service for a team is way more valuable -
- Shared skills and a common knowledge base. Skills and reference docs are written once and used by every agent, and the agents keep that knowledge base updated as they work. The debugging agent's troubleshooting doc, further down, is exactly this.
- Guarded access. The support team can use an agent that reads our source code without getting access to the repository (The agent, of course, gets scoped, read-only access). Those support agents get far more useful because they can look at the actual source when needed instead of just troubleshooting articles, while the source stays inside the boundary we set.
- Central safeguards. Credentials, read-only boundaries, and approval checks are set once, in one place, instead of trusting ten separate laptops to be configured right.
There is a cost angle too. Instead of a Claude Code license per team member, this runs on one machine and one plan. For a team of ten that is a real saving. (Note that starting starting June 15, 2026, Anthropic will stop Agent SDK from using subscription plan usage and all usage will be billed at API pricing. However, this should still stay cost-effective even at API pricing compared to Claude licenses for the entire support, sales and marketing teams)
None of this is locked to Claude. I used the Claude Agent SDK, but OpenAI's Agents SDK is very similar and can be used in the same way. As with Slack and AWS earlier, treat the vendor as swappable.
What we automated
Here is the current list.
| Area | What the agent does |
|---|---|
| Marketing and blog | Weekly reminder to publish, Research & Suggest Ideas, Drafts a post, makes a hero image, opens a preview for review, publishes after approval |
| Creative and images | Removes backgrounds, resizes, generates and edits images, builds kiosk theme variants, Kiosk home screen videos |
| Sales | Researches prospects from the brands and locations we already serve, groups them into target cohorts (food-service operators at airports, point of sale resellers, and so on), and drafts a pitch with the right context. Also turns our pricing sheet into an SOW for a location |
| Production debugging | Parses a log, checks known error patterns, troubleshooting notes, source code. Also updates troubleshooting docs |
| Support monitoring | Pulls support tickets and reports the objective numbers and subjective quality metrics |
| Ticket filing | Turns a description into a Jira ticket, attaches the log, and assigns the likely owner |
| Service operations | Restarting services, pulling logs from cloud servers etc. |
A few of these are worth a bit more detail on why they exist.
The marketing reminder matters a lot - we always kept failing at publishing on a regular basis. The agent now tracks that and reminds us before a slot is missed. It researches current news and comes up with blog topic suggestions and drafts that we can edit and approve.
The debugging agent also maintains the troubleshooting documentation as well. (I have given up on expecting people to keep updating docs). The agent keeps an error patterns file and a troubleshooting doc. For any debugging requests, it first identifies the error from the logs, checks the error patterns, then the troubleshooting doc and finally the source code (read only). Any new error, it goes back and updates the patterns and troubleshooting.
For Support-Ops, since the agent is reading the full support conversation, it is able to subjectively analyze and report on the quality of the conversation - e.g. Agent X is not communicating properly, see examples... or Client Y is upset - see this. This gives a much better picture of the Support Ops than just the numbers like SLA, resolution times etc.
Why it runs on our own AWS account
The reason we are comfortable giving an agent access to production logs, our marketing site, and our support portal is because the machine it runs on limits what it can reach. On our own instance, we control the IAM permissions and the security group, and we can audit the logs. A hosted platform, or OpenClaw on your Mac mini, doesn't give any of that. The architecture post goes through how this is set up.
What else I plan to write about
These are the areas I want to cover, in no particular order:
- Automating marketing and blog publishing with AI agents
- Automating creative and image work with AI agents
- Automating sales prospecting and proposals with AI agents
- Debugging production issues with AI agents
- The architecture: front end, hosting, Agents SDK, security
- Running AI agents on AWS instead of a hosted platform or Mac Mini
- Monitoring customer support operations with AI agents
- Filing and routing Jira tickets with AI agents
- Limiting what an AI agent can access
- What broke, ideas we tried and dropped
None of these automations are large. I won't be providing any code samples or configurations - you can ask your AI agents to do that. I'll publish them as I write them.