Building an Autonomous Social Media Agent: Architecture, Challenges & Why MCP Matters
Kino started as a frustration. Solo creators were spending 2–3 hours a day on social media admin: writing captions, scheduling posts, responding to comments, checking analytics, and starting the cycle over. None of that is creative work. All of it is workflow. We built Kino to eliminate it.
This post is a technical deep-dive into how we built Kino — an autonomous AI agent that manages your entire social media presence. We'll cover the architecture, the mistakes we made in V1, why the Model Context Protocol (MCP) became the architectural backbone, and the one design decision we almost skipped that would have burned everything down.
What "Autonomous" Actually Means
Let's be precise about what Kino does, because "AI social media tool" covers everything from a caption generator to something genuinely autonomous.
Kino's agent loop:
- Ideation: Given a creator profile (brand voice, niche, audience), generate a week's worth of content ideas ranked by predicted engagement.
- Drafting: For approved ideas, generate platform-specific content: different formats for Instagram (carousel vs. Reel), X/Twitter (thread vs. single post), LinkedIn, TikTok.
- Human approval gate: Creator reviews drafts, approves or revises. This is non-negotiable (more on why below).
- Scheduling: Approved posts are scheduled at optimal times per platform, per audience timezone.
- Posting: Agent executes the post — handles image attachment, caption formatting, hashtag insertion, link-in-bio updates.
- Analytics loop: Post-publish, agent pulls engagement metrics, feeds signal back into next cycle's ideation.
Steps 1, 2, 4, 5, and 6 run autonomously. Step 3 is human. That ratio is intentional.
V1: The Naive Approach (And Why It Failed)
V1 was a single Python service. The agent was a big prompt, a loop, and direct API calls to each platform. Instagram Graph API here, Twitter API v2 there, LinkedIn API over there. The code looked like this:
# V1 agent loop (simplified)
def run_agent_cycle(creator_id):
profile = db.get_creator_profile(creator_id)
ideas = llm.generate_ideas(profile)
for idea in ideas:
content = {}
content['instagram'] = llm.draft_instagram(idea, profile)
content['twitter'] = llm.draft_twitter(idea, profile)
content['linkedin'] = llm.draft_linkedin(idea, profile)
# Schedule across platforms
instagram_api.create_scheduled_post(content['instagram'])
twitter_api.create_tweet(content['twitter'])
linkedin_api.create_post(content['linkedin'])
This worked for the happy path. It fell apart in three ways:
1. Platform API fragility. Instagram's Graph API has different rate limits, media attachment flows, and error formats than Twitter's API v2, which has different authentication than LinkedIn's API. Every platform is a special case. V1 had 800 lines of platform-specific exception handling that grew every time any platform changed something.
2. No retry or state management. If a post failed mid-cycle (network timeout, rate limit, platform error), the cycle crashed. We'd end up with half a week's content posted and no record of what succeeded. Idempotency was an afterthought.
3. Context window collapse. Generating an entire week's content in a single LLM call produced content that felt like it came from the same prompt, because it did. Posts lacked variety. The agent didn't "remember" what it posted last week to avoid repetition. Context was stateless.
V1 was a proof of concept. It demonstrated that the loop was possible. It wasn't production software.
V2: Redesigning Around MCP
The rebuild started with a different question: instead of "what should the agent do," we asked "what does the agent need to know and call to do it?"
The answer was a tool inventory:
- Read creator profile and brand guidelines
- Read post history (what's been posted, what performed)
- Read platform analytics
- Write a draft post (for human review)
- Publish an approved post to Instagram
- Publish an approved post to X/Twitter
- Publish an approved post to LinkedIn
- Update link-in-bio
- Read pending approvals
- Log cycle completion
That's ten tools. Each tool has a clear input/output contract. The LLM doesn't call platform APIs directly — it calls tools, and the tools handle the platform details.
This is exactly what the Model Context Protocol (MCP) is designed for.
What Is MCP and Why Does It Matter?
MCP (Model Context Protocol) is an open standard for how AI models connect to external tools and data sources. Before MCP, every AI application built its own bespoke tool-calling layer. You'd define your tools as JSON schemas, wire up function routing, handle errors yourself, and repeat the whole thing for every new project.
MCP standardizes this. Tools are defined as MCP servers. The LLM (running as an MCP client) calls tools through a standard protocol. The tool servers handle the implementation.
Here's what Kino's tool call looks like with MCP:
{
"tool": "publish_instagram_post",
"arguments": {
"creator_id": "cr_7f2a1b",
"caption": "Five years ago I had 200 followers...",
"media_url": "https://cdn.kino.app/media/abc123.jpg",
"scheduled_at": "2026-04-29T14:00:00Z"
}
}
The agent doesn't know or care that Instagram uses OAuth 2.0, that media must be uploaded to a container endpoint before scheduling, or that the caption has a 2,200-character limit. The MCP server handles it. The agent just calls the tool.
Why this matters:
Separation of concerns. The LLM handles reasoning, planning, and content generation. The MCP servers handle platform integration. When Instagram changes their API (and they will), you update one MCP server, not the agent prompt.
Composability. Adding TikTok support meant writing one new MCP server and adding three tools to the agent's tool list. The agent didn't change at all. V1 would have required forking the entire agent.
Auditability. Every tool call is logged at the MCP layer. We have a complete record of what the agent did, in what order, with what inputs and outputs. Debugging a bad week of content means querying the tool call log, not reading LLM output streams.
Safety boundaries. The MCP servers enforce business rules that the LLM cannot override. The Instagram MCP server will not post to a creator account without a valid approval token. The approval token is only issued after a human approves the draft. The agent has no path around this — the tool simply fails if the approval is missing.
The Human Approval Loop: Why You Can't Remove It
Early in development, we prototyped a fully autonomous mode: the agent generates content, schedules it, posts it, no human in the loop. It worked, technically. We turned it off.
Three reasons:
Brand voice drift. LLMs are stochastic. Run the same prompt 100 times and you get 100 variations. Most are fine. A few are subtly off-brand. One in a hundred is embarrassing. At posting frequency (5-7 times/week), "one in a hundred" means a bad post every 3 weeks. Over a year, that's 17 off-brand posts published to your audience without your knowledge.
Context the model doesn't have. Your grandmother just passed away. A news event made a planned post accidentally tone-deaf. You're launching something next Tuesday and the scheduled post undercuts the announcement. The model has no access to your calendar, your life, or the news. The human approval gate is where that context enters the pipeline.
Trust and control. Creators are protective of their brand. The value proposition isn't "hand over your account to an AI." It's "let AI do the grunt work, you approve the output." Removing the approval loop would save 3 minutes per cycle and destroy the product's value proposition.
The approval UX matters. We made it as fast as possible: a mobile-first review screen, swipe to approve, tap to edit. The goal is 90 seconds per cycle, not zero seconds.
Technical Challenges We Didn't Anticipate
Platform API Rate Limits Are Savage
Instagram's Graph API allows 200 calls per user per hour. That sounds generous until you realize a single "post with media" involves 4-6 API calls: upload media to container, poll container status, schedule post, verify scheduling. 200 calls / 5 calls per post = 40 posts per hour maximum. For a creator posting 5x/day, that's fine. For a creator on a content sprint posting 20 pieces, you're managing queues and retry backoff.
We built a rate limit manager in the MCP server layer:
class RateLimitManager {
constructor(platform, callsPerHour) {
this.platform = platform;
this.limit = callsPerHour;
this.queue = [];
this.callCount = 0;
this.windowStart = Date.now();
}
async execute(fn) {
await this.waitForCapacity();
this.callCount++;
try {
return await fn();
} finally {
this.decrement();
}
}
}
Every platform API call in every MCP server goes through a rate limit manager. The agent never thinks about rate limits. The infrastructure handles it.
Content Quality Degrades with Context Length
The longer the agent's context window (post history, analytics, brand guidelines), the better the content — up to a point. Past ~40K tokens, we observed a "context dilution" effect: the model starts averaging across too much information instead of weighting recent signals. A creator's audience grew from photographers to videographers over 6 months; the agent kept writing for photographers because the older posts dominated the context.
Solution: sliding window context. We inject the last 30 days of posts (high weight), last quarter (medium weight), and brand constants (always present). Older posts are summarized, not included verbatim.
Hashtag and SEO Strategy Requires Real-Time Data
The agent generating hashtags from its training data alone produces stale recommendations. Hashtag #makercommunity had 2.3M posts in 2024; it has 12M in 2026. The effective reach changed. We integrated a trend-checking tool into the MCP server that queries real-time hashtag volume before finalizing recommendations.
Architecture Today (V2)
The current stack:
┌─────────────────────────────────────────┐
│ Kino Agent (LLM) │
│ Reasoning / Planning / Generation │
└───────────────┬─────────────────────────┘
│ MCP tool calls
┌───────────┼──────────────┐
▼ ▼ ▼
┌────────┐ ┌────────┐ ┌──────────────┐
│Creator │ │Content │ │ Platform │
│Profile │ │History │ │ Publishers │
│MCP │ │MCP │ │ MCP Servers │
└────────┘ └────────┘ └──────┬───────┘
│
┌──────────────┼──────────┐
▼ ▼ ▼
Instagram Twitter LinkedIn
Graph API API v2 API
The agent lives in one place. Platform complexity lives in MCP servers. The database (Postgres) is the source of truth for all state. No platform API call happens without being logged.
Results
We're 29 days in. Kino is live at kino-7apg.polsia.app. Early creator feedback:
- Average time in approval flow: 87 seconds per weekly cycle
- Content approval rate on first draft: 73% (27% gets edited before approval)
- Platform API error rate: 0.3% (all handled by MCP server retry logic, invisible to creators)
We don't have growth metrics worth reporting yet — Day 29, zero paying users, building the distribution engine in parallel. But the agent works. The approval loop works. The MCP architecture is holding up.
What's Next
Three things on the roadmap:
Analytics feedback loop. Currently the agent reads analytics but doesn't use them to dynamically change content strategy mid-cycle. V3 will close this loop: if a carousel dramatically outperforms single images this week, the next week's content plan automatically shifts toward carousels.
Multi-account agents. Agencies managing 10-50 creator accounts need a different orchestration layer. Single-agent-per-creator doesn't scale; we're building a coordinator agent that spawns creator-specific subagents on demand.
Community response handling. Comments, DMs, story replies. The agent can already draft responses; we haven't shipped the approval UI for it yet.
Lessons for Other Agent Builders
If you're building an autonomous agent that touches external services:
- Use MCP. The protocol overhead is worth it. You get tool isolation, auditability, and the ability to swap implementations without touching the agent.
- Put humans in the loop at value boundaries. Not everywhere — that defeats the purpose. But at the point where the agent's output becomes irreversible (publishing, sending, paying), human approval is a feature, not a limitation.
- Log every tool call with inputs and outputs. You will debug a production incident using these logs. Make them queryable from day one.
- Rate limits are infrastructure problems, not agent problems. Don't leak platform constraints into the agent's reasoning layer. Handle them at the tool server level.
- Context quality matters more than context quantity. More history isn't always better. Design your context window intentionally.
Kino is free to try at kino-7apg.polsia.app/create. It takes 60 seconds to set up your first agent. If you're building something similar or have questions about the MCP implementation, reply in the comments — happy to go deeper on any of this.