MCP Messaging Server Patterns — Research¶
Date: 2026-04-06 Purpose: Inform the bidirectional Teams loop design with patterns from existing MCP servers that handle messaging (Slack, iMessage, Discord, Teams).
Key Finding: Nobody Polls at the MCP Layer¶
Every production MCP messaging server uses stateless request-response. The LLM decides when to fetch messages. No server maintains a background polling loop or tracks "last seen message ID" between invocations. Our watch_teams_replies tool would be the first to do this.
The one exception is tolgasumer/discord-mcp (Go), which uses Discord's WebSocket gateway to push events via JSON-RPC notifications. Teams has no equivalent — Graph API is REST-only for messaging.
Servers Studied¶
Slack¶
| Server | Lang | Stars | Polling | Dedup | Token Refresh |
|---|---|---|---|---|---|
| Official (mcp.slack.com) | — | — | On-demand | None | OAuth 1hr, NO refresh token — known pain point |
| korotovsky/slack-mcp-server | Go | 9k+ | On-demand + "unreads" shortcut | Slack ts as cursor |
Static env var tokens |
| jtalk22/slack-mcp-server | TS | — | On-demand | None | 4-layer fallback: env -> file -> Keychain -> Chrome extraction, mutex-locked refresh |
Key pattern — korotovsky's "unreads" shortcut: Single API call to ClientUserBoot returns all channels with LastRead/Latest metadata, then only fetches history for channels where Latest > LastRead. The Teams equivalent would be checking lastMessagePreview on chat objects before fetching full messages.
Key lesson — token refresh is the #1 pain point: The official Slack server's 1-hour expiry without refresh tokens caused 18 re-authentications in 5 days. Our three-hop flow is even more complex — eager refresh is essential.
iMessage¶
| Server | Lang | Polling | Dedup |
|---|---|---|---|
| photon-hq/imessage-kit | TS | Timer polling (2s default) of SQLite DB | Map of ROWIDs + 1s timestamp overlap window |
| steipete/imsg | Swift | FSEvents + --since-rowid cursor |
ROWID watermark (monotonic) |
| carterlasalle/mac_messages_mcp | Python | On-demand query | None (stateless) |
Key pattern — timestamp overlap + Map dedup (imessage-kit):
overlap = min(1s, pollInterval)
query messages WHERE created_at >= lastCheck - overlap
filter out IDs already in seen_map
Key pattern — bounded seen-set cleanup: When the Map exceeds 10,000 entries, prune to last hour's records. Prevents memory leaks in long-running processes.
Key pattern — ROWID cursor (imsg): Monotonic IDs beat timestamps. WHERE rowid > last_seen has no clock precision issues. Graph API message IDs aren't monotonic, but $deltaLink tokens serve the same purpose.
Discord¶
| Server | Lang | Polling | Dedup | Rate Limiting |
|---|---|---|---|---|
| tolgasumer/discord-mcp | Go | WebSocket gateway events | Event-driven (no dedup needed) | 30 req/min config |
| barryyip0625/mcp-discord | TS | On-demand | None | discord.js built-in |
Key pattern — event streaming via JSON-RPC notifications: tolgasumer/discord-mcp pushes events proactively. Event types are individually configurable to control noise. Not available for Teams (no WebSocket gateway).
Key pattern — write rate caps: 10 messages/min global, 3/min per channel, 5s minimum between sends. Good safety model.
Teams¶
| Server | Lang | Polling | Token Refresh |
|---|---|---|---|
| floriscornel/teams-mcp | TS | On-demand | MSAL auto-refresh with file cache |
| InditexTech/mcp-teams-server | Python | On-demand | Client credentials re-request |
Key findings from floriscornel/teams-mcp:
- Graph API for chat messages only supports descending datetime order; ascending returns an error
- $filter is unreliable for chat messages — must sort/filter client-side
- HTML-to-Markdown conversion needed (Graph returns HTML)
- MSAL ICachePlugin handles token persistence: read-on-demand, write-on-change
- 100-page safety cap when fetchAll is true
Graph API Pitfalls for Teams Polling¶
These are critical for our implementation:
-
Chat message endpoints don't support
$orderbyor$filterreliably. Sort and filter client-side after retrieval. -
Delta query for chat messages has ~8 month lookback limit. Older history requires full pagination.
-
Pagination can be cut off — the API may stop returning
@odata.nextLinkto preserve service stability. -
Throttling is inconsistent. HTTP 429 can occur without warning. Not all endpoints return
Retry-Afterheaders. Always implement exponential backoff as fallback. -
Webhooks require a public HTTPS endpoint — non-starter for local MCP servers. Polling + delta query is the only option.
-
Delta queries return unexpected change types — deleted items, read-state changes — that don't match your original filter. Must handle
@removedentries. -
Microsoft recommends polling the
x-ms-throttle-limit-percentageresponse header to detect approaching rate limits before hitting 429.
MCP Protocol Patterns for Long-Running Operations¶
From the MCP spec and community:
-
The "two-tool pattern" is canonical:
start_X()returns job ID,check_X(job_id)polls status. Most production MCP servers use this today. -
The Tasks primitive (experimental, spec 2025-11-25) formalizes this:
tools/callwithtaskfield returnstaskId+pollInterval, client callstasks/getto check status. Not yet broadly supported by clients. -
A single blocking tool that polls internally also works for Claude Code's stdio transport. This is simpler but blocks the LLM while polling.
-
Server-guided poll intervals: Include recommended wait time in tool output so the LLM can pace itself.
-
Rate limiting is critical: Unthrottled MCP servers can generate 1,000+ API calls/minute from retry loops.
-
Resource subscriptions exist in the spec but Claude Desktop doesn't support them. Polling is the pragmatic choice.
Auth Lifecycle Comparison¶
| Platform | Token Type | Expiry | Refresh Strategy |
|---|---|---|---|
| Discord | Bot token | Never | None needed |
| Slack (official) | OAuth | 1hr | None (broken — re-auth via browser) |
| Slack (jtalk22) | Session | Variable | Mutex-locked 4-layer fallback |
| Teams (floriscornel) | OAuth delegated | ~1hr access, ~90d refresh | MSAL auto-refresh with file cache |
| Teams (InditexTech) | Client credentials | ~1hr | Re-request on expiry |
| Teams (Entraclaw) | OBO chain (3 hops) | ~1hr per hop | Must refresh each hop independently |
Our three-hop flow is the most complex token lifecycle of any MCP messaging server studied. Eager refresh (55-min threshold) + lazy retry (catch 401) is the right strategy, confirmed by the pain points seen across all implementations.
Design Implications for Entraclaw¶
Changes from original design¶
-
Use delta queries instead of raw timestamp polling. Graph API's
/chats/{id}/messages/deltareturns a$deltaLinktoken that acts like imsg's--since-rowid— monotonic, no clock precision issues. Store the delta token as cursor instead oflast_seen_timestamp. -
Add timestamp overlap as fallback. If delta query fails or isn't available, fall back to timestamp-based polling with 1-second overlap + message ID dedup set (imessage-kit pattern).
-
Bounded seen-set with cleanup. Cap at 1,000 message IDs (our volume is much lower than iMessage), prune on threshold.
-
Client-side filtering is mandatory. Don't trust Graph API
$filterfor chat messages. Always filter human-vs-agent messages in Python after retrieval. -
Exponential backoff on 429. Check
Retry-Afterheader first, fall back to exponential backoff with jitter. Monitorx-ms-throttle-limit-percentageheader. -
Handle
@removedentries from delta queries. Don't crash on deleted messages or read-state changes in delta responses. -
Single-instance guard. Log a warning if concurrent polling is detected (can't enforce, but can detect).