PLAN: Multi-Tenant Progressive Identity for EntraClaw¶
HISTORICAL — shipped (commit
c8ec521). Kept for design rationale. See Engineering Status for current state.Generated by
/plan-ceo-reviewon 2026-04-09 Branch:feature/multi-tenant-lightweight-chat| Mode: HOLD SCOPE Design doc:~/.gstack/projects/entraclaw-identity-research/2026-04-09-feature-multi-tenant-lightweight-chat-design-progressive-identity.mdSpec:docs/architecture/NEXT-WhatsApp-lightweight-teams-chat.md
Problem Statement¶
EntraClaw currently requires per-tenant setup (certificates, Blueprint, Agent Identity, Agent User) before an agent can message in Teams. This makes onboarding slow and limits usage to pre-configured tenants.
Goal: Any user from any Entra ID tenant can start messaging through EntraClaw immediately using their own identity (delegated token), with automatic upgrade to a dedicated Agent User identity when provisioning infrastructure exists.
Architecture: Identity State Machine (Approach C)¶
UNAUTHENTICATED ──► DELEGATED ──► PROVISIONING ──► AGENT_USER
│ │ │ │
│ │ ▼ │
│ │ ERROR ◄────────────┘
│ │ │
│ ◄───────────────┘
│ (recovery)
▼
[auth required]
States:
- UNAUTHENTICATED — no token, auth required
- DELEGATED — human's token via MSAL, messages prefixed [EntraClaw]
- PROVISIONING — background Agent User creation in progress
- AGENT_USER — agent's own identity, full attribution
- ERROR — recovers to DELEGATED (never silent downgrade from AGENT_USER)
Key invariant: The asyncio.Lock protects STATE MUTATIONS ONLY (microsecond hold time). Auth and provisioning operations run OUTSIDE the lock. The 30s timeout is a deadlock safety net, not an operation timeout.
Implementation: Two PRs¶
PR #1: State Machine + Delegated Mode¶
Delivers: multi-tenant MSAL auth, identity state machine, delegated-mode messaging, setup.sh updates.
New files:
src/entraclaw/
auth/
delegated.py — MSAL localhost redirect + device code fallback
token_cache.py — keyring-backed MSAL SerializableTokenCache
identity/
__init__.py
state_machine.py — IdentityStateMachine (replaces _state dict)
Modified files:
src/entraclaw/
config.py — multi-tenant app fields (CLIENT_ID, SKIP_PROVISIONING)
errors.py — new exception classes (AuthTimeoutError, etc.)
models.py — IdentityState enum, attribution_type field on AuditEvent
mcp_server.py — uses state machine, MSAL init, identity-aware poll
tools/teams.py — [EntraClaw] prefix, sent-message dedup, empty validation
tools/audit.py — delegated-human attribution type
scripts/
setup.sh — multi-tenant app registration (signInAudience=AzureADMultipleOrgs)
AGENTS.md — delegated-human attribution exception documented
New test files:
tests/
auth/test_delegated.py — MSAL auth flows (localhost, device code, fallback)
auth/test_token_cache.py — keyring cache (load, write-after-acquire, corruption)
identity/test_state_machine.py — all transitions, invalid transitions, lock timeout
tools/test_teams_delegated.py — delegated-mode operations, prefix, dedup
test_mcp_server_integration.py — initialization, tool routing, background poll
PR #2: Background Provisioner + Agent User Upgrade¶
Delivers: async provisioner, three-hop flow integration, chat membership migration, token swap.
New files:
Modified files:
src/entraclaw/
identity/state_machine.py — PROVISIONING + AGENT_USER transitions
mcp_server.py — provisioner lifecycle, token swap grace period
tools/teams.py — identity-aware sender filtering during transition
New test files:
Decisions Made (19 total)¶
| # | Decision | Rationale |
|---|---|---|
| 1 | Progressive identity required | Management directive, non-negotiable |
| 2 | Localhost redirect primary, device code fallback after 10s | Security architect's flag; try localhost, fall back gracefully |
| 3 | Three-hop flow preserved as Agent User path | Existing working infrastructure |
| 4 | werner.ac for research, tenant-agnostic design | Config not code |
| 5 | MSAL for delegated, raw httpx for three-hop | Best tool for each job |
| 6 | Token swap self-validates on next tool call | Validate before discarding old token |
| 7 | [EntraClaw] prefix for research | Sufficient for research attribution |
| 8 | Provisioner client_credentials from dedicated app | Lazy load, clear after use |
| 9 | Single app for research, permission split for production | Document split |
| 10 | Sent-message tracking (bounded set, max 1000 FIFO) | Prevents echo reprocessing |
| 11 | State Machine First (Approach C) | Best testability, TDD-first |
| 12 | HOLD SCOPE mode | Maximum rigor |
| 13 | MSAL token cache backed by keyring | OS-level encryption sufficient |
| 14 | asyncio.Lock 30s timeout (state mutation only) | Deadlock safety, microsecond holds |
| 15 | GraphApiError + ChatNotFoundError rescue handlers | Section 2 gaps |
| 16 | Single progressive mode, no config flag | State machine handles all paths |
| 17 | ENTRACLAW_SKIP_PROVISIONING override | For testing Phase 1 in isolation |
| 18 | Two PRs (delegated first, provisioner second) | Smaller, independently valuable |
| 19 | Same Teams thread across identity swap | add_teams_member converts 1:1 to group |
Security Requirements¶
Auth Flow¶
- Localhost redirect: Bind to
127.0.0.1only (not0.0.0.0) - PKCE with S256: MSAL default, explicitly required
- Fallback detection: Try localhost, if no browser detected within 10s, fall back to device code
- Remote scenarios: SSH sessions, Codespaces, containers automatically get device code
Delegated Mode Scoping¶
- Only allow operations on explicitly watched chats (no directory enumeration)
- Every action logged with
attribution_type: "delegated-human" - No privilege escalation: delegated token not cached beyond MSAL token cache
Provisioner Credential Isolation¶
- Lazy load: only when provisioning starts, not at MCP server startup
- Clear from memory after provisioning completes
- Research: embedded provisioner acceptable (single developer machine)
- Production: extract to standalone service (see TODOS.md)
Token Swap Validation¶
- Make one Graph API call with new token before committing transition
- Validate
idtypclaim isuser(notapp) - No silent downgrade: AGENT_USER failure → ERROR state, not DELEGATED
Token Cache¶
- Keyring provides OS-level encryption (Keychain/DPAPI/Secret Service)
- Load cache once at startup, keep in-memory
- Write back after every token acquisition
- Handle keyring entry size limits (split per-account if needed)
- Silent re-auth on corruption (graceful degradation)
Dependencies¶
- Pin
msalversion inpyproject.toml - Verify no version conflicts with existing
PyJWT/cryptographypins
Error & Rescue Registry¶
METHOD/CODEPATH | WHAT CAN GO WRONG | EXCEPTION CLASS
-----------------------------|--------------------------------|---------------------------
MsalAuth.acquire_interactive | Browser not opened in 10s | AuthTimeoutError
| User cancels consent | AuthCancelledError
| Port 8400 in use | PortConflictError
| MSAL returns error response | MsalAuthError
MsalAuth.acquire_device_code | User doesn't complete in time | AuthTimeoutError
| User cancels/denies | AuthCancelledError
KeyringTokenCache.load | Corrupted keyring entry | CredentialStoreError
| Keyring unavailable | CredentialStoreError
StateMachine.transition | Invalid state transition | InvalidTransitionError
| Lock timeout (30s deadlock) | TransitionTimeoutError
| Exception during transition | TransitionRollbackError
Provisioner.provision | Three-hop flow fails at any hop| ProvisioningError
| Agent User has no Teams license| LicenseUnavailableError
| Provisioning exceeds timeout | ProvisioningTimeoutError
Graph API (delegated) | 403 Forbidden | GraphApiError
| 429 Rate Limited | RateLimitError
| Chat not found (404) | ChatNotFoundError
Certificate auth (three-hop) | Certificate expired/invalid | CertificateError
Token exchange (three-hop) | FIC exchange fails | TokenExchangeError
Token validation | Wrong idtyp claim | TokenValidationError
Chat membership | Add member fails | ChatMembershipError
EXCEPTION CLASS | RESCUED? | RESCUE ACTION | USER SEES
-----------------------------|----------|----------------------------------|------------------
AuthTimeoutError | Y | Fall back to device code | "Opening device code flow..."
AuthCancelledError | Y | Stay UNAUTHENTICATED, log | "Auth cancelled"
MsalAuthError | Y | Log, stay UNAUTHENTICATED | "Auth failed: {detail}"
PortConflictError | Y | Try ports 8401-8410 | Nothing (transparent)
CredentialStoreError | Y | Clear cache, re-auth silently | Nothing (re-prompts if needed)
InvalidTransitionError | Y | Log, no state change | Tool returns error message
TransitionTimeoutError | Y | Log, transition → ERROR | "Identity system busy, retrying"
TransitionRollbackError | Y | Revert to previous state, log | Nothing (transparent)
ProvisioningError | Y | Stay DELEGATED, log, retry later | Nothing (stays in delegated)
LicenseUnavailableError | Y | Stay DELEGATED permanently | "Agent User requires Teams license"
ProvisioningTimeoutError | Y | Stay DELEGATED, log | Nothing (stays in delegated)
GraphApiError (403) | Y | Retry once, then user message | "Permission denied for this chat"
RateLimitError | Y | Backoff + retry (existing) | Nothing (transparent)
ChatNotFoundError | Y | Remove from watched_chats, log | "Chat no longer available"
CertificateError | Y | Stay DELEGATED, log | Nothing (stays in delegated)
TokenExchangeError | Y | Stay DELEGATED, log | Nothing (stays in delegated)
TokenValidationError | Y | Reject swap, stay current state | Nothing (stays in current mode)
ChatMembershipError | Y | Log, skip chat migration | "Could not add agent to chat"
CRITICAL GAPS: 0 (all rescued)
Failure Modes Registry¶
CODEPATH | FAILURE MODE | RESCUED | TEST | USER SEES | LOGGED
----------------------------|---------------------------|---------|------|----------------|-------
MSAL localhost auth | Port 8400 in use | Y | Y | Transparent | Y
MSAL localhost auth | No browser detected 10s | Y | Y | Device code | Y
MSAL device code | User cancels | Y | Y | "Auth cancelled"| Y
Keyring token cache | Corrupted entry | Y | Y | Silent re-auth | Y
Keyring token cache | Keyring unavailable | Y | Y | Auth re-prompt | Y
State transition | Concurrent transitions | Y | Y | Serialized | Y
State transition | Lock deadlock | Y | Y | Timeout + error| Y
Background provisioner | Hop 1 cert auth fails | Y | Y | Stays delegated| Y
Background provisioner | Hop 2 FIC exchange fails | Y | Y | Stays delegated| Y
Background provisioner | Hop 3 user_fic fails | Y | Y | Stays delegated| Y
Background provisioner | No Teams license | Y | Y | Stays delegated| Y
Token swap | New token invalid idtyp | Y | Y | Stays current | Y
Token swap | Mid-operation swap | Y | Y | Grace period | Y
Chat migration | add_member fails | Y | Y | Chat skipped | Y
Background poll | Identity transition mid-poll| Y | Y | Exclude both IDs| Y
Graph API (delegated) | 403 forbidden | Y | Y | Error message | Y
Graph API (delegated) | Chat deleted | Y | Y | Removed from watch| Y
Send message (delegated) | Empty string params | Y | Y | Validation error| Y
CRITICAL GAPS: 0
Data Flow Diagrams¶
Auth Flow (Phase 1)¶
User starts MCP server
│
▼
State: UNAUTHENTICATED
│
├─► Try localhost redirect (port 8400)
│ │
│ ├─ Success ──► MSAL callback ──► Token acquired ──► State: DELEGATED
│ │
│ └─ Fail (port busy / no browser / 10s timeout)
│ │
│ ▼
│ Device code flow
│ │
│ ├─ Success ──► Token acquired ──► State: DELEGATED
│ │
│ └─ Fail (cancelled / timeout) ──► State: UNAUTHENTICATED
│
▼
State: DELEGATED
│
├─► All tools available (human's permissions)
├─► Messages prefixed [EntraClaw]
├─► Audit: attribution_type = "delegated-human"
│
├─► If ENTRACLAW_PROVISIONER_CLIENT_ID set AND SKIP_PROVISIONING != true:
│ │
│ ▼
│ State: PROVISIONING (background)
│ │
│ ├─ Success ──► Validate token ──► Add to chats ──► State: AGENT_USER
│ │
│ └─ Fail ──► State: ERROR ──► Recovery ──► State: DELEGATED
│
└─► If no provisioner credentials: stays DELEGATED permanently
Token Swap Flow (Phase 2)¶
Provisioner completes
│
▼
Validate new Agent User token
│
├─► Graph API test call with new token
│ │
│ ├─ Success + idtyp=user ──► Proceed to swap
│ │
│ └─ Fail ──► Reject swap, stay DELEGATED, log error
│
▼
Add Agent User to watched_chats (chat membership migration)
│
├─► For each chat in watched_chats:
│ add_teams_member(chat_id, agent_user_email)
│ │
│ ├─ Success ──► Chat migrated
│ └─ Fail ──► Log, skip this chat
│
▼
Acquire asyncio.Lock (microsecond hold)
│
├─► Write new state: DELEGATED → AGENT_USER
├─► Update current token reference
├─► Update sender filter (exclude agent_user_id)
│
▼
Release lock
│
Grace period: in-flight operations complete with old token
Background Poll (Identity-Aware)¶
Every 5 seconds:
│
▼
Read current identity mode
│
├─ DELEGATED ──► Use human's delegated token
│ Filter: exclude human_user_id + sent_message_ids
│
├─ AGENT_USER ──► Use Agent User token
│ Filter: exclude agent_user_id + sent_message_ids
│
├─ TRANSITIONING ──► Exclude BOTH IDs for one cycle
│
└─ UNAUTHENTICATED / ERROR ──► Skip poll
Module Dependency Graph¶
config.py ◄─────────────────────────────────────────────┐
│ │
▼ │
auth/ │
├── certificate.py (existing) │
├── delegated.py (NEW — MSAL) │
└── token_cache.py (NEW — keyring wrapper) │
│ │
▼ │
identity/ │
├── state_machine.py (NEW — core) ◄── errors.py │
└── provisioner.py (NEW — async) ──────────────────────┘
│ │
▼ ▼
tools/ mcp_server.py
├── teams.py (modified) (orchestrator)
├── audit.py (modified)
└── rate_limit.py (existing)
Observability¶
Structured Logging¶
# State transitions
INFO identity.state_machine: transition from=DELEGATED to=PROVISIONING trigger=provisioner_start
ERROR identity.state_machine: transition_failed from=X to=Y error=InvalidTransitionError
# Tool calls (every call)
INFO tools.teams: send_teams_message chat_id=19:xxx identity_mode=DELEGATED attribution=delegated-human
# Provisioner progress
INFO identity.provisioner: started
INFO identity.provisioner: hop1_complete
INFO identity.provisioner: hop3_complete idtyp=user duration_s=1.2
INFO identity.provisioner: chat_migration chats_migrated=3
# Auth events
INFO auth.delegated: localhost_redirect_started port=8400
WARN auth.delegated: localhost_failed falling_back=device_code reason=timeout
INFO auth.delegated: auth_complete method=localhost scopes=Chat.ReadWrite,User.Read
What Already Exists¶
| Sub-problem | Existing code | Reused? |
|---|---|---|
| Three-hop token flow | tools/teams.py:acquire_agent_user_token() |
Yes (PR #2) |
| Graph API send/read/create_chat | tools/teams.py (640 lines) |
Yes, modified |
| Background poll | mcp_server.py:_background_poll() |
Yes, modified |
| Token refresh | mcp_server.py:_ensure_valid_token()/_with_token_retry() |
Yes, adapted |
| Certificate JWT assertion | auth/certificate.py |
Yes (PR #2) |
| Keyring credential storage | platform/ modules |
Yes (token cache backend) |
| Error hierarchy | errors.py |
Yes, extended |
| Pydantic models | models.py |
Yes, extended |
| Rate limit handling | tools/rate_limit.py |
Yes, unchanged |
| Audit logging | tools/audit.py |
Yes, modified |
| Config from env | config.py |
Yes, extended |
NOT in Scope¶
| Item | Rationale |
|---|---|
| UI/frontend | CLI/MCP server only |
| Multi-tenant runtime (shared process) | Per-process model is correct; each session gets its own MCP server |
| Production provisioner service | Research uses embedded provisioner; service extraction is a TODO |
| IC3 federation | 12-month target, state machine enables it but not built yet |
| AppContainer sandbox | Separate workstream (existing TODO) |
| Admin consent automation for external tenants | Users handle consent; provisioning gracefully stays DELEGATED |
| HTTP stack unification (requests → httpx) | TODO, not blocking |
| Windows Certificate Store integration | Existing platform/ modules handle this; no changes needed |
Dream State Delta¶
CURRENT STATE THIS PLAN 12-MONTH IDEAL
──────────────────── ──────────────────── ────────────────────
Single-tenant only → Multi-tenant delegated → IC3 federation
Certificate required → MSAL auth (zero setup) → Passkey/FIDO2
Manual setup.sh → Scripted multi-tenant → Self-service portal
Agent User or nothing → Progressive identity → Always Agent User
One process model → One process (correct) → Shared service option
This plan moves ~60% toward the 12-month ideal. The state machine architecture is the key enabler for the remaining 40%.
Setup Prerequisites¶
setup.sh Updates (PR #1)¶
- Create multi-tenant app registration:
az ad app createwithsignInAudience=AzureADMultipleOrgs- Redirect URI:
http://localhost:8400/callback - API permissions:
Chat.ReadWrite,User.Read(delegated) - Write
ENTRACLAW_CLIENT_IDto.env
Provisioner App (PR #2)¶
- Create provisioner app registration (application permissions for Agent Identity APIs)
- Generate client secret
- Write
ENTRACLAW_PROVISIONER_CLIENT_IDandENTRACLAW_PROVISIONER_CLIENT_SECRETto.env
.env Changes¶
# PR #1: Multi-tenant delegated auth
ENTRACLAW_CLIENT_ID=<multi-tenant app ID>
ENTRACLAW_TENANT_ID=<"common" for multi-tenant discovery>
ENTRACLAW_SKIP_PROVISIONING=false # set true to force delegated-only
# PR #2: Provisioner (optional)
ENTRACLAW_PROVISIONER_CLIENT_ID=<provisioner app ID>
ENTRACLAW_PROVISIONER_CLIENT_SECRET=<provisioner secret>
# Existing (unchanged)
ENTRACLAW_BLUEPRINT_APP_ID=...
ENTRACLAW_BLUEPRINT_CERT_THUMBPRINT=...
# ... all existing vars still work
Technical Debt (Accepted)¶
- Dual HTTP stacks (MSAL/requests + httpx) — acceptable for research, TODO to unify
- Embedded provisioner — acceptable for research, TODO to extract to service
- In-memory state only — state doesn't survive restart, MSAL cache in keyring enables silent re-auth
Codex Cross-Model Review Summary¶
First review (during /office-hours): 9 findings, 5 resolved via user decisions. Second review (post-sections): 8 findings, 6 tension points resolved:
| # | Finding | Resolution |
|---|---|---|
| 1 | Chat continuity hard problem | Same thread via add_teams_member (converts to group) |
| 2 | Provisioner credential trust boundary | Embedded for research, service for production (TODO) |
| 3 | Ship delegated-only first | Two PRs: PR #1 delegated, PR #2 provisioner |
| 4 | Attribution model violation | Already resolved: delegated-human audit type |
| 5 | Runtime not multi-tenant | Correct: per-process model, documented |
| 6 | 30s lock fake rigor | Clarified: lock covers state mutation only (microseconds) |
| 7 | Sponsor mode changes semantics | Already resolved: identity-aware filtering |
| 8 | mcp_server.py under-tested | Accepted: integration tests added to PR #1 |
Completion Summary¶
+====================================================================+
| MEGA PLAN REVIEW — COMPLETION SUMMARY |
+====================================================================+
| Mode selected | HOLD SCOPE |
| System Audit | 20 source files, 12 test files, ~1845 LOC |
| Step 0 | Approach C (State Machine First), HOLD SCOPE |
| Section 1 (Arch) | 1 issue found (lock timeout) |
| Section 2 (Errors) | 18 error paths mapped, 0 GAPS |
| Section 3 (Security)| 6 issues found, 0 High unresolved |
| Section 4 (Data/UX) | 4 edge cases mapped, 0 unhandled |
| Section 5 (Quality) | 2 issues found (module structure, _state) |
| Section 6 (Tests) | Diagram produced, 0 gaps |
| Section 7 (Perf) | 1 issue found (cache strategy) |
| Section 8 (Observ) | 1 gap found (state transition logging) |
| Section 9 (Deploy) | 2 issues found (mode selection, setup.sh) |
| Section 10 (Future) | Reversibility: 4/5, debt items: 3 |
| Section 11 (Design) | SKIPPED (no UI scope) |
+--------------------------------------------------------------------+
| NOT in scope | written (8 items) |
| What already exists | written (11 items mapped) |
| Dream state delta | written (~60% toward 12-month ideal) |
| Error/rescue registry| 18 methods, 0 CRITICAL GAPS |
| Failure modes | 18 total, 0 CRITICAL GAPS |
| TODOS.md updates | 3 items proposed, 3 accepted |
| Scope proposals | 0 proposed (HOLD SCOPE mode) |
| CEO plan | skipped (HOLD SCOPE) |
| Outside voice | ran (codex GPT-5.4), 8 findings, 6 resolved |
| Lake Score | 19/19 recommendations chose complete option |
| Diagrams produced | 4 (auth flow, token swap, bg poll, modules) |
| Stale diagrams found | 0 |
| Unresolved decisions | 0 |
+====================================================================+
GSTACK REVIEW REPORT¶
| Review | Trigger | Why | Runs | Status | Findings |
|---|---|---|---|---|---|
| CEO Review | /plan-ceo-review |
Scope & strategy | 1 | CLEAR | mode: HOLD_SCOPE, 0 critical gaps |
| Codex Review | /codex review |
Independent 2nd opinion | 2 | issues_found | 17 findings total (9 + 8), all resolved |
| Eng Review | /plan-eng-review |
Architecture & tests (required) | 1 | CLEAR | 18 issues, 0 critical gaps, mode: SCOPE_REDUCED |
| Outside Voice | /codex review (eng) |
Independent plan challenge | 3 | issues_found | 6 findings, 6 tension points, all resolved |
| Design Review | /plan-design-review |
UI/UX gaps | 0 | — | — |
| DX Review | /plan-devex-review |
Developer experience gaps | 0 | — | — |
CROSS-MODEL: Three Codex reviews (GPT-5.4) ran. First during /office-hours (9 findings), second post-CEO-sections (8 findings), third post-eng-review (6 findings). All tension points resolved with user. Key cross-model catches: audit attribution gap, _initialize() entanglement, restart dedup limitation.
ENG REVIEW AMENDMENTS: Step 0 scope reductions (drop token_cache.py + PortConflictError), 4 architecture decisions (1A-4A), 3 code quality decisions (5A-7A), 5 test gaps (8A-12A), 6 Codex tension resolutions (audit attribution, init split, prefix dedup, hard exit removal, live swap kept, IdentitySession added). PR #1 commit order: split init → remove hard exits → state machine + MSAL. 3 new TODOs added.
UNRESOLVED: 0 decisions pending.
VERDICT: CEO + ENG CLEARED — ready to implement.