PLAN: Multi-Tenant Progressive Identity for EntraClaw¶

HISTORICAL — shipped (commit c8ec521). Kept for design rationale. See Engineering Status for current state.

Generated by /plan-ceo-review on 2026-04-09 Branch: feature/multi-tenant-lightweight-chat | Mode: HOLD SCOPE Design doc: ~/.gstack/projects/entraclaw-identity-research/2026-04-09-feature-multi-tenant-lightweight-chat-design-progressive-identity.md Spec: docs/architecture/NEXT-WhatsApp-lightweight-teams-chat.md

Problem Statement¶

EntraClaw currently requires per-tenant setup (certificates, Blueprint, Agent Identity, Agent User) before an agent can message in Teams. This makes onboarding slow and limits usage to pre-configured tenants.

Goal: Any user from any Entra ID tenant can start messaging through EntraClaw immediately using their own identity (delegated token), with automatic upgrade to a dedicated Agent User identity when provisioning infrastructure exists.

Architecture: Identity State Machine (Approach C)¶

  UNAUTHENTICATED ──► DELEGATED ──► PROVISIONING ──► AGENT_USER
        │                │                │               │
        │                │                ▼               │
        │                │             ERROR ◄────────────┘
        │                │               │
        │                ◄───────────────┘
        │                    (recovery)
        ▼
      [auth required]

States: - UNAUTHENTICATED — no token, auth required - DELEGATED — human's token via MSAL, messages prefixed [EntraClaw] - PROVISIONING — background Agent User creation in progress - AGENT_USER — agent's own identity, full attribution - ERROR — recovers to DELEGATED (never silent downgrade from AGENT_USER)

Key invariant: The asyncio.Lock protects STATE MUTATIONS ONLY (microsecond hold time). Auth and provisioning operations run OUTSIDE the lock. The 30s timeout is a deadlock safety net, not an operation timeout.

Implementation: Two PRs¶

PR #1: State Machine + Delegated Mode¶

Delivers: multi-tenant MSAL auth, identity state machine, delegated-mode messaging, setup.sh updates.

New files:

src/entraclaw/
  auth/
    delegated.py          — MSAL localhost redirect + device code fallback
    token_cache.py        — keyring-backed MSAL SerializableTokenCache
  identity/
    __init__.py
    state_machine.py      — IdentityStateMachine (replaces _state dict)

Modified files:

src/entraclaw/
  config.py               — multi-tenant app fields (CLIENT_ID, SKIP_PROVISIONING)
  errors.py               — new exception classes (AuthTimeoutError, etc.)
  models.py               — IdentityState enum, attribution_type field on AuditEvent
  mcp_server.py           — uses state machine, MSAL init, identity-aware poll
  tools/teams.py          — [EntraClaw] prefix, sent-message dedup, empty validation
  tools/audit.py          — delegated-human attribution type
scripts/
  setup.sh                — multi-tenant app registration (signInAudience=AzureADMultipleOrgs)
AGENTS.md                 — delegated-human attribution exception documented

New test files:

tests/
  auth/test_delegated.py       — MSAL auth flows (localhost, device code, fallback)
  auth/test_token_cache.py     — keyring cache (load, write-after-acquire, corruption)
  identity/test_state_machine.py — all transitions, invalid transitions, lock timeout
  tools/test_teams_delegated.py  — delegated-mode operations, prefix, dedup
  test_mcp_server_integration.py — initialization, tool routing, background poll

PR #2: Background Provisioner + Agent User Upgrade¶

Delivers: async provisioner, three-hop flow integration, chat membership migration, token swap.

New files:

src/entraclaw/
  identity/
    provisioner.py        — async Agent User provisioner (rewrite of setup scripts)

Modified files:

src/entraclaw/
  identity/state_machine.py — PROVISIONING + AGENT_USER transitions
  mcp_server.py             — provisioner lifecycle, token swap grace period
  tools/teams.py            — identity-aware sender filtering during transition

New test files:

tests/
  identity/test_provisioner.py — async provisioning, failure recovery, chat migration

Decisions Made (19 total)¶

#	Decision	Rationale
1	Progressive identity required	Management directive, non-negotiable
2	Localhost redirect primary, device code fallback after 10s	Security architect's flag; try localhost, fall back gracefully
3	Three-hop flow preserved as Agent User path	Existing working infrastructure
4	werner.ac for research, tenant-agnostic design	Config not code
5	MSAL for delegated, raw httpx for three-hop	Best tool for each job
6	Token swap self-validates on next tool call	Validate before discarding old token
7	[EntraClaw] prefix for research	Sufficient for research attribution
8	Provisioner client_credentials from dedicated app	Lazy load, clear after use
9	Single app for research, permission split for production	Document split
10	Sent-message tracking (bounded set, max 1000 FIFO)	Prevents echo reprocessing
11	State Machine First (Approach C)	Best testability, TDD-first
12	HOLD SCOPE mode	Maximum rigor
13	MSAL token cache backed by keyring	OS-level encryption sufficient
14	asyncio.Lock 30s timeout (state mutation only)	Deadlock safety, microsecond holds
15	GraphApiError + ChatNotFoundError rescue handlers	Section 2 gaps
16	Single progressive mode, no config flag	State machine handles all paths
17	ENTRACLAW_SKIP_PROVISIONING override	For testing Phase 1 in isolation
18	Two PRs (delegated first, provisioner second)	Smaller, independently valuable
19	Same Teams thread across identity swap	add_teams_member converts 1:1 to group

Security Requirements¶

Auth Flow¶

Localhost redirect: Bind to 127.0.0.1 only (not 0.0.0.0)
PKCE with S256: MSAL default, explicitly required
Fallback detection: Try localhost, if no browser detected within 10s, fall back to device code
Remote scenarios: SSH sessions, Codespaces, containers automatically get device code

Delegated Mode Scoping¶

Only allow operations on explicitly watched chats (no directory enumeration)
Every action logged with attribution_type: "delegated-human"
No privilege escalation: delegated token not cached beyond MSAL token cache

Provisioner Credential Isolation¶

Lazy load: only when provisioning starts, not at MCP server startup
Clear from memory after provisioning completes
Research: embedded provisioner acceptable (single developer machine)
Production: extract to standalone service (see TODOS.md)

Token Swap Validation¶

Make one Graph API call with new token before committing transition
Validate idtyp claim is user (not app)
No silent downgrade: AGENT_USER failure → ERROR state, not DELEGATED

Token Cache¶

Keyring provides OS-level encryption (Keychain/DPAPI/Secret Service)
Load cache once at startup, keep in-memory
Write back after every token acquisition
Handle keyring entry size limits (split per-account if needed)
Silent re-auth on corruption (graceful degradation)

Dependencies¶

Pin msal version in pyproject.toml
Verify no version conflicts with existing PyJWT/cryptography pins

Error & Rescue Registry¶

METHOD/CODEPATH              | WHAT CAN GO WRONG              | EXCEPTION CLASS
-----------------------------|--------------------------------|---------------------------
MsalAuth.acquire_interactive | Browser not opened in 10s      | AuthTimeoutError
                             | User cancels consent           | AuthCancelledError
                             | Port 8400 in use               | PortConflictError
                             | MSAL returns error response    | MsalAuthError
MsalAuth.acquire_device_code | User doesn't complete in time  | AuthTimeoutError
                             | User cancels/denies            | AuthCancelledError
KeyringTokenCache.load       | Corrupted keyring entry        | CredentialStoreError
                             | Keyring unavailable            | CredentialStoreError
StateMachine.transition      | Invalid state transition       | InvalidTransitionError
                             | Lock timeout (30s deadlock)    | TransitionTimeoutError
                             | Exception during transition    | TransitionRollbackError
Provisioner.provision        | Three-hop flow fails at any hop| ProvisioningError
                             | Agent User has no Teams license| LicenseUnavailableError
                             | Provisioning exceeds timeout   | ProvisioningTimeoutError
Graph API (delegated)        | 403 Forbidden                 | GraphApiError
                             | 429 Rate Limited               | RateLimitError
                             | Chat not found (404)           | ChatNotFoundError
Certificate auth (three-hop) | Certificate expired/invalid    | CertificateError
Token exchange (three-hop)   | FIC exchange fails             | TokenExchangeError
Token validation             | Wrong idtyp claim              | TokenValidationError
Chat membership              | Add member fails               | ChatMembershipError

EXCEPTION CLASS              | RESCUED? | RESCUE ACTION                    | USER SEES
-----------------------------|----------|----------------------------------|------------------
AuthTimeoutError             | Y        | Fall back to device code         | "Opening device code flow..."
AuthCancelledError           | Y        | Stay UNAUTHENTICATED, log        | "Auth cancelled"
MsalAuthError                | Y        | Log, stay UNAUTHENTICATED        | "Auth failed: {detail}"
PortConflictError            | Y        | Try ports 8401-8410              | Nothing (transparent)
CredentialStoreError         | Y        | Clear cache, re-auth silently    | Nothing (re-prompts if needed)
InvalidTransitionError       | Y        | Log, no state change             | Tool returns error message
TransitionTimeoutError       | Y        | Log, transition → ERROR          | "Identity system busy, retrying"
TransitionRollbackError      | Y        | Revert to previous state, log    | Nothing (transparent)
ProvisioningError            | Y        | Stay DELEGATED, log, retry later | Nothing (stays in delegated)
LicenseUnavailableError      | Y        | Stay DELEGATED permanently       | "Agent User requires Teams license"
ProvisioningTimeoutError     | Y        | Stay DELEGATED, log              | Nothing (stays in delegated)
GraphApiError (403)          | Y        | Retry once, then user message    | "Permission denied for this chat"
RateLimitError               | Y        | Backoff + retry (existing)       | Nothing (transparent)
ChatNotFoundError            | Y        | Remove from watched_chats, log   | "Chat no longer available"
CertificateError             | Y        | Stay DELEGATED, log              | Nothing (stays in delegated)
TokenExchangeError           | Y        | Stay DELEGATED, log              | Nothing (stays in delegated)
TokenValidationError         | Y        | Reject swap, stay current state  | Nothing (stays in current mode)
ChatMembershipError          | Y        | Log, skip chat migration         | "Could not add agent to chat"

CRITICAL GAPS: 0 (all rescued)

Failure Modes Registry¶

CODEPATH                    | FAILURE MODE              | RESCUED | TEST | USER SEES      | LOGGED
----------------------------|---------------------------|---------|------|----------------|-------
MSAL localhost auth         | Port 8400 in use          | Y       | Y    | Transparent    | Y
MSAL localhost auth         | No browser detected 10s   | Y       | Y    | Device code    | Y
MSAL device code            | User cancels              | Y       | Y    | "Auth cancelled"| Y
Keyring token cache         | Corrupted entry           | Y       | Y    | Silent re-auth | Y
Keyring token cache         | Keyring unavailable       | Y       | Y    | Auth re-prompt | Y
State transition            | Concurrent transitions    | Y       | Y    | Serialized     | Y
State transition            | Lock deadlock             | Y       | Y    | Timeout + error| Y
Background provisioner      | Hop 1 cert auth fails     | Y       | Y    | Stays delegated| Y
Background provisioner      | Hop 2 FIC exchange fails  | Y       | Y    | Stays delegated| Y
Background provisioner      | Hop 3 user_fic fails      | Y       | Y    | Stays delegated| Y
Background provisioner      | No Teams license          | Y       | Y    | Stays delegated| Y
Token swap                  | New token invalid idtyp   | Y       | Y    | Stays current  | Y
Token swap                  | Mid-operation swap        | Y       | Y    | Grace period   | Y
Chat migration              | add_member fails          | Y       | Y    | Chat skipped   | Y
Background poll             | Identity transition mid-poll| Y      | Y    | Exclude both IDs| Y
Graph API (delegated)       | 403 forbidden             | Y       | Y    | Error message  | Y
Graph API (delegated)       | Chat deleted              | Y       | Y    | Removed from watch| Y
Send message (delegated)    | Empty string params       | Y       | Y    | Validation error| Y

CRITICAL GAPS: 0

Data Flow Diagrams¶

Auth Flow (Phase 1)¶

  User starts MCP server
    │
    ▼
  State: UNAUTHENTICATED
    │
    ├─► Try localhost redirect (port 8400)
    │     │
    │     ├─ Success ──► MSAL callback ──► Token acquired ──► State: DELEGATED
    │     │
    │     └─ Fail (port busy / no browser / 10s timeout)
    │           │
    │           ▼
    │     Device code flow
    │           │
    │           ├─ Success ──► Token acquired ──► State: DELEGATED
    │           │
    │           └─ Fail (cancelled / timeout) ──► State: UNAUTHENTICATED
    │
    ▼
  State: DELEGATED
    │
    ├─► All tools available (human's permissions)
    ├─► Messages prefixed [EntraClaw]
    ├─► Audit: attribution_type = "delegated-human"
    │
    ├─► If ENTRACLAW_PROVISIONER_CLIENT_ID set AND SKIP_PROVISIONING != true:
    │     │
    │     ▼
    │   State: PROVISIONING (background)
    │     │
    │     ├─ Success ──► Validate token ──► Add to chats ──► State: AGENT_USER
    │     │
    │     └─ Fail ──► State: ERROR ──► Recovery ──► State: DELEGATED
    │
    └─► If no provisioner credentials: stays DELEGATED permanently

Token Swap Flow (Phase 2)¶

  Provisioner completes
    │
    ▼
  Validate new Agent User token
    │
    ├─► Graph API test call with new token
    │     │
    │     ├─ Success + idtyp=user ──► Proceed to swap
    │     │
    │     └─ Fail ──► Reject swap, stay DELEGATED, log error
    │
    ▼
  Add Agent User to watched_chats (chat membership migration)
    │
    ├─► For each chat in watched_chats:
    │     add_teams_member(chat_id, agent_user_email)
    │     │
    │     ├─ Success ──► Chat migrated
    │     └─ Fail ──► Log, skip this chat
    │
    ▼
  Acquire asyncio.Lock (microsecond hold)
    │
    ├─► Write new state: DELEGATED → AGENT_USER
    ├─► Update current token reference
    ├─► Update sender filter (exclude agent_user_id)
    │
    ▼
  Release lock
    │
  Grace period: in-flight operations complete with old token

Background Poll (Identity-Aware)¶

  Every 5 seconds:
    │
    ▼
  Read current identity mode
    │
    ├─ DELEGATED ──► Use human's delegated token
    │                 Filter: exclude human_user_id + sent_message_ids
    │
    ├─ AGENT_USER ──► Use Agent User token
    │                  Filter: exclude agent_user_id + sent_message_ids
    │
    ├─ TRANSITIONING ──► Exclude BOTH IDs for one cycle
    │
    └─ UNAUTHENTICATED / ERROR ──► Skip poll

Module Dependency Graph¶

  config.py ◄─────────────────────────────────────────────┐
    │                                                      │
    ▼                                                      │
  auth/                                                    │
    ├── certificate.py (existing)                          │
    ├── delegated.py (NEW — MSAL)                         │
    └── token_cache.py (NEW — keyring wrapper)            │
          │                                                │
          ▼                                                │
  identity/                                                │
    ├── state_machine.py (NEW — core) ◄── errors.py       │
    └── provisioner.py (NEW — async) ──────────────────────┘
          │                                    │
          ▼                                    ▼
  tools/                                  mcp_server.py
    ├── teams.py (modified)                (orchestrator)
    ├── audit.py (modified)
    └── rate_limit.py (existing)

Observability¶

Structured Logging¶

  # State transitions
  INFO  identity.state_machine: transition from=DELEGATED to=PROVISIONING trigger=provisioner_start
  ERROR identity.state_machine: transition_failed from=X to=Y error=InvalidTransitionError

  # Tool calls (every call)
  INFO  tools.teams: send_teams_message chat_id=19:xxx identity_mode=DELEGATED attribution=delegated-human

  # Provisioner progress
  INFO  identity.provisioner: started
  INFO  identity.provisioner: hop1_complete
  INFO  identity.provisioner: hop3_complete idtyp=user duration_s=1.2
  INFO  identity.provisioner: chat_migration chats_migrated=3

  # Auth events
  INFO  auth.delegated: localhost_redirect_started port=8400
  WARN  auth.delegated: localhost_failed falling_back=device_code reason=timeout
  INFO  auth.delegated: auth_complete method=localhost scopes=Chat.ReadWrite,User.Read

What Already Exists¶

Sub-problem	Existing code	Reused?
Three-hop token flow	`tools/teams.py:acquire_agent_user_token()`	Yes (PR #2)
Graph API send/read/create_chat	`tools/teams.py` (640 lines)	Yes, modified
Background poll	`mcp_server.py:_background_poll()`	Yes, modified
Token refresh	`mcp_server.py:_ensure_valid_token()/_with_token_retry()`	Yes, adapted
Certificate JWT assertion	`auth/certificate.py`	Yes (PR #2)
Keyring credential storage	`platform/` modules	Yes (token cache backend)
Error hierarchy	`errors.py`	Yes, extended
Pydantic models	`models.py`	Yes, extended
Rate limit handling	`tools/rate_limit.py`	Yes, unchanged
Audit logging	`tools/audit.py`	Yes, modified
Config from env	`config.py`	Yes, extended

NOT in Scope¶

Item	Rationale
UI/frontend	CLI/MCP server only
Multi-tenant runtime (shared process)	Per-process model is correct; each session gets its own MCP server
Production provisioner service	Research uses embedded provisioner; service extraction is a TODO
IC3 federation	12-month target, state machine enables it but not built yet
AppContainer sandbox	Separate workstream (existing TODO)
Admin consent automation for external tenants	Users handle consent; provisioning gracefully stays DELEGATED
HTTP stack unification (requests → httpx)	TODO, not blocking
Windows Certificate Store integration	Existing platform/ modules handle this; no changes needed

Dream State Delta¶

  CURRENT STATE               THIS PLAN                    12-MONTH IDEAL
  ────────────────────        ────────────────────          ────────────────────
  Single-tenant only    →     Multi-tenant delegated   →   IC3 federation
  Certificate required  →     MSAL auth (zero setup)   →   Passkey/FIDO2
  Manual setup.sh       →     Scripted multi-tenant    →   Self-service portal
  Agent User or nothing →     Progressive identity     →   Always Agent User
  One process model     →     One process (correct)    →   Shared service option

This plan moves ~60% toward the 12-month ideal. The state machine architecture is the key enabler for the remaining 40%.

Setup Prerequisites¶

setup.sh Updates (PR #1)¶

Create multi-tenant app registration:
az ad app create with signInAudience=AzureADMultipleOrgs
Redirect URI: http://localhost:8400/callback
API permissions: Chat.ReadWrite, User.Read (delegated)
Write ENTRACLAW_CLIENT_ID to .env

Provisioner App (PR #2)¶

Create provisioner app registration (application permissions for Agent Identity APIs)
Generate client secret
Write ENTRACLAW_PROVISIONER_CLIENT_ID and ENTRACLAW_PROVISIONER_CLIENT_SECRET to .env

.env Changes¶

# PR #1: Multi-tenant delegated auth
ENTRACLAW_CLIENT_ID=<multi-tenant app ID>
ENTRACLAW_TENANT_ID=<"common" for multi-tenant discovery>
ENTRACLAW_SKIP_PROVISIONING=false  # set true to force delegated-only

# PR #2: Provisioner (optional)
ENTRACLAW_PROVISIONER_CLIENT_ID=<provisioner app ID>
ENTRACLAW_PROVISIONER_CLIENT_SECRET=<provisioner secret>

# Existing (unchanged)
ENTRACLAW_BLUEPRINT_APP_ID=...
ENTRACLAW_BLUEPRINT_CERT_THUMBPRINT=...
# ... all existing vars still work

Technical Debt (Accepted)¶

Dual HTTP stacks (MSAL/requests + httpx) — acceptable for research, TODO to unify
Embedded provisioner — acceptable for research, TODO to extract to service
In-memory state only — state doesn't survive restart, MSAL cache in keyring enables silent re-auth

Codex Cross-Model Review Summary¶

First review (during /office-hours): 9 findings, 5 resolved via user decisions. Second review (post-sections): 8 findings, 6 tension points resolved:

#	Finding	Resolution
1	Chat continuity hard problem	Same thread via add_teams_member (converts to group)
2	Provisioner credential trust boundary	Embedded for research, service for production (TODO)
3	Ship delegated-only first	Two PRs: PR #1 delegated, PR #2 provisioner
4	Attribution model violation	Already resolved: delegated-human audit type
5	Runtime not multi-tenant	Correct: per-process model, documented
6	30s lock fake rigor	Clarified: lock covers state mutation only (microseconds)
7	Sponsor mode changes semantics	Already resolved: identity-aware filtering
8	mcp_server.py under-tested	Accepted: integration tests added to PR #1

Completion Summary¶

+====================================================================+
|            MEGA PLAN REVIEW — COMPLETION SUMMARY                   |
+====================================================================+
| Mode selected        | HOLD SCOPE                                  |
| System Audit         | 20 source files, 12 test files, ~1845 LOC   |
| Step 0               | Approach C (State Machine First), HOLD SCOPE |
| Section 1  (Arch)    | 1 issue found (lock timeout)                |
| Section 2  (Errors)  | 18 error paths mapped, 0 GAPS               |
| Section 3  (Security)| 6 issues found, 0 High unresolved           |
| Section 4  (Data/UX) | 4 edge cases mapped, 0 unhandled            |
| Section 5  (Quality) | 2 issues found (module structure, _state)   |
| Section 6  (Tests)   | Diagram produced, 0 gaps                    |
| Section 7  (Perf)    | 1 issue found (cache strategy)              |
| Section 8  (Observ)  | 1 gap found (state transition logging)      |
| Section 9  (Deploy)  | 2 issues found (mode selection, setup.sh)   |
| Section 10 (Future)  | Reversibility: 4/5, debt items: 3           |
| Section 11 (Design)  | SKIPPED (no UI scope)                       |
+--------------------------------------------------------------------+
| NOT in scope         | written (8 items)                           |
| What already exists  | written (11 items mapped)                   |
| Dream state delta    | written (~60% toward 12-month ideal)        |
| Error/rescue registry| 18 methods, 0 CRITICAL GAPS                |
| Failure modes        | 18 total, 0 CRITICAL GAPS                   |
| TODOS.md updates     | 3 items proposed, 3 accepted                |
| Scope proposals      | 0 proposed (HOLD SCOPE mode)                |
| CEO plan             | skipped (HOLD SCOPE)                        |
| Outside voice        | ran (codex GPT-5.4), 8 findings, 6 resolved |
| Lake Score           | 19/19 recommendations chose complete option |
| Diagrams produced    | 4 (auth flow, token swap, bg poll, modules) |
| Stale diagrams found | 0                                           |
| Unresolved decisions | 0                                           |
+====================================================================+

GSTACK REVIEW REPORT¶

Review	Trigger	Why	Runs	Status	Findings
CEO Review	`/plan-ceo-review`	Scope & strategy	1	CLEAR	mode: HOLD_SCOPE, 0 critical gaps
Codex Review	`/codex review`	Independent 2nd opinion	2	issues_found	17 findings total (9 + 8), all resolved
Eng Review	`/plan-eng-review`	Architecture & tests (required)	1	CLEAR	18 issues, 0 critical gaps, mode: SCOPE_REDUCED
Outside Voice	`/codex review` (eng)	Independent plan challenge	3	issues_found	6 findings, 6 tension points, all resolved
Design Review	`/plan-design-review`	UI/UX gaps	0	—	—
DX Review	`/plan-devex-review`	Developer experience gaps	0	—	—

CROSS-MODEL: Three Codex reviews (GPT-5.4) ran. First during /office-hours (9 findings), second post-CEO-sections (8 findings), third post-eng-review (6 findings). All tension points resolved with user. Key cross-model catches: audit attribution gap, _initialize() entanglement, restart dedup limitation. ENG REVIEW AMENDMENTS: Step 0 scope reductions (drop token_cache.py + PortConflictError), 4 architecture decisions (1A-4A), 3 code quality decisions (5A-7A), 5 test gaps (8A-12A), 6 Codex tension resolutions (audit attribution, init split, prefix dedup, hard exit removal, live swap kept, IdentitySession added). PR #1 commit order: split init → remove hard exits → state machine + MSAL. 3 new TODOs added. UNRESOLVED: 0 decisions pending. VERDICT: CEO + ENG CLEARED — ready to implement.