Architecture
Design Philosophy
CopilotReportForge is built on four principles that directly address the problems identified in enterprise AI adoption:
| Principle | What It Means |
|---|---|
| Composability | Every building block (LLM query, AI agent, storage, notification) is independent. Swap a system prompt to change domains; add a tool to add capabilities. |
| Zero-Infrastructure AI | Consume hosted LLMs through the Copilot SDK. No GPU provisioning, no model management, no inference servers. |
| Security by Default | OIDC federation, scoped RBAC, time-limited sharing URLs, and ephemeral execution environments. No long-lived secrets anywhere. |
| Auditable Execution | Every run is recorded with full input/output context, execution metadata, and provenance — by default, not by configuration. |
Execution Model: Why GitHub Actions?
A key architectural decision is that all AI agent execution runs within GitHub Actions runners rather than on developers' local machines. This choice solves three problems simultaneously:
Ephemeral Isolation
Each workflow execution spins up a fresh, sandboxed environment and discards it upon completion. Secrets, intermediate files, and model outputs exist only for the duration of the run. This is fundamentally more secure than running AI agents on developer workstations, where credentials can leak through shell history, file system artifacts, or malware.
Environment Consistency
Every team member, every workflow, and every domain uses the same runner configuration. This eliminates "works on my machine" issues, prevents environment drift across teams, and ensures that AI evaluation results are reproducible regardless of who triggers them.
Built-in Governance
GitHub Actions natively records who executed what, when, for how long, and with what inputs/outputs. Full execution logs are retained and searchable. For organizations, all workflow activity is captured in the enterprise audit log — providing compliance-ready audit trails without any additional tooling.
System Architecture
%%{init: {'theme': 'dark'}}%%
flowchart LR
USER(["👤 User / Scheduler"])
subgraph GHA["⚙️ GitHub Actions Runner"]
direction TB
SDK["🤖 Copilot SDK"]
TOOLS["🔧 AI Tools"]
SDK <--> TOOLS
end
subgraph AI["🧠 AI Models"]
direction TB
LLM["💬 GPT-5 / Claude"]
FOUNDRY["📚 Foundry Agents"]
end
subgraph AZ["☁️ Azure"]
direction TB
AUTH["🔐 Entra ID"]
STORAGE[("📦 Blob Storage")]
end
USER -- "① Submit" --> SDK
SDK -- "② Parallel queries" --> LLM
TOOLS -- "③ Delegate" --> FOUNDRY
FOUNDRY -. "④ Reference data" .-> STORAGE
SDK -- "⑤ Upload report" --> STORAGE
STORAGE -- "⑥ Secure URL" --> USER
USER -. "OIDC" .-> AUTH
AUTH -. "Token" .-> SDK
style GHA fill:#1a365d,stroke:#63b3ed,stroke-width:2px,color:#e2e8f0
style AI fill:#744210,stroke:#f6ad55,stroke-width:2px,color:#e2e8f0
style AZ fill:#22543d,stroke:#68d391,stroke-width:2px,color:#e2e8f0
Component Responsibilities
| Component | Role |
|---|---|
| Copilot SDK Client | Manages LLM sessions, sends queries in parallel, handles tool calls, aggregates results |
| Hosted LLMs | Provide text generation capabilities (GPT-5-mini, GPT-5, Claude Sonnet/Opus 4.6) — no self-hosting required |
| AI Foundry Agents | Domain-specific AI personas with access to reference data (documents, images, specifications) |
| Entra ID | Issues short-lived access tokens via OIDC federation — no stored credentials |
| Blob Storage | Stores reports and reference data; generates time-limited sharing URLs |
| GitHub Actions | Provides ephemeral execution, OIDC authentication, and audit logging |
Core Data Flow: Report Generation
%%{init: {'theme': 'dark'}}%%
sequenceDiagram
actor User as 👤 User
participant RT as ⚙️ Runtime
participant LLM as 🧠 LLMs
participant ST as 📦 Storage
User->>RT: ① Submit persona + queries
activate RT
rect rgba(26, 54, 93, 0.5)
note over RT,LLM: Parallel Execution
par Query 1
RT->>LLM: Prompt
LLM-->>RT: Response
and Query 2
RT->>LLM: Prompt
LLM-->>RT: Response
and Query N
RT->>LLM: Prompt
LLM-->>RT: Response
end
end
RT->>RT: ② Aggregate results
RT->>ST: ③ Upload report (JSON)
ST-->>RT: Secure URL
RT-->>User: ④ Return secure URL
deactivate RT
Key properties of this flow: - Each query executes in an independent session — no conversational cross-contamination between personas. - Results are typed and validated — the output schema tracks total queries, successes, and failures. - The report is immutable once uploaded — providing a point-in-time record of the evaluation.
Agentic Data Flow: Domain-Specific Evaluation
For evaluations that require access to reference data (floor plans, product specs, clinical guidelines), the platform integrates AI Foundry Agents:
%%{init: {'theme': 'dark'}}%%
sequenceDiagram
actor User as 👤 User
participant SDK as 🤖 Copilot Session
participant Agent as 📚 Domain Agent
participant Data as 📦 Reference Data
User->>SDK: ① Domain-specific query
activate SDK
rect rgba(116, 66, 16, 0.4)
note over SDK,Data: Agent Orchestration
SDK->>SDK: ② Route to specialist
SDK->>Agent: ③ Invoke agent
activate Agent
Agent->>Data: ④ Fetch documents
Data-->>Agent: Content
Agent-->>SDK: ⑤ Structured evaluation
deactivate Agent
end
SDK-->>User: ⑥ Agent-enriched report
deactivate SDK
The Copilot session autonomously decides when to delegate to a Foundry Agent based on the query context. This enables multi-agent orchestration within a single session — the user submits a high-level query, and the system routes to the appropriate domain specialist.
Authentication Model
%%{init: {'theme': 'dark'}}%%
sequenceDiagram
participant GHA as ⚙️ GitHub Actions
participant OIDC as 🔑 OIDC Provider
participant Entra as 🛡️ Entra ID
participant AZ as ☁️ Azure
rect rgba(26, 54, 93, 0.5)
note over GHA,Entra: Passwordless Authentication
GHA->>OIDC: ① Request JWT
OIDC-->>GHA: ② Signed JWT (short-lived)
GHA->>Entra: ③ Exchange JWT
Entra-->>GHA: ④ Scoped access token
end
GHA->>AZ: ⑤ Access resources (least privilege)
note right of AZ: Token expires in minutes
Why OIDC Federation?
Traditional CI/CD authentication stores long-lived API keys as repository secrets. These keys are difficult to rotate, easy to leak, and grant broad access. OIDC federation eliminates this pattern entirely:
- No stored secrets for Azure access — Tokens are issued per workflow run and expire within minutes.
- Least-privilege scoping — Each token is scoped to specific RBAC roles (see below).
- Zero rotation overhead — There are no credentials to rotate.
RBAC Roles
| Role | Purpose |
|---|---|
| Contributor | Manage Azure resources via Terraform |
| Storage Blob Data Contributor | Read/write report and reference data |
| Storage Blob Delegator | Generate user delegation keys for secure sharing URLs |
| Cognitive Services OpenAI User | Access hosted model endpoints |
Infrastructure Architecture
All infrastructure is managed as code via Terraform, organized into reusable modules and deployment scenarios.
%%{init: {'theme': 'dark'}}%%
flowchart LR
subgraph Step1["① OIDC Setup"]
S1["🔐 Identity & Trust"]
P1["Entra ID App\nService Principal\nFederated Credential\nRBAC Roles"]
end
subgraph Step2["② GitHub Secrets"]
S2["🔒 Environment Config"]
P2["GitHub Environment\nEncrypted Secrets"]
end
subgraph Step3["③ AI Foundry"]
S3["🧠 AI Infrastructure"]
P3["AI Hub + Models\nStorage Account\nAI Search"]
end
subgraph Step4["④ Container Apps"]
S4["🐳 Standalone Deploy"]
P4["Container Apps Env\nMonolith Container"]
end
S1 --> P1
S2 --> P2
S3 --> P3
S4 --> P4
Step1 -- "credentials" --> Step2
Step2 -- "enables workflows" --> Step3
style Step1 fill:#1a365d,stroke:#63b3ed,stroke-width:2px,color:#e2e8f0
style Step2 fill:#702459,stroke:#f687b3,stroke-width:2px,color:#e2e8f0
style Step3 fill:#744210,stroke:#f6ad55,stroke-width:2px,color:#e2e8f0
style Step4 fill:#22543d,stroke:#68d391,stroke-width:2px,color:#e2e8f0
| Scenario | Purpose | Key Resources |
|---|---|---|
azure_github_oidc |
Establish passwordless trust between GitHub and Azure | Entra ID app, service principal, federated credential, RBAC roles |
github_secrets |
Automate GitHub environment configuration | GitHub environment, encrypted secrets |
azure_microsoft_foundry |
Deploy AI capabilities and storage | AI Hub, model deployments, Storage Account, optional AI Search |
azure_container_apps |
Deploy monolith service to Azure | Resource group, Container Apps Environment, monolith container (Copilot CLI + API) |
The first three scenarios must be deployed in order: OIDC → Secrets → Foundry. The Container Apps scenario is standalone. See Deployment for step-by-step instructions.
Application Architecture
The platform provides three interfaces for interacting with the AI execution pipeline:
CLI Tools
Command-line interfaces for chat, report generation, agent management, storage operations, and notifications. All CLIs follow the same pattern: configure via environment variables, execute via typed commands.
Web Application
A browser-based interface with GitHub OAuth login, interactive chat, and a parallel report generation panel. Users authenticate with their GitHub identity, and the application makes Copilot requests on their behalf.
GitHub Actions Workflows
Automated workflows triggered by schedule, manual dispatch, or API call. These are the primary production execution path, providing ephemeral environments with full audit trails.
| Interface | Best For |
|---|---|
| CLI | Local development, scripting, automation |
| Web UI | Interactive exploration, ad-hoc evaluations |
| GitHub Actions | Production execution, scheduled reports, governed workflows |
LLM Provider Model
The platform supports multiple LLM backend configurations through a unified provider interface:
| Mode | Authentication | Use Case |
|---|---|---|
| Copilot (default) | GitHub token via Copilot CLI | Standard usage — access hosted models without API key management |
| API Key | Static API key | Direct model API access when Copilot is not available |
| Entra ID | Azure Entra ID bearer token | Enterprise deployments with private endpoints and managed identity |
Switching between modes requires changing a configuration parameter, not code. This enables deployment across environments with different security requirements — from open-internet development to air-gapped corporate networks.
Provider Extensibility
The provider system is implemented in template_github_copilot/providers.py with three components:
AuthMethodenum — Defines available authentication methods:GITHUB_COPILOT,API_KEY,FOUNDRY_ENTRA_ID.create_provider()factory — Returns aProviderResultcontaining theProviderConfig(for BYOK modes, orNonefor the default Copilot backend) and the target model name, based on the selectedAuthMethod.register_provider()hook — Allows adding custom provider builders without modifying the core code. Register a callable that accepts the same arguments and returns aProviderResult.
from template_github_copilot.providers import AuthMethod, register_provider, ProviderResult
def my_custom_provider(**kwargs) -> ProviderResult:
# Build your custom ProviderConfig / model here
...
# Note: register_provider requires an AuthMethod enum value, not a string
register_provider(AuthMethod.API_KEY, my_custom_provider)
Custom Copilot Tools
The platform supports custom tools that extend the Copilot SDK session with additional capabilities. Tools are defined in template_github_copilot/tools/ and automatically registered when a Copilot session is created.
Built-in Tools
| Tool | Description | Input |
|---|---|---|
list_foundry_agents |
List all available agents on Microsoft Foundry | endpoint (optional; defaults to env) |
call_foundry_agent |
Call a named Foundry agent with a user message | agent_name, user_message, conversation_id (optional), endpoint (optional) |
Adding a Custom Tool
- Create a new file in
template_github_copilot/tools/(e.g.,my_tool.py). - Define a Pydantic input model and implement the tool function using the
@define_tooldecorator fromcopilot.tools. - Export the tool in
template_github_copilot/tools/__init__.pyviaget_custom_tools().
from copilot.tools import define_tool
from pydantic import BaseModel, Field
class MyToolInput(BaseModel):
query: str = Field(description="The query to process")
@define_tool(
description="Description visible to the LLM",
)
def my_tool(params: MyToolInput) -> str:
return f"Processed: {params.query}"
The Copilot SDK session will automatically discover the tool and invoke it when the LLM determines it is relevant to the user's query.
Extensibility
The architecture is designed for extension at five levels:
| Extension Point | How to Extend | Example |
|---|---|---|
| New domain | Change system prompt and queries | Adapt from product evaluation to clinical guideline review |
| New AI capability | Add a Copilot tool in tools/ |
Web scraper, database lookup, calculation engine |
| New AI agent | Create a Foundry Agent with domain instructions | Specialized real estate appraiser with access to floor plan data |
| New LLM provider | Register via register_provider() |
Custom model endpoint with proprietary auth |
| New output channel | Post-process the report JSON | Send to Slack, email, dashboard, or PowerBI |
Technology Stack
| Layer | Technology |
|---|---|
| Language | Python 3.13+ |
| AI SDK | GitHub Copilot SDK, Azure AI Projects SDK |
| LLM Client | OpenAI Python SDK |
| Web Framework | FastAPI |
| HTTP Client | httpx |
| Data Validation | Pydantic, pydantic-settings |
| CLI Framework | Typer |
| Environment | python-dotenv |
| Cloud Storage | Azure Blob Storage |
| Authentication | Azure Identity (OIDC, DefaultAzureCredential) |
| Infrastructure | Terraform |
| CI/CD | GitHub Actions |
| Containerization | Docker, Docker Compose |
| Testing | pytest, pytest-cov |