Agent Feedback Engine Part 1: Secure AI Foundations for CEM Feedback | Joseph Arnold

In FY26 I led the design and approval of Agent Feedback Engine, an internal AI solution that converts Customer Engagement Manager (CEM) call transcripts into structured feedback. My responsibility was to deliver a production-ready system that passed Microsoft security and privacy review, respected strict data constraints, and was practical for front-line CEMs, all while the underlying Copilot platform was still evolving.

Security, Compliance, and Navigating Ambiguity

To move Agent Feedback Engine into production, I owned the core artifacts for security and privacy review. An FTE security lead later summarized my work in a gratitude email to our firm's partners, stating I was a "huge driver" in getting the project ready and that my "combination of precision, thorough research, and collaboration materially de-risked our security and privacy reviews."

Navigating Policy Confusion and Managing Up

One of the less visible challenges was that internal reviewers were not fully aligned on how current Copilot guidance applied to our scenario. Policies and documentation were changing under our feet. My approach was to replace ambiguity with facts.

The same FTE summed this up in their note: "I am not confident we would have been able to navigate the reviews successfully without his support."

Executive Summary

This article details the security, compliance, and architectural work that made Agent Feedback Engine possible. I designed a defensible and extensible architecture that keeps all data within Microsoft-owned services, simplifying compliance and auditing. I also identified and designed mitigations for platform limitations around file search, content size, and metadata retrieval. To improve trust and reduce hallucinations, I engineered the agent to provide verbatim quotes and structured outputs, allowing human auditors to validate claims with minimal friction. The result is an AI-assisted pipeline that survived a rigorous review process during a period of active platform evolution and is now ready for future extension.

For the transcript-driven workflow, I also configured a second-pass search that revalidates key findings against the original transcript before we promote them into feedback or trend recommendations.

Platform Limitations and Mitigations

•File search limits: Copilot search was constrained to roughly 20 files and a few hundred pages per query. I designed the workflow so that a separate ‘researcher’ configuration could fan out queries and aggregate results when we hit those limits. I validated this approach by researching internal guidance and testing a more capable model configuration that could handle the heavier retrieval workload without changing our data boundaries.
•Pattern reuse: These mitigations were not documented in existing internal guidance; I documented them as recommendations for similar AI deployments across our organization.
•Metadata retrieval: Because the model could not reliably read identifiers from file titles, I changed our document pattern to repeat the engagement ID and key metadata in the file body so downstream agents could reliably access them.

➤I created and owned the core artifacts: Data Flow Diagram, STRIDE Threat Model, and 1CS Narrative: required to pass security and privacy reviews.
➤I navigated internal policy confusion by grounding conversations in factual, deep-dive research of the latest documentation, de-risking the approval process.
➤I designed a trusted, auditable architecture with hallucination mitigations that made the solution practical for non-technical front-line users.