VocaNova: Voice-First AI Interview & Learning Platform Built by Optimus Information Inc.

Overview

Spoken communication remains one of the most critical yet overlooked aspects of workforce readiness. For HR teams, screening candidates for verbal fluency is time-consuming and subjective. For students and job seekers, gaining confidence in spoken English often requires costly coaching and comes with inconsistent feedback.

Recognizing the dual need for scalable interview automation and personalized language assessment, Optimus Information developed VocaNova, a voice-first AI platform that transforms how organizations evaluate and develop verbal communication skills. Powered entirely by Microsoft Azure, VocaNova brings together advanced speech processing, generative AI, and enterprise-ready cloud architecture to deliver consistent, unbiased, and highly accessible spoken assessments at scale.

The Challenge

Recruiters were struggling to keep pace with growing candidate volumes, particularly for roles where strong verbal communication is essential. Traditional interview processes required manual scheduling, coordination, and evaluation, often leading to inconsistent experiences and unconscious bias.

At the same time, many learners lacked access to structured tools for improving spoken English. Existing resources offered limited interactivity, little real-time feedback, and no tailored progression paths. The result was inefficiency for HR and a missed opportunity for talent development.

Optimus set out to build a solution that could:

Automate voice-based interviews without compromising quality
Deliver real-time feedback on spoken English fluency, confidence, and coherence
Eliminate human bias in communication assessments
Scale seamlessly across industries and use cases

The Solution

Optimus developed VocaNova Interview and VocaNova Learn, two modular applications designed to address both recruiting and learning challenges using the latest in Azure AI, communication, and containerization technologies.

VocaNova Interview enables HR teams to automate the screening process through voice-first, AI-led interviews. Candidates respond to dynamic questions generated by GPT-4o, with their speech transcribed, analyzed, and scored in real-time. Evaluations are consistent across all users, enabling fair comparisons and faster decisions.

VocaNova Learn provides a guided, interactive environment for users to practice spoken English. The platform simulates natural conversations, offers tailored feedback, and helps users improve fluency, pronunciation, and verbal confidence through repeated, low-friction engagement.

Both solutions are delivered through a secure, Azure-native environment that supports real-time communication, analytics, and integration with Microsoft Teams and Outlook.

Key Capabilities

VocaNova was designed to support two core workflows: automated interviewing and conversational language learning. Both use cases are delivered through a shared foundation of advanced Azure services and GPT-4o.

Automated Interviewing: Candidates receive a personalized interview link and complete a voice-first assessment. The system evaluates fluency, clarity, and confidence in real time using Azure Speech Services and custom scoring logic powered by GPT-4o.
Conversational Learning: Learners engage in guided, natural dialogues with the AI assistant. The system delivers feedback on pronunciation, pacing, and vocabulary usage, simulating real-world English communication scenarios.
Real-Time Feedback: Whether in an interview or a learning session, users receive feedback immediately, eliminating the lag between effort and improvement.
Integrated Communication: Azure Communication Services ensures seamless, low-latency delivery of voice-based interactions across browsers and devices.

Multi Agent Architecture

At the core of VocaNova is a modular, voice-first architecture that uses a system of specialized agents to manage interviews, assess speech, and generate personalized feedback in real time. Here’s how the architecture works:

AGENT	ROLE
Conversation Orchestrator Agent	Manages session flow, selects questions based on role profile, and routes input to analysis agents
Speech Transcription Agent	Converts spoken responses into text using Azure Speech-to-Text, with fallback accuracy tuning
Fluency Scoring Agent	Evaluates pace, clarity, and pronunciation using custom GPT-4o scoring models
Content Coherence Agent	Assesses how well responses match the question prompt, identifying logic and structure gaps
Confidence Analysis Agent	Detects vocal tone, filler words, and speaking hesitation to infer speaker confidence
Feedback Generator Agent	Summarizes results and delivers tailored insights for learners or HR reviewers

Deployment

The entire solution was delivered in phases between June and October 2025. The core infrastructure was set up using Microsoft’s Cloud Adoption Framework (CAF) principles to ensure governance, scalability, and compliance.

Deployment highlights include:

Development and production environments built on Azure Container Apps
CI/CD pipelines established via GitHub Actions
Security and secret management handled via Azure Key Vault
Real-time observability implemented using Azure Monitor and Log Analytics
99.95% uptime achieved via Azure’s serverless consumption tier

The frontend was developed using React.js, with full responsiveness across mobile and desktop devices.

The Results

VocaNova has delivered measurable, high-impact results for both HR and learning stakeholders:

Interview screening time reduced by over 70%, freeing recruiters to focus on deeper evaluation and onboarding
Student engagement increased by 50%, driven by accessible, real-time feedback
Bias eliminated in scoring, with every response evaluated using a consistent AI model
Deployment speed accelerated, with serverless infrastructure allowing for rapid iteration and scale
Cost efficiency achieved, leveraging Azure Container Apps’ 200K free monthly requests

Tangible Business Outcomes:

VocaNova delivered measurable results across both recruitment and education use cases:

Significantly reduced manual screening time by enabling asynchronous, AI-led interviews that required no human coordination or follow-up.
Improved candidate evaluation consistency through standardized, unbiased scoring of fluency, coherence, and confidence.
Enabled scalable learning access by offering students and professionals 24/7 availability to practice spoken English in a low-pressure, voice-first environment.
Increased feedback quality and speed with real-time, automated insights, eliminating the delay and subjectivity associated with human reviewers.
Supported enterprise-scale adoption by integrating with Microsoft Teams and Outlook, allowing seamless rollout in both corporate and academic settings.

VocaNova allowed teams to do more than just automate, it redefined what’s possible in voice-based screening and learning.

Technologies Used

Azure OpenAI (GPT-4o)
Azure AI Speech Services (STT, TTS)
Azure Container Apps
Azure Cosmos DB
Azure SQL Database
Azure Blob Storage
Azure Communication Services
Azure Monitor & Log Analytics
Azure Key Vault
React.js frontend
Microsoft Teams and Outlook integration
GitHub Actions for CI/CD

What’s Next

VocaNova’s next phase is focused on deepening personalization and expanding its reach. Optimus is enhancing the platform’s adaptive capabilities, enabling it to better tailor feedback and learning paths based on a user’s individual progress and communication goals. On the enterprise side, integration with HR platforms and applicant tracking systems is underway to streamline workflows and unify hiring data. Additional language models are also being explored to support multilingual assessment and coaching, opening the door for global deployment across new markets and industries. VocaNova is evolving from a single solution into a scalable ecosystem for voice-based intelligence.

Contact Optimus to learn how we can help you leverage Agentic AI & Cloud services.

Overview

The Challenge

The Solution

Key Capabilities

Multi Agent Architecture

The Results

Tangible Business Outcomes:

Technologies Used

What’s Next

ABOUT US

WHAT WE DO

RESOURCES

COMPANY