Back to All Projects

Vaani Assistant

Multilingual voice assistant for Indian languages with STT/TTS, intent understanding, and contextual responses.

Project Details

Vaani Assistant is a Next.js application providing a multilingual voice interaction experience. Key features include: - Multilingual Voice Interaction: Supports Speech-to-Text (STT) and Text-to-Speech (TTS) in selected regional Indian languages and English. - Intelligent Conversation: Understands user intents (weather, time, greetings) and extracts entities. - Specific Capabilities: Provides live weather forecasts, current time, and can answer questions about its creator and features. - User Interface: Audio recording with visual feedback, conversation flow display, and automatic audio playback of Vaani's responses.

Problem Statement

Voice assistants are predominantly English-focused, creating a digital divide for speakers of regional Indian languages. There is a need for an accessible voice assistant that can understand and respond in multiple Indian languages, performing useful tasks.

My Role

Solo Project. I managed the end-to-end development, focusing on integrating Google Cloud TTS and the Web Speech API. A major part of my work was designing the Genkit flow for intent recognition in multiple languages.

Key Learnings

This project was a deep dive into the complexities of speech technologies. Integrating both STT and TTS for multiple languages was a significant technical hurdle. I learned how to design a Genkit flow that could handle intent recognition across different languages, which required thinking about language structure and common user queries. It was a powerful lesson in building inclusive technology.

Technology Stack

Next.js
React
ShadCN UI
Tailwind CSS
Genkit
Google Cloud TTS
SpeechRecognition API
TypeScript