Welcome to the Voice Chat with PDFs project! Imagine talking to your PDFs and getting answers like a real conversation, powered by the magical combo of LlamaIndex and Next.js. It's like turning your PDFs into your personal assistants!
This project takes inspiration from the super cool openai-realtime-console, but we’ve spiced it up by adding a simple RAG system (fancy talk for making your chats smarter) using LlamaIndexTS.
- An OpenAI API Key – This is the magic wand that powers the whole thing. You can get one (user key or project key) and pop it into the
.env
file or set it as an environment variable (OPENAI_API_KEY
). Trust me, it won’t work without it!
Run this magical line to get all the things you need:
npm install
npm run generate
npm run dev
Open http://localhost:3000 in your browser and watch the magic happen! ✨
When you first start the app, it might ask you for the API key again (yeah, it’s a little annoying, we know…). Connect to start chatting, and don’t forget to give the app microphone access (so it can hear you!). Choose between Push-to-talk (manual mode) or VAD (Voice Activity Detection) mode, where it listens when you start talking. You can switch anytime! Interrupt the AI? Sure! You’re in charge.
If you’re curious about LlamaIndex and want to level up your skills, check out these awesome resources: