Docai is a GPT-3 based Question Answering System that can provide answers based on a PDF, DOCX, and TXT files.
Key Features • How To Use • Requirements • Copyright
- File handling
- The script supports PDF, DOCX, and TXT files
- Read the content using the pdfplumber, docx, and built-in open() functions
- GPT-3 integration
- The script uses the OpenAI GPT-3 model, specifically the
text-davinci-003
engine, to generate answers to questions.
- The script uses the OpenAI GPT-3 model, specifically the
- Confidence scoring
- The script calculates confidence scores for the generated answers using log probabilities returned by the GPT-3 API.
- Concurrency
- It uses the
concurrent.futures.ThreadPoolExecutor
to process questions concurrently, potentially speeding up the process.
- It uses the
- Text preprocessing
- The script splits the input document into chunks to fit within GPT-3's token limit, and post-processes the answers to remove duplicate sentences.
- Saving conversation history
- The script allows users to save the conversation history to a text file.
- Caching
- The script uses
lru_cache
decorator to cache the answers generated by GPT-3. This way, if a user asks the same question again, the cached answer can be returned instead of making another API call.
- The script uses
- Gui
- The script provides a friendly graphical user interface built using the tkinter library and
ttkthemes
allowing users to select a file, input a question, view the answer, and save the conversation history.
- The script provides a friendly graphical user interface built using the tkinter library and
- Put you api key in line 45
- Run the script
- Select your file
- Enter your question and click submit
It's as simple as that
Note We will provide an executable version soon
pip install openai pdfplumber python-docx
All rights reserved to Bropocalypse Team.