An OCR tool based on Ollama-supported visual models such as Llama 3.2-Vision or MiniCPM-V 2.6 accurately recognizes text in images while preserving the original formatting.
- 🚀 High accuracy text recognition using Llama 3.2-Vision/MiniCPM-V 2.6 model
- 📝 Preserves original text formatting and structure
- 🖼️ Supports multiple image formats: JPG, JPEG, PNG
- ⚡️ Customizable recognition prompts and models
- 🔍 Markdown output format option
- 💪 Robust error handling
Accurate text recognition on macOS: macos-vision-ocr.
- Node.js 18.0 or higher
- Local running Ollama server
- Llama 3.2-Vision model installed
- Ensure Ollama server is running before use
- Make sure Llama 3.2-Vision model is downloaded
- Currently supported image formats: .jpg, .jpeg, .png
npm install ollama-ocr
# or using pnpm
pnpm add ollama-ocr
import { ollamaOCR, DEFAULT_OCR_SYSTEM_PROMPT } from "ollama-ocr";
async function runOCR() {
const text = await ollamaOCR({
filePath: "./test/images/handwriting.jpg",
systemPrompt: DEFAULT_OCR_SYSTEM_PROMPT,
});
console.log(text);
}
import { ollamaOCR, DEFAULT_MARKDOWN_SYSTEM_PROMPT } from "ollama-ocr";
async function runOCR() {
const text = await ollamaOCR({
filePath: "./test/images/trader-joes-receipt.jpg",
systemPrompt: DEFAULT_MARKDOWN_SYSTEM_PROMPT,
});
console.log(text);
}
async function runOCR() {
const text = await ollamaOCR({
model: "minicpm-v",
filePath: "./handwriting.jpg.jpg",
systemPrompt: DEFAULT_OCR_SYSTEM_PROMPT,
});
console.log(text);
}
The tool provides comprehensive error handling:
import { ollamaOCR, LlamaOCRError, ErrorCode } from "ollama-ocr";
async function runOCR() {
try {
const text = await ollamaOCR({
filePath: "./test/images/handwriting.jpg",
});
console.log(text);
} catch (error) {
if (error instanceof LlamaOCRError) {
switch (error.code) {
case ErrorCode.FILE_NOT_FOUND:
console.error("Image file not found");
break;
case ErrorCode.UNSUPPORTED_FILE_TYPE:
console.error("Unsupported image format");
break;
case ErrorCode.OLLAMA_SERVER_ERROR:
console.error("Ollama server connection failed");
break;
case ErrorCode.OCR_PROCESSING_ERROR:
console.error("OCR processing failed");
break;
}
}
}
}
MIT