This repository contains code and instructions for fine-tuning Microsoft SpeechT5 model for Hindi and English. The finetuned model is available here on Hugging Face: https://huggingface.co/clayton07/speecht5_finetuned_hindi_mono
- Python 3.8+
- CUDA-compatible GPU (recommended)
- Git
- Jupyter Notebook
- 50GB+ disk space for datasets and models
- Clone the repository:
git clone https://github.com/Claytonn7/speecht5-finetune.git
cd speecht5-finetune
- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install required packages:
pip install -r requirements.txt
- Start Jupyter Notebook:
jupyter notebook
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a new Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
For any questions or issues, please open an issue in the repository.