This project showcases how to construct a knowledge graph for a movie dataset using Neo4j and GROQ API. The dataset includes attributes such as movieId
, title
, release year
, actors
, director
, genres
, and IMDb rating
. By combining the power of graph databases and GROQ queries, the project allows for advanced semantic queries, providing valuable insights into the movie domain.
- Creation of a movie knowledge graph with interconnected nodes (movies, actors, genres, directors).
- Querying the knowledge graph using Neo4j's Cypher language.
- Integration with GROQ API for data augmentation and enhanced querying.
- Semantic relationships and insights into movie data trends.
- Neo4j: Graph database for storing and querying movie data.
- GROQ API: For additional data enrichment and querying capabilities.
- LangChain: Framework for intelligent chain-based querying with LLM integration.
- Python: For data preprocessing, API integration, and orchestration.
The dataset includes the following fields:
movieId
: Unique identifier for each movie.title
: Name of the movie.released
: Year of release.actors
: List of main actors.director
: Name of the director.genres
: Categories the movie belongs to.imdbRating
: IMDb rating of the movie.
- Neo4j AuraDB: Set up a free instance on Neo4j AuraDB.
- GROQ API Key: Obtain an API key from the GROQ platform.
- Python 3.8+: Install Python on your system.
- Clone the repository:
git clone https://github.com/SAKTHIVINASH2/Movie-Knowledge-Graph-using-Neo4j-and-GROQ-API.git cd Movie-Knowledge-Graph-using-Neo4j-and-GROQ-API
- Install dependencies:
pip install -r requirements.txt
- Add your API keys:
- Update the config.py file with your Neo4j connection details and GROQ API key.
- Load the dataset into Neo4j:
- Use Cypher queries or provided scripts to populate the database with movie data.
- Start your Neo4j database instance.
- Run the Python script:
python main.py
- Query the knowledge graph using:
- The Neo4j Browser.
- Predefined GROQ-powered functions in the script.
To find the director of the movie GoldenEye:
response = chain.invoke({"query": "Who was the director of the movie GoldenEye"})
print(response)
├── data/
│ └── link.txt # Link to movie dataset
| └── README.md # dataset documentation
├── result/
| └── neo4j_query_table_data_2024-11-22.csv # sample result CSV file
| └── visualisation (1).png # sample png
├── src/
│ └── main.py # Main script to run the project
| # Neo4j graph initialization and utilities
│ # GROQ API integration functions
├── requirements.txt # Required Python libraries
├── README.md # Project documentation
- Add recommendation system capabilities using graph traversal.
- Visualize the graph for better exploration of relationships.
- Expand the dataset to include additional attributes like box office data or reviews.
This project is licensed under the MIT License.
For any queries or suggestions, feel free to reach out at sakthivinashb@example.com.