2025Completed · Final Year Project

Book Recommendation System

Indian literature deserves better recommendation algorithms.

PythonML/DLPandasNumPyCollaborative Filtering

What

A recommendation engine focused on Indian Literature that benchmarks multiple ML and DL algorithms — collaborative filtering, content-based filtering, and hybrid models — to identify the most effective strategy for a culturally niche, sparse dataset.

Why

Most recommendation systems are trained on Western literature datasets. Indian literature — across languages, regions, and genres — is severely underrepresented. I wanted to understand how standard recommendation algorithms degrade when applied to sparse, niche datasets, and whether hybrid approaches could compensate.

How I built it

Sourced and cleaned a dataset of Indian literature titles, authors, genres, and reader ratings.

Performed exploratory data analysis to understand sparsity, rating distribution, and genre imbalance.

Implemented collaborative filtering (user-based and item-based), content-based filtering (TF-IDF on metadata), and a hybrid model combining both signals.

Benchmarked all approaches using RMSE, precision@k, and recall@k on a held-out test set.

Documented findings in a structured report comparing algorithm performance trade-offs for sparse cultural datasets.

Challenges

—

Dataset sparsity was severe — most books had very few ratings, which breaks standard collaborative filtering assumptions.

—

Content metadata (genre, themes, language) was inconsistent and required significant manual cleaning.

—

Hybrid weighting between collaborative and content signals required extensive tuning.

Outcome

The hybrid model outperformed both pure approaches on sparse subsets. The project contributed a structured analysis of recommendation algorithm performance on underrepresented cultural datasets — submitted as the MCA final year project.

Tech Stack

Language

Python

Collaborative FilteringContent-Based FilteringHybrid ModelsScikit-Learn

Data

PandasNumPyTF-IDF

← Back to all projects