Skip to content

This project was realised in my first year at Telecom Physique Strasbourg, it is based on Netflix Prize dataset.

Notifications You must be signed in to change notification settings

ajayat/netflix-prize

Repository files navigation

Programming Project

This project was realised by Adrien JAYAT and Ysée JACQUET.

It consists of a recommendation system for movies based on the Netflix Prize dataset published by Netflix in 2006.

Our algorithm gives a RMSE of 0.971 on the test set. In comparison, the RMSE of the Cinematch, the Netflix algorithm, was 0.9525.

Get Started

This project is using two submodules:

First, clone the project and its submodules:

git clone https://gitlab.unistra.fr/jayat/projet-programmation-2023
cd projet-programmation-2023
git submodule init && git submodule update

Dependencies

Install make, gcc, doxygen and Zstandard compression algorithm:

Ubuntu

sudo apt update
sudo apt install make gcc doxygen zstd

Arch Linux

sudo pacman -Syu
sudo pacman -S make gcc doxygen zstd

Build

Build the project with the make command. It will generate the main executable in the current working directory.

You can run all tests with the make tests command:

make clean && make tests

How to use

After building the project, you can run the ./main executable to start the program. Use the option -h to get the following options list in your terminal.

Options

Flag Argument Description
-f FORCE Force to recompute all stats.
-r LIKES_FILE List of movies liked by the user.
-n N Length of the recommendation list the algorithm will give.
-d DIRECTORY The path of the folder where files corresponding to results will be saved.
-l LIMIT Forbidden to take in acount ratings with a date greater than the LIMIT.
-s MOVIE_ID Give statistics about the movie with the identifier MOVIE_ID.
-c IDS Allow to take into account only the ratings of the customers with given identifiers.
NB_CUSTOMER_IDS Number of given customer ids.
-b IDS Allow to not take into account the ratings of the customers with given identifiers.
NB_BAD_REVIEWERS Number of given bad reviewers.
-e MIN Allow to take into account only customers who rated at least MIN movies.
-t TIME Precise the executive time of the algorithme.
-p PERCENT Percentage (between 0 and 1) to quantify the importance of personnal recommendations over popular recommendations.

Note that options -r, -n, -t and -p are not used for statistics processing.

Examples

Gives the 10 best recommendations from liked movies

./main -r likes.txt -n 10

The likes.txt file contains titles of movies liked by the user. You can also give a list of movie ids directly from the command line.

./main -r 872 996 463 5582 -n 10 -p 0.8

Add the -p option to give a percentage to quantify the importance of personalized recommendations over popularity.

Gives statistics about a movie

./main -s 42

It will create a file named stats_mv_000042.csv in the stats folder, containing the min, max and average score of the movie 42.

About

This project was realised in my first year at Telecom Physique Strasbourg, it is based on Netflix Prize dataset.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published