Context
Introduction
In 2006, Netflix released an anonymized dataset of movie ratings from approximately 500,000 subscribers as part of a public machine learning competition. Researchers Arvind Narayanan and Vitaly Shmatikov demonstrated that by cross-referencing this dataset with non-anonymous IMDb user reviews, they could re-identify specific individuals, uncovering their personal viewing histories.
Role
The learner acts as a data scientist within Netflix’s recommendation team.
Business Objectives
The task is to develop models that accurately predict movie ratings to optimize content recommendations.
The learner’s experience with collaborative filtering and machine learning equips them to tackle this prediction challenge.
Products
The product is a predictive model or set of models that generate accurate rating predictions for unseen user-movie pairs.
Codebook
Dataset
License
Not Provided
Tags
Data Provenance