Abstract:
This study aims to analyze different data mining approaches to explore the attributes
affecting the success or failure of a movie and develop a rating for movies. Four data mining
algorithms and an Ensemble approach are considered in movie success prediction and also this
study demonstrates the correlation between successes or failure of a movie and the different
attributes of movies. Various prediction criteria are used to evaluate the prediction performance
of these models. Further, a spatial clustering technique called the Associated Keyword Space
(ASKS) was applied for this study. Similarities between movies were calculated using the Cosine
Similarity and these affinity values were used for this clustering model. Movies were categorized
under the success or failure of movies by clustering them into four clusters as Most Successful
Movies, Successful Movies, Unsuccessful Movies and Least Successful Movies. The most effective
attributes towards the success or failure of a movie were identified. Movie makers can use these
results to identify which movie attributes are the most effective and can consider them for the
success of their future movie productions. Also, using the Correlation Coefficient, a mathematical
model that can be used to predict the movie’s success or failure is proposed and a movie rating
(from 1-10) is developed.