Turkish Sentiment Analysis Dataset

Please fill the form to download datasets.

This dataset was created for the Paper 'SentiWordNet for New Language: Automatic Translation Approach' Ucan et. al,. SITIS 2016
Please cite the paper if you want to use it.
It contains sentences labelled with positive or negative sentiment.

Dataset Metadata

property value
name Turkish Sentiment Analysis Dataset
description We have selected two most popular movie and hotel recommendation websites from those which attain a high rate in the Alexa website. We selected “beyazperde.com” and “otelpuan.com” for movie and hotel reviews, respectively. The reviews of 5,660 movies were investigated. The all 220,000 extracted reviews had been already rated by own authors using stars 1 to 5. As most of the reviews were positive, we selected the positive reviews as much as the negative ones to provide a balanced situation. The total of negative reviews rated by 1 or 2 stars were 26,700, thus, we randomly selected 26,700 out of 130,210 positive reviews rated by 4 or 5 stars. Overall, 53,400 movie reviews by the average length of 33 words were selected. The similar manner was used to hotel reviews with the difference that the hotel reviews had been rated by the numbers between 0 and 100 instead of stars. From 18,478 reviews extracted from 550 hotels, a balanced set of positive and negative reviews was selected. As there were only 5,802 negative hotel reviews using 0 to 40 rating, we selected 5800 out of 6499 positive reviews rated from 80 to 100. The average length of all 11,600 selected positive and negative hotel reviews were 74 which is more than two times of the movie reviews.
sameAs http://humirapps.cs.hacettepe.edu.tr/tsad.aspx
citation http://ieeexplore.ieee.org/document/7907484/
provider
property value
name Hacettepe University Multimedia Information Retrieval Laboratory
sameAs http://humir.cs.hacettepe.edu.tr/