EXPERT SYSTEMS WITH APPLICATIONS, cilt.42, ss.5256-5263, 2015 (SCI-Expanded)
Abstract
Microblogs are one of the most popular social network areas where users share their opinions, daily activities, interests or other user content. As microblogs generally pose the user’s interests, the field of interests can be extracted by using the presented content. In this study, we group microblog users as normal or bot depending on their supplied content and evaluate the user groups with respect to how well they reflect their categories with fresh entries, essentially by using content mining. Traditional content mining studies do not evaluate whether the supplied user entries are up-to-date or not. Unlike similar studies, we check up-to-dateness of users’ content by simultaneously retrieving user entries and RSS news feeds. If a term of user content is absent in the feature set that is formed by RSS news feeds, it is not regarded as a feature to check the freshness of the content. For each user group, we divide users into predefined categories and inspect how well the group users post relevant entries while checking the up-to-dateness of their content. Our experimental results prove that bot users always post fresher and category-relevant entries. Finally, we visualize the categorization performances of each user group’s entries with Cobweb. The Cobweb presentation unveils the miscategorization tendencies of the user groups.
Keywords
Microblog categorization, Short text classification, Social media, Twitter