A Self-Pruning Classification Model for News
Ημερομηνία
2019Γλώσσα
en
Λέξη-κλειδί
Επιτομή
News aggregators are on-line services that collect articles from numerous reputable media and news providers and reorganize them in a convenient manner with the aim of assisting their users to access the information they seek. One of the most important tools offered by news aggregators is based on the classification of the articles into a fixed set of categories. In this article, we introduce a supervised classification method for news articles that analyzes their titles and constructs multiple types of tokens including single words and n-grams of variable sizes. In the sequel, it employs several statistics, such as frequencies and token-class correlations, to assign two importance scores to each token. These scores reflect the ambiguity of a token; namely, how significant it is for the classification of an article to a category. The tokens and their scores are stored in a support structure that is subsequently used to classify the unlabeled articles. In addition, we propose a dimensionality reduction approach that reduces the size of the model without significant degradation of its classification performance. The algorithm is experimentally evaluated by employing a popular dataset of news articles and is found to outperform standard classification methods. © 2019 IEEE.
Collections
Related items
Showing items related by title, author, creator and subject.
-
Fuzzy Cognitive Maps for Interpretable Image-based Classification
Sovatzidi G., Vasilakakis M.D., Iakovidis D.K. (2022)Image classification is a fundamental component of intelligent vision systems. Developing classifiers capable of explaining how or why a classification result occurs, in a way compatible with human perception, remains a ... -
A comparable study employing weka clustering/classification algorithms for web page classification
Charalampopoulos, I.; Anagnostopoulos, I. (2011)Documents and web pages share many similarities. Thus classification methods used in documents can be applied to advanced web content, with or even without modifications. Algorithms for document and web classification are ... -
Examining travelers "optimal strategies" in transit trip choices, applying a classification tree approach on transit quality of service indicators
Tsami, M. T.; Nathanail, E. G. (2014)According to Spiess and Florian (1989) [1], at each transfer point passengers may shape the "optimal" for them strategy taking into account minimum generalized travel cost. The generalized cost of trip choices is formulated ...