Logo
    • English
    • Ελληνικά
    • Deutsch
    • français
    • italiano
    • español
  • Ελληνικά 
    • English
    • Ελληνικά
    • Deutsch
    • français
    • italiano
    • español
  • Σύνδεση
Προβολή τεκμηρίου 
  •   Ιδρυματικό Αποθετήριο Πανεπιστημίου Θεσσαλίας
  • Επιστημονικές Δημοσιεύσεις Μελών ΠΘ (ΕΔΠΘ)
  • Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ.
  • Προβολή τεκμηρίου
  •   Ιδρυματικό Αποθετήριο Πανεπιστημίου Θεσσαλίας
  • Επιστημονικές Δημοσιεύσεις Μελών ΠΘ (ΕΔΠΘ)
  • Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ.
  • Προβολή τεκμηρίου
JavaScript is disabled for your browser. Some features of this site may not work without it.
Ιδρυματικό Αποθετήριο Πανεπιστημίου Θεσσαλίας
Όλο το DSpace
  • Κοινότητες & Συλλογές
  • Ανά ημερομηνία δημοσίευσης
  • Συγγραφείς
  • Τίτλοι
  • Λέξεις κλειδιά

Crowd Sourcing as an Improvement of N-Grams Text Document Classification Algorithm

Thumbnail
Συγγραφέας
Saloun P., Andrsic D., Cigankova B., Anagnostopoulos I.
Ημερομηνία
2020
Γλώσσα
en
DOI
10.1109/SMAP49528.2020.9248454
Λέξη-κλειδί
Computational linguistics
Crowdsourcing
Decision trees
Information retrieval systems
Natural language processing systems
Nearest neighbor search
Neural networks
Semantics
Social networking (online)
Statistical tests
Support vector machines
Text processing
Automated classification
Classification accuracy
Design and implements
Improvement mechanism
K-nearest neighbours
NAtural language processing
Real-world implementation
Text document classifications
Classification (of information)
Institute of Electrical and Electronics Engineers Inc.
Εμφάνιση Μεταδεδομένων
Επιτομή
A common task in a world of natural language processing is text classification useful for e.g.spam filters, documents sorting, science articles classification or plagiarism detection. This can still be done best and most accurately by human, on the other hand, we can of ten accept certain error in the classification in exchange for its speed. Here, natural language processing mechanism transforms the text in natural language to a form understandable by a classifier such as K-Nearest Neighbour, Decision Trees, Artificial Neural Network or Support Vector Machines. We can also use thishuman element to help automated classification to improve its accuracy by means of crowdsourcing. This work deals with classification of text documents and its improvement through crowdsourcing. Itsgoal is to design and implement text documents classifier prototype based on documents similarityand to design evaluation and crowdsourcing-based classification improvement mechanism. For classification the N-grams algorithm has been chosen, which was implemented in Java. Interface for crowdsourcing was created using CMS WordPress. In addition to data collection, the purpose of interface is to evaluate classification accuracy, which leads to extension of classifier test data set, thus the classification is more successful. We have tested our approach on two data sets with promising preliminary results even across different languages. This led to a real-world implementation started at the beginning of 2019 in cooperation of two universities: VšB-TUO and OSU. © 2020 IEEE.
URI
http://hdl.handle.net/11615/78741
Collections
  • Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ. [19735]

Related items

Showing items related by title, author, creator and subject.

  • Thumbnail

    Fuzzy Cognitive Maps for Interpretable Image-based Classification 

    Sovatzidi G., Vasilakakis M.D., Iakovidis D.K. (2022)
    Image classification is a fundamental component of intelligent vision systems. Developing classifiers capable of explaining how or why a classification result occurs, in a way compatible with human perception, remains a ...
  • Thumbnail

    A comparable study employing weka clustering/classification algorithms for web page classification 

    Charalampopoulos, I.; Anagnostopoulos, I. (2011)
    Documents and web pages share many similarities. Thus classification methods used in documents can be applied to advanced web content, with or even without modifications. Algorithms for document and web classification are ...
  • Thumbnail

    A Self-Pruning Classification Model for News 

    Akritidis L., Fevgas A., Bozanis P., Alamaniotis M. (2019)
    News aggregators are on-line services that collect articles from numerous reputable media and news providers and reorganize them in a convenient manner with the aim of assisting their users to access the information they ...
htmlmap 

 

Πλοήγηση

Όλο το DSpaceΚοινότητες & ΣυλλογέςΑνά ημερομηνία δημοσίευσηςΣυγγραφείςΤίτλοιΛέξεις κλειδιάΑυτή η συλλογήΑνά ημερομηνία δημοσίευσηςΣυγγραφείςΤίτλοιΛέξεις κλειδιά

Ο λογαριασμός μου

ΣύνδεσηΕγγραφή (MyDSpace)
Πληροφορίες-Επικοινωνία
ΑπόθεσηΣχετικά μεΒοήθειαΕπικοινωνήστε μαζί μας
Επιλογή ΓλώσσαςΌλο το DSpace
EnglishΕλληνικά
htmlmap