Thèses en ligne de l'université 8 Mai 1945 Guelma

Sélection des termes co-occurrents avec entropie minimale pour la Classification des textes

Afficher la notice abrégée

dc.contributor.author BENSSAADA, ARIDJ
dc.date.accessioned 2022-10-13T09:51:46Z
dc.date.available 2022-10-13T09:51:46Z
dc.date.issued 2022
dc.identifier.uri http://dspace.univ-guelma.dz/jspui/handle/123456789/13228
dc.description.abstract Feature selection, as a dimensionality reduction technique, aims at selecting a small subset of the relevant features from the original ones by removing the irrele vant, redundant or noisy ones. Feature selection generally leads to better learning performance, i.e. higher learning accuracy, lower computational cost and better mo del interpretation. Feature selection methods such as Information Gain (IG), Mutual Information (MI) and Chi-square (Chi2) are statistical methods based on document frequency, but they do not take into account the frequency of terms within docu ments, nor do they consider their semantics. Based on the idea that terms that frequently co-occur may have a common se mantics and thus a high discrimination capacity compared to isolated terms, we propose a feature selection method for text classification considering two measures : term co-occurrence frequency and term entropy, where a term that frequently co occurs with other terms and leads to minimize the uncertainty (entropy) of the class variable is considered relevant. The performance of our method is compared to the four most commonly used se lection metrics : Information Gain (IG), Mutual Information (MI), Chi-square (Chi2) and Document-Frequency (DF), using two classifiersNaïve Bayes (NB) and Support Vector Machine (SVM) and three datatsets en_US
dc.language.iso fr en_US
dc.publisher université de guelma en_US
dc.subject sélection , terme , co-occurence , entropie , texte , classification . en_US
dc.title Sélection des termes co-occurrents avec entropie minimale pour la Classification des textes en_US
dc.type Working Paper en_US


Fichier(s) constituant ce document

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée

Chercher dans le dépôt


Recherche avancée

Parcourir

Mon compte