Show simple item record

dc.creatorDallas I.L., Vrahatis A.G., Tasoulis S.K., Plagianakos V.P.en
dc.date.accessioned2023-01-31T07:49:55Z
dc.date.available2023-01-31T07:49:55Z
dc.date.issued2022
dc.identifier10.1007/978-3-031-20837-9_18
dc.identifier.isbn9783031208362
dc.identifier.issn03029743
dc.identifier.urihttp://hdl.handle.net/11615/73044
dc.description.abstractWe are going through the last years of the COVID-19 pandemic, where almost the entire research community has focused on the challenges that constantly arise. From the computational and mathematical perspective, we have to deal with a dataset with ultra-high volume and ultra-high dimensionality in several experimental studies. An indicative example is DNA sequencing technologies, which offer a more realistic picture of human diseases at the molecular biology level. However, these technologies produce data with high complexity and ultra-high dimensionality. On the other hand, dimensionality reduction techniques are the first choice to address this complexity, revealing the hidden data structure in the original multidimensional space. Also, such techniques can improve the efficiency of machine learning tasks such as classification and clustering. Towards this direction, we study the behavior of seven well-known and cutting-edge dimensionality reduction techniques tailored for RNA-sequencing data. Along with the study of the effect of these algorithms, we propose the extension of the Random projection and Geodesic distance t-Stochastic Neighbor Embedding (RGt-SNE) algorithm, a recent t-Stochastic Neighbor Embedding (t-SNE) improvement. We suggest a new distance criterion for the kernel matrix construction. Our results show the potential of the proposed algorithm and, at the same time, highlight the complexity of the COVID-19 data, which are not separable, creating a significant challenge that the Machine Learning field will have to face. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.en
dc.language.isoenen
dc.sourceLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)en
dc.source.urihttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85144249540&doi=10.1007%2f978-3-031-20837-9_18&partnerID=40&md5=b458dbcf999487e1483e2b67f2ec2b0a
dc.subjectData reductionen
dc.subjectDNA sequencesen
dc.subjectEmbeddingsen
dc.subjectGene encodingen
dc.subjectMachine learningen
dc.subjectMolecular biologyen
dc.subjectRNAen
dc.subjectStochastic systemsen
dc.subjectDimensionality reductionen
dc.subjectDimensionality reduction techniquesen
dc.subjectHigh dimensionalityen
dc.subjectHigh-dimensionalen
dc.subjectHigh-dimensional COVID-19 dataen
dc.subjectHigher-dimensionalen
dc.subjectMachine-learningen
dc.subjectSingle cellsen
dc.subjectSingle-cell RNA-sequencingen
dc.subjectUltra-highen
dc.subjectCOVID-19en
dc.subjectSpringer Science and Business Media Deutschland GmbHen
dc.titleRecent Dimensionality Reduction Techniques for High-Dimensional COVID-19 Dataen
dc.typeconferenceItemen


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record