Data dimension reduction (DDR) is all about mapping data from high dimensions to low dimensions, various techniques of DDR are being used for image dimension reduction like Random Projections, Principal Component Analysis (PCA), the Variance approach, LSA-Transform, the Combined and Direct approaches, and the New Random Approach. Auto-encoders (AE) are used to learn end-to-end mapping. In this paper, we demonstrate that pre-processing not only speeds up the algorithms but also improves accuracy in both supervised and unsupervised learning. In pre-processing of DDR, first PCA based DDR is used for supervised learning, then we explore AE based DDR for unsupervised learning. In PCA based DDR, we first compare supervised learning algorithms accuracy and time before and after applying PCA. Similarly, in AE based DDR, we compare unsupervised learning algorithm accuracy and time before and after AE representation learning. Supervised learning algorithms including support-vector machines (SVM), Decision Tree with GINI index, Decision Tree with entropy and Stochastic Gradient Descent classifier (SGDC) and unsupervised learning algorithm including K-means clustering, are used for classification purpose. We used two datasets MNIST and FashionMNIST Our experiment shows that there is massive improvement in accuracy and time reduction after pre-processing in both supervised and unsupervised learning.
翻译:减少数据维度(DDR)是指从高层面到低层面的绘图数据,复员方案的各种技术正在被用于图像维度的减少,如随机预测、主要构成部分分析(PCA)、差异方法、LSA-变换法、综合和直接方法以及新随机方法等。同样,在以AE为基础的复员方案中,我们比较未经监督的学习算法准确度和时间,在AE代表学习之前和之后,我们证明预处理不仅加快算法,而且提高了受监督和不受监督的学习的准确性。在复员方案的预处理中,首次以CPA为基础的复员方案用于监督的学习,然后我们探索以AE为基础的复员方案用于不受监督的学习。在以CPA为基础的复员方案中,我们首先比较受监督的学习算法的准确性和时间。在AE代表学习之前和之后,我们比较了未经监督的算法,包括支持-检验机器(SVMM)、带有GIN指数的决策树、带有昆虫和斯托氏级梯级梯级梯级梯级梯级梯级的DDDR(SG)和在应用的大规模学习时程中,我们用于学习的升级的升级和升级的KIS。