在地物空间进行多重学习的四面高地表交叉交叉点 (Quadric hypersurface intersection for manifold learning in feature space)

The knowledge that data lies close to a particular submanifold of the ambient Euclidean space may be useful in a number of ways. For instance, one may want to automatically mark any point far away from the submanifold as an outlier, or to use its geodesic distance to measure similarity between points. Classical problems for manifold learning are often posed in a very high dimension, e.g. for spaces of images or spaces of representations of words. Today, with deep representation learning on the rise in areas such as computer vision and natural language processing, many problems of this kind may be transformed into problems of moderately high dimension, typically of the order of hundreds. Motivated by this, we propose a manifold learning technique suitable for moderately high dimension and large datasets. The manifold is learned from the training data in the form of an intersection of quadric hypersurfaces -- simple but expressive objects. At test time, this manifold can be used to introduce an outlier score for arbitrary new points and to improve a given similarity metric by incorporating learned geometric structure into it.

翻译：数据接近周围欧几里德空间某一子层的知识在很多方面可能有用。例如,人们可能希望自动将离亚平面很远的任何点标为外端,或者使用其大地测量距离来测量各点之间的相似性。典型的多重学习问题往往是在非常高的维度上产生的, 例如图像空间或文字表达空间。今天,随着在计算机视觉和自然语言处理等领域的上升方面的深刻代表性学习,许多这类问题可能会变成中等高维度的问题, 通常是几百个层次的问题。我们为此提出一种适合中等高维度和大数据集的多元学习技术。从培训数据中以四面高面交汇的形式学习到的方块 -- -- 简单但能表达的物体。在试验时, 可以利用这些方块来为任意的新点引入外部分, 并通过将学习的几何结构纳入其中来改进特定的相似度指标。

相关内容

流形学习

关注 0

流形学习，全称流形学习方法(Manifold Learning)，自2000年在著名的科学杂志《Science》被首次提出以来，已成为信息科学领域的研究热点。在理论和应用上，流形学习方法都具有重要的研究意义。假设数据是均匀采样于一个高维欧氏空间中的低维流形，流形学习就是从高维采样数据中恢复低维流形结构，即找到高维空间中的低维流形，并求出相应的嵌入映射，以实现维数约简或者数据可视化。它是从观测到的现象中去寻找事物的本质，找到产生数据的内在规律。

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日