using GeometricDatasets
using CairoMakie
Sampling data
Methods for extracting subsets of the original pointcloud.
Farthest points sample
farthest_points_sample(X::PointCloud, n::Integer; metric = Euclidean())
Given X
and an integer n
, return a subset of X
such that its points are the most distant possible from each other.
Details
Let X
be a metric space with k
points. Select a random point x_1
∈ X
. Select then x_2
as the point most distant from x_1
in relation to the given metric. After that, choose x_3
as the point most distant to both x_1
and x_2
at the same time. Keep choosing points like this until we have n
points.
Let \(X\) be a set of random numbers in the unit square:
= rand(2, 5000);
X = scatter(X);
fig, ax, plt fig
Let’s apply the farthest points sample using different parameters \(n\)
= farthest_points_sample(X, 50)
L = scatter!(ax, X[:, L], color = :red);
plt2 fig
= 0
plt2.alpha = farthest_points_sample(X, 100)
L = scatter!(ax, X[:, L], color = :red)
plt2 fig
= 0
plt2.alpha = farthest_points_sample(X, 500)
L = scatter!(ax, X[:, L], color = :red)
plt2 fig
= 0
plt2.alpha = farthest_points_sample(X, 1000)
L = scatter!(ax, X[:, L], color = :red)
plt2 fig
\(ϵ\)-net
An \(ϵ\)-net is a subset \(Y\) of a point cloud \(X\) such that every point \(x \in X\) is in an ϵ-ball with center in \(y\) for some \(y \in Y\).
Example
First we define X
to be 1000 random points in the unique square of \(\mathbb{R}^2\):
= rand(2, 10^4)
X = scatter(X)
fig, ax, plt fig
Then we create the \(\epsilon\)-net of \(X\) and plot it in red:
= 0.1
ϵ = epsilon_net(X, ϵ)
L = X[:, L]
Y
scatter!(ax, Y, color = :red)
fig