Say want to pick a fixed number of samples from a large 2D dataset, such that they relatively evenly distributed over the whole sample area. Imagine places in a country - so the border of the data is a irregular polygon.
Aware of poisson-disk based sampling, but that has a minimum distance as input, which dont know. The number of points want is the input variable.
Can just divide the area into a grid (as long as sample size is easily divisible), but it tends to favour the edges, parts of the grid with little coverage.
As an exaggerated example, picking 16 samples from a circle, tends to have a lot of samples around the edges, because will still pick one from each square. (the circle is just to demonstrate a kinda of worse case, but the 'grid' will never exactly match the polygon)
Best idea is at the moment, is to take the samples and perform something like https://en.wikipedia.org/wiki/Lloyd%27s_algorithm so the points 'relax' to be spread out over a rectangle. Then can just pick the points from the regular grid. ... but its never going to be very effient, as will have to do multiple rounds of Lloyds over a large dataset.
Feels like there is algorithm for this, but don't actually know what to look for!
Edit: actully here is a better demo, using real data, of issue with picking points from a grid. Getting lots of points around the coast, as have lots of Ireland that overlaps bits of squares.