We start by scanning all the pixels of the image. We record each color, and how many times that color appears, which we call "frequency". The collection of colors is then sorted by frequency in highest-to-lowest order.
In our example above, the specified palette size was 256. This means that out of the 83,882 unique colors that are used in the image, we have to choose 256 that will be represent the original image. The example above shows 3 different schemes I tried before obtaining a good result.
The final solution was to take the sorted colors list and calculate a stepping distance. This stepping distance is the number of colors to skip between each selection. The stepping distance is calculated using floor(num_of_colors / palette_size). This allowed us to sample from a wide range of frequencies which gave the best result.
Once the scheme is chosen, we have to loop back through all the pixels replacing them with their palette equivalents. There are 2 cases:
Above we mentioned "finding the closest color", but how do we do this? Think of it this way, colors are made of 3 primary colors. These primary colors are completely independent of one another. For whatever reason, computers don't use the three primary colors but instead use red, blue, and green, but thats okay because the result is the same. Think of our 3D world, how each location is specified as a distance along the x, y, and z axis. Well we can think of red, green, and blue as independent axes in color space. So, finding the distance between colors is the same thing as finding the distance between to points in 3D space. We simply use the distance formula.
Let color1 be made of (r1, g1, b1) and color2 be made of of (r2, g2, b2). Then the distance between the colors is
d = sqrt( (r2 - r1)^2 + (g2 - g1)^2 + (b2 - b1)^2)