Creating Oddly Satisfying Visuals with Multi-processing-based Image Evolution in Python šŸŽØ

Mohammed El komy
Analytics Vidhya
Published in
8 min readFeb 17, 2021

--

Mona Lisa reconstruction

Hello Folks!, This time, I have very fascinating animations! We are going to evolve images reconstructed using a small number of polygons, optimized by a dual-annealing optimization algorithm in Python.

I think you might feel interested after seeing the evolution GIF above; indeed, we are going to have a quick walkthrough on how that was made and also how it was represented as a distributed job for better CPU utilization.

At first, I was mainly inspired by the great episode of the two-minute paper channel, in which the process of converting raster images into vector images was discussed. While the method mentioned in the video is a powerful deep-learning-based method that interferes with differentiable graphics, there was another popular yet very simple method to approximate or reconstruct images using polygons or graphics shapes in general by addressing the reconstruction process as an optimization problem being optimized using gradient-free optimization algorithms such as evolutionary algorithms and different kinds of naturally inspired algorithms, for instance, simulated annealing.

Raster Vs Vector Images [I donā€™t own the rights] vector-conversions.com

Differentiable graphics is an active point of research as the video revolves around this newly introduced method, but we are still capable of doing this reconstruction process with a fairly simple approach and still getting somehow acceptable results with few lines of code and thatā€™s what we are going to do here.šŸšŸ

As Iā€™m writing this, during Valentineā€™s Day, reconstructing your soulmate's image using this code is legal, but unfortunately, no such soulmate exists in my set šŸ¤£šŸ¤£.

Apart from that, letā€™s give a formal definition of the problem. We would like to find the optimal set of shapes with their corresponding colours and centres to approximate the reference image as much as possibleā€”the Mona Lisa image in our case.
This means that in an optimisation step, the renderer should render the image in the form of polygons or shapes overlaid on top of each other (the image has an alpha channel that allows overlaying). (Live Online Demo I donā€™t own)

Itā€™s worth noting that the process of finding a plausible configuration of shapes with their corresponding colours could be addressed using a variety of algorithms and techniques; some of them might be solely based on image segmentation and other computer vision aspects, which are entirely algorithmic and donā€™t involve any kind of optimization as we are going to do. I came across this impressive work during my search. But as alluded to above, in this post we are going to formulate the process of finding the optimal configuration using global optimizers.

As most of you know, optimization has been and is still the interest of many researchers due to its importance in engineering in general and operation research, for example, tuning the parameters of a PID controller or a neural network, which is extremely popular nowadays and changing our lives.
But for those of you who are familiar with deep learning and neural networks in general, do know that neural networks are mostly optimized using gradient-based methods because neural networks compute a well-defined function (apart from RL) for which the gradients of weights could be found using direct application of the chain rule; thereby, we can use a gradient-based optimizer (gradient descent family of optimizers for deep learning) to optimize and find a good configuration of parameters iteratively.

In contrast, the problem we are dealing with today involves non-differentiable operations by themselves, namely, the rasterization process or the rendering of the polygons in the drawing canvas. (Differentiable rasterizers are a hot topic of research, as in the foregoing video of the two-minute paper channel.)

So 1) how is it possible to optimize a function and find the global minima or plausible local minima? And if that works well, 2) why can't we use them to find the optimal parameters for my most loved neural network šŸ¤— šŸ¤—?

  1. There is a group of algorithms called meta-heuristic algorithms, which include evolutionary algorithms that mimic the evolution process, such as genetic algorithms, and also naturally inspired algorithms that mimic different kinds of animals and swarms of bees and ants.
    Those algorithms are population-based algorithms, where multiple solutions are competing and evolving, applying the rules of nature iteratively, and finding better solutions based on a fitness function or error function resembling how good the solution is. Those sets of algorithms work pretty well for certain problems in the literature, and itā€™s an active area of research in numerical optimization and operation research.
    [Disclaimer: The field of optimization is a wide area of research and other optimizers donā€™t belong to meta-heuristic algorithms.]
  2. To be honest, those global optimizers / gradient-free optimizers could be used to optimize neural networks, although empirically it has been shown that gradient-based methods are the go-to for optimizing neural networks. I guess this happens because gradient-based methods make use of our prior knowledge of the gradient of a well-defined function, not considering it as a black box for gradient-free optimizers.

Enough terminology; letā€™s give a little bit of detail. For every optimization problem, we should represent the world state in the form of a data structure to start the manipulation process. In our case, the thing we are going to observe as humans is the generated vector image (world state), and as you know from basic computer graphics courses, rendering an image is done through a set of primitive shapes such as polygons and their corresponding colours. This can be simply represented in the form of a long vector encoding this data.

The image below depicts the encoding process of the rasterizer state, which is:

1. The red rectangular prism represents a set of polygons of size
[num_polygon x 2 x sides], for example, I used 250 polygons, each of which has 3 sides (a rectangle), and 2 represents the coordinates X and Y because we are drawing points in a two-dimensional space.
2. The yellow rectangle represents the corresponding RGBA (4-dimensional) colour for each of those polygons; thatā€™s why the height (in purple) is shared between them.

You might guess the long state vector at the bottom (gradient filling from red to yellow) is the flattened version of the concatenated yellow and red tensors.

Illustration of the state vector encoding

The process of converting the problem states into a plain vector is called encoding, and the reverse operation is called decoding. The optimization algorithm perturbs the encoded state as a plain vector, then decodes it and sends it to the rasterizer to be rendered. The error is obtained through a very simple L2 elementwise distance between the reconstructed image and the reference image. (The image below demonstrates the optimization process and the loss meaning; this is done iteratively.)

On the left, the iter:7 having a loss of .1875, on the right iter:7200, with .0636 loss value

You can have a look at those operations in the 00.lowpoly_distributed.py file in functions encode, decode, and custom_fitness.

The functions used to encode, decode and compute the fitness for every state vector

Using this setup of the abovementioned functions, we are ready to use any gradient-free optimization algorithm to reconstruct the vector images. I used the genetic algorithm at first, and it was quite good. Then I resorted to stochastic fractal search (you may check out my blog post on it), which is another optimization algorithm based on the concept of fractals, but it was terrible. Then I tried the optimizers shipped with Scipy, which has a wide variety of optimizers with a lot of options, and I came across the dual annealing optimization algorithm, and it worked incredibly well!

According to Scipy documentation, the dual annealing optimization algorithm is an improved version of simulated annealing (inspired by metallurgy, that mimics the heating and controlled cooling of a material to increase the size of its crystals and reduce their defects by incorporating a strategy to refine the solution found by the generalized annealing process.

In that context, we can just feed our custom fitness function for the dual annealing optimization and do some graphics stuff to have a real-time overview of whatā€™s going on and start evolving images. Thatā€™s it šŸ˜‡.

But how was the multi-processing beneficial?
I think you might have noticed the black separation lines showing a grid of 16 rectangles. Thatā€™s true; every sub-image is reconstructed solely using the optimization scheme defined above, and yes, as you guessed šŸŽ‰ each one is fed to a process while drawing to a shared canvas šŸ¤”. This is not straightforward in Python multiprocessing because the global variables by default arenā€™t shared; they are just copied. To do so, I found this great answer on Stackoverflow.

The optimization process

Extra FunšŸ¶:

Hmm, the next section is going to use this low polygon representation to create some fancy visuals.
As the optimization process finishes, a pickle file contains a dictionary of all the polygons evolved for those 16 sub-images.

1) To generate the optimization animation above, I just save the global shared canvas to the disk every few minutes and simply render it in a video format or GIF.
I did some experimenting with removing duplicate subsequent frames based on hashing, as in 01.remove_duplicate.py file (pyimagesearch tutorial)
I also remove images in a decaying manner (as you reach the end of the video, drop more images), as in 02. drop frames.py, then used the scripts 03.generate_video_cmd.sh and 04.generate_gif.sh, to create the videos and gifs from the image sequence.

2) I also created a vector graphic in SVG format and a PDF using 06.generate_svg.py, and as you already know, this is not an image; itā€™s a set of shapes overlaid on top of each other.

Mona Lisa in PDF format šŸ–

3) Fancy overlaid animation
This animation is generated by 05.Animated polygons effect.py, where the evolved polygons are overlaid on top of each other by following a random trajectory.

That was a pretty long post I know šŸšŸ˜…, Thanks for your patience.
I hope you enjoyed šŸ™

Please check the GitHub repo for more details.

I used gifski open-source tool for generating high-quality gifs šŸŽŠšŸŽŠ

References:

1. Github repo: https://github.com/mohammed-elkomy/low-poly-image-evolution

2. Sharing global states in Python multiprocess: (we donā€™t need locks for our case because each process writes to one specific sub-image, simplifying the implementation) https://stackoverflow.com/questions/7894791/use-numpy-array-in-shared-memory-for-multiprocessing/

3. Reseeding for each sub-process: https://stackoverflow.com/questions/32172054/how-can-i-retrieve-the-current-seed-of-numpys-random-number-generator

4. Removing duplicate images using dhashing: https://www.pyimagesearch.com/2020/04/20/detect-and-remove-duplicate-images-from-a-dataset-for-deep-learning/

5. Scipy dual-annealing: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.dual_annealing.html

6. I used bezier curves for smoother trajectories (I came across this great solution) https://stackoverflow.com/questions/50731785/create-random-shape-contour-using-matplotlib/50732357

7. ffmepg project for video creation: https://ffmpeg.org/

7. Gifski: https://gif.ski/

--

--

Mohammed El komy
Analytics Vidhya

Teaching and Research Assistant, A deep learning enthusiast, Passionate about Algorithms and problem-solving, An Origami practitioner.