Working with the Artificial Bee Colony Algorithm
Most of the techniques in AI respect. Machine learning needs to solve a continuous optimization problem. Usually this optimization problem is non-linear and cannot be solved by hand. A vast literature exists regarding the types of optimization problems and possible ways to solve them efficiently.
The main difference between these methods can be made by the use of derivatives. So, for example, gradient-based methods, as the name suggests, require the calculation of derivatives. The same holds for Newton’s method, which seeks to find the roots of the first derivative.
These methods are known to be quick if they work. But they are also known to sometimes not work at all or have difficulty getting stuck in local minima.
In this story, I want to present an optimization technique that is itself based on AI. Like most AI-based algorithms, it is non-deterministic.
Its name isartificial bee colony algorithm‘ And it follows a pattern observed in nature regarding bees:
A fixed number of N ‘bees’ are placed randomly (uniformly) in the domain. Each bee remembers a current-based position and the algorithm remembers a global best.
Local Search: At each stage, each bee tries to find a better position by first randomly selecting another bee and moving a random distance towards or away from its counterpart. It only does the real trick when the new position will be better than the current one.
Spectator Phase: When all bees have done a local search, they are moved by a categorical distribution w.r.t a fitness value. In detail, each bee is assigned a fitness by computing the distance of its value to the current ‘worst’ bee’s value. So the greater this distance, the better the fitness. Each fitness value that we consider as a category is weighted by dividing the fitness value by the sum of all fitness values. It gives the distribution (category) of all fitness values and we can draw random samples from it. Note, the higher the fitness, the higher the probability that a sample is selected from the corresponding category. By doing this N times, which contributes to selecting some bee exactly N times, we find a new location for each bee. Thus, the better the fitness value of a location, the more bees are allocated there.
Scout Phase: Next, all bees that have not improved their current value by repeating the above moves X times are randomly moved throughout the domain, in the same way as at initialization.
The above is iterated several times to improve the global found optimum.
Furthermore this method is very stable, it works for high dimensions, the function only needs to be continuous, and it is very easy to implement.
In addition, it allows for a number of custom improvements. Its rate of convergence is difficult to estimate, but it has been verified in several applications that it works particularly well when searching for global minima without being trapped by local minima.
Therefore, in cases where a very accurate solution to that global minimum is required, this method is best used to get close to that point and then with steepest gradient or Newton iterations to produce the desired accuracy. Follow up action is taken.
More information including the inventor can be found at:
Artificial bee colony algorithms. (2022, July 8). In Wikipedia, https://en.wikipedia.org/wiki/Artificial_bee_colony_algorithm
We are going to look at a possible implementation in Rust. For those who need a little wrap-up in Rust, I recommend reading My Brief Introduction.
To generate random values, we put crates
rand for our dependencies (
rand = version = "0.8.5"
During the program, we are using the following types:
prelude::Distribution, rngs::ThreadRng, thread_rng, Rng;
pub struct Params
type X = Vec
type Target = dyn Fn(&X) -> f64;
fitness: f64, // distance to worst
Bee is described by
value, fitness and how many times has it failed to reform itself
X is just a convenient placeholder for the domain and
Target is a type of function that we want to minimize.
Next, we list a few handy useful functions that are needed:
fn compute_fitness_on_bees(bees: &mut Vec
let worst_value = bees[find_worst_bee_idx(bees)].value;
for bee in bees.iter_mut()
bee.fitness = worst_value - bee.value;
// returns the best position and value among all the bees
fn find_best_position_and_value(bees: &Vec
) -> (X, f64) b1.value.partial_cmp(&b2.value).unwrap())
// returns the index of the bee with highest (worst) value
fn find_worst_bee_idx(bees: &Vec
) -> usize b1.value.partial_cmp(&b2.value).unwrap())
// creates a random point within the n-dim. interval [a, b],
// that is a = x = b
fn create_random_position(a: &X, b: &X, rng: &mut ThreadRng) -> X
let mut res = vec!;
for (u, v) in a.iter().zip(b.iter())
Finally, we apply each step described in the algorithm one by one. You will see, it is straightforward to realize all the steps:
// create n bees and position them randomly in [a,b]
fn init_bees(f: &Target, a: &X, b: &X, n_bees: usize) -> Vec
let mut res = vec!;
let mut rng = thread_rng();
for _ in 0..n_bees
let position = create_random_position(a, b, &mut rng);
// let each bee locally search by moving a random distance to or away from
// another random bee.
// At the same time, keep the globally best know value/position up to date.
bees: &mut Vec
best_position: &mut X,
best_value: &mut f64,
let mut rng = thread_rng();
for idx in 0..bees.len()
let mut new_position = bees.get(idx).unwrap().position.clone();
let other_bee_idx = rng.gen_range(0..bees.len());
let position_idx = rng.gen_range(0..new_position.len());
let phi = rng.gen_range(-1.0..1.0);
let x_i_other = bees.get(other_bee_idx).unwrap()
let x_i = new_position[position_idx];
new_position[position_idx] = (x_i + phi * (x_i_other - x_i))
let new_value = f(&new_position);
let old_value = bees.get(idx).unwrap().value;
let bee = bees.get_mut(idx).unwrap();
if new_value < old_value
bee.position = new_position;
bee.value = new_value;
bee.not_improved_since = 0;
bee.not_improved_since += 1;
if new_value < *best_value
*best_value = bee.value;
*best_position = bee.position.clone();
// Re-distribute bees onto other locations and prioritize locations with
// higher fitness. This is done by laying a categorical distrubition over all
// the bees.
fn do_onlooker_phase(bees: &mut Vec
) bee.value).sum:: ();
let weights = bees
// Abandon bees from positions if have not been improved for long.
// This avoids being trapped at local minima.
fn do_scout_phase(bees: &mut Vec
abandon_limit: usize, a: &X, b: &X, f: &Target) bee.not_improved_since > abandon_limit)
Now, all we have to do is iterate these steps together:
pub fn optimize(f: &Target, a: &X, b: &X, params: Params) -> (X, f64)
let mut bees = init_bees(f, a, b, params.n_bees);
let (mut best_position, mut best_value) = find_best_position_and_value(&bees);
for _ in 0..params.max_iter
do_local_search_phase(f, &mut bees, &mut best_position, &mut best_value, a, b);
do_scout_phase(&mut bees, params.abandon_limit, a, b, f);
As a test function, I am using a single function (see Here) which is a function with lots of local minima:
-20.0 * f64::exp(
-0.2 * f64::sqrt(0.5 * (x * x + x * x))
) - f64::exp(
0.5 * (f64::cos(2.0 * PI * x) + f64::cos(2.0 * PI * x))
) + E + 20.0
println!("", ackley_fn(&vec![0.0, 0.0])); // this is the minimum
let params = Params
optimize(&ackley_fn, &vec![-5.0, -5.0], &vec![5.0, 5.0], params)
([0.007315634561002593, 9.547793830446227e-5], 0.02211822639045735)
you can get the complete code to play with Here,