MapReduce in The Farmer Was Replaced

Code discussion. Part II.

In the previous post we discussed the code which uses map reduce approach to farm sunflowers. We reviewed the drone_map_reduce(objects, f_map, f_red) function line-by-line and realized that:
๐Ÿ‘‰๐Ÿป objects is an array of tasks descriptions
๐Ÿ‘‰๐Ÿป f_map - a factory function that creates a task from its description
๐Ÿ‘‰๐Ÿป f_red - reducer function, merges results into one big result if necessary

Today we are going to discuss how this framework applies to solving three sunflower tasks: till the entire field, plant sunflowers, harvest sunflowers with x5 bonus.

Till the field

def till_field():
 columns = []

 for x in range(get_world_size()):
  columns.append(x)

 def till_column(columns):
  def g():
   for q in columns:
    for p in range(get_world_size()):
     nav(q, p)
     till_soil()
  return g

 drone_map_reduce(columns, till_column, no_reduce)

Let's check this code. First, we prepare a list of columns to till. Then we prepare the factory function till_column, which takes a list of columns to process and returns... Attention!!! - a function to till all these columns. It uses helper functions: nav for navigation and till_soil for tilling grassland into Soil (it doesn't till it back). We use quite an interesting trick - nested function g inside the function till_column. It captures the columns variable and therefore is self-aware, it can run with no parameters. In "normal" programming it is like "partial" function which binds parameters with values. This function is called inside the spawn_drone function and determines the drone's actions and its lifetime.

Plant sunflowers

def plant_sunflowers():
  drone_jobs = []

  for q in range(get_world_size()):
    drone_jobs.append(q)

  def drone_job(columns):
    def g():
      heights = dict()
      for q in columns:
        for p in range(get_world_size()):
          nav(q, p)
          plant(Entities.Sunflower)
          heights[(q, p)] = measure()
      return heights
    return g

  def reduce_hights(h1, h2):
    for k in h2:
      h1[k] = h2[k]
    return h1

  return drone_map_reduce(drone_jobs, drone_job, reduce_hights)

The first part is quite familiar - forming a list of columns to process. The drone_job function is again a factory of functions. Again, these functions are quite similar to previous till_column. But now we have an interesting detail. Each sunflower has its own petal count and we are to use these petals to determine which sunflower to collect first. So, after planting a sunflower we call the measure() function and put its result into a dictionary:

heights[(q, p)] = measure()

Keys of this dictionary are tuples of coordinates and the values - the numbers of petals.

And last, but not least detail: reduce_heights function. It takes two dictionaries as parameters, updates the first with the values from the second and returns this first dictionary. Here the logic of the whole affair is a little bit complicated and I intentionally did it. In my first version the code was clearer: every time I created a fresh dictionary and filled it with the content of two parameters. But in this case drone suddenly stopped for a noticeably long time. It turned out that the price of clean code was quadratic complexity of the algorithm. Therefore I changed it and now you see messier, but faster code.

I'm not sure that one post can contain so much code, so let's use links to github. Starting from this line I convert a dictionary from format (x,y) -> h to h -> [(x1, y1), (x2, y2), ...]. For each height of sunflower now we have a list of cells with sunflowers of this height in these cells.

Finally , we use our map_reduce approach to collect all sunflowers in a correct order. "gather_all" is a factory function for this task.