https://miro.medium.com/max/1200/0*FObvQElaotcFaoD-

Original Source Here

# 2. Planning and control — Acting in the real world

Perceiving the world is only one part of the equation. Acting on it is more challenging even. Passive perception is just about contemplating the world, but action requires us to take into consideration how the world interacts with us and vice versa. That’s why Tesla made Autopilot’s most basic principle “to never crash.” Whatever the car does, it must not touch anything else.

Nowadays, Autopilot uses an explicit approach to solve planning and control, which means they’re hard coding the methods directly by hand, without using machine learning techniques. However, there are situations that can’t be handled adequately with this approach, so they plan to use learning-based methods in the future.

They envision a two-system structure for planning and control that combines:

- Explicit planning
- Learning-based planning

## Explicit planning — Harcoding their way to the destination

There’s are three crucial differences between the visual system I described in the previous section and the current action system that comprises planning and control.

Both systems serve different purposes. The reason I make this comparison is that most people are familiar with AI systems like the visual neural network, but not with the planning/control system. Here I aim at giving perspective to its complexity.

The first difference is that the visual system is completely virtual, in the sense that no one except the car is directly affected by it. In contrast, the action system has effects on the real world. It influences our lives in a direct manner — either as passengers or as pedestrians.

That’s why the engineering team prioritized **safety, comfort, and efficiency**. Getting to the destination only matters if those priorities are fulfilled adequately.

The second difference is that any system that faces the real world will encounter, as Elon Musk implied, the highest number of degrees of freedom possible.

Solving Chess, Go, or even Starcraft is easy in comparison to navigating the real world. In more technical jargon, this means that the space is** non-convex **— the system can get stuck in a good local minimum that solves a specific situation but isn’t valid as a general solution — , and **high-dimensional **— the system needs to process many parameters, like acceleration, trajectory, and so on, to plan what to do next.

They faced a dilemma when trying to tackle both problems at the same time.

On the one hand, non-convexity is solved with **discrete search algorithms** because they don’t get stuck in local minima, but they could become computationally intractable due to the high dimensionality. On the other hand, **continuous function optimization** algorithms process easily high-dimensional spaces but can get stuck in local minima.

They found a **hybrid approach solution****:**

In short, starting from the vector space predictions provided by the visual system, they solve the non-convexity with a **coarse search** to obtain a **convex corridor** and then apply to the output the **continuous optimization method** to get a **smooth trajectory**.

The idea is to avoid the high-dimensionality bottleneck in the search method by using physics models with a low number of features and applying a non-fine grid — they found that a trade-off between thinness and computing time is a nice solution, reaching 2500 searches in 2.5 ms.

Here’s a visualization of how the system searches for the best route. The system rapidly transforms the space from non-convex to convex, finding a set of plan candidates which make up what they call the convex corridor. Then, it picks the best trajectory considering the maximization of safety, comfort, and efficiency, which is the convex corridor solution.

The convex corridor is the space in which the best trajectory is to be found. However, due to the lack of precision of the coarse search method, the final solution is only achieved after the information is passed through a continuous optimization function.

Now, the car has to do the required actions to carry out the plan. Here’s where it uses a continuous optimization algorithm to rapidly process the high-dimensional parameter space within the convex corridor set of possible trajectories to find a smooth path to follow.

The third difference, as I said at the beginning of this section, is that we have to take into account how other agents are acting — or going to act — to find the best joint plan. The visual system perceives this info but does nothing with it. But the planning system has to adapt and change accordingly because other agents’ actions may modify ours in drastic ways.

To recap, here are the main points about the planning and control system:

- Safety, comfort, and efficiency are maximized.
- A hybrid approach combining coarse search and continuous optimization functions solves both non-convexity and high-dimensionality.
- The system first searches for the best set of trajectories, the convex corridor, using physical models.
- Then it finds a definitive solution using continuous optimization methods.
- The control system carries out the planned solution to get the car to the destination.

## Learning-based planning — Handling complex situations

There are situations in which the coarse search + continuous optimization mix isn’t efficient enough to solve the planning problem (e.g., highly-populated city centers in rush hour). For these cases, they will implement **learning-based methods **— not yet in use in the current version of Autopilot.

To make an analogy, the vector space predictions from the visual NN provide a similar framework to a multiplayer Atari game. Because Atari games were already solved by DeepMind’s MuZero, Tesla decided to copy this solution, which seems to fit perfectly the planning problem.

**A combination of neural networks and Monte-Carlo Tree Search (MCTS)** **provides a global solution**. The neural network gives state and action distributions that are then used by the MCTS algorithm to reach the goal taking into account cost functions such as proximity to an object intervention, discomfort, or travel time. The NN provides global context to the search algorithm so it doesn’t get stuck in local minima. Here’s the comparison between methods that illustrate how the NN + MCTS approach surpasses the other options.

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot