Skip to main content

Chapter 5: Navigation and mapping

So far, we have defined state, built models of how a robot moves and designed controllers that choose actions. Throughout that discussion, however, we have been quietly relying on a very strong assumption: we know where the robot is and what the world looks like. This chapter we remove that assumption.

SLAM

Let’s start with the core problem:

how do I know where I am if I don’t know where anything is?

This is the SLAM problem, or simultaneous localisation and mapping. In SLAM, the robot must estimate its own state and the map of the environment at the same time.

The difficulty is that these two estimates depend on each other. To build a map, the robot needs to know where it is. To know where it is, the robot needs a map. This creates a circular dependency.

The key idea in SLAM is to solve both problems jointly. As the robot moves, it updates its belief about its position and its belief about the map. Each estimate supports and improves the other.


Joint state estimation

Lets illustrate this for a robot keeping track of its state using a topological map of N landmarks. Instead of estimating only the robot state,

xr=(x,y,θ),x_r = (x, y, \theta),

we estimate a joint state that stacks the robot pose together with all NN landmark positions:

x=(xrm1m2mN)R3+2N.\mathbf{x} = \begin{pmatrix} x_r \\ m_1 \\ m_2 \\ \vdots \\ m_N \end{pmatrix} \in \mathbb{R}^{3 + 2N}.

Each landmark mi=(mi,x,mi,y)m_i = (m_{i,x},\, m_{i,y})^\top is a 2D position in the world frame.

We maintain a Gaussian belief over this joint state:

p(xkz1:k,u1:k)=N(μk,Σk).p(\mathbf{x}_k \mid z_{1:k}, u_{1:k}) = \mathcal{N}(\boldsymbol{\mu}_k,\, \boldsymbol{\Sigma}_k).

The covariance matrix Σk\boldsymbol{\Sigma}_k has a block structure:

Σk=(ΣrrΣrm1ΣrmNΣm1rΣm1m1Σm1mNΣmNrΣmNmN).\boldsymbol{\Sigma}_k = \begin{pmatrix} \Sigma_{rr} & \Sigma_{rm_1} & \cdots & \Sigma_{rm_N} \\ \Sigma_{m_1 r} & \Sigma_{m_1 m_1} & \cdots & \Sigma_{m_1 m_N} \\ \vdots & & \ddots & \vdots \\ \Sigma_{m_N r} & \cdots & & \Sigma_{m_N m_N} \end{pmatrix}.

The off-diagonal blocks, such as Σrmi\Sigma_{r m_i}, capture the correlations between robot pose and landmark positions. This is the key insight. Once the robot returns to a known landmark, uncertainty in the entire map can be reduced because the robot pose and landmark estimates are statistically linked.


EKF SLAM

One concrete implementation of SLAM uses the Extended Kalman Filter (EKF) to maintain the Gaussian belief over robot pose and landmarks.

Predict step

During the prediction step, the robot moves according to a nonlinear motion model, such as a differential-drive model:

xk+1=f(xr,k,uk)+ϵk,ϵkN(0,Qk).x_{k+1} = f(x_{r,k},\, u_k) + \epsilon_k, \qquad \epsilon_k \sim \mathcal{N}(0, Q_k).

The landmarks are assumed to be stationary. This means the robot part of the state changes, while the landmark estimates remain the same:

μk=(f(μr,k1,uk)μm1μmN).\overline{\boldsymbol{\mu}}_k = \begin{pmatrix} f(\mu_{r,k-1},\, u_k) \\ \mu_{m_1} \\ \vdots \\ \mu_{m_N} \end{pmatrix}.

Because the motion model is nonlinear, we linearise it using the Jacobian

Fk=fxrμr,k1,uk.F_k = \left.\frac{\partial f}{\partial x_r}\right|_{\mu_{r,k-1},\, u_k}.

This robot-state Jacobian is then embedded into the full-state Jacobian:

Fk=(Fk00I2N).\mathbf{F}_k = \begin{pmatrix} F_k & 0 \\ 0 & I_{2N} \end{pmatrix}.

The predicted covariance becomes

Σk=FkΣk1Fk+Qk,\overline{\boldsymbol{\Sigma}}_k = \mathbf{F}_k\, \boldsymbol{\Sigma}_{k-1}\, \mathbf{F}_k^\top + \mathbf{Q}_k,

where Qk\mathbf{Q}_k contains QkQ_k in the robot block and zeros elsewhere, since landmarks do not move.

Measurement model

Now suppose the robot observes landmark ii using a range-bearing sensor. The expected observation is

z^ki=hi(μk)=((μmi,xμˉx)2+(μmi,yμˉy)2atan2(μmi,yμˉy,  μmi,xμˉx)μˉθ).\hat{z}_k^i = h^i(\overline{\boldsymbol{\mu}}_k) = \begin{pmatrix} \sqrt{(\mu_{m_i,x} - \bar{\mu}_x)^2 + (\mu_{m_i,y} - \bar{\mu}_y)^2} \\ \text{atan2}(\mu_{m_i,y} - \bar{\mu}_y,\; \mu_{m_i,x} - \bar{\mu}_x) - \bar{\mu}_\theta \end{pmatrix}.

The Jacobian HkiH_k^i maps the full state to the observation space:

Hki=hixμk.H_k^i = \left.\frac{\partial h^i}{\partial \mathbf{x}}\right|_{\overline{\boldsymbol{\mu}}_k}.

This matrix is sparse. Only the columns corresponding to the robot pose and landmark ii are non-zero, because a single landmark observation directly depends only on the robot pose and that landmark.

Update step

For each observed landmark ii, we first compute the innovation, which is the difference between the actual observation and the expected observation:

yki=zkiz^ki.y_k^i = z_k^i - \hat{z}_k^i.

The innovation covariance is

Ski=HkiΣk(Hki)+Rk,S_k^i = H_k^i\, \overline{\boldsymbol{\Sigma}}_k\, (H_k^i)^\top + R_k,

and the Kalman gain is

Kki=Σk(Hki)(Ski)1.K_k^i = \overline{\boldsymbol{\Sigma}}_k\, (H_k^i)^\top\, (S_k^i)^{-1}.

The mean and covariance are then updated as

μk=μk+Kkiyki\boldsymbol{\mu}_k = \overline{\boldsymbol{\mu}}_k + K_k^i\, y_k^i

and

Σk=(IKkiHki)Σk.\boldsymbol{\Sigma}_k = (I - K_k^i\, H_k^i)\, \overline{\boldsymbol{\Sigma}}_k.

Although the observation is of only one landmark, the gain KkiK_k^i can have non-zero entries for all landmarks because of the off-diagonal covariance blocks. This means that observing one landmark can update the estimated positions of other landmarks.

Initialising new landmarks

When a new landmark jj is seen for the first time, we initialise its position by inverting the observation model:

μmj=(μˉx+rcos(ϕ+μˉθ)μˉy+rsin(ϕ+μˉθ)),\mu_{m_j} = \begin{pmatrix} \bar{\mu}_x + r\cos(\phi + \bar{\mu}_\theta) \\ \bar{\mu}_y + r\sin(\phi + \bar{\mu}_\theta) \end{pmatrix},

where (r,ϕ)(r, \phi) is the observed range and bearing. The landmark’s initial uncertainty is computed using the Jacobian of this inverse model, propagating both the sensor noise and the current robot pose uncertainty.

Data association

Before applying a measurement update, the robot must decide which observed feature corresponds to which landmark in the map. This is the data association problem.

A common approach is nearest-neighbour matching using the Mahalanobis distance:

dM2=(zz^)S1(zz^).d_M^2 = (z - \hat{z})^\top S^{-1} (z - \hat{z}).

A new landmark is initialised only when no existing landmark lies within a chosen threshold. Incorrect associations can corrupt the map, which makes data association one of the most challenging parts of practical SLAM.

Key intuition

The key intuition is that uncertainty is shared between the robot and the map. When the robot re-observes a known landmark, the error in that observation flows back through the correlated covariance structure and can reduce uncertainty throughout the map. This is loop closure: returning to a previously visited place can dramatically improve the map.

Smith, R., Self, M., and Cheeseman, P. (1986). "On the representation and estimation of spatial uncertainty." International Journal of Robotics Research, 5(4), 56–68.

Durrant-Whyte, H. and Bailey, T. (2006). "Simultaneous Localisation and Mapping (SLAM): Part I." IEEE Robotics and Automation Magazine, 13(2), 99–110.


Occupancy grid SLAM

Landmarks are useful when the world contains nice discrete features. But many environments do not naturally break into clean landmark observations. In those cases, we often use an occupancy grid.

An occupancy grid represents the world as a regular grid of cells. Each cell cc stores the log-odds of occupancy:

lc=logp(c=occ)1p(c=occ).l_c = \log \frac{p(c = \text{occ})}{1 - p(c = \text{occ})}.

We usually start from a prior l0=0l_0 = 0, which corresponds to p=0.5p = 0.5. Each time a sensor ray passes through or terminates at a cell, we update that cell:

lclc+logp(c=occz)1p(c=occz)l0.l_c \leftarrow l_c + \log \frac{p(c = \text{occ} \mid z)}{1 - p(c = \text{occ} \mid z)} - l_0.

In practice, cells at the end of a sensor ray receive a positive occupancy increment loccl_\text{occ}, while cells along the free part of the ray receive a negative free-space decrement lfreel_\text{free}.

We can always convert the log-odds value back into a probability:

p(c=occ)=111+elc.p(c = \text{occ}) = 1 - \frac{1}{1 + e^{l_c}}.

The log-odds representation is convenient because sequential updates become additions, and the values are symmetric around zero.

The main change from landmark SLAM is the size of the map. Instead of a small set of landmark positions, the map may contain W×HW \times H cells, each updated independently. We still need a robot pose estimate to know which cells each ray passes through, often using ray-casting methods such as Bresenham’s line algorithm. However, because the map is so large, joint EKF-style estimation quickly becomes infeasible.

Occupancy grids are flexible and work directly with lidar or depth cameras without requiring discrete landmark extraction. The tradeoff is that they can be computationally expensive in large environments and they require accurate localisation; otherwise, the resulting map becomes blurry or inconsistent.


Rao-Blackwellisation and FastSLAM

As the joint state grows, especially with occupancy grids, estimating everything together becomes expensive. FastSLAM addresses this by exploiting the conditional independence structure of the SLAM problem.

Given a full robot trajectory x0:kx_{0:k}, the map cells or landmarks become conditionally independent of each other:

p(x0:k,mz1:k,u1:k)=p(mx0:k,z1:k)  p(x0:kz1:k,u1:k).p(\mathbf{x}_{0:k}, m \mid z_{1:k}, u_{1:k}) = p(m \mid \mathbf{x}_{0:k}, z_{1:k})\; p(\mathbf{x}_{0:k} \mid z_{1:k}, u_{1:k}).

This factorisation is called Rao-Blackwellisation. Instead of estimating everything jointly, we sample possible robot trajectories with a particle filter, and for each particle we maintain a separate map conditioned on that trajectory.

FastSLAM algorithm

Each particle ii stores a robot trajectory hypothesis x0:k(i)x_{0:k}^{(i)} and a set of landmark EKFs,

{μmj(i),Σmj(i)}j=1N,\{\mu_{m_j}^{(i)}, \Sigma_{m_j}^{(i)}\}_{j=1}^N,

with one small EKF per landmark, per particle.

During prediction, a new pose is sampled from the motion model:

xk(i)p(xkxk1(i),uk).x_k^{(i)} \sim p(x_k \mid x_{k-1}^{(i)}, u_k).

During the update step, each observed landmark jj is used to compute the particle weight under that particle’s landmark estimate:

wk(i)N ⁣(zkj;  h(xk(i),μmj(i)),  Hk(i)Σmj(i)(Hk(i))+R).w_k^{(i)} \propto \mathcal{N}\!\left(z_k^j;\; h(x_k^{(i)}, \mu_{m_j}^{(i)}),\; H_k^{(i)}\Sigma_{m_j}^{(i)}(H_k^{(i)})^\top + R\right).

The relevant EKF for landmark jj is then updated inside particle ii, and particles are resampled according to their weights.

FastSLAM scales as O(MlogN)O(M \log N) in the number of particles MM and landmarks NN, compared with O(N2)O(N^2) for EKF SLAM. This is a significant practical improvement.

Montemerlo, M., Thrun, S., Koller, D., and Wegbreit, B. (2002). "FastSLAM: A Factored Solution to the Simultaneous Localization and Mapping Problem." In Proceedings of the AAAI National Conference on Artificial Intelligence, pp. 593–598.


Global vs Local Planning

Once we have a map, we can ask a new question:

how do we get from A to B?

Before looking at specific algorithms, it is useful to distinguish two levels of planning that operate together in most autonomous robots.

A global planner computes a path from the robot’s current pose to a distant goal using a complete, or nearly complete, map of the environment. It operates over the full map and usually runs relatively infrequently, such as once per goal or when the map changes significantly. Its output is normally a reference path or waypoint sequence. Global planners include methods such as A*, D*, RRT and PRM.

A local planner, in contrast, is responsible for short-term motion around the robot’s immediate surroundings. It operates over a small local window, often only a few metres around the robot, and runs at high frequency. It must react to dynamic obstacles such as people, other vehicles, or objects that were not in the global map. Local planners include the Dynamic Window Approach, tentacle-based navigation and trajectory rollout.

In practice, these two planners interact continuously. The global planner provides a desired heading, path or waypoint, while the local planner chooses a feasible motion toward that reference under the robot’s kinematic constraints and current sensor observations. If the local planner cannot make progress, for example because of a deadlock or newly discovered obstacle, the global planner must re-plan.


Bug algorithms

Bug algorithms are among the simplest local reactive strategies. They assume that the robot knows where the goal is, can sense obstacles locally and does not have a map.

Bug 0

Bug 0 moves directly toward the goal whenever possible. When the robot hits an obstacle boundary, it follows that boundary, for example by always keeping the obstacle on its left, until the direction to the goal is clear again. It then leaves the boundary and resumes heading for the goal.

while not at goal:
if path to goal is clear:
move toward goal
else:
follow obstacle boundary
if direction to goal is free:
leave boundary and move toward goal

Bug 0 is simple, but it is not guaranteed to terminate in all environments. It can get trapped in loops.

Bug 1

Bug 1 is more cautious. When it encounters an obstacle, it fully circumnavigates the obstacle boundary, records the boundary point closest to the goal and then leaves the obstacle from that point.

while not at goal:
move toward goal
if obstacle hit at hit point H:
circumnavigate entire obstacle
record the point L closest to the goal
return to L
leave toward goal

Bug 1 is guaranteed to terminate if a path exists. Its path length is at most

d(qstart,qgoal)+12iPi,d(q_{\text{start}}, q_{\text{goal}}) + \frac{1}{2}\sum_{i} P_i,

where PiP_i is the perimeter of obstacle ii.

Bug 2

Bug 2 avoids full circumnavigation when possible. It uses the M-line, the straight line from start to goal. When the robot hits an obstacle, it follows the obstacle boundary only until the M-line is crossed again at a point closer to the goal than the original hit point.

Compute the M-line: the straight line from start to goal
while not at goal:
move along M-line toward goal
if obstacle hit at H:
follow obstacle boundary
if M-line crossed at point L with d(L, goal) < d(H, goal):
leave boundary, resume along M-line

Bug 2 is also guaranteed to terminate if a path exists. Its path length is bounded by

d(qstart,qgoal)+12iniPi,d(q_{\text{start}}, q_{\text{goal}}) + \frac{1}{2}\sum_{i} n_i P_i,

where nin_i is the number of times the M-line intersects obstacle ii. In practice, Bug 2 is generally shorter than Bug 1.

These algorithms are simple and surprisingly effective in the right conditions.

Lumelsky, V. J. and Stepanov, A. A. (1987). "Path-planning strategies for a point mobile automaton moving amidst unknown obstacles of arbitrary shape." Algorithmica, 2(1–4), 403–430.

Their limitations are also important. They can be inefficient, their path length can grow with obstacle perimeter, they provide no global optimality guarantee and they do not benefit from accumulated map knowledge. They are purely reactive methods.


A* and D*

Now suppose we do have a map. Planning can then be formulated as a graph search problem.

We discretise the environment into a grid or graph G=(V,E)G = (V, E). Each node nVn \in V represents a configuration, such as a grid cell, and each edge (n,n)E(n, n') \in E has a cost c(n,n)c(n, n'), such as the Euclidean distance between cell centres. The goal is to find a minimum-cost path from a start node ss to a goal node gg.

A* algorithm

A* maintains two quantities for each node nn. The first is g(n)g(n), the cost of the cheapest path found so far from the start to nn. The second is h(n)h(n), a heuristic estimate of the remaining cost from nn to the goal.

The priority function is

f(n)=g(n)+h(n).f(n) = g(n) + h(n).

Nodes are expanded in order of increasing f(n)f(n).

open_set = {start}
g(start) = 0
f(start) = h(start)

while open_set is not empty:
n = node in open_set with smallest f(n)
if n == goal: return path

remove n from open_set
for each neighbour n' of n:
tentative_g = g(n) + c(n, n')
if tentative_g < g(n'):
g(n') = tentative_g
f(n') = g(n') + h(n')
parent(n') = n
add n' to open_set if not already there

A heuristic h(n)h(n) is admissible if it never overestimates the true cost to the goal:

h(n)h(n)n,h(n) \leq h^*(n) \qquad \forall n,

where h(n)h^*(n) is the true cheapest cost from nn to the goal. For grid-based planning, a common choice is Euclidean distance,

h(n)=ng2,h(n) = \|n - g\|_2,

or Manhattan distance when only 4-connected moves are allowed:

h(n)=nxgx+nygy.h(n) = |n_x - g_x| + |n_y - g_y|.

Theorem (Hart et al., 1968): if hh is admissible, A* returns an optimal path. A* is also complete, meaning that if a path exists, it will be found.

Hart, P. E., Nilsson, N. J., and Raphael, B. (1968). "A Formal Basis for the Heuristic Determination of Minimum Cost Paths." IEEE Transactions on Systems Science and Cybernetics, 4(2), 100–107.

D*

D*, or Dynamic A*, extends A* to handle changing maps. Instead of repeatedly replanning from scratch whenever the map changes, D* repairs only the affected portion of the plan. It searches from the goal back toward the start while maintaining back-pointers. When an edge cost changes, such as when a new obstacle is detected, only the relevant parts of the search tree need to be updated.

Stentz, A. (1994). "Optimal and Efficient Path Planning for Partially-Known Environments." In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 3310–3317.


Probabilistic Roadmaps (PRM)

Instead of planning on a dense grid, a Probabilistic Roadmap represents free space as a sparse graph built by random sampling.

During the build phase, random configurations are sampled. Collision-free configurations are added as graph nodes, and nearby nodes are connected if the path between them is also collision-free.

while roadmap is not dense enough:
q = sample random configuration
if q is collision-free:
add q to V
for each q' in V near q:
if path q -> q' is collision-free:
add edge (q, q') to E

During the query phase, the start and goal configurations are connected to the roadmap, and a graph search method such as A* or Dijkstra’s algorithm is used to find a path.

connect q_start and q_goal to the roadmap
find shortest path using A* or Dijkstra

Kavraki, L. E., Švestka, P., Latombe, J. C., and Overmars, M. H. (1996). "Probabilistic roadmaps for path planning in high-dimensional configuration spaces." IEEE Transactions on Robotics and Automation, 12(4), 566–580.

PRMs are effective for high-dimensional configuration spaces, such as robot arms, and the roadmap can be reused for multiple queries in the same environment. Their main weakness is that they struggle with narrow passages, because the probability of sampling useful connections in a narrow region can be low. They also assume a mostly static environment and are not naturally suited to online replanning.


RRT

Rapidly-exploring Random Trees, or RRTs, take a different approach. Instead of building a full roadmap, an RRT grows a single tree from the start state toward unexplored regions of the space and, eventually, toward the goal.

T = {x_start}

for i = 1 to N:
x_rand = sample random state (with probability p_goal, use x_goal)
x_near = nearest node in T to x_rand
x_new = Steer(x_near, x_rand, step_size)

if x_new is collision-free:
add x_new to T, with parent x_near
if ||x_new - x_goal|| < goal_threshold:
return path from x_start to x_new

The Steer function moves from xnearx_\text{near} toward xrandx_\text{rand} by at most step_size:

xnew=xnear+Δxrandxnearxrandxnear.x_\text{new} = x_\text{near} + \Delta \cdot \frac{x_\text{rand} - x_\text{near}}{\|x_\text{rand} - x_\text{near}\|}.

LaValle, S. M. (1998). "Rapidly-Exploring Random Trees: A New Tool for Path Planning." Technical Report TR 98-11, Iowa State University.

The key property of RRT is its Voronoi bias. The probability of extending a node is proportional to the volume of its Voronoi region:

P(node n extended)Vol(Voronoi region of n).P(\text{node } n \text{ extended}) \propto \text{Vol}(\text{Voronoi region of } n).

This means the tree naturally expands toward unexplored regions.

RRT*

The original RRT is not asymptotically optimal. The first solution found may be poor, and continuing to sample does not necessarily improve it. RRT* adds two ideas: near-set rewiring and tree rewiring.

When adding xnewx_\text{new}, RRT* checks nearby nodes and chooses the parent that gives the cheapest path to xnewx_\text{new}:

xparent=argminxNear(xnew)[c(xstartx)+c(xxnew)].x_\text{parent} = \arg\min_{x' \in \text{Near}(x_\text{new})} \left[c(x_\text{start} \to x') + c(x' \to x_\text{new})\right].

Then, for each nearby node xx', it checks whether routing through xnewx_\text{new} would be cheaper:

if c(xstartxnew)+c(xnewx)<c(xstartx):reparent x to xnew.\text{if } c(x_\text{start} \to x_\text{new}) + c(x_\text{new} \to x') < c(x_\text{start} \to x'):\quad \text{reparent } x' \text{ to } x_\text{new}.

RRT* is asymptotically optimal: as the number of samples NN \to \infty, the path cost converges to the optimal cost.

Karaman, S. and Frazzoli, E. (2011). "Sampling-based algorithms for optimal motion planning." International Journal of Robotics Research, 30(7), 846–894.

RRT is especially useful for systems with complex kinematics or dynamics, continuous high-dimensional configuration spaces and single-query planning problems where there is one start and one goal.


Local planning methods

Global planners produce a reference path, but they cannot react quickly to dynamic obstacles or directly handle low-level kinematic constraints. For that we need a local planner.

Dynamic Window Approach (DWA)

The Dynamic Window Approach searches directly in velocity space, using candidate commands (v,ω)(v, \omega), to find a safe and goal-directed motion.

The first step is to compute the dynamic window. Given the current velocity (vc,ωc)(v_c, \omega_c) and the robot’s acceleration limits (v˙max,ω˙max)(\dot{v}_\text{max}, \dot{\omega}_\text{max}), the velocities reachable within one time step τ\tau are

Vd={(v,ω)v[vcv˙maxτ,  vc+v˙maxτ],ω[ωcω˙maxτ,  ωc+ω˙maxτ]}.V_d = \{(v,\omega) \mid v \in [v_c - \dot{v}_\text{max}\tau,\; v_c + \dot{v}_\text{max}\tau],\quad \omega \in [\omega_c - \dot{\omega}_\text{max}\tau,\; \omega_c + \dot{\omega}_\text{max}\tau]\}.

This set is intersected with the admissible velocities VaV_a, which are the velocities that allow the robot to stop before hitting an obstacle:

Va={(v,ω)v2d(v,ω)v˙max}.V_a = \{(v,\omega) \mid v \leq \sqrt{2\, d(v,\omega)\, \dot{v}_\text{max}}\}.

Here, d(v,ω)d(v, \omega) is the distance to the nearest obstacle along the circular arc traced by the command (v,ω)(v, \omega).

The second step is to score each candidate velocity in VdVaV_d \cap V_a. Each candidate arc is simulated over a short horizon and evaluated using

G(v,ω)=σ ⁣[αheading(v,ω)+βclearance(v,ω)+γv].G(v, \omega) = \sigma\!\left[\alpha \cdot \text{heading}(v,\omega) + \beta \cdot \text{clearance}(v,\omega) + \gamma \cdot v\right].

The heading term measures alignment with the goal direction after the arc, the clearance term measures the distance to the nearest obstacle along the arc and the speed term rewards faster motion. The weights α\alpha, β\beta and γ\gamma tune the relative importance of these objectives, while σ()\sigma(\cdot) is a smoothing normalisation.

The final command is the highest-scoring admissible velocity:

(v,ω)=argmax(v,ω)VdVaG(v,ω).(v^*, \omega^*) = \arg\max_{(v,\omega) \in V_d \cap V_a} G(v, \omega).

DWA runs at sensor rate, typically around 20–50 Hz, and directly produces smooth, kinematically feasible commands.

Fox, D., Burgard, W., and Thrun, S. (1997). "The Dynamic Window Approach to Collision Avoidance." IEEE Robotics and Automation Magazine, 4(1), 23–33.

Trajectory rollout

Trajectory rollout is closely related to DWA. Instead of scoring only a short arc, it simulates each candidate velocity (v,ω)(v, \omega) over a longer trajectory of TT steps:

xt+1(v,ω)=f(xt(v,ω),v,ω).\mathbf{x}_{t+1}^{(v,\omega)} = f(\mathbf{x}_t^{(v,\omega)},\, v,\, \omega).

The full rollout is scored using a cost function such as

J(v,ω)=t=0Tc(xt(v,ω)),J(v,\omega) = \sum_{t=0}^{T} c(\mathbf{x}_t^{(v,\omega)}),

where c()c(\cdot) can encode clearance, deviation from the reference path and speed. This can be viewed as an early form of model predictive control applied to local navigation.

Driving with tentacles

Tentacle-based navigation uses a fixed set of candidate arcs, or tentacles, that are precomputed offline. Each tentacle kk is defined by a constant curvature

κk=ωkv,k=1,,K.\kappa_k = \frac{\omega_k}{v}, \qquad k = 1, \dots, K.

This gives KK precomputed arc paths that fan out in front of the robot, often with K=7K = 71515 and symmetry about the forward direction.

At runtime, each tentacle is checked against the local occupancy grid. A tentacle is blocked if any cell along it is occupied. The free tentacles are then scored using

sk=wggoal_alignment(κk)+wcclearance(κk)+wvvk.s_k = w_g \cdot \text{goal\_alignment}(\kappa_k) + w_c \cdot \text{clearance}(\kappa_k) + w_v \cdot v_k.

The robot selects the highest-scoring free tentacle and applies the corresponding (v,ω)(v, \omega) command.

Tentacle methods are extremely fast at runtime because the geometry is precomputed. The online computation is mostly occupancy lookup and a linear scoring pass. This makes them useful in high-speed autonomous driving, where compute can be limited and reaction time must be very short.

Urmson, C. et al. (2008). "Autonomous driving in urban environments: Boss and the Urban Challenge." Journal of Field Robotics, 25(8), 425–466.

Comparing local planning methods

Bug algorithms are simple and reactive, but they do not operate in velocity space and do not explicitly handle robot kinematics. DWA searches continuous velocity space, handles kinematics and reacts quickly. Trajectory rollout also handles kinematics but simulates longer candidate trajectories, making it somewhat more computationally expensive. Tentacle methods use precomputed arcs, which makes them extremely fast, but less flexible than continuous optimisation over velocity space.

MethodVelocity spacePre-computedHandles kinematicsReactive speed
Bug algorithmsNoNoNoSlow
DWAYes (continuous)NoYesFast
Trajectory rolloutYes (discrete)NoYesModerate
TentaclesYes (discrete)Yes (arcs)YesVery fast

Big picture

This chapter connects the major components of robot navigation. Modelling tells us how the robot moves. Estimation tells us where the robot is. Mapping tells us what the world looks like. Planning tells us what the robot should do, both globally and locally.

The full navigation stack can be viewed as

Sensor data

SLAM (localise + build map)

Global planner (A*, RRT, …) → reference path

Local planner (DWA, tentacles, …) → velocity commands

Controller → wheel torques

Each layer operates at a different timescale and level of abstraction.


Final thought

Robotics is not just choosing actions. It is choosing actions under uncertainty, in a world we do not fully know, that changes while we act. That is what makes it hard, and also what makes it interesting.

Key Papers

Smith, R., Self, M., and Cheeseman, P. (1986). "On the representation and estimation of spatial uncertainty." International Journal of Robotics Research, 5(4), 56–68.

Durrant-Whyte, H. and Bailey, T. (2006). "Simultaneous Localisation and Mapping (SLAM): Part I." IEEE Robotics and Automation Magazine, 13(2), 99–110.

Montemerlo, M., Thrun, S., Koller, D., and Wegbreit, B. (2002). "FastSLAM: A Factored Solution to the Simultaneous Localization and Mapping Problem." In Proceedings of AAAI, pp. 593–598.

Lumelsky, V. J. and Stepanov, A. A. (1987). "Path-planning strategies for a point mobile automaton moving amidst unknown obstacles of arbitrary shape." Algorithmica, 2(1–4), 403–430.

Hart, P. E., Nilsson, N. J., and Raphael, B. (1968). "A Formal Basis for the Heuristic Determination of Minimum Cost Paths." IEEE Transactions on Systems Science and Cybernetics, 4(2), 100–107.

Stentz, A. (1994). "Optimal and Efficient Path Planning for Partially-Known Environments." In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 3310–3317.

Kavraki, L. E., Švestka, P., Latombe, J. C., and Overmars, M. H. (1996). "Probabilistic roadmaps for path planning in high-dimensional configuration spaces." IEEE Transactions on Robotics and Automation, 12(4), 566–580.

LaValle, S. M. (1998). "Rapidly-Exploring Random Trees: A New Tool for Path Planning." Technical Report TR 98-11, Iowa State University.

Karaman, S. and Frazzoli, E. (2011). "Sampling-based algorithms for optimal motion planning." International Journal of Robotics Research, 30(7), 846–894.

Fox, D., Burgard, W., and Thrun, S. (1997). "The Dynamic Window Approach to Collision Avoidance." IEEE Robotics and Automation Magazine, 4(1), 23–33.

Urmson, C. et al. (2008). "Autonomous driving in urban environments: Boss and the Urban Challenge." Journal of Field Robotics, 25(8), 425–466.

Coming up next

We now have models, estimators and planners, both global and local. Next, we look at more complex articulated robots.

Chapter 6: Kinematics