# COMP7801 Topic 3 Spatial Networks

### Background

#### Network Distance

• In many real applications accessibility of objects is restricted by a spatial network
• Examples
• Driver looking for nearest gas station
• Mobile user looking for nearest restaurant
• Shortest path distance used instead of Euclidean distance
• SP(a,b) = path between a and b with the minimum accumulated length

#### Challenges

• Euclidean distance is no longer relevant
• R-tree may not be useful, when search is based on shortest path distance
• Graph cannot be flattened to a one-dimensional space
• Special storage and indexing techniques for graphs are required
• Graph properties may vary
• directed vs. undirected
• length, time, etc. as edge weights

### Modeling and Storing Spatial Networks

#### Modeling Spatial Networks

• Adjacency matrix only appropriate for dense graphs #### Storing Large Spatial Networks

• Problem: adjacency lists representation may not fit in memory if graph is large
• Solution:
• partition adjacency lists to disk blocks (based on proximity)
• create B+-tree index on top of partitions (based on node-id) • Given a graph G(V,E), and two nodes s,t in V, find the shortest path from s to t
• A classic algorithmic problem
• Studied extensively since the 1950’s
• Several methods:
• Dijkstra’s algorithm
• A*-search
• Bi-directional search
• idea: incrementally explore the graph around s, visiting nodes in distance order to s until t is found (like NN)  #### Algorithm #### Example #### Illustrating

• Find the shortest path between a and b.
• Worst-case performance O(|E| + |V|log|V| )

#### Description

• Dijkstra’s search explores nodes around s without a specific search direction until t is found

• Idea: improve Dijkstra’s algorithm by directing search towards t

• Due to triangular inequality, Euclidean distance is a lower bound of network distance

• Use Euclidean distance to lower bound network distance based on known information:

• Nodes are visited in increasing SPD(s,v)+dist(v,t) order
• SPD(s,v): shortest path distance from s to v (computed by Dijkstra)
• dist(v,t): Euclidean distance between v and t
• Original Dijkstra visits nodes in increasing SPD(s,v) order #### Example #### Illustrating

• Find the shortest path between s and t.
• f(p) = Dijkstra_dist(s, p) + Euclidean_dist(p, t)
• Dijkstra’s search explores nodes around s without a specific search direction until t is found
• Idea: search can be performed concurrently from s and from t (backwards)
• The shortest path tree of s and the (backward) shortest path tree of t are computed in concurrently
• One queue Q_s for forward and one queue Q_t for backward search
• Node visits are prioritized based on min(SPD(s,v), SPD(v,t))
• If v already visited from s and v is in Qt, then candidate shortest path: p(s,v)+p(v,t) (if v already visited from t and v in Q_s symmetric)
• If v is visited by both s and t terminate search; report best candidate shortest path

#### Example ### Discussions

• A* and bi-directional search can be combined to powerful search techniques
• A* can only be applied if lower distance bounds are available
• All versions of Dijkstra’s search require non-negative edge weights
• Bellman-Ford is an algorithm for arbitrary negative edges

## Spatial queries over spatial networks

### Introduction

#### Source/Destination on Edges

• We have assumed that points s and t are nodes of the network
• In practice s and t could be arbitrary points on edges
• Mobile user locations
• Solve problem by introducing 2 more nodes #### Spatial Queries over Spatial Networks

• Data:
• A (static) spatial network (e.g., city map)
• A (dynamic) set of spatial objects
• Spatial queries based on network distance:
• Selections. Ex: find gas stations within 10km driving distance from here
• Nearest neighbor search. Ex: find k nearest restaurants from present position
• Joins. Ex: find pairs of restaurants and hotels at most 100m from each other #### Methodology

• Store (and index) the spatial network
• Graph component (indexes connectivity information)
• Spatial component (indexes coordinates of nodes, edges, etc.)
• Store (and index) the sets of spatial objects
• Ex., one spatial relation for restaurants, one spatial relation for hotels, one relation for mobile users, etc.
• Given a spatial location p, use spatial component of network to find the network edge containing p
• Given a network edge, use network component to traverse neighboring edges
• Given a neighboring edge, use spatial indexes to find objects on them

### Evaluation of Spatial Selections (1)

• Query: find all objects in spatial relation R, within network distance ε from location q
• Method:
• Use spatial index of network (R-tree indexing network edges) to find edge n_1n_2, which includes q
• Use adjacency index of network (graph component) and apply Dijkstra’s algorithm to progressively retrieve edges that are within network distance ε from location q
• For all these edges apply a spatial selection on the R-tree that indexes R to find the results

#### Example

• Example: Find restaurants at most distance 10 from q

• Step 1: find network edge which contains q • Step 2: traverse network to find all edges (or parts of them within distance 10 from q) • Step 3: find restaurants that intersect the subnetwork computed at step 2 ### Evaluation of Spatial Selections (2)

#### Description

• Query: find all objects in spatial relation R, within network distance ε from location q
• Alternative method based on Euclidean bounds:
• Assumption: Euclidean distance is a lower-bound of network distance:
• dist(v,u) ≤ SPD(v,u), for any v,u
• Use R-tree on R to find set S of objects such that for each o in S: dist(q,o) ≤ ε
• For each o in S:
• find where o is located in the network (use Network R-tree)
• compute SPD(q,o) (e.g. use A*)
• If SPD(q,o) ≤ ε then output o

#### Example

• Example: Find restaurants at most distance 10 from q

• Step 1: find restaurants for which the Euclidean distance to q is at most 10: S={r1,r2,r3} • Step 2: for each restaurant in S, compute SPD to q and verify if it is indeed a correct result ### Evaluation of NN search (1)

• Query: find in spatial relation R the nearest object to a given location q
• Method:
• Use spatial index of network (R-tree indexing network edges) to find edge n_1n_2, which includes q
• Use adjacency index of network (graph component) and apply Dijkstra’s algorithm to progressively retrieve edges in order of their distance to q
• For each edge apply a spatial selection on the R-tree that indexes R to find any objects
• Keep track of nearest object found so far; use its shortest path distance to terminate network browsing

#### Example

• Example: Find nearest restaurant to q
• Step: in ppt 31

### Evaluation of NN search (2)

• Query: find in spatial relation R the nearest object to a given location q
• Alternative method based on Euclidean bounds:
• Assumption: Euclidean distance lower-bounds network distance:
• dist(v,u) ≤ SPD(v,u), for any v,u ### Spatial Join Queries

#### Description

• Query: find pairs (r,s), such that r in relation R, s in relation S, and SPD(r,s)≤ε
• Methods:
• For each r in R, do an ε-distance selection queries for objects in S (Index Nested Loops)
• For each pair (r,s), such that Euclidean dist(r,s)≤ε compute SPD(r,s) and verify SPD(r,s)≤ε

### Notes on Query Evaluation based on Network Distance

• For each query type, there are methods based on network browsing and methods based on Euclidean bounds
• Network browsing methods are fast if network edges are densely populated with points of interest
• A limited network traversal can find the result fast
• Methods based on Euclidean bounds are good if the searched POIs are sparsely distributed in the network
• Few verifications with exact SP searches are required
• Directed SP search (e.g. using A*) avoids visiting empty parts of the network

## Advanced indexing techniques for spatial networks

### Shortest Path Materialization and Indexing in Large Graphs

• Dijkstra’s algorithm and related methods could be very expensive on very large graphs
• (Partial) materialization of shortest paths in static graphs can accelerate search ### Hierarchical Path Materialization

• Idea: Partition graph G into G_1,G_2,G_3,… based on connectivity and proximity of nodes
• Every edge of G goes to exactly one G_i
• Border nodes belong to more than one G_i’s
• For each G_i compute and materialize SPs between every pair of nodes in G_i (matrix M_i)
• Partitions are small enough for materialization space overhead to be low
• Compute and materialize SPs between every pair of border nodes (matrix B)
• If border nodes too many, hierarchically partition them into 2nd-level partitions #### algorithm #### Illustrating

• Good partitioning if:
• small partitions
• few combinations examined for SP search
• Non-highway nodes in local partitions
• Highway nodes become border nodes ### Compressing Materialized Paths

• Distance matrix with successors has O(n_2) space cost
• Motivation: reduce space by grouping targets based on common successors #### algorithm

• Create and encode one space partitioning defined by targets of the same successor
• For each node s, index Is a set of <succ,R> pairs:
• succ: a successor of s
• R: a continuous region, such that for each t in R, the successor of s in SP(s,t) is succ • To compute SP(s,t) for a given s, t:
1. SP=s
2. Use spatial index Is to find <succ,R>, such that t in R
3. SP = SP + (s,succ)
4. If succ = t, report SP and terminate
5. Otherwise s=succ; Goto step 2

## Summary

• Indexing and search of spatial networks is different than spatial indexing
• Shortest path distance is used instead of Euclidean distance, to define range queries, nearest neighbor search, and spatial joins
• Spatial networks could be too large to fit in memory
• Disk-based index for adjacency lists is used
• Several shortest path algorithms
• Spatial queries can be evaluated using Euclidean bounds
• Advanced indexing methods for shortest path search on large graphs

## 相关文章

2021-03-06
COMP7801 Topic 4 Top-k
2021-02-06
COMP7801 Topic 2 Spatial Data Management
2021-01-30
COMP7801 Topic 1b Database Indexing
2021-01-23
COMP7801 Topic 1a Relational Database

Arthur