Explain the purpose and properties of routing protocols and switching. The chapter begins with Layer 2-specific technologies, including virtual LANs (VLANs) and the Spanning Tree Protocol (STP). It then continues with a high-level analysis of network routing protocols, with a focus on functionality and technical constraints. The characteristics of both IPv4 and IPv6 routing protocols will be discussed, although from a design perspective they are similar in many aspects. You configure many routing protocols in our Cisco CCNA lab and video course.
Layer 2 Technologies
Layer 2 technologies relate to the OSI Data Link Layer. Today’s modern enterprises, distributed networking world of multimedia, and client applications require greater bandwidth and a greater degree of control. Over the past decade almost all organizations have replaced their shared networking technology (hubs) with switches to create switched technologies.
A collision domain (also called a bandwidth domain) is made up of nodes and devices that share the same bandwidth. For instance, everything that is connected to a switch port, via a hub for example, is in a collision domain. This means that there is always the possibility of a collision in the operations of that Ethernet.
A broadcast domain, on the other hand, represents a collection of devices that can see each other’s Broadcast or Multicast packets. Nodes that are in the same collision domain are also in the same broadcast domain. For instance, all devices associated with the port of a router are in a broadcast domain.
Note: By default, Broadcasts do not traverse a router’s port interface. |
When shared technology is used (e.g., hubs and repeaters), all the devices share the bandwidth of the specific network segment. When using switched technologies (switches), each device in the switch port is in its own collision domain. However, all the devices are in the same broadcast domain.
Switched technologies, as opposed to shared technologies, offer many advantages, among which include the following:
- Provides greater than 10Mbps bandwidth
- Can be used for greater distances because they can connect a matrix of switches
- Provides support for intelligent services
- Provides high availability with only a small increase in costs
Virtual LANs
Virtual LANs (VLANs) define broadcast domains in a Layer 2 network. They represent an administratively defined subnet of switch ports that are in the same broadcast domain. As mentioned before, a broadcast domain is the area in which a Broadcast frame propagates through a network.
The separation of broadcast domains is accomplished with routers, which prevent Broadcasts from being propagated through router interfaces. On the other hand, Layer 2 switches create broadcast domains by special configuration on the switch. By defining broadcast domains on the switch, you can configure switch ports to forward a received Broadcast frame to other specified ports.
Broadcast domains cannot be observed by analyzing the physical topology of the network. A VLAN is a logical concept that can be studied only with access to the configuration on the switches. Another way of thinking about VLANs is to consider them virtual switches, defined in one physical switch. Each new virtual switch defined creates a new broadcast domain (VLAN). Traffic from one VLAN cannot pass directly to another VLAN within a (Layer 2) switch. In order to interconnect VLANs, a router must be used to route packets between them.
Ports can be grouped into different VLANs on a single switch or on multiple interconnected switches. Broadcast frames sent by a device in one VLAN will reach the devices in that specific VLAN only.
VLANs represent a group of devices that participate in the same Layer 2 domain and that can communicate without needing to pass through a router. This means that they share the same broadcast domain. Best design practices suggest a one-to-one relationship between VLANs and IP subnets. Devices in a single VLAN are typically also in the same IP subnet.
Figure 4.1 – Virtual LANs
Figure 4.1 above shows two VLANs, each associated with an IP subnet. VLAN 10 contains Router 1, Host A and Router 2, configured on Switch 1 and Switch 3, and has the 10.10.10.0/24 IP subnet allocated. VLAN 20 contains Host B, Host C, and Host D, configured on Switch 2 and Switch 3, and has the 10.10.20.0/24 IP subnet allocated.
Although vendors took individual approaches to creating VLANs, a multivendor VLAN must be carefully handled to deal with interoperability issues. For example, Cisco developed the ISL standard that operates by adding a new 26-byte header, plus a new 4-byte trailer, encapsulating the original frame (see Figure 4.2 below). To solve incompatibility problems, IEEE developed 802.1Q, a vendor-independent method of creating interoperable VLANs.
Figure 4.2 – ISL Marking Method
802.1Q is often referred to as frame tagging because it inserts a 32-bit (4-byte) 802.1Q VLAN Tag field into the original frame, after the Source Address field, without modifying other fields. The first 2 bytes in the 802.1Q VLAN Tag field hold a registered Ethernet-type value of 0x8100, which implies the frame contains an 802.1Q header. The next 3 bits represents the 802.1Q User Priority field, and these bits are used as Class of Service (CoS) bits in Quality of Service (QoS) techniques. The next 1 bit represents the Canonical Format Indicator field, followed by the 12-bit VLAN ID field. This gives you a total of 4096 VLANs when using the 802.1Q marking method, as illustrated in Figure 4.3 below:
Figure 4.3 – 802.1Q Marking Method
A port that carries data from multiple VLANs is called a trunk. It can use either the ISL or the 802.1Q protocols. A special concept in the 802.1Q world is native VLAN. This is a particular type of VLAN in which frames are not tagged. The purpose of a native VLAN is to allow a switch to use 802.1Q trunking (i.e., multiple VLANs on a single link) on an interface, but if the other device does not support trunking, the traffic for the native VLAN can still be sent over the link. Cisco uses VLAN 1 as the default native VLAN.
Among the reasons for using VLANs, the most important are:
- Network security
- Broadcast distribution
- Bandwidth utilization
An important benefit of using VLANs concerns network security issues. By creating VLANs within switched network devices, a logical level of protection is created. This can be useful in situations in which a group of hosts must not receive data destined for another group of hosts (e.g., departments in a large company, as depicted in Figure 4.4 below).
Figure 4.4 – Departmental VLAN Segmentation
VLANs can mitigate situations in which Broadcasts represent a problem in a network. Creating additional VLANs and attaching fewer devices to each isolates Broadcasts within smaller areas. The effectiveness of this action depends on the source of the Broadcast. If Broadcast frames come from a localized server, that server might need to be isolated in another domain. If Broadcasts come from workstations, creating multiple domains helps reduce the number of Broadcasts in each domain. In the example depicted in Figure 4.4 above, each department VLAN creates a standalone broadcast domain.
Users attached to the same network segment share the bandwidth of that particular segment. As the number of users attached to the segment grows, the average bandwidth assigned to each user decreases and the applications might start to suffer. Implementing VLANs can offer more bandwidth to users. In Figure 4.4 above, each department VLAN has a 100Mbps bandwidth shared between the workstations in that specific department.
Spanning Tree Protocol
The Spanning Tree Protocol (STP), defined by IEEE 802.1D, is a loop-prevention protocol that allows switches to communicate with each other in order to discover physical loops in a network. If a loop is found, STP specifies an algorithm that switches can use to create a loop-free logical topology. This algorithm creates a tree structure of loop-free leaves and branches that spans across the Layer 2 topology.
Loops mainly occur as a result of multiple connections between switches, and the main function of loops is to provide redundancy.
Figure 4.5 – Layer 2 Loop Scenario
Analyzing the topology in Figure 4.5 above, if none of the switches run STP, the following process takes place: Host A sends a frame to the Broadcast MAC address FF:FF:FF:FF:FF:FF and the frame arrives at both Switch 1 and Switch 2. When Switch 1 receives the frame on its Fa0/1 interface, it will flood the frame out of the Fa0/2 port. The frame will reach Host B and Switch 2’s Fa0/2 interface. Switch 2 will flood the frame out of its Fa0/1 port and Switch 1 will receive the same frame it transmitted before. Following the same procedure, Switch 1 will re-transmit the frame on its Fa0/2 interface and a broadcast loop occurs. A broadcast loop also occurs in the opposite direction (i.e., the frame received by Switch 2’s Fa0/1 will be flooded to the Fa0/2 interface and received back from Switch 1).
Bridging loops are more dangerous than routing loops because (as mentioned before), a Layer 3 packet contains a special TTL (Time To Live) field that decrements as it passes through Layer 3 devices. In a routing loop, the TTL field will reach 0 and the packet will be discarded. A Layer 2 frame loop will not stop until a switch interface is shut down.
The negative effects of Layer 2 loops grow as network complexity (i.e., number of switches) grows because as the frame is flooded out of multiple switch ports, the total number of frames multiplies at an exponential rate.
Broadcast storms also have a big negative impact on network hosts because the Broadcasts must be processed by the CPU in all devices on the segment. In Figure 4.5, both Host A and Host B will try to process all the frames they receive. This will eventually deplete their resources unless these frames are removed from the network.
STP calculations are based on two key concepts:
- Bridge ID
- Path cost
A Bridge ID (BID) is an 8-byte field composed of two subfields: the high-order Bridge Priority subfield (2 bytes), which is expressed in dotted-decimal format with values from 0 to 65535, and the low-order MAC Address subfield (6 bytes), which is expressed in hexadecimal format. The default Bridge Priority value is 32768.
Switches use the concept of cost to evaluate how close they are to other switches. The original 802.1D standard defined cost as 1000Mbps divided by the bandwidth of the link in Mbps, for example, a 10Mbps link with a cost of 100 and a FastEthernet link with a cost of 10, where lower STP costs are better. However, as higher bandwidth connections started to gain in popularity, a problem occurred, because the cost was stored as an integer value only. The option of using a cost of 1 for all links greater than 1Gbps would narrow the accuracy of the STP cost calculations, so this option was considered invalid. As a solution to this problem, the IEEE decided to modify the cost values to a non-linear scale, as shown in Table 4.1 below:
Table 4.1 – Bandwidth and STP Cost Values
Bandwidth | STP Cost |
10Mbps | 100 |
45Mbps | 39 |
100Mbps | 19 |
622Mbps | 6 |
1Gbps | 4 |
10Gbps | 2 |
These values were carefully chosen so that the old and new schemes interoperate for the link speeds in common use today.
In the process of creating a loop-free logical topology, STP uses a four-step decision sequence:
- Lowest Root BID
- Lowest path cost to Root Bridge
- Lowest Sender BID
- Lowest Port ID
Switches exchange STP information using special frames called bridge protocol data units (BPDUs). Switches evaluate all the BPDUs received on a port and store the best BPDU seen on every port. Every BPDU received on a port is checked against the four-step sequence to see if it is more attractive than the existing BPDU saved for that port.
When a switch first becomes active, all of its ports send BPDUs every 2 seconds. If a port hears a BPDU from another switch that is more attractive than the BPDUs it has been sending, the port stops sending BPDUs. If the more attractive BPDU stops arriving after a period of 20 seconds (by default), the local port will resume sending its own BPDUs.
There are two types of BPDUs:
- Configuration BPDUs: sent by the Root Bridge; flow across the active paths
- Topology Change Notification (TCN) BPDUs: sent to announce a topology change
The initial STP convergence process is accomplished in three steps:
- Root Bridge election
- Root Ports election
- Designated Ports election
When a network is powered on, all the switches announce their own BPDUs. After they analyze the BPDUs received, a single Root Bridge is elected. All switches except the Root Bridge calculate a set of Root Ports and Designated Ports to build a loop-free topology. After the network converges, BPDUs flow from the Root Bridge to every segment in the network. Additional changes in the network are handled using TCN BPDUs.
The first step in the convergence process is electing a Root Bridge. The switches do this by analyzing the BPDUs received and looking for the switch with the lowest BID.
Figure 4.6 – STP Convergence
Referring to Figure 4.6 above, Switch 1 has the lowest BID of 32768.AA.AA.AA.AA.AA.AA and is elected as the Root Bridge because it has the lowest MAC address, assuming they all have the same bridge priority (i.e., the default of 32768).
The switches learned that Switch 1 was the Root Bridge by exchanging BPDUs at a default interval of 2 seconds. BPDUs contain a series of fields, among which include the following:
- Root BID field: identifies the Root Bridge
- Root Path Cost field: contains information about the distance to the Root Bridge
- Sender BID field: identifies the bridge that sent the specific BPDU
- Port ID field: identifies the port on the sending bridge that placed the BPDU on the link
Only the Root BID and Sender BID fields are considered in the Root Bridge election process. When a switch first boots, it places its BID in both the Root BID and the Sender BID fields. Suppose that Switch 1 boots first and starts sending BPDUs announcing itself as the Root Bridge every 2 seconds. After some time, Switch 3 boots and announces itself as the Root Bridge. When Switch 2 receives these BPDUs, it discards them because its own BID has a lower value. As soon as Switch 3 receives a BPDU generated by Switch 2, it starts sending BPDUs that list Switch 2 as the Root BID (instead of itself) and Switch 3 as the Sender BID. The two switches now agree that Switch 2 is the Root Bridge.
If Switch 1 boots a few minutes later, it initially assumes that it is the Root Bridge and starts advertising this fact in the BPDUs it generates. As soon as these BPDUs arrive at Switch 2 and Switch 3, these two switches give up the Root Bridge position in favor of Switch 1. All three switches are now sending BPDUs that announce Switch 1 as the Root Bridge.
The next step is electing the Root Ports. A Root Port on a switch is the port that is closest to the Root Bridge. Every switch except the Root Bridge must elect one Root Port. As mentioned before, switches use the concept of cost to determine how close they are to other switches. The Root Path Cost is the cumulative cost of all links to the Root Bridge.
When Switch 1 sends BPDUs, they contain a Root Path Cost of 0. As Switch 2 receives them, it adds the path cost of its interface Fa0/1 (a value of 19 for a FastEthernet link) to the Root Path Cost value. Switch 2 sends the new Root Path Cost calculated value of 19 in its BPDUs generated on the Fa0/2 interface. When Switch 3 receives the BPDUs from Switch 2, it increases the Root Path Cost by adding 19 (for a total of 38), the cost of its Fa0/2 interface. At the same time, Switch 3 also receives BPDUs directly from the Root Bridge on Fa0/1. This enters Switch 3 with a value of 0 and Switch 3 increases the cost to 19 because Fa0/1 is a FastEthernet interface. At this point, Switch 3 must select a single Root Port based on the two different BPDUs it received, one with a Root Path Cost of 38 from Switch 2 and the other with a Root Path Cost of 19 from Switch 1. The lowest cost wins, thus Fa0/1 becomes the Root Port and Switch 3 begins advertising this Root Path Cost of 19 to downstream switches. Switch 2 goes through the same set of calculations and elects its Fa0/1 interface as the Root Port. This Root Port selection process on Switch 3 is based on the lowest Root Path Costs it receives in the BPDUs, as presented in Table 4.2 below:
Table 4.2 – Switch 3 Root Port Selection
BPDU Received on Port | Root Path Cost |
Fa0/1 (winner) | 19 |
Fa0/2 | 38 |
Note: The path cost is a value assigned to each port and added to BPDUs received on that port in order to calculate the Root Path Cost. The Root Path Cost represents the cumulative cost to the Root Bridge, and it is calculated by adding the receiving port’s path cost to the value contained in the BPDU. |
The next step in the STP convergence process is electing Designated Ports. Each segment in a Layer 2 topology has one Designated Port. This port sends and receives traffic to and from that segment and the Root Bridge. Because only one port handles traffic for each link, this guarantees a loop-free topology. The bridge that contains the Designated Port for a certain segment is considered the Designated Switch on that segment.
Analyzing the link between Switch 1 and Switch 2, you can see that Switch 1’s Fa0/1 has a Root Path Cost of 0 (i.e., it is the Root Bridge) and Switch 2’s Fa0/1 has a Root Path Cost of 19. Switch 1’s Fa0/1 becomes the Designated Port for that link because of the lower Root Path Cost.
A similar election takes place for the link between Switch 1 and Switch 3. Switch 1’s Fa0/2 has a Root Path Cost of 0 and Switch 3’s Fa0/1 has a Root Path Cost of 19, so Switch 1’s Fa0/2 becomes the Designated Port.
Note: Every active port on the Root Bridge becomes a Designated Port. |
When considering the link between Switch 2 and Switch 3, both Switch 2’s Fa0/2 and Switch 3’s Fa0/2 ports have a Root Path Cost of 19. To break this tie, STP uses the four-step decision process described before. Let’s examine this sequence in detail:
- Lowest Root BID: all three bridges are in agreement that Switch 1 is the Root Bridge → advance to the next step.
- Lowest Root Path Cost: both Switch 2 and Switch 3 have a cost of 19 → advance to the next step.
- Lowest Sender BID: Switch 2’s BID (32768.BB.BB.BB.BB.BB.BB) is lower than Switch 3’s BID (32768.CC.CC.CC.CC.CC.CC), so Switch 2’s Fa0/2 becomes the Designated Port and Switch 3’s Fa0/2 is considered a Non-Designated Port → election process has ended.
- Lowest Port ID: this step is not necessary in this particular example.
In a loop-free topology, Root and Designated Ports forward traffic and Non-Designated Ports block traffic. Actually, there are five STP port states, as listed in Table 4.3 below:
Table 4.3 – Spanning Tree Protocol Port States
State | Purpose |
Forwarding | Sending/receiving user data |
Learning | Building bridge table |
Listening | Building “active” topology |
Blocking | Receives BPDUs only |
Disabled | Administratively down |
The Disabled state means that the port is administratively shut down. After initialization, the port starts in the Blocking state, where it listens for BPDUs. The port will transit to the Listening state after the booting process, either when it thinks it is the Root Bridge or after not receiving BPDUs for a certain period of time. In the Listening state, no user data passes through the port; rather, it is just sending and receiving BPDUs in order to determine the Layer 2 topology. This is the phase in which the election of the Root Bridge, Root Ports, and Designated Ports occurs. Ports that remain Designated or Root Ports after 15 seconds progress to the Learning state. This is another 15-second period in which the bridge builds its MAC address table but still does not forward user data. After this 15-second period, the port enters the Forwarding state, in which it sends and receives data frames. The various timers used in this process are listed in Table 4.4 below:
Table 4.4 – Spanning Tree Protocol Timers
Timer | Purpose | Default Value |
Hello Time | Time between BPDUs sent by the Root Bridge | 2 seconds |
Forward Delay | Duration of the Listening and Learning states | 15 seconds |
Max Age | Duration the BPDU is stored | 20 seconds |
A modern variation of STP is Rapid STP (RSTP), as defined by IEEE 802.1W. The main advantage of RSTP is its ability to achieve fast convergence, as the neighbor switches are able to communicate with each other and determine the state of the links in less time. The RSTP Bridge Port roles include the following:
- Root: the best forwarding port from the Non-Root Bridge to the Root Bridge
- Designated: forwarding port for every LAN
- Alternate: alternate path to the Root Bridge (different from the Root Port path in STP)
- Backup: backup path to a segment where another bridge port already connects
- Disabled: port can be manually disabled by an administrator
The RSTP port states are listed in Table 4.5 below:
Table 4.5 – Rapid STP Port States
State | Purpose |
Discarding | No user data sent over the port |
Learning | Building bridge table |
Forwarding | Port is fully operational |
Although some important differences exist between RSTP and STP, they are compatible and can work together in any network.
Layer 3 Technologies
The Routing Logic
Routers look at the packet’s destination address to determine where the packet is going so they can select the best path to get the packet there. In order to calculate the best path, routers must know which interface should be used to reach the packet’s destination network. Routers learn about networks either by being physically connected to them or by learning information from other routers or from a network administrator. The process of learning about networks from other routers’ advertisements is called dynamic routing and different routing protocols can be used to achieve this. The process by which a network administrator manually defines routing rules on the device is called static routing. Finally, the routes to which a router is physically connected are known as directly connected routes.
Routers keep the best path to destinations learned via direct connections, static routing, or dynamic routing in internal data structures called routing tables. A routing table contains a list of networks the router has learned about and information about how to reach them.
As mentioned before, dynamic routing is the process by which a router exchanges routing information and learns about remote networks from other routers. Different routing protocols can accomplish this task:
- Routing Information Protocol (RIP)
- Enhanced Interior Gateway Routing Protocol (EIGRP)
- Open Shortest Path First (OSPF)
- Intermediate System-to-Intermediate System (IS-IS)
- Border Gateway Protocol (BGP)
The most important information routing tables contain includes the following:
- How the route was learned (i.e., statically, dynamically, or directly connected)
- The address of the neighbor router from which the network was learned
- The interface through which the network can be reached
- The route metric: a measurement that gives routers information about how far or how preferred a network is (the exact meaning of the metric value depends on the routing protocol used)
Figure 4.7 – Routing Tables
Figure 4.7 above illustrates a scenario with two routers that use hop count as the metric. The topology contains three networks known by both routers. A hop count represents the number of routers a packet must go through to reach a specific destination. Router A has two directly connected networks: 10.10.10.0 and 192.168.10.0; thus, the metric to each of them is 0. Router A knows about the 10.10.20.0 network from Router B, so the metric for this network is 1, because a packet sent by Router A must traverse Router B to reach the 10.10.20.0 network. Router B has two directly connected networks – 10.10.20.0 and 192.168.10.0 – and one remote network learned from Router A – 10.10.10.0, with the metric of 1.
Routing Protocol Concepts
Before analyzing details about each individual routing protocol, we will first look at some general information about IP routing. Network engineers should know the key characteristics that different routing protocols have because they will be in a position to recommend specific routing protocols for different projects.
The first key decision criteria is figuring out whether you should use static or dynamic routing. Static routing implies manually defining routes on devices and dynamic routing implies using a dedicated routing protocol that will build the routing table.
Static Routing
A static route is input manually by the network administrator. Even though they may not seem necessary in modern networks, there are situations in which they can offer granular control and optimization of the information learned by the routing protocols. Static routes can be used in conjunction with dynamic routing protocols to reach specific networks or to provide the default gateway (e.g., pointing to the ISP), which is useful in situations where the destination network is not part of the routing protocol database.
Another scenario in which static routes are used is to override dynamically learned routing information. Static routes can also be used in the form of floating static routes by setting the administrative distance (AD) of a particular static route to a higher (worse) value than the AD value of the same route learned via a routing protocol for failover reasons.
Dynamic Routing
An important thing to decide when choosing the routing protocol is whether you need an Interior Gateway Routing Protocol (IGP) or an Exterior Gateway Routing Protocol (EGP). When you are routing between the devices within your organization (i.e., an autonomous system), you will have a lot of IPv4-based IGPs you can choose from, including the following:
- RIP version 1
- RIP version 2
- OSPF
- IS-IS
- IGRP
- EIGRP
- ODR
Note: RIP version 1 and IGRP are considered legacy protocols and some modern network devices do not support them. |
On Demand Routing (ODR) is a Cisco proprietary protocol designed for hub-and-spoke topologies. It offers basic routing functionality and works over Cisco Discovery Protocol (CDP). The most common interior protocols used in non-hub-and-spoke environments are RIPv2, OSPF, IS-IS, and EIGRP.
IPv6 uses specially developed versions of routing protocols, including the following:
- IS-IS
- RIPng
- OSPFv3
- EIGRP for IPv6
Routing between autonomous systems (i.e., from large corporations to the Internet or between service providers) is accomplished using special routing protocols called Exterior Routing Protocols (EGPs). The most common EGP for both IPv4 and IPv6 is Border Gateway Protocol (BGP). Some companies are very big and they span the entire globe, so they use BGP inside their network as their IGP.
Large networks, including the Internet, are based on the autonomous system (AS) concept. An AS defines a group of network devices under a common administration, and most often this defines a large company or a service provider. Routing protocols can be classified based on different criteria. Depending on the zone in which they operate, they can be considered interior (intra-AS) routing protocols or exterior (inter-AS) routing protocols. Interior routing protocols can be further classified as Distance Vector protocols or Link-State protocols, based on their behavior regarding the router update exchange process. Each routing protocol type is detailed in the following sections, along with their respective design considerations.
Interior Routing Protocols
Interior routing protocols or interior gateway protocols (IGPs) are configured on groups of routers from the same AS; thus, the IGP routing activity never leaves the enterprise premises. An important aspect that must be considered when selecting the routing protocol is the difference between Distance Vector and Link-State protocols. Link-State protocols, developed after Distance Vector protocols, are much more sophisticated. A special category involves hybrid routing protocols, which feature the best characteristics of Distance Vector and Link-State technologies. The only hybrid routing protocol used in modern networks is EIGRP. These different routing protocols are illustrated in Figure 4.8 below:
Figure 4.8 – Routing Protocol Technologies
Distance Vector routing protocols include:
- RIP version 1
- RIP version 2
- IGRP
- RIPng
Link-State routing protocols include:
- OSPF
- IS-IS
- OSPF version 3
Distance Vector Routing Protocols
Distance Vector routing is a property of certain routing protocols that builds an internal picture of the topology by periodically exchanging full routing tables between neighbor devices. The main difference between Distance Vector routing protocols and Link-State routing protocols is the way they exchange routing updates. Distance Vector protocols function using the “routing by rumor” technique, as every router relies on its neighbors to maintain correct routing information. This means that the entire routing table is sent periodically to neighbors, as shown in Figure 4.9 below:
Figure 4.9 – Distance Vector Routing Protocol Behavior
The most important advantage of Distance Vector routing protocols is they are easy to implement and maintain. One downside, however, is convergence times. A converged network is a network in which every router has the same perspective of the topology. When a topology change occurs, the routers in the respective area propagate the new information to the rest of the network. Considering this is done on a hop-by-hop basis (i.e., every router passes its fully updated routing information to each neighbor), network convergence won’t be established until after a significant amount of time has passed.
In addition to slow convergence, because full routing tables are exchanged between routers, Distance Vector protocols are also bandwidth-intensive. This happens especially in large networks, where routing tables can be of considerable size. Considering these aspects, Distance Vector protocols are recommended only in small enterprise network implementations.
An example of a Distance Vector routing protocol still used in modern networks is RIPv2 (a routing information protocol described in RFC 2453). RIPv2 uses hop count as a metric for path selection, with a maximum hop count of 15. RIPv2 updates are sent using Multicast by default, although they can be configured as Unicast or Broadcast, and, unlike its predecessor (RIPv1), RIPv2 permits VLSM on the network.
Devices receive routing information from their neighbors and pass it on to other neighbors. RIP repeats this process every 30 seconds. The downside in this scenario is that when the network is stable and there are no changes in the topology, RIP still sends its routing table every 30 seconds, which is not very effective as it wastes bandwidth.
Note: Although the general idea in the networking world concerning Distance Vector routing protocol updates is that all the routing table information is exchanged between neighbors, the truth is that only the best routes are exchanged through the routing updates. Alternate routes are not included in the routing update packets. |
If a router that uses Distance Vector protocols has inaccurate information, that information will be propagated through the entire network. Distance Vector routing protocols are also prone to major problems, including routing loops.
Link-State Routing Protocols
Link-State routing protocols do not “route by rumor.” Instead, routing devices exchange information about their link states. Devices independently build a loop-free map of the network (i.e., they do not rely on a map of a particular node) based on the Link-State information each router generates and propagates to the other routers.
Unlike Distance Vector routing protocols, Link-State protocols flood information about its links to a specific area or to all the routers in the network. This way every router in the topology has detailed knowledge of the entire network, unlike the routers using Distance Vector routing protocols, where only the best routes are exchanged between neighbors. The routing decisions are made by applying the Shortest Path First (SPF) or Dijkstra’s algorithm to the information received from various sources. This calculation results in the shortest path to each destination in the network, as shown in Figure 4.10 below:
Figure 4.10 – Link-State Routing Protocol Behavior
This is a much more efficient approach to building routing databases, and there is no fixed update timer like with Distance Vector technologies. Link-State protocols reflood their entire routing information every 30 minutes to ensure that the network is properly converged.
Link-State protocols offer a series of important advantages compared with Distance Vector protocols. The most important advantage relates to the convergence factor. Convergence is a lot faster because as soon as a network topology changes, only that specific information is sent to the routers in a given area. The routing updates stop after all the routers learn about the specific change, thus decreasing the need for bandwidth, unlike with Distance Vector protocols that periodically exchange routing tables, even if no topology change occurs. In Link-State routing, updates are triggered only when a Link-State changes somewhere in the network. Depending on the routing protocol in use, this can mean a link going up/down or changing some of its parameters (e.g., bandwidth).
Examples of Link-State routing protocols are OSPF, described in RFC 2328, and IS-IS, described in RFC 1142.
Note: An interesting and special routing protocol is EIGRP, a Cisco proprietary protocol, because it has both Distance Vector and Link-State characteristics. It is also called a hybrid or an advanced Distance Vector routing protocol. |
Exterior Routing Protocols
Exterior routing protocols run between autonomous systems (inter-AS) and the most common example is BGP version 4. The main reason for using different types of routing protocols to carry routes outside the AS boundaries is the need to exchange a large amount of route entries. In this regard, exterior routing protocols support special options and features that are used to implement various policies. The routing metrics of these protocols include more parameters than in the IGP case because of the crucial need for fast convergence and choosing the best possible path.
While IGPs are used within enterprise-level networks, BGP is typically used in ISP-to-ISP or ISP-to-enterprise connections. Unlike intra-AS protocols that make routing decisions based exclusively on the metric value, inter-AS protocol policies can also include other factors, like business decisions or possible AS vulnerabilities. These are technically implemented by configuring different BGP parameters.
A typical scenario in which the use of BGP is beneficial because of its flexible policy implementation is an enterprise connecting to multiple ISPs (i.e., multihoming). BGP can interconnect with any interior routing protocol used inside the enterprise network, so administrators have maximum flexibility when it comes to choosing a suitable interior routing protocol. A simple example of this scenario is presented in Figure 4.11 below:
Figure 4.11 – Enterprise Multihoming Scenario
Other Routing Protocol Considerations
Another key parameter of routing protocols and a measure of their sophistication is whether they have a hierarchical or flat behavior. IS-IS, OSPF, and EIGRP can be configured in a hierarchical manner and they offer improved scalability. For example, OSPF splits the topology into multiple areas and uses the Area 0 (backbone) concept, which allows connections to every other area in the topology. Routes can be summarized as they enter or leave the backbone, which leads to increased efficiency and bandwidth optimization.
IGRP and RIP are examples of routing protocols that are based on a flat behavior because they are not optimized and they use a single structure, no matter how large the network is.
One of the things a router has to do is decide the best way to get to a destination. If a router learns different paths to an exact same destination from different protocols, the router has to decide which prefix it should listen to. In order to make this decision it uses administrative distance values. Lower AD values are preferred over higher AD values so, for example, OSPF routes (AD=110) will be preferred over RIP routes (AD=120). The AD value represents how trustworthy a particular routing protocol is, and the most common AD values are summarized in Table 4.6 below:
Table 4.6 – Administrative Distance Values
Routing Protocol | AD Value |
Connected | 0 |
Static (pointing at IP address) | 1 |
EIGRP Summary | 5 |
External BGP | 20 |
EIGRP | 90 |
OSPF | 110 |
RIP | 120 |
External EIGRP | 170 |
Internal BGP | 200 |
Using AD values, a router selects a routing path based on one protocol instead of another, but something that also has to be decided is the way in which the device will select a routing table entry over another entry from the same protocol. Routing protocol metrics are used to make this decision.
Different routing protocols use different metrics. RIP uses hop count as a metric, selecting the best route based on the lowest number of routers it needs to traverse through. This is not very efficient because the shortest path can have a lower bandwidth than other paths. OSPF is more evolved and takes bandwidth into consideration, creating a metric called cost. Cost is directly generated from the bandwidth value, so a low bandwidth has a high cost and a high bandwidth has a low cost.
EIGRP is even more sophisticated and uses a composed metric, considering both bandwidth and delay values to create the metric value. BGP, the most sophisticated of all, uses many different attributes grouped in path vectors that can be used to calculate the best path.
Note: One of the reasons RIP has a high AD value is that it uses the hop count metric, which is not very efficient in complex environments. The more sophisticated a routing protocol’s metric calculation is, the lower its AD value. |
Routing Problems Avoidance Mechanisms
As mentioned before, Distance Vector routing protocols are prone to major problems as a result of their simplistic “routing by rumor” approach. Distance Vector and Link-State protocols use different techniques to prevent routing problems. The most important mechanisms include the following:
- Invalidation timers: These are used to mark routes as unreachable when updates for those routes are not received for a long time.
- Hop count limit: This parameter marks routes as unreachable when they are more than a predefined number of hops away. The hop count limit for RIP is 15, as it is not usually used in large networks. Unreachable routes are not installed in the routing table as best routes. The hop count limit prevents updates from looping in the network, just like the TTL field in the IP header.
- Triggered updates: This feature allows the update timer to be bypassed in the case of important updates. For example, the RIP 30-second timer can be ignored if a critical routing update must be propagated through the network.
- Holddown timers: If a metric for a particular route keeps getting worse, updates for that route are not accepted for a delayed period.
- Asynchronous updates: These represent another safety mechanism that prevents the routers from flooding their entire routing information at the same time. As mentioned before, OSPF does this every 30 minutes. The asynchronous updates mechanism generates a small delay for every device so they do not flood information at exactly the same time. This improves bandwidth utilization and processing capabilities.
- Route poisoning: This prevents routers from sending packets through a route that has become invalid. Distance Vector protocols use this to indicate that a route is no longer reachable. This is accomplished by setting the route metric to a maximum value.
- Split horizon: This prevents updates from being sent out of the same interface they came from because routers in that area should already know about that specific update.
- Poison reverse: This is an exception to the split horizon rule for poisoned routes.
RIP
The Routing Information Protocol (RIP) comes in two versions: RIPv1, which is a legacy protocol that has some shortcomings, and RIPv2.
RIP version 1
The major drawback of RIPv1 is its classful behavior, meaning it does not send subnet mask information in its routing updates. If there is no subnet mask information about prefixes in the updates, then a consistency of the prefixes used is assumed. This means that RIPv1 does not offer VLSM support (you have to stick with the default Class A,B or C subnet mask).
Another issue with RIPv1 is that it broadcasts updates. In addition to the unnecessary use of bandwidth, this also means that routers not running RIP constantly receive unnecessary RIP updates from the network. Modern routing protocols use a Multicast approach to solve this issue, sending updates only to routers that really need to receive them (i.e., devices that subscribe to hearing RIP information).
RIPv1 does not allow authentication, so there is no element of security that can be added to the routing protocol to ensure that you are not sending information to devices that should not receive it.
RIP version 2
Examining RIP version 2, you can see that many of the RIPv1 shortcomings have been addressed. RIPv2 has a classless behavior, meaning subnet mask information is sent in updates so VLSM can be achieved. It also supports authentication to ensure that the person you are sending the information to is authorized to receive that information.
RIPv2 multicasts routing updates instead of broadcasting them like with RIPv1, which allows for efficient routing update exchanges. Another special feature of RIPv2 is the automatic summarization applied to prefixes on classful boundaries, as illustrated in Figure 4.12 below. However, this behavior can induce problems in real-world scenarios.
Figure 4.12 – RIPv2 Automatic Summarization
Consider the scenario in which you have a router (R1) that connects to the following networks: 10.10.10.0, 10.10.20.0, and 10.10.30.0. R1 connects to R2 and then to R3, which has connectivity to the 10.10.40.0 and 10.10.50.0 networks. You also have other networks between the routers, like 172.16.0.0 between R1 and R2 and 192.168.0.0 between R2 and R3.
Notice the change in classful boundaries that makes RIPv2 automatically summarize the networks behind R1 and R3 as 10.0.0.0/8 toward R2. This leads to a real problem: R2 will receive the same route from both directions. If R2 gets a packet destined to 10.10.10.0, it can send it in both directions based on the automatically summarized prefixes it received. This problem is called discontiguous subnets and it is generated by the automatic summarization behavior of the routing protocol that aggregates those subnets.
Solutions for this problem involve not using discontiguous subnets in different areas in the network topology or disabling the automatic summarization behavior.
Another interesting aspect about RIP is that it relies on a series of timers for its operation:
- Update (updates are sent every 30 seconds by default)
- Invalid (the route is invalidated if no update was received before this timer expires)
- Flush (determines the time a route gets flushed from the RIP table)
- Holddown (updates are not accepted for a route that keeps getting a bad metric)
- Sleep (adds a delay to triggered updates)
Note: The holddown and sleep timers are Cisco-specific and are used to enhance RIP functionality. They were not originally specified in the RFCs for RIP. |
EIGRP
The Enhanced Interior Gateway Routing Protocol (EIGRP) is a unique protocol in that it uses a hybrid approach, combining Distance Vector and Link-State characteristics. Combining these features makes EIGRP very robust and allows for fast convergence even in large topologies.
EIGRP is a Cisco proprietary protocol that is used only in environments that contain Cisco devices. This protocol is not suitable in a multivendor architecture. Just like RIPv2, EIGRP is a classless protocol and it allows for VLSM. Another similarity between the two protocols is the automatic summarization behavior, but this can be disabled just as easily.
The algorithm that EIGRP uses is called the Diffusing Update Algorithm (DUAL). DUAL is the engine that makes EIGRP such a powerful protocol. DUAL operates based on a topology table that contains all of the possible prefixes and information about how to reach those prefixes. The topology table is used to identify a best prefix called the Successor and puts this route in the routing table. After determining the best route in the topology table, EIGRP identifies second-best routes called Feasible Successors. Feasible Successors are not installed in the routing table until the best route is lost. At that moment, the next-best Successor in the topology table is installed in the routing table almost immediately because there is no need for other computations. This is the reason EIGRP provides such fast convergence times.
EIGRP is the only IGP that can perform unequal cost load balancing across different paths, as illustrated in Figure 4.13 below. This is accomplished on Cisco routers by using the variance command, which defines a tolerance multiplier that can be applied to the best metric and that will result in the maximum allowed metric.
Figure 4.13 – EIGRP Unequal Cost Load Balancing
Let’s consider an example in which you have two routes to a destination with a cumulative metric of 100, and you also have a route to the same destination with a cumulative metric of 200. By default, EIGRP performs only equal cost load balancing so it will send traffic only across the first two links, which have the best metric of 100. If you want to send traffic over the third link as well, you need to set the variance to 2, meaning the maximum allowed metric is 2 times the lowest metric, equaling 200. Traffic will be sent proportionally to the metric, meaning for each packet sent over the third link, two packets are sent over each of the first two links because their metrics are better.
EIGRP creates neighbor relationships with adjacent routes and exchanges information with them using the Reliable Transport Protocol (RTP). This protocol makes sure neighbors can exchange information in a reliable manner.
Note: Do not confuse EIGRP-specific RTP with the Real-time Transport Protocol used in VoIP environments. |
By default, EIGRP calculates route metrics based on bandwidth and delay but it can also use other parameters in the calculation, including:
- Load
- Reliability
- MTU
Enabling the metric calculation based on load, reliability, and MTU is not recommended by Cisco because the network might become unstable.
OSPF
The Open Shortest Path First (OSPF) protocol is one of the most complex routing protocols that can be deployed in modern networks. As opposed to EIGRP, OSPF is an open standard protocol.
OSPF is a classless routing protocol and this allows it to support VLSM. Just like EIGRP uses the DUAL algorithm, OSPF uses the Dijkstra Shortest Path First (SPF) algorithm to select loop-free paths throughout the topology. OSPF is designed to be very scalable because it is a hierarchical routing protocol, using the concept of areas to split the topology into smaller sections.
OSPF may not converge as fast as EIGRP, but it offers efficient updating and convergence and it takes bandwidth into consideration when calculating route metrics (or costs). A higher bandwidth generates a lower cost and lower costs are preferred in OSPF. OSPF supports authentication, just like EIGRP and RIPv2. It is also very extensible, just like BGP and IS-IS, meaning the protocol can be modified in the future to handle other forms of traffic.
OSPF Functionality
OSPF discovers neighbors and exchanges topology information with them, acting a lot like EIGRP. Based on the information collected and the link costs, OSPF calculates the shortest paths to each destination using the SPF algorithm. The formula for calculating the interface cost is Reference Bandwidth/Link Bandwidth. The default Reference Bandwidth is 100Mbps but it can be modified, just as the Link Bandwidth that is taken into consideration can be modified using the bandwidth command.
Note: The Reference Bandwidth should be modified in networks that contain a combination of 100Mbps and 1Gbps links because, by default, all of these interfaces will be assigned the same OSPF cost. |
Another aspect that adds to the design complexity of OSPF is that it can be configured to behave differently depending on the topology in which it is implemented. OSPF recognizes different network types and this controls things like:
- How updates are sent
- How many adjacencies are made between the OSPF speakers
- How the next hop is calculated
OSPF supports six network types:
- Broadcast
- Non-Broadcast
- Point-to-Point
- Point-to-Multipoint
- Point-to-Multipoint Non-Broadcast
- Loopback
OSPF does a good job of automatically selecting the network type that is most appropriate for a given technology. For example, if you configure OSPF in a Broadcast-based Ethernet environment, it will default to the Broadcast network. If you configure it on a Frame Relay physical interface, it will default to the Non-Broadcast network. OSPF configured on a point-to-point serial link will default to the Point-to-Point network.
Two network types that are never automatically assigned are Point-to-Multipoint and Point-to-Multipoint Non-Broadcast. These are most appropriate for partial-mesh (hub-and-spoke) environments and must be manually configured.
The network types can influence the underlying OSPF protocol in many ways. The Broadcast network will be the default on Broadcast media, and once OSPF is configured in a Broadcast environment, the systems will elect a Designated Router (DR) and a Backup Designated Router (BDR) on each segment. To communicate with the DRs, OSPF will multicast updates to 224.0.0.6, while communicating with every OSPF router requires packets to multicast to 224.0.0.5.
In a Broadcast network, the DR is the device that all the other routers will form their adjacency with and this is a protection mechanism against the network being overwhelmed with a full mesh of adjacencies. Except for the fact that it minimizes adjacencies, the DR also helps minimize the amount of OSPF traffic between OSPF nodes because they must communicate only with the DR. The BDR is just a node that will replace the DR if it fails.
On a Broadcast OSPF segment, if every node had to form adjacencies to exchange information with everybody else, the number of total neighbor relationships would be n*(n-1)/2, where n is the number of routers. Using a DR helps reduce the total number of adjacencies and makes the process more efficient because nodes do not need a full mesh of relationships.
OSPF Router Types
The OSPF hierarchy uses the concept of areas to improve the scalability of the protocol. Link-State protocols operate by flooding information about the status of their links, but when you divide the network into areas, only the routers in a specific area have to agree on the topology map. By setting up areas you can reduce the convergence domain size because you have the ability to hide topology details between areas. This leads to the protocol becoming much more efficient.
Area 0 (backbone) is the critical area in an OSPF environment and every OSPF design must start from this area. It is also called the transit area because all areas must connect to it and traffic between areas must go through Area 0. Another feature of the backbone area is that it must be contiguous, meaning it cannot be broken up into multiple parts. Once the backbone area is designed, other areas called non-transit areas can be included and they can be assigned any number. This is illustrated in Figure 4.14 below:
Figure 4.14 – OSPF Area Types and Router Roles
Network engineers should also understand the different router roles that exist within OSPF. These roles are as follows:
- Backbone router: This is the terminology given to a router that has at least one link in Area 0.
- Internal router: This router has all links participating in one non-transit area.
- Area Border Router (ABR): This router is positioned between multiple areas. This means that the router has at least one link in Area 0 and one link in a non-transit area. ABRs are used to pass information between the backbone area and non-transit areas. They are also used to summarize information between the two areas, thus improving the efficiency and the scalability of the OSPF design.
- Autonomous System Boundary Router (ASBR): An ASBR has at least one link to the OSPF domain and at least one link outside the OSPF domain, touching another routing protocol, like EIGRP, IS-IS, or BGP. It is used to redistribute information to and from other routing domains and OSPF.
Virtual Links
If the backbone area was split into multiple areas, you could ensure continuity by creating virtual links. A virtual link can be considered an Area 0 tunnel that connects the dispersed backbone areas. Virtual links are not considered best design practices but they can be useful in particular situations, such as company mergers. In Figure 4.15 below, a virtual link is configured between ABRs as a temporary fix to the problem (i.e., split Area 0). The virtual link tunnels the backbone area between the devices, repairing the topology until a network redesign takes place.
Figure 4.15 – OSPF Virtual Links (Example 1)
Another classic case in which you might use virtual links is a situation in which you have an OSPF area not connected to the backbone. Looking at the example in Figure 4.16 below, you have Area 100 connected to Area 0 and Area 200 connected only to Area 100. This poses a design problem because it goes against the principle that every area must be connected to Area 0. The solution in this case would be to configure a virtual link between Area 0 and Area 200 so the backbone is extended to reach Area 200.
Figure 4.16 – OSPF Virtual Links (Example 2)
Note: In the scenario depicted in Figure 4.16, virtual links are often considered an extension of the non-transit area (Area 200 in this case) to reach Area 0. However, this is actually the opposite of what the virtual link is used for. Because the virtual link is part of Area 0, it is Area 0 that is extended to reach the non-transit area (Area 200 in this case). |
Link-State Advertisements
Another important OSPF aspect is represented by the different Link-State Advertisement (LSA) types. Each LSA type has a unique format that is defined by the type of information it contains (either internal or external prefixes). The LSA types are as follows:
- Type 1 – Router LSA: Type 1 LSAs are used by routers in an area to advertise a link to another router in the same area.
- Type 2 – Network LSA: Type 2 LSAs are generated by the DR to send updates about attached routers.
- Type 3 – Network Summary LSA: Type 3 LSAs are generated by the ABR to advertise information from one area to another.
- Type 4 – ASBR Summary LSA: Type 4 LSAs are generated by the ABR to send information about the location of the ASBR.
- Type 5 – External LSA: Type 5 LSAs are used by the ASBR to advertise external prefixes to the OSPF domain.
- Type 6 – Multicast LSA: Type 6 LSAs are not implemented by Cisco.
- Type 7 – NSSA External LSA: Type 7 LSAs are used in Not-So-Stubby Areas to advertise external prefixes.
- Types 8, 9, and 10 – Opaque LSAs: These LSAs are used for extensibility.
The LSA types allow for a hierarchical structure:
- LSAs that flow only within an area (intra-area routes): Type 1 and Type 2 (O)
- LSAs that flow between areas (inter-area routes): Type 3 and Type 4 (O IA)
- External routes: Type 5 (E1/E2) or Type 7 (N1/N2)
OSPF Area Types
OSPF offers the capability to create different area types, which relates to the various LSA types presented below and the way they flow inside a specific area. The different area types are as follows:
- Regular Area: This is the normal OSPF area, with no restrictions to the flow of LSAs.
- Stub Area: This area prevents external Type 5 LSAs from entering the area. It will also stop Type 4 LSAs, as they are only used in conjunction with Type 5 LSAs.
- Totally Stubby Area: This area prevents Type 5, Type 4, and Type 3 LSAs from entering the area. A default route is automatically injected in order to reach the internal destinations.
- Not-So-Stubby Area (NSSA): The NSSA blocks Type 4 and Type 5 LSAs, but it can still connect to other domains and ASBRs are allowed in this area. NSSAs will not receive external routes injected in other areas but it can inject external routes into the OSPF domain. The external routes are injected as Type 7 LSAs, which are converted to Type 5 LSAs by the NSSA’s ABR (i.e., the router that connects to the backbone) and they reach other OSPF areas as Type 5 LSAs.
- NSSA Totally Stubby Area: This area has the same characteristics as NSSA, except that it will also block Type 3 LSAs from entering the area.
Note: All routers in an OSPF area must agree on the stub flag. |
The various areas and LSA types are summarized in Figure 4.17 below:
Figure 4.17 – OSPF Area Types
All of these areas and LSA types make OSPF a very hierarchical and scalable routing protocol. You can tweak and tune it for very large environments because of all these design elements.
OSPF allows for summarization, which can be carried out in two locations:
- Between areas (inter-area summarization), using Type 3 LSAs
- At the ASBR, summarizing external prefixes, using Type 5 and Type 7 LSAs
IS-IS
The Intermediate System-to-Intermediate System (IS-IS) protocol is a pure Link-State protocol, just like OSPF, and it is defined in RFC 1142. IS-IS is currently used mostly in very large service provider environments, as it lost the battle with OSPF for Internet-wide supremacy. Considering this aspect, from a design perspective, not many engineers have strong knowledge about IS-IS.
IS-IS has many similarities to OSPF; as it is a classless routing protocol, it uses VLSM and it supports authentication. However, there are also some key differences from OSPF, including increased scalability features and the ability to support more routers in a single area.
Another difference between IS-IS and OSPF is that IS-IS is not as strict with the area concept. The backbone concept also exists in IS-IS but it offers much more flexibility. Routers that make up the IS-IS topology can be classified as Level 1 or Level 2 and the area border concept is simply on a router that supports both levels. Level 1 and Level 2 routers are connected, thus acting like the backbone structure of the topology, and each node from the backbone can connect to Level 1 routers.
The concept of OSPF DRs also has an equivalent in IS-IS and it is called the Designated Intermediate System (DIS); one difference from OSPF is that there is no backup DIS in the IS-IS topology and no concept of special area types.
Another difference between IS-IS and OSPF concerns the routing protocol metric. The IS-IS metric is not based on interface bandwidth and the default metric has a value of 10 on Cisco routers. The IS-IS metric is similar to a hop count and must be manipulated in order to take bandwidth into consideration in the path selection process.
BGP
Border Gateway Protocol (BGP) is the only exterior gateway protocol in use today and its role is to exchange routing information between organizations. BGP is a standard-based protocol defined in RFC 4271 and is the successor of Exterior Gateway Protocol (EGP).
The Need for BGP
BGP is used to route between autonomous systems (ASs) and is considered to be a path vector routing protocol. The BGP metric is based on multiple attributes that you can tune and control in order to affect which path AS data is taken through (this is in fact the routing decision). This is more of a policy-based routing approach and policy routing is very important for Internet Service Providers routing traffic between each other for different ASs.
BGP is a classless routing protocol and it supports VLSM and summarization (i.e., route aggregation). While IGPs can scale to thousands of routes, BGP can scale to hundreds of thousands of routes, making it the most scalable routing protocol ever developed. Currently, the global BGP routing table has over 300,000 routes.
Another characteristic of BGP is its high stability, especially considering that there is never a solid convergence of the Internet routing table (i.e., something is always changing in such a large routing table). It is stable enough to handle routing and decision-making at the same time. With BGP it is all about the enforcement of policies, so it does not use a simple metric value that might be tied to a single parameter (like bandwidth). Instead, BGP has a group of attributes that can be manipulated to dictate a particular routing policy.
BGP is used in two particular scenarios, which are depicted in Figure 4.18 below:
- Transit networks: ISPs that want to provide transit to other destinations on the public Internet
- Multihomed networks: big enterprise networks that rely heavily on Internet traffic and have sophisticated connectivity requirements for two or more ISPs; BGP allows them to control inbound and outbound routing policies
Figure 4.18 – BGP Deployment Scenarios
Most of the enterprise networks do not need BGP for various reasons:
- The network requires single ISP connectivity and default routing configuration is sufficient. A default route will point to the ISP so all Internet traffic is routed to that single ISP.
- The memory and CPU resources are limited and do not support a BGP implementation. The global routing table needs more than 1GB of memory just for storage.
- BGP cannot be used if you don’t own the IPv4 address space in use. This happens in situations in which the company’s addresses are owned by the ISP and the ISP takes care of that address space advertising on the Internet. This is the case for small and medium-sized organizations.
BGP Functionality
Similar to OSPF, IS-IS, and EIGRP, BGP uses a three-table data structure:
- Neighbor table: contains information about adjacency with peers
- BGP table (topology table): contains all the prefixes learned from the peers and by itself
- IP routing table: contains the best routes from the BGP table
The devices running BGP will establish the peerings in order to build the neighbor table and will then exchange updates to build the BGP table. After the BGP table is built, the best paths for routing information are chosen and are used to build the IP routing table.
BGP allows for different types of peerings to be created, as illustrated in Figure 4.19 below:
- External BGP (eBGP) peerings create BGP peerings with neighbors that are outside of the AS
- Internal BGP (iBGP) peerings create BGP peerings with devices inside the AS
Figure 4.19 – BGP Peering Types
The BGP peering types that a route is being sent to and received from will influence the update and path selection rules. An example of this is when eBGP peers are assumed to be directly connected. If they are not, the special ebgp multihop command must be entered to let the device know it is not directly connected so it can establish the BGP peering. This assumption has no equivalent when considering iBGP peerings, where there is no requirement for direct connectivity.
Another example of iBGP versus eBGP behavior is that an iBGP-learned route will not be advertised between iBGP peers because of a special loop prevention mechanism that prevents an update learned via iBGP to be sent to other iBGP peers. This happens because BGP assumes that all routers within an AS have complete information about each other. Multiple solutions exist to solve this issue:
- Configure a full mesh of iBGP peers
- Use BGP Route Reflectors
- Organize the AS into BGP Confederations
The solution that involves a full mesh of iBGP peers is the least preferred because of the increased number of connections. The total number of connections is n*(n-1)/2, where n equals the number of BGP routers, so for 1000 routers you would have 499,500 peerings. This is very hard to implement and maintain, which is why Route Reflector and Confederation solutions are recommended instead.
Route Reflectors (RR) are nodes that reflect the iBGP updates to devices that are configured as RR clients. This solution is easy to design and implement and solves the iBGP split horizon rule. You still have to configure a full mesh of connections between RR nodes and normal nodes (Non-Route Reflector clients), but you do not have to have a full mesh between the RR and its clients. This is illustrated in Figure 4.20 below:
Figure 4.20 – BGP Route Reflectors
BGP Confederations are more complex than Route Reflectors, and they function by creating a sub-AS within the main AS. The connections between sub-ASs behave like eBGP peerings and the connections inside sub-ASs are pure iBGP peerings. This means that you only need full-mesh configuration inside sub-ASs. Inside sub-ASs you can also configure RRs, so you can have a combination of BGP design technologies. BGP Confederations are illustrated in Figure 4.21 below:
Figure 4.21 – BGP Confederations
Note: AS numbers (ASNs) are defined as 16-bit integers that range from 0 to 65535. Sub-ASs are usually assigned private ASNs, ranging from 64512 to 65535. Due to public ASN exhaustion, IANA introduced 32-bit ASNs, which it begun to allocate over the last few years. |
BGP Path Vector Attributes
BGP can use multiple attributes to define a routing policy and the most important ones include the following:
- Next Hop: This is an attribute that must be present in each BGP update and it indicates where the traffic should be sent in order to reach a particular destination.
- AS-Path: This attribute lists all the ASs through which the prefix has passed. AS-Path is similar to a hop count, except that it uses ASNs and provides more details about the path.
- Origin: This attribute gives information about how the prefix entered the BGP system, either directly advertised into BGP with the network command or redistributed from other routing protocols.
- Local Preference: This attribute can influence the way packets leave the AS and the path that it takes.
- Multi-Exist Discriminator (MED): MED can influence the way traffic enters an AS and the path that it takes.
- Atomic Aggregate: This attribute is used when performing BGP summarization.
- Aggregator: This attribute is also used when performing BGP summarization.
- Community: Used to assign a tag value to a prefix. The value can be matched by policies at other devices in order to take specific actions on the prefix.
BGP attributes can be grouped into several categories. They can be either well-known or optional, with well-known attributes supported by all BGP vendors and the optional attributes supported only by certain BGP vendors. They can also be either mandatory or discretionary, with the mandatory attributes sent in every routing update, while the discretionary attributes may or may not be present in routing updates. Another categorization relies on the path attribute’s transitivity. They can be either transitive (i.e., they pass between eBGP and iBGP neighbors) or non-transitive (i.e., they pass between only iBGP neighbors).
Valid combinations of BGP attributes include the following:
- Well-known mandatory (Next Hop, AS-Path, and Origin)
- Well-known discretionary (Local Preference and Atomic Aggregate)
- Optional transitive (Aggregator and Community)
- Optional non-transitive (MED).
BGP systems analyze all of these attributes and determine the best path to get to a destination based on this very complex decision-making process. Only the best route is sent to the routing table and to the peers. The first step in this process is checking to see if the next hop is reachable. If it is not, the route is dropped, but if it is, the decision-making process makes the decision by analyzing the following steps:
- Weight (Cisco-specific attribute): the highest weight is preferred
- Local preference: the highest local preference is preferred
- Locally originated routes: preferred routes
- AS-Path: the shortest AS-Path is preferred
- Origin: routes with the lowest origin type are preferred
- MED: the lowest MED is preferred
- Neighbor type: routes that came via eBGP are preferred over those that were learned via iBGP
- IGP metric: if there is still a tie, the lowest IGP metric wins
IPv6 Routing Concepts
Cisco routers do not route IPv6 by default, so this capability should be activated with the ipv6 unicast-routing command. Cisco routers are dual-stack by default, meaning they are capable of running IPv4 and IPv6 simultaneously on the same interfaces.
IPv6 allows the use of static routing and also supports specific dynamic routing protocols that are variations of the IPv4 routing protocols modified or redesigned to support IPv6, such as:
- RIPng (RIP new generation)
- OSPFv3
- EIGRPv6
- IS-IS
- BGP
Note: IS-IS and BGP experienced the least amount of modifications to support IPv6 because they were built with extensibility in mind. |
RIPng, OSPFv3, and EIGRPv6 are new routing protocols that work independently of the IPv4 versions and they run on a completely separate process on the device. BGP and IS-IS are exceptions to this rule, as they route IPv6 traffic using the same process as for IPv4 traffic but they also use the concept of address families that hold the entire IPv6 configuration.
Many of the issues with IPv4 (such as name resolution or NBMA environments which are covered later) still exist in the IPv6 routing world. An important aspect is that IPv6 routing protocols communicate with the remote Link-Local addresses when establishing their adjacencies and exchanging routing information. When examining the routing table of an IPv6 router, notice that the next hops are the Link-Local addresses of the neighbors.
As mentioned, static routing is one of the options you can use in IPv6 and it has the same implications as in IPv4. The route can point to:
- The next hop (the next hop must be resolved)
- A Multipoint interface (the final destination must be resolved)
- A Point-to-Point interface (no resolution is required)
RIP new generation, also called RIP for IPv6, was specified in RFC 2080 and is similar in operation to RIPv1 and RIPv2. While RIPv2 uses the multicast address 224.0.0.9 to exchange routing information with its neighbors, RIPng uses the similar FF02::9 address and UDP port 521. Another difference between the two versions is that IPv6 is configured at the interface level, while RIPv1 and RIPv2 are configured at the global routing configuration level.
OSPF version 3 is defined in RFC 2740 and is similar in operation to OSPFv2 (for IPv4). OSPFv3 even supports the same network types as OSPFv2:
- Broadcast
- Non-Broadcast
- Point-to-Point
- Point-to-Multipoint
- Point-to-Multipoint Non-Broadcast.
EIGRPv6 is similar in operation to EIGRP for IPv4 and it uses IP protocol 88 to multicast updates to FF02::A.
Note: An important aspect to consider when implementing EIGRPv6 is that, unlike EIGRP for IPv4, the process is shut down until you manually enable it by issuing the no shutdown routing configuration command. |
BGP for IPv6 is configured in Address Family Configuration mode but it is based on the same configuration principles used by BGP for IPv4:
- An underlying transport IGP is required
- There is an implicit iBGP loop-prevention mechanism that prevents iBGP-learned routes from being advertised to other iBGP neighbors (this can be solved using Route Reflectors or Confederations)
- There is an implicit eBGP loop-prevention mechanism that does not accept routes entering an AS that have the same AS in the path
- It uses the same best-path selection process
Layer 3 Switching
Historically, LAN switching typically involves Layer 2 switching at the Access Layer and sometimes at the Distribution Layer. Layer 2 switches only forward information based on the MAC address (the Layer 2 frame address). Layer 3 switching, however, will use the MAC address but also adds the Layer 3 address (e.g., an IP address).
Three options exist when designing a switched environment:
- Layer 2 switching throughout the network
- A combination of Layer 2 and Layer 3 switching; this requires a higher degree of planning
- Layer 3 switching throughout the network
When trying to decide between Layer 2 and Layer 3 switching, you must understand the impact this decision has on five main areas, which is illustrated in Figure 4.22 below:
- How the policies are implemented
- How intelligent the load sharing accomplished is
- How network failures are dealt with
- Convergence issues
- The cost factor
Figure 4.22 – Layer 2 vs. Layer 3 Switching
Note: Using Layer 2 switching, Layer 3 switching, or a combination of both also depends on the switching platforms available. Not all switches support Layer 3 technologies. Layer 3 switches are also called multilayer switches or routing switches. |
At the heart of switched networking is the concept of virtual LANs (VLANs). This basically represents the process of creating logical broadcast domains by grouping particular nodes attached to different switches within a switched environment.
When designing a full Layer 2 environment using VLANs, a router might be used to provide routing between VLANs. This technique is called “router on a stick” because only one router interface is used, which carries all the VLANs.
The advantage of having a combination of Layer 2 and Layer 3 switches at the Distribution Layer or Core Layer, or Layer 3 switches throughout the network, is that a routing process or router switch module is built into the switch itself.
When using Layer 2 switches and VLANs exclusively throughout the network, all the policies, access control lists, and QoS rules are managed at the Data Link Layer. The policy capabilities are very limited at the Data Link Layer but they are greatly enhanced in Layer 3 switches.
Another area in which Layer 2 switches are limited is when considering load sharing capabilities used to ensure redundant links (i.e., multiple paths) throughout the network. This is because Layer 2 switches only know about MAC addresses. They cannot perform intelligent load sharing, for example, based on the destination network like Layer 3 switches that support dynamic routing protocols can. With Layer 2 switching, the load can only be shared on a per-VLAN basis.
In addition, with only Layer 2 switches, the basis of all failures or the failure domain will be isolated to the VLAN only. On the other hand, in a multilayer environment, the failures can be better isolated to the Access Layer, to the Core Layer, and even to particular network segments.
Convergence and loop control are offered only by the Spanning Tree Protocol (STP) in a Layer 2-only switched environment. However, when using Layer 3 switching, this feature can also be implemented at the Distribution and Core Layers using routing protocol technologies (e.g., OSPF, EIGRP, and others).
When considering the cost, using Layer 2 switches everywhere is the cheapest solution but this is also the least flexible and manageable option. Using Layer 3 switches throughout the network is the most expensive option but it is very powerful and flexible. A compromise would be to implement Layer 3 switches first only in the Distribution Layer, and then eventually, as the budget allows and the network scales, extend the Layer 3 switches into the Core Layer, which will give you full Layer 3 switching at the Distribution and Core Layers.
Summary
The Spanning-Tree Protocol (STP), defined by IEEE 802.1D, is a loop-prevention protocol that allows switches to communicate with each other in order to discover physical loops in a network.
Virtual LANs (VLANs) define broadcast domains in a Layer 2 network. They represent an administratively defined subnet of switch ports that are in the same broadcast domain. A broadcast domain is the area in which a Broadcast frame propagates through a network.
VLANs represent a group of devices that participate in the same Layer 2 domain and can communicate without needing to pass through a router. This means that they share the same broadcast domain. Best design practices suggest a one-to-one relationship between VLANs and IP subnets. Devices in a single VLAN are typically also in the same IP subnet.
IP routing is the process of forwarding a packet based on the destination IP address. Routers keep the best path to destinations learned via direct connections, static routing, or dynamic routing to internal data structures called routing tables. A routing table contains a list of networks the router has learned about and information about how to reach them.
The most important information routing tables contain include the following:
- How the route was learned (statically, dynamically, or directly connected)
- The address of the neighbor router from which the network was learned
- The interface through which the network can be reached
- The route metric: a measurement that gives routers information about how far or how preferred a network is (the exact meaning of the metric value depends on the routing protocol used)
The dynamic routing protocols that are most commonly used in modern networks are RIP, EIGRP, OSPF, and BGP.
Distance Vector routing protocols include:
- RIP version 1
- RIP version 2
- IGRP
- RIPng
Link-State routing protocols include:
- OSPF
- IS-IS
- OSPF version 3
The main difference between Distance Vector routing protocols and Link-State routing protocols is the way they exchange routing updates. Distance Vector protocols function using the “routing by rumor” technique, as every router relies on its neighbors to maintain correct routing information. This means that the entire routing table is sent periodically to neighbors.
Configure routing protocols in our 101 Labs – CompTIA Network+ book.