Explain different methods of and rationales for network performance optimization. This chapter describes network performance techniques, including QoS, traffic shaping, load balancing, high availability, caching, and fault tolerance. You learn learn more about QoS in our Cisco CCNP ENCOR course.
Performance Optimization Considerations
As technology evolves, networks increase in complexity and have to support more and more applications, so you have to make sure that you monitor and optimize the performance of your environment. One of the major reasons for this is to keep every application and network service running and available. Network uptime is often the number one priority in an organization, as they want everything to work properly for users that access different types of data and applications across the network, even on the Internet.
Because of these considerations, you should design and build your networks with redundancy and resiliency so if there is a problem in any particular part of the network, you have systems and procedures in place to make sure that that particular outage is minimized.
Some of the most critical applications that require peak network performance include:
- Transactional applications (banking, insurance, stocks, etc.)
- Healthcare applications (imaging solutions, hospital applications, etc.)
- Multimedia applications (voice and video)
If you use applications that need high bandwidth and they run all the time, the network connection may become congested. In such situations it is critical to make sure that your network is optimized so every application runs as it should and is guaranteed a certain amount of the network resources.
Network performance optimization may include taking care of several network parameters, as follows:
- Bandwidth: you should make sure that every application/service has enough bandwidth to function properly.
- Latency: you should make sure that packets arrive at the destination in a reasonable amount of time – as requested by the application/service (this is especially important when using real-time applications).
- Jitter (variable latency): this is also very important in multimedia and real-time applications.
Some of the most sensitive applications from a latency (delay) and jitter perspective are VoIP applications. If the packets do not arrive at a constant and fast rate to the other side, then the voice call may be broken and the service cannot be used. The same applies to video applications.
Quality of Service
Quality of Service (QoS) is a key aspect that needs to be considered when designing and implementing enterprise campus solutions to ensure performance optimization across the network. There are many different categories of the QoS approach, including:
- Shaping
- Policing
- Congestion management
- Congestion avoidance
- Link efficiency mechanisms
QoS is usually needed when traffic goes from a high-bandwidth connection (LAN) to a low-bandwidth connection (Internet or WAN). This is the reason QoS techniques are usually applied at the network’s exit point.
QoS involves a wide variety of techniques used, especially in networks that offer multimedia services (voice and/or video) because these services are usually delay-sensitive and require low latency and low jitter. Traffic generated by these applications must be prioritized and this is the role of QoS techniques.
Congestion Management
In situations in which a specific link is constantly congested, the link may need to be upgraded. However, when experiencing occasional congestion on a particular link, you can use QoS congestion management techniques. Congestion happens for a lot of different reasons in modern networks.
The approach of congestion management is called queuing. Applying queuing techniques means using other techniques than the default FIFO (First In, First Out) method. An interface consists of two different queue areas, which are shown in Figure 26.1 below:
- Hardware queue (or Transmit Ring – TX Ring)
- Software queue
Figure 26.1 – Interface Queue Types
The hardware queue on the interface always uses the FIFO method for packet treatment. This mode of operation assures that the first packet in the hardware queue is the first packet that will leave the interface. The only TX Ring parameter that can be modified on most devices is the queue length.
The software queue is where most of the congestion management manipulations are performed. The software queue is used to order packets before they use the hardware queue and they can be configured with different queuing strategies.
Congestion might happen because of the high-speed LAN connections that aggregate into the lower-speed WAN connections. Aggregation refers to being able to support the cumulative effect of all the users wanting to use the connection.
There are many different approaches (queuing strategies) that can be used in congestion management:
- FIFO (First In First Out)
- PQ (Priority Queuing)
- RR (Round Robin)
- WRR (Weighted Round Robin)
- DRR (Deficit Round Robin)
- MDRR (Modified Deficit Round Robin)
- SRR (Shaped Round Robin)
- CQ (Custom Queuing)
- FQ (Fair Queuing)
- WFQ (Weighted Fair Queuing)
- CBWFQ (Class Based Weighted Fair Queuing)
- LLQ (Low Latency Queuing)
Note: All of the techniques mentioned above are used in the interface software queue. The hardware queue always uses FIFO. |
FIFO is a technique used in the hardware queue and is the least complex method of queuing. It operates by giving priority to the packets first received. This is also usually the default queuing mechanism for software queues on high-speed interfaces. If you have a sufficient budget to overprovision the congested links, you can use FIFO on all your interfaces (hardware and software queues). However, in most situations this is not the case, so you will need to use some kind of advanced queuing technique, like WFQ, CBWFQ, or LLQ. These are the most modern queuing strategies that will enable you to make sure that important packets are getting priority during times of congestion.
FIFO used in the software queue will not make a determination on packet priorities that are usually signaled using QoS markings. If you rely on FIFO and experience congestion, the traffic will be affected by things like delay or jitter and important traffic might be starved and might not reach its destination.
Weighted Fair Queuing (WFQ) is usually the default technique used on slow-speed interfaces (less than 2Mbps) because it is considered to be more efficient than FIFO in this case. Weighted Fair Queuing functions by dynamically sorting the traffic into flows and then dedicating a queue for each flow while trying to allocate the bandwidth fairly. It does this by inspecting the QoS markings and giving priority to higher priority traffic.
WFQ is not the best solution in every scenario because it does not provide enough control in the configuration (i.e., it does everything automatically), but it is far better than the FIFO approach because interactive flows that generally use small packets (e.g., VoIP) get prioritized to the front of the software queue. The WFQ fairness element makes sure that high priority interactive conversations don’t get starved by high volume traffic flows. In other words, high volume talkers won’t use all of the interface bandwidth.
Figure 26.2 – Weighted Fair Queuing Logic
As shown in Figure 26.2 above, the different WFQ flows are placed into different queues, and then they hit a WFQ scheduler that allows them to pass to the hardware queue based on the defined logic. If one queue fills to the limit, the packets will be dropped, but this will also be based on the WFQ approach (lower priority packets are dropped first), as opposed to the FIFO approach of tail dropping.
Because WFQ lacks a certain level of control, another congestion management technology was created called Custom Queuing (CQ). Even though CQ is a legacy technology it is still implemented in some environments. CQ is similar to WFQ but it operates by manually defining 16 static queues. Network administrators can assign a byte count for each queue (i.e., the number of bytes that are to be sent from each queue). Queue number 0 is reserved for the system to avoid starvation on key router messages. CQ allows for manual allocation of the number of bytes or packets for each queue.
Even though CQ provides flexible congestion management, this does not work well with VoIP implementations because of the round-robin nature of Custom Queuing. Let’s consider an example with four queues that are allocated a different number of packets (Q1: 10 packets; Q2: 20 packets; Q3: 50 packets; and Q4: 100 packets) over a particular time interval. Even though Queue 4 has priority, the interface is still using a round-robin approach (Q4-Q3-Q2-Q1-Q4…and so on). This is not appropriate for VoIP scenarios because voice traffic needs strict priority to maintain a constant traffic flow that will minimize jitter. As a result, another legacy technology was invented called Priority Queuing (PQ). PQ places packets into four priority queues:
- Low
- Normal
- Medium
- High
VoIP traffic is placed in the high priority queue to ensure absolute priority. However, this can lead to the starvation of other queues. For this reason, PQ is not recommended for use in modern networks.
If VoIP is not used in the network, the most recommended congestion management technique is Class Based Weighted Fair Queuing (CBWFQ). CBWFQ defines the amounts of bandwidth that the various forms of traffic will get. Minimum bandwidth reservations are defined for different classes of traffic.
Figure 26.3 – Class Based Weighted Fair Queuing Logic
Referencing Figure 26.3 above, CBWFQ logic is based on a WFQ scheduler that receives information from queues defined for different forms of traffic. The traffic that does not fit any manually defined queue automatically falls into the “class-default” queue. These queues can be assigned minimum bandwidth guarantees to all traffic classes. CBWFQ offers powerful methodologies for controlling exactly how much bandwidth these various classifications will receive. If it contains more than one traffic type, each individual queue will use the FIFO method instead, so be careful not to combine too many forms of traffic inside a single queue.
Because CBWFQ is not efficient when using VoIP, another QoS technique was developed: Low Latency Queuing (LLQ). As can be seen in Figure 26.4 below, LLQ adds a priority queue (usually for voice traffic) to the CBWFQ system, so LLQ is often referred to as an extension of CBWFQ:
Figure 26.4 – Low Latency Queuing Logic
Adding a priority queue to CBWFQ will not lead to starvation because this queue is policed, which means that the amount of bandwidth guaranteed for voice is also policed to ensure that it doesn’t exceed a particular value. In other words, voice traffic gets its own priority treatment and the remaining traffic forms will fall under WFQ based on the bandwidth reservation values.
Congestion Avoidance
Congestion avoidance is another category of Differentiated Services QoS often deploys in Wide Area Networks. When both the hardware and the software queues fill up, the packets are tail dropped at the end of the queue, which can lead to voice traffic starvation and/or to the TCP global synchronization process described earlier. Using congestion avoidance techniques can guard against these global synchronization problems.
The most popular congestion avoidance mechanism is called Random Early Detection (RED). The job of this QoS tool is to try to prevent congestion from occurring by randomly dropping unimportant traffic before the queue gets full. As the queue fills up, more and more unimportant packets are randomly dropped.
Shaping and Policing
Shaping and policing do not represent one and the same thing, as many people think. Shaping is the process that tries to control the way traffic is sent, by buffering excess packets. Policing, on the other hand, will drop or re-mark (penalize) packets that exceed a given rate.
Policing might be used when there is fast WAN access available but it is not needed. This prevents certain applications from using all the connection resources. Another situation is where there are certain applications with clear bandwidth requirements that are offered only as many resources as they need.
Shaping is often used to prevent congestion in situations where you have asymmetric bandwidth. An example in this regard can be a headquarters router that connects to a branch router that has a lower bandwidth connection. In this type of environment, you can set up shaping when the HQ router sends data so it does not overwhelm the branch office router.
Many times the contract between a service provider and its customer specifies a committed information rate (CIR) value. This represents the amount of bandwidth purchased from the service provider. Shaping can be used when you want to make sure that the data you send conforms to the specified CIR.
When comparing shaping and policing, notice that shaping can only be done outbound, while policing can be used both in the ingress and egress direction. Another key distinction is that policing will drop or re-mark the packet (i.e., it is not a queuing mechanism, as it can also handle inbound traffic), while shaping will queue the excess traffic. Because of this behavior, policing will require less buffering. With shaping, one advantage is that it supports Frame Relay congestion indicators by responding to FECN and BECN messages.
Link Efficiency Mechanisms
Link efficiency mechanisms are divided into two categories:
- Compression
- Link fragmentation and interleaving (LFI)
Compression involves reducing the size of certain packets in order to increase the available bandwidth and decrease delay. Multiple types of compression exist:
- TCP header compression (compresses the IP and TCP headers, reducing the overhead from 40 bytes to 3 to 5 bytes)
- RTP header compression (compresses the IP, UDP, and RTP headers of voice packets, reducing the overhead to 2 to 4 bytes)
There are three different flavors of LFI used today:
- Multilink PPP with interleaving (used in PPP environments)
- 12 (used with Frame Relay data connections)
- 11 Annex C (used with Voice over Frame Relay – VoFR)
LFI techniques are efficient on slow links where certain problems might appear even after applying congestion management features. These problems are generated by big data packets that arrive at the interface before other smaller, more important packets. If a big packet enters the FIFO TX Ring before a small VoIP packet arrives at the software queue, the VoIP packet will get stuck behind the data packet and might have to wait a long time before its transmission is finished. To solve this problem, LFI splits the large data packet into smaller pieces (fragments) so the voice packets can be interleaved between them, as illustrated in Figure 26.5 below. Using LFI, the voice packets do not have to wait behind the large packet until it is completely transmitted.
Figure 26.5 – Link Fragmentation and Interleaving
QoS Design Recommendations for Voice Transport
Network engineers should use general guidelines when designing Quality of Service for voice transport. These QoS techniques are applied to WAN connections that are equal to T1 or lower. When using high-bandwidth connections, you do not need to think about QoS techniques because congestion is less likely to occur.
QoS techniques are most effective on bursty connections, typically in Frame Relay environments where committed information rates and burst rates are usually specified in the contract. Traffic bursts occur when large packets are sent over the network or when the network is very busy during certain periods of the day. QoS techniques should be mandatory if the enterprise uses any kind of delay-sensitive applications, for example, applications that must function in real-time (e.g., presentations over the Internet or video training sessions).
Congestion management should be considered only when the network experiences congestion and this should be planned by analyzing the organizational policies and goals and following the network design steps (PDIOO – Plan, Design, Implement, Operate, Optimize). Before applying any QoS configuration, traffic should be carefully analyzed to detect congestion problems. The best QoS mechanism should be chosen by consulting with the implementation team and this can include packet classification and marking, queuing techniques, congestion avoidance techniques (traffic shaping or policing), or bandwidth reservation mechanisms (e.g., RSVP).
Network engineers should also be familiar with the most important QoS mechanisms available for IP telephony:
- cRTP
- LFI
- PQ-WFQ
- LLQ
- Auto QoS
The compressed Real-time Transport Protocol (cRTP) is a compression mechanism that reduces the IP, UDP, and RTP headers’ size from 40 bytes to 2 or 4 bytes. cRTP is configured on a link-by-link basis and the recommendation is to use this technique for links that are lower than 768Kbps.
Note: cRTP should not be configured on devices that have high processor utilization (above 75%). |
LFI is a QoS mechanism used to reduce serialization delay. PQ is also referred to as IP RTP priority. This adds a single priority queue to the WFQ technique, which is used for VoIP traffic. All of the other traffic is queued based on the WFQ algorithm. When using PQ, the router places VoIP RTP packets in a strict priority queue that is always serviced first.
LLQ, also known as Priority Queuing-Class Based Weighted Fair Queuing (PQ-CBWFQ), also provides a single priority queue but it is preferred over the PQ-WFQ technique because it guarantees bandwidth for different classes of traffic. All voice call traffic is assigned to the priority queue, while VoIP signaling and video traffic is assigned to its own traffic class. For example, FTP can be assigned to the low priority traffic class and all other data traffic can be assigned to a regular class.
Note: QoS techniques should be carefully configured on all the devices involved in voice transport, not just on individual devices. |
Some of the best practices to be used in a VoIP environment to ensure optimal performance include the following:
- Separate voice traffic into its own subnets and VLANs
- Use private addressing (RFC 1918) for IP phones, combined with NAT/PAT if necessary
- Strategically place the IP telephony servers on filtered VLANs in the data center
- Use QoS IP precedence or DSCP for packet classification and markings
- Use LLQ on WAN links
- Use LFI on slower WAN links
- Use the CAC (Call Admission Control) mechanism to avoid oversubscribing priority queues; the main goal of CAC is to protect voice traffic from being affected by other voice traffic by rerouting it through alternative network paths or to the PSTN
Analyzing Delay-Sensitive Traffic
When using Multicasting or Web streaming, e-commerce, e-learning solutions, or IP telephony, the traffic involved in this process will be delay-sensitive and QoS techniques might be necessary to ensure that the traffic is treated with priority.
In Layer 3 applications like Frame Relay environments using EIGRP, OSPF, or BGP as the routing protocols with the ISP, it is very common to use QoS techniques to shape and control traffic at the IP Layer. You can also use QoS at Layer 2. When using QoS or analyzing or controlling delay-sensitive traffic at Layer 2, there are four categories of QoS techniques available:
- Tagging and traffic classification
- Congestion control
- Policing and traffic shaping
- Scheduling
Figure 26.6 – Layer 2 QoS Techniques
Examining Figure 26.6 above, tagging and traffic classification takes place between the end-user nodes, through the Access Layer and up to the Distribution Layer. This is where packets are classified, and they can be grouped and partitioned based on different priority levels or Classes of Service. This procedure involves inspecting the Layer 2 packet headers and determining the priority of the traffic based on the type of traffic (e.g., IP telephony, Telnet traffic, printer traffic, file services, or other traffic). The traffic is then tagged and classified and the Layer 2 frame can be changed depending on the priority.
Note: Tagging and traffic classification is also called traffic marking. |
The next three techniques (congestion control, policing and traffic shaping, and scheduling) occur at the Distribution Layer (building distribution and edge distribution), primarily on Layer 3 switches. You should avoid applying any QoS techniques at the Core Layer because you want as little overhead as possible on the backbone devices so they can successfully achieve their goal, which is fast connectivity, high availability, and reliability.
Congestion control involves looking at the interfaces of the access switches and the queuing mechanisms configured on them. Queuing techniques should be used to deal with the congestion of packets coming in and going out of the switch ports. This also ensures that traffic from critical applications will be forwarded properly. Congestion control is especially helpful when using real-time applications (e.g., VoIP) that require a reduced amount of jitter and latency so the specific application will function in an optimal way, with as little delay as possible.
Policing and shaping help move important traffic to the top of the list and dynamically adjust the priorities of certain types of traffic during periods of congestion. Scheduling is the process of establishing the order in which the congested queues will be served.
Quality of Service techniques can be used to allow Layer 2 switches to help out in the queuing process by scheduling certain processes. Much of the QoS activity occurs at Layer 3, but many features can also be taken into consideration at Layer 2 and they can be implemented on higher-end switches to provide support for tagging, traffic classification, congestion control, policing/shaping traffic, and traffic scheduling.
Load Balancing
Load balancing is a common method of taking information destined to a single device and distributing it to multiple devices. Load balancing techniques and algorithms were described in Chapter 21.
Load balancing is usually implemented in large clustered environments, where servers are grouped based on their function. It facilitates server maintenance, downtime, and load sharing in such environments. If you don’t have a busy server there is no need to use load balancing, but as more users need access to that service, you might consider using server clusters with load balancing instead.
Load balancing is usually implemented with the Common Address Redundancy Protocol (CARP), which allows you to assign the same IP address to multiple hosts. CARP is an open standard protocol and is similar in functionality to HSRP or VRRP First Hop Redundancy Protocols. CARP offers transparency to the user because if a server fails, the service is still available via other servers with the same IP address.
High Availability and Fault Tolerance
High availability and fault tolerance are two of the most critical technology areas in the networking world and they impact all the other technologies presented in a network.
High availability is often a factor taken into consideration when designing and implementing end-to-end solutions. This ensures redundancy for the network services and for the end-users and is accomplished by making sure that the network devices are reliable and fault tolerant.
You should design some level of high availability solutions into any network module that you want to propose. For example, the Access Layer can have multiple ways of connecting access devices, it can have multiple connections to multiple devices in the Distribution Layer, or the Distribution Layer can have some redundancy when connecting to the Core Layer.
Redundancy is a typical design element and high availability is a function of two parameters:
- The budget available
- How mission critical a particular service or application is to the business goals of the organization
There are many redundancy options that can be utilized in different modules of modern networks:
- Workstation-to-router redundancy at the Access Layer
- Server redundancy in the Server Farm module
- Route redundancy
- Media redundancy at the Access Layer
Each of these areas will be covered in detail in the following sections.
Workstation-to-Router Redundancy
The most important topic in the list above is workstation-to-router redundancy, because it is very important that the Access Layer devices maintain their default gateway information. Modern networks usually respect the 80/20 model, which states that 80% of the traffic will pass through the default gateway and 20% of the destinations will be local. This means that making the default gateway available is critical.
Workstation-to-router redundancy can be accomplished in multiple ways, including:
- Proxy ARP on routers
- Explicit configuration
- IRDP
- RIP
- HSRP
- VRRP
- GLBP
Proxy ARP involves a workstation that has no default gateway configured. To communicate with a remote host, the workstation sends ARP requests for the address of the host and the router that hears this request realizes that it can service that request and responds on behalf of the client using Proxy ARP. The router actually pretends to be the host so the workstation can send traffic destined to that specific client to the router.
Explicit configuration is the most common way of accomplishing workstation-to-router redundancy because some of the operating systems allow multiple default gateways to be configured. The problem with this is the latency that appears while the device figures out that one gateway is down and switches to another gateway. Another problem with the explicit configuration of multiple default gateways is that not all operating systems support this feature.
The optimal solution is one that can be used by every host in the infrastructure and, if possible, in a transparent way (i.e., configure network devices only). Some routers can run the ICMP Router Discovery Protocol (IRDP). If both routers and hosts can run IRDP, this is another option for hosts to dynamically discover an available default gateway. RIP is another protocol that can be used in this scenario, but this also means that all the hosts must know this technology and it implies some configuration on them.
The preferred solution is a technology that does not place any burden on the hosts and that is completely transparent to them, for example, the hosts just need to configure a single default gateway because the entire redundancy configuration is made on the routers. The protocols that can be used to accomplish this are generically called First Hop Redundancy Protocols (FHRPs) and they include the following:
- Hot Standby Router Protocol (HSRP)
- Virtual Router Redundancy Protocol (VRRP)
- Gateway Load Balancing Protocol (GLBP)
HSRP is a Cisco proprietary protocol that inspired IEEE to create the open standard protocol VRRP (their functionality is almost identical). GLBP is the most recent protocol, again a Cisco invention, and this is the most sophisticated FHRP.
HSRP
Analyzing Figure 26.7 below, the network has two gateway routers that connect to one Layer 2 switch that aggregates the network hosts:
Figure 26.7 – Hot Standby Router Protocol
Router 1 has one potential default gateway address (10.10.10.1) and Router 2 has another potential default gateway address (10.10.10.2). The two routers are configured in an HSRP group and they present to the clients a virtual default gateway address of 10.10.10.3. This host address is considered the default gateway address, although it is not assigned to any router interface because it is a virtual address.
One of the two routers is the active device (Router 1 in this example) and it is the one that is actually forwarding traffic for the 10.10.10.3 virtual address. Router 2 is the standby HSRP device. The two routers exchange HSRP messages in order to check on one another’s health status. For instance, if Router 2 no longer hears from Router 1, it realizes that Router 1 is down and it will take over as the active HSRP device.
These devices transparently provide access for the clients, as they are transparently serving up the virtual default gateway address.
VRRP
VRRP works in a way that is similar to HSRP. The differences are that the two routers are configured in a VRRP group and one of them is the master device (instead of the active router) and it performs all the forwarding, while the other one is the slave device (instead of the standby router), as illustrated in Figure 26.8 below:
Figure 26.8 – Virtual Router Redundancy Protocol
As was the case for HSRP, the VRRP group presents a virtual IP address to the clients. An interesting aspect about VRRP is that you can utilize as the virtual IP address the same address that is on the master device (this is very useful if you are using public IP addresses). In this case, the virtual address is configured as 10.10.10.1, which is identical to the address on the Router 1 interface.
GLBP
GLBP is the most unique of the FHRPs. With GLBP, not only do you have the ability to achieve redundancy but also you have the ability to perform load balancing and it is much easier to use more than two devices. This scenario is depicted in Figure 26.9 below:
Figure 26.9 – Gateway Load Balancing Protocol
In Figure 26.9 above, there are three routers configured in a GLBP group, which is assigned a virtual default gateway address (10.10.10.4) that is also configured on the clients. One of the devices (Router 1 in this example) is elected as the AVG (Active Virtual Gateway) and the other devices are AVFs (Active Virtual Forwarders).
When the hosts send ARP requests for the 10.10.10.4 MAC address, the AVG responds to the ARP requests and round robins with the virtual MAC addresses of the AVF machines. Router 1 responds to the first ARP request it receives with its own virtual MAC address, then it responds to the second ARP request it receives with Router 2’s virtual MAC address, and then with Router 3’s virtual MAC address. In other words, the AVG round robins the traffic over the available AVF devices. The simplistic round robin balancing approach can be changed within the configuration to other load balancing techniques for GLBP.
Note: The AVG can also function as an AVF and it usually does so. |
Server Redundancy
Server-based redundancy technologies can be implemented in server farms or data centers. This is often needed to ensure high availability for key server functions, like file or application sharing. One way to achieve this is to mirror multiple servers so in case of a failure of one server you can dynamically fail over to another server.
Another example of redundancy technology is VoIP, which has a critical component called the PBX (or Call Manager in Cisco networks). Because PBXs are so critical for routing call traffic, you will typically configure them in a cluster with the same idea as in server redundancy: if one device fails, the other device starts servicing the VoIP clients transparently.
Route Redundancy
If you wanted to configure redundancy between the campus infrastructures in a WAN, you can achieve this by implementing load balancing at the routing protocol level.
Figure 26.10 – WAN Route Redundancy
This increases availability because in the case of a direct path failure between Site 1 and Site 2 in Figure 26.10 above, the two sites can still reach each other by going through another location (Site 1 > Site 3 > Site 2).
Although it features high availability functionality, the scenario presented above has a downside regarding its full-mesh connectivity. This implies configuration overhead and a high cost because of the number of circuits that need to be created. This is one of the reasons full-mesh topologies are not implemented very often.
To calculate the necessary number of circuits for a full-mesh architecture, you can use the n*(n-1)/2 formula, where “n” equals the number of sites (nodes). In Figure 26.10 above, there are four sites, so the number of connections equals 4*3/2 for a total of six circuits.
Details concerning the specific capabilities of high availability and load balancing for each particular protocol will not be covered in this chapter or in the remainder of the manual.
Media Redundancy
Another type of redundancy is media redundancy, which occurs in places where devices with multiple paths are connected. This is always useful in case one of the links fails. Media redundancy demands the configuration of Spanning Tree Protocol at Layer 2 to avoid loops that can bring the network down.
In a WAN environment, there might be floating static routes, meaning static routes that point to a backup path, as illustrated in Figure 26.11 below:
Figure 26.11 – Floating Static Routes
Analyzing Figure 26.11 above, there are two sites (Site 1 and Site 2) connected via a direct WAN circuit and via a slower connection that involves more hops as a backup path (via Site 3). The traffic from Site 1 to Site 2 should be direct (the primary path), which in this example the route is learned via EIGRP and it has an AD (administrative distance) of 90.
To achieve redundancy over the backup path, you can create a static route on Site 1 pointing to Site 3. To make this a floating static route, you need to intentionally set its administrative distance higher than the AD of the main path. In this example, the floating static route is AD 91 so it will be less preferred than the EIGRP route. This means that the static route will not be used unless the EIGRP route goes away. In other words, the floating route will not be used unless there is a loss of the primary route.
The floating static route is a technique often used in WAN scenarios to provide circuit redundancy. Another technology used to achieve media redundancy is EtherChannel. This is a Layer 2 or Layer 3 logical bundling or channel aggregation technique that can be used between switches. The bundled links can appear as one single link on the specific devices, as shown in Figure 26.12 below:
Figure 26.12 – EtherChannel Example
EtherChannels are important when using the Spanning Tree Protocol because when all of the links look like one link, STP will not consider them a possible loop threat and will not shut down any link from that specific bundle. Therefore, the links in an EtherChannel can be dynamically load balanced without STP interfering in this situation.
As explained previously in Chapter 19, EtherChannels come in two forms:
- PAgP (Port Aggregation Protocol): a Cisco proprietary protocol
- LACP (Link Aggregation Control Protocol): an open standard protocol
Channel aggregation techniques can also be used in WANs. An example in this regard is the Multilink PPP (MPPP) technology.
Caching Engines
In large environments there is the challenge of providing all users with quick access to internal and external resources over the network. As the number of users who try to access network resources grows, so does the challenge of providing proper services to each of them.
Cache engines receive a request from the user, forward it to the Internet (or intranet), receive the content, and forward it to the user while storing a copy. When another user asks for the same content, the cache engine can immediately respond with the stored copy without the need to request it from an external server, resulting in optimized communication. Cache engines can come in many forms:
- Dedicated appliances
- Integrated in existing network devices (switches, routers, etc.)
- Software applications
Note: Caching functionality is usually implemented in proxy servers, as they act as a middle-man in the network. |
Caching usually works just fine with static information but it may not be very effective with information that keeps updating, like dynamic Web pages or streaming media. Such dynamic content is present more and more in modern organizations so caching can become challenging to implement.
One of the most commonly used techniques for implementing caching is the open source product called Squid. It is basically a proxy application that also offers many caching options.
Details about proxy server functionality can be found in Chapter 21.
Summary
As technology evolves, networks increase in complexity and have to support more and more applications, so you have to make sure that you monitor and optimize the performance of your environment. One of the major reasons for this is to keep every application and network service running and available. Network uptime is often the number one priority in an organization, as they want everything to work properly for users that access different types of data and applications across the network, even on the Internet.
Some of the most critical applications that require peak network performance include:
- Transactional applications (banking, insurance, stocks, etc.)
- Healthcare applications (imaging solutions, hospital applications, etc.)
- Multimedia applications (voice and video)
Network optimization may include taking care of several network parameters:
- Bandwidth
- Latency
- Jitter (variable latency)
Quality of Service (QoS) is a key aspect that needs to be considered when designing and implementing enterprise campus solutions to ensure performance optimization across the network. There are many different categories of the QoS approach, including:
- Shaping
- Policing
- Congestion management
- Congestion avoidance
- Link efficiency mechanisms
There are many redundancy options that can be utilized in different modules of modern networks:
- Workstation-to-router redundancy at the Access Layer
- Server redundancy in the Server Farm module
- Route redundancy
- Media redundancy at the Access Layer
Cache engines receive a request from the user, forward it to the Internet (or intranet), receive the content, and forward it to the user while storing a copy. When another user asks for the same content, the cache engine can immediately respond with the stored copy without the need to request it from an external server.
Configure QoS in our 101 Labs – CompTIA Network+ book.