Introduction
Scope of Evaluation
What is Visual UpTime™?
Case Study Overview
Daily Ops & UpTime™
Troubleshooting FR
Traffic Management
Tip of Iceberg
Conclusions
Contact Visual Networks

Visual UpTime™:
A Core Competence Product Evaluation

Case Study: Large-scale Frame Relay Enterprise Network

The subject company provides information services to several hundred broker locations nationally over a router-based network composed of a number of information distribution centers served by T1 Frame Relay. The majority of broker services customers have 56 Kbps or 64 Kbps Frame Relay circuits. Backbone links between information distribution centers are dedicated T1 access lines. The primary application for this WAN is a UDP-based broadcast of stock exchange "ticker" information from distribution centers to brokerage locations.

Why Visual UpTime™?

The subject company acquired the Visual UpTime™ WAN Service Level Management System to improve its time to respond to Frame Relay service outages and abnormalities. Monitoring tools used by operations staff usually identified link failures, but staff could not identify whether the failure occurred in the Frame Relay network or at FR terminating equipment. Service restoration following a failure was often hampered by the all-too-familiar finger-pointing between a customer and a service provider.

The subject company also needed detailed performance and event correlation data to proactively deal with latency, data loss, and to identify changes in information flows. They were reluctant to assign skilled staff away from daily operations to program their existing systems to collect and process performance data to generate reports for analysis, and unwilling to burden WAN circuits and routers with the considerable polling required to collect performance data.

Daily Operations and the roles for Visual UpTime™

The subject company runs a sophisticated network operations center (NOC). A commercial SNMP NMS is used to collect routing information, router performance statistics, and router configurations. The same application the subject company provides to brokerage locations also runs on client computers located in the NOC, to allow NOC staff to observe the same broadcast delivered to customers. A trouble ticketing system, a T1 circuit management system and LAN analysis equipment are also used.

The subject company uses Visual UpTime™ to complement these traditional NOC tools. The primary daily applications for Visual UpTime™ in the NOC are real-time event monitoring and troubleshooting.

Visual UpTime™ and Trouble Sectionalization

NOC staff watch the Event Processing window to monitor access line status. The ops staff indicate that the added dimension VisualUpTime™ brings to the daily operational practice of alarm monitoring is the timely and accurate isolation and identification of the network component that is in trouble or has failed.

By adding Visual UpTime™ to existing daily operations, Frame Relay is no longer an opaque "cloud". All the individual components of each Frame Relay connection router, serial interface, DSU/CSU, telco DDS or T1 facility—are visible and can be managed individually. Armed with the ability to view components rather than a cloud, ops staff now first consults the Visual UpTime™ Event Processor to determine the root cause for access line trouble or failure.

Once an event is noted, the NOC staff consult the Troubleshooting window to view signal and error levels to ascertain whether the access line is OK, degraded, or failed. The subject company accurately identifies the troubled access line to the telco using telco circuit ID’s (configured in the database). Location information is also readily available to identify customer(s) impacted by each event. The ability to isolate and quickly associate a network fault with readily identifiable externally-visible WAN services and facilities is what distinguishes service level management from network management.

In our opinion, the company will gain better understand traffic behavior over their WAN circuits when they begin to monitor additional threshold events: Channel and PVC throughput and utilization. Monitoring these events will provide early indications that high-priority circuits are in danger of becoming oversubscribed, and will not interfere with application traffic. ASE's collect and analyze utilization statistics. Only event notifications are transmitted to the PAM or MIC. SNMP NMSs don’t poll network elements, and processor cycles of network elements are not constantly diverted from packet-switching to management processing. The ability to monitor performance and provide event notifications in a non-intrusive, low-overhead manner is a unique feature of the Visual UpTime™ product.

Troubleshooting Frame Relay using Visual UpTime™ Toolsets

When a troubled access line has been identified using the Event Processor, the ops staff uses the Troubleshooting toolset to identify the trouble source. The ops staff use Access Line statistics and summary graphical displays from this toolset to help determine exactly what is wrong with an access line. the ops staff can determine the status for DDS and T1 access lines. Graphs depicting the percentage of errored to normal seconds over a time are used to conclude whether a certain error condition is transient or has been occurring frequently enough to warrant further investigation.

When the ops staff places a call to the telco, they identify the troubled access line using customer data and circuit failure terminology the telco understands. This is a win situation for all parties: when Frame Relay users provide complete circuit and service identification and a detailed diagnosis of a service failure to the FR provider, they typically reduce the time required to restore service.

Real-time traffic analysis over Frame Relay

In large-scale data networks, there are times when all network components appear to be operational, but traffic flows are not proceeding as expected. In a LAN environment, such anomalies are routinely resolved using packet or traffic analysis. Visual UpTime's Traffic Capture toolset provides daily ops staff with the ability to examine the traffic emanating from and arriving across any Frame relay access line. The features of the Traffic Capture toolset compete with the best of LAN analyzers, and are accessible in a distributed manner. In our opinion, ASE traffic collection is more likely to yield useful snapshot data because the data collected are closer to being instantaneous than data collected via remote polling.

The benefits of such a tool are obvious. Misconfigured equipment connected to WAN services consumes bandwidth unnecessarily, or disrupts service. The sooner the ops staff isolates the problem, the sooner expected performance levels can be restored. By providing centralized access to traffic details at every Frame Relay access line, the Traffic Capture toolset reduces incidences where staff must be dispatched to a remote location to diagnose a problem.

next...