Configuration and Network Change Management
NetQoS Performance Experts
Many times the hardest part about change is understanding and documenting the impact. When you understand the impact, it becomes easier to handle and manage. Through historical reporting and evaluation systems, environmental changes and their respective impacts can be quantified after a change occurs. Frequently when planned or unplanned changes occur in an environment, you get unexpected outcomes.
Through performance reports, you can validate applying changes to improve performance levels. In Figure 1, network latency is choppy and inconsistent for a particular business-critical application with remote users. After applying a QoS rate-limiting policy on large FTP transfers competing for resources and priority with a business-critical application, the network latency improves to average less than a second and is more consistent.

From the server side, the server response times for a business-critical application server are extremely large with an average 1.5 second response time. After evaluating the server's CPU and memory capacity, a change was implemented to improve caching on the server. When the server was rebooted (big dip of data shown in Figure 2), performance levels improved dramatically to half their previous levels. Response time trends over time allow users to quantify the impact of change and ensure the upgrades or changes actually improve performance levels for applications, servers, and networks.

With application rollouts, typically users are rolled out gradually to ensure the increased load does not impact overall performance for the newly added users. As shown in Figure 3, additional users start using a human resources application over a weekend as illustrated by the increased number of TCP sessions from users to the server. Monitoring the number of sessions helps establish how many users and sessions are initiated by the users to a server or group of servers.

In the next application rollout example, Figure 4 illustrates that while the new users start appearing, performance levels remain the same. For this application rollout, the new users did not affect the application's performance levels.

In another example of change management, the application rolled out shows degraded performance because of the increased users. As shown in Figure 5, the number of users increased from 45 to 70 over the weekend. The users graph calculates the number of users or subnets connecting into a server or application pair.
.jpg)
With the corresponding user increase, the data volume also increased. Figure 6 shows the data volumes to and from the server for the users. The increased number of users corresponds with the increases in data volumes.

For the response times and the affect on end-user interaction, the increase in users also corresponds to a 40% increase in overall transaction times. As illustrated in Figure 7, primarily the network latency increased for new users coming into the infrastructure.

For the newly added users, the network latency itself increased. The new users are from a remote site on higher latency links, which causes the increases in response times. As shown in Figure 8, the larger network latencies for the application are caused by the increased users with the longer latencies.

Each time you introduce a change, there is an inherent risk that the change could do nothing or do harm rather than the usual goal of improving performance. To that end, you must have sufficient instrumentation in place before you make a change to observe and precisely understand the impact. As described in the examples, you can mitigate the risks in case something goes wrong or you get unintended consequences.
|