How Insurance Works

nGeniusONE Service Assurance Platform Overview Demo

The NETSCOUT nGeniusONE business assurance solution monitors the performance and end-user experience of business critical applications for the enterprise. The visibility and workflows provided help IT to quickly identify, triage, and resolve issues in the complex infrastructure for today’s digital transformation initiatives in the
enterprise. For this enterprise demonstration, we’ll use a typical Fortune 100 company with thousands of employees and tens of billions of dollars in annual revenue. As one can imagine, this includes applications hosted in various data centers as well as applications hosted in the newly deployed software-defined private cloud and applications hosted off premises in the public cloud. Some of these applications are used internally from dozens of remote sites and others are externally facing applications used by customers via the internet and mobile devices. Let’s start the demo. nGeniusONE features a global dashboard that shows the real-time and historical performance of all applications and infrastructure devices. With a global operation in hundreds of cities across all continents, nGeniusONE provides visibility of all applications and users in a single mobile dashboard for the enterprise. For example, we can look at performance by geography and keep track of key performance indicators or KPIs for our apps in each global region. Presented here in the US East region, about 6% of our transactions have failures. Drilling down, we established the failures are mostly attribute to New York with 10% failures and drilling down some more, we discover that most of these issues are related to Citrix VDI where out of 8k transactions in the last hour, close to 1/4 or 22% are failing. Of course, we can select any timeframe of interest in the past, effectively seeing how the apps were behaving at any specific historical time and whether this problem has just started or it’s been going on for a long time. Drilling back out, we can take a different perspective and examine each enterprise application individually regardless of the geography. For example, Exchange, SharePoint, Oracle E-business Suite, Oracle CRM, and Citrix for this enterprise. This dashboard reveals a number of applications suffering, Microsoft Exchange with 17% of the transactions not meeting the SLA. As we open up exchange, we can see all the components that make up our exchange email service and we can establish that most of the issues are to do with Active Directory with 30% of the authentication transactions failing. By focusing on the high level performance of the application and then drilling down into the tiers as needed, we can quickly identify which components are causing problems. nGeniusONE is unique in its ability to scale the hundreds of locations, thousands of applications, and millions of end-user transactions all tracked in real-time and aggregated historically into a single cohesive and easy to navigate global dashboard. Now let’s troubleshoot the specific root causes of issues that we’ve seen. In this example, users and executives are complaining about the performance of the business critical CRM application, especially since the software upgrade applied a few days ago, they complained some sites are slow and other users aren’t able to complete transactions, because of system errors. We’ll start at the dashboard and open up the CRM application view. The web component which is how the users interact with the application via their browsers has 5% of the transactions resulting in failures. We’ll change to the overtime view to examine performance by remote site to see which of the sites are most problematic. The top row shows the response time by site, for example, we can see that most sites experienced around 50 milliseconds which is acceptable. However, Singapore users second from the right see a much higher response time of up to one and a half seconds that could explain the complaints about slow application performance. The bottom row shows us where transactions are completing. Transactions from most sites are completing successfully. However, London has a high failure rate. About half of all London transactions are red, which means they’re not completing and return application errors to the users, mapping to the complaint that some users are unable to complete their CRM transaction successfully. By expanding the view, we can establish that this is an ongoing problem that has been happening for as long as the last 24 hours. Drilling down into the service monitor for London’s transactions, we can find out more details as to what’s going on. The service monitor presents information about CRM transactions from London, including network and end-user experience KPIs. Focusing on the graph in the bottom right, error code distribution over time, we discover transactions with errors and the nature of these errors. In this case, the CRM server is returning the error 507: Insufficient Storage. We can conclude here that London users are having application type issues and not network issues and we even know the specific error code as evidence to approach the CRM DevOps teams with. If we need to provide more evidence, the session analysis view provides visibility of each and every specific London transaction that had issues. We see the exact time at which had occurred, the client and server IP involved, the web services URL, the response time, the return code, and dozens of other key data points about each user transaction. nGeniusONE is unique in its ability to passively collect data for millions of transactions every minute and store indexed information about each transaction. This allows IT to easily search and find any transaction that occurred even weeks or months ago. Packets are also stored and can be retrieved for any transaction with one or two clicks. The packet analysis view presents protocol headers and payload values, but in reality, the packets were already analyzed by nGeniusONE, which has already summarized the reason for the failure, so taking the time to review packets won’t enhance the conclusion. nGeniusONE already did that for us. By using the dashboard and service monitor, we know why transactions are failing from London and we have the specific transaction on error codes that we can bring to the DevOps teams that manage the application code. nGeniusONE provides additional views to help us answer a few more performance related questions going on in our enterprise. For example, if users from all sites of leveraging the private cloud or SDN data center to use the CRM application then why are only uses from London getting the errors? They’re all accessing the same load balancers and servers, so logically, we should see similar errors on transactions from all sites. nGeniusONE service dependency views provides automated dynamic views visualizing how users and servers interact based on real-life interactions in the application server farms. To take a look at what’s going on on the CRM application let’s launch service dependency. This map details exactly how the application works and which tiers interact, the various client locations talking to the web app and database servers in the CRM map, plus other dependencies going on. This view is true discovery built by nGeniusONE’s view of the traffic it sees in real time and historically. It’s therefore possible to compare how the application runtime architecture is different before and after an application upgrade or any other event of interest. In this case, we’re using the diagram is a real-time dashboard with key performance indicators, such as transactions, latency, or response time and errors. Here viewing transactions processed, we note that Oracle RAC1 processed 3 times the volume of transactions than RAC2, this imbalance should be looked at by the DB and DevOps teams, as we would expect the load to be similar across the cluster. Changing the KPI is to look at the response time, we see that the Oracle G2- 2 server is much slower than its peer database servers, 833 milliseconds vs. 11 and 9 milliseconds. Again, something that DB and DevOps teams should take a look at, as we expect the database servers to have similar response time KPIs. Let’s not forget that all these automated analytics and KPIs are collected entirely passively from wire data, no agents or active polling of servers involved. nGeniusONE understands transactions between tiers at a protocol and business level Anyway, back to our original question, why are London users the only ones with errors? Well if we look at the right hand side of the map we see that London users are
not using the common app infrastructure, but instead are executing transactions against a loan server this explains why only London was having problems. In this specific case, users with London profiles were mistakenly pointed towards a QA server as part of their last upgrade of code, because of the automated service dependency views generated by nGeniusONE, we know the exact runtime architecture of the application and can identify issues that escaped traditional monitoring and indeed, where as planned does not turn out to be as built. This concludes this demonstration of NETSCOUT nGeniusONE business assurance solution, we’ve shown the power of the nGeniusONE global dashboard to manage the complexity of multi-tier applications overlaid into the typical enterprise infrastructure of today, service monitors to help diagnose specific issues, and provide the necessary evidence to help IT work with DevOps to resolve application issues quickly, and finally demonstrate how service dependency helps IT visualize in real time the real deployment of applications and exactly how they’re interacting on the
infrastructure. Thank you for watching.

Leave a Reply

Your email address will not be published. Required fields are marked *