Over the last couple of years performance of web applications became more important to businesses as search engines (such as Google) factor in performance into their ranking. This ultimately leads to Performance == Better Visibility == More Users == More Revenue. Read more on Why Web Performance Optimization is as Important as SEO
The best way to measure the performance of your website is by looking at certain Key Performance Indicators (KPI’s) that tell you how fast or slow your web site is to the end user. Driven by efforts from web performance specialists such as Steve Souders and companies like Google and Yahoo, the industry has learned that factors such as page load time, number of network roundtrips and transferred size are important performance indicators for a web page.
There are 3 interesting phases of a web site from an end-user performance perspective. The dynaTrace AJAX Edition visualizes the page lifecycle in the TimeLine View where we can highlight First Impression, onLoad and Fully Loaded Time:
This is the time from when the URL is entered into the browser until the user has the first visual indication of the page that gets loaded. The first visual indication is the first drawing activity by the browser and can be traced with dynaTrace AJAX Edition. It depends on the initial HTML document when the browser can start drawing content. There are different Best Practices available that talk about different strategies. Google for example downloads a minimalistic page to provide fast first visual rendering. It then delay loads more content after onLoad or even later when the user starts interacting with the page.
From the Network View we can read several Network Resource related KPI's that help us to understand the structure and size of the page.
This is the total number of network requests that get downloaded with the website. The ultimate goal is to keep this number as low as possible in order to reduce roundtrips. Monitoring this KPI gives you early indications on newly introduced content that can negatively impact page performance.
This is the total number of requests to the server that responded with an HTTP Status Code of 300 (Redirect), 400 (Authorization Problem) or 500 (Server Error). These are requests that should be avoided as they have a negative impact on the page load time. The root cause of these problems is often server-side related implementation, configuration or deployment issues.
The browsers underlying network connection has a major impact on the download speed of web site content. There are different phases when downloading content that impact the overall download time. The dynaTrace AJAX Edition shows all phases for every network request.
A request that gets handled by the browser runs through different stages. The following list explains these phases, what the measures tell us and how they get impacted by the browser, the network and other requests.
One DNS Lookup happens for every domain that hosts resources for the current web site. If you move between multiple pages the browser does not require another DNS lookup for a domain that has been resolved on the previous page. It is interesting to look at the total DNS time to identify problems with DNS Lookup Times that can be caused by DNS configuration problems.
Depending on the browser and the number of resources that are served by a domain the browser establishes one or multiple connections to each domain that hosts resources for the page. Connect Time is the time it takes to establish the TCP/IP connection to the web server. Connections usually stay open unless the Web Server directs the browser to close the connection (Connection HTTP Header). When using secure communication via SSL, the Connect Time also includes the time of the SSL handshake. High Connect Time can therefore have the following reasons: slow network connection to the web server, usage of SSL and not allowing the browser to keep the connection open.
High Server Time means that the Web/Application Server required a long time to process the request. This is particularly relevant with requests that trigger application logic to be executed on the application server where higher Server Times can be expected – especially under heavy load periods. Monitoring Server Time is important to identify bottle necks, performance and scalability problems with the application server. It is usually easier to scale static content delivery by adding more web servers with load-balancers or by using a Content Delivery Network. It is not that easy to scale a dynamic application in the same way. Keeping an eye on this metric is important.
This time directly correlates with the size and the connection speed between browser and server. Keeping transfer time low is important to ensure faster load times. Transfer Time can be improved by lowering the Total Page Size and by bringing content closer to the end user by using Content Delivery Networks (CDNs)
Wait Time is directly correlated with the number of resources that are served by the same domain. The physical network limitation of a browser per domain causes resources to wait for a free connection. Reducing the number of resources or spreading the resources over different domains will bring this time down. Instead of looking at the total Wait Time the average Wait Time tells a better story whether Wait Time is of a concern.
The number of domains that host the web sites resources is important as it affects DNS, Connect and Wait Time. Additional domains that are utilized to download resources will have a direct reduction in the wait time because the browser ultimately uses more physical connections. This can have an opposite affect when more DNS lookups are needed and more time is spent to establish the physical connections. Single Resource Domains should be avoided as you pay a high price for performing the DNS and Connect to download a single resource. It is sometimes not avoidable when downloading content from external content providers (such as ad-services). When having the deployment under your own control you want to make sure to not have single resource domains.
dynaTrace AJAX Edition calculates a total page rank based on some of the KPI’s that were discussed in this article. We consider the 3 Page Load Times as most important indicators and we identified the following threshold values to define what is great, acceptable and bad page speed:
The most important factor is the Time to First Impression followed by Time to onLoad and then Time to Fully Loaded. We penalize a threshold violation higher for Time to First Impression and onLoad as compared to Fully Loaded. For details see the example calculation below.
We also factor the number of total HTTP Requests as the number of roundtrips greatly impact overall download time. Great sites require fewer than 40 requests, acceptable are up to 100 requests. Sites with more than 100 HTTP Roundtrips are considered bad.
Of course – these are generic rules and may not be applicable for all web sites. CPU Power and Network Connectivity also have a great impact on load times. Across the board these thresholds seem to be fair.
The Rank Calculation is best suited when analyzing pages till they are fully loaded. When interacting with pages more dynamic content gets downloaded and impacts the the metrics that influence the rank calculation.
Our goal is to adjust all these rules over time based on more feedback we receive from our community.
A page starts with a rank of 100 and is lowered based on missed thresholds. Assuming our page has a 1.6 seconds Time to First Impression. For every 200ms this KPI is slower than the value we specified as being great (which is 1s in this case) we degrade the rank by 1. This reduces the rank of this page by 3 due to Time to First Impression.
If the page has an onLoad of 3.2s it gets penalized an additional 6 points as 3.2s exceeds the 2s goal for a great time. We also use the 200ms rule that we take for the First Impression time.
If the page has a Fully loaded time of 4s it gets additionally penalized by 4 (difference to the 2s goal but only penalizing 1 point for every 500ms). This reduces the page rank by 10 (6 & 4) points due to OnLoad and Fully Loaded time.
Even though a page might be fast - if it requires too many roundtrips to download all resources we penalize the Rank. If a page causes more than 40 roundtrips we penalize 1 point for every 5 requests more than 40. If the page has 55 roundtrips we reduce the rank by 3.
The Rank calculation based on these KPI's therefore is 100-3-6-4-3=84.
This gives us to the following calculation: 84*0.6+60*0.1+80*0.1+80*0.1+70*0.1=79 which corresponds to a Grade C.
Limiting the Time Based Penalties
We only penalize up to twice the values we consider as bad (5s/8s/10s). That means that – if a page would have a Fully Loaded Time of 12s we penalize it the same way as a page with a load time of 10s.
The Rank concept have been taken from tools like YSlow and PageSpeed who present their result in a Rank (100 is best - 0 is worst) that also corresponds to a Grade (A=100-90, B=89-80, C=79-70, D=69-60, E=59-50, F=49-0).
Here are further reads on Key Performance Indicators and how to speed up web sites