How Scott Hanselman is Helping to Destroy the Internet

3/29/2011

At work last week, someone complained to me that their personal website was just too slow (it took about 10 seconds to load). He asked me to take a look at it and one of the first things that I did was look at the output of the site. To say the least it was a nightmare. Not because of the table layout or blinking images or the obnoxious flash movie that was on the page. No, the real reason it was a nightmare was simply because of how inefficient it was. The HTML had tons of white space (about 50k extra), about 10 separate CSS files, and 5 JavaScript files (neither the CSS or JavaScript were minimized). On top of that most of the actual code/markup wasn't even being used (or at least not being used well). Then you add in all of the VERY large images and you get about a 10 second load time. After a couple minutes of weeping from the sheer pain of looking at the markup, I wrote a bit of code to combine the CSS and JavaScript files on the fly, minify them, etc. (and made the project open source which is now available on CodePlex), and cut down on the image files a bit and was able to get the load time down to about 1.2-1.5 seconds (luckily the little code behind that there was, wasn't too bad).

When I had finished with updating the site, I started to wonder if anyone else out there had similar issues. I never really bother to look at the actual markup of the sites that I go to but surely they would be better. I mean these were created by professionals (hopefully). So I picked about 20 sites that I frequent, took their normal CSS, JavaScript, and HTML output and ran it through my bit of code that I had just created. Every single one was inefficient, well, except two. Google and Bing were both rather efficient. But everything else had, on average, about 10% wasted code being transmitted across the wire (HTML usually 10-15%, CSS and JavaScript varied wildly from 5% to 20%). As an example, with LifeHacker I was able to save about 10% (or 5kb) on the HTML but only about 2% on the CSS (and they used min versions of jquery, etc. so that made no real difference). Penny Arcade only gained about 1kb (or a little less than 10% gain), but once again only a small gain on the CSS (and once again they were using a min version of jquery). Twitter, another 6kb that could be saved on the HTML (once again about 10%). In fact every site I came to had the same issues. It didn't matter what it was written in (PHP, Ruby, ASP.Net). Usually they were using a version of jquery on a CDN, but the HTML was inefficient and the CSS had a couple of changes that could be made to it. After my normal, every day sites, I checked a couple CMS systems that I use when I'm creating sites for other people (DNN and Umbraco). DNN, again, about 10% or 6kb. Umbraco was one of the higher ones at 20% or 6kb (their site is much smaller, but very inefficient)...

I started to think to myself, how can this be? Surely there must be someone out there... a hero of... efficiency? So I turned to the one man who in the .Net community that I respect more than anyone (other than ScottGu)... Scott Hanselman. I listen to his podcast, I read the blog, I get more information from that man than probably anyone else (I've never met him and probably never will but in all seriousness, someone tell him thank you). I mean CodeProject, StackOverflow, etc. are great sites when I run into a problem, I listen to .Net Rocks, Herding Code (which is probably my favorite), Polymorphic Podcast (when the hell is that coming back?), etc. I try to stay plugged in as much as possible, but I can honestly say that I get more information that helps me in my day to day work from Scott Hanselman than anyone else. So I thought that if anyone was going to do well on this test it might be him. So I went to his site, plugged in the code, and... it wasn't good... Well it wasn't great anyway. The stats for his blog were about on par with everyone else. Although he did do better from a percentage stand point on the HTML (6% or 5kb), he did much worse on the CSS (11 files, 12 if you count the IE6 specific one and each one about 10% on the size).

OK, so I've been rambling about percentages, a couple kb saved each page view, etc. Why does that matter? Well, let's just focus on the HTML part of Hanselman's site. On a given week day, he gets about 12,000 page views by the looks of it and 6,000 on weekends (note that, that was back in 2008 and if his numbers trend like mine do, which are still much, much lower than his, his site is receiving a few more people now). So just so we have a nice round number, we'll say about 10,000 a day. In a given day, he's wasting about 48.8 megabytes of bandwidth. In a 30 day month, that's 1.43 gigs of bandwidth. In a year, 16.77 gigs. And since bandwidth isn't free (it might be for him, but someone is paying the cost), we can assume that money is going to waste (and I've heard rather wild numbers ranging from $2 to $0.03 for bandwidth, so either he's costing himself or someone else between $33.54 to $0.50 a year)...

OK, that's still not that bad, but think about it for a second. Almost every site I ran into had similar stats to his site. He was the every man (who has a website) in this situation. Each had about 10% waste on average and he was in line with it. And I'm pretty sure in the process of writing this post I've figured out a way to shrink the CSS, HTML, and some of the JavaScript files of the sites that I polled even more, but for now I'll leave that number at 10%. Anyway, that means that 10% of the traffic on the web (that isn't video, or Google, or Facebook [which was rather efficient actually], etc. which account for most of the web traffic out there and thus this post doesn't have too much merit in the grand scheme of things) is a complete waste. So maybe saying that Hanselman is helping to destroy the internet is a bit much, I mean it's not like the internet is a series of tubes, but he isn't helping matters. (Once again, I think he is an fantastic blogger/person to learn from and someone please tell him thank you on my behalf. And if you couldn't tell from my various asides, I don't really care about this topic, I just wanted to do a long rambling post to announce my new project instead of the usual "Hey, released something else this week" sort of post and I found it interesting that most of the sites out there could potentially benefit from the project. Speaking of which, all of my various projects have been updated and released on CodePlex as well as NuGet. Also, the new project on CodePlex, Optimizer Prime, compresses the page output, CSS files, and JavaScript files on the fly for a site and works with both Web Forms and MVC. So try it out, leave feedback, and happy coding).



Comments