Richard Pitt's Personal Site
Marketing and Sales with a Technical Bent

The Content Managed Web Site

By rights this belongs on either the Digital-Rag or on my Blog. Maybe I'll copy it there too - it is a lesson in the evolution of the internet and web sites but it specifically is about my own personal trek, so here is where it starts.

This (richard.pacdat.net) web site is the latest of many sites I've been involved in to move from one of the first "page generation" web production facilities to one of the latest crop of open-source CMS systems. The old system was Frontpage from Microsoft, starting with Frontpage98 and progressing through to Frontpage 2002. Prior to that I did all sites by hand, crafting HTML with a text editor. The evolution of page creation systems progressed quickly. Today we have things like Adobe's DreamWeaver and all manner of others - but when I started looking there really wasn't much to choose from. In fact, one of the first major sites I was involved in was a re-publishing of material originally set up for a paper catalogue and we had to write software for it from scratch to make it viable.

This one site created about 5000 new pages of content, all interlinked and with menus, etc., each week - real estate listings. The software we put together for this one purpose took in the publisher's file and put out all the new pages in about 2 minutes elapsed time - and this was on hardware that was vintage 1994 - Pentium 90s and such. But this was a "one-off" project - not useable for other sites - but a portent of things to come it turned out. It created a whole site in 2 minutes once per week from a primitive database. Today's CMS systems do individual pages on demand from a SQL database.

I wanted something that would keep the menus straight as pages were added manually, much as the automated system re-built the menus each week. I'd hand-tweeked the old Digital Rag menus each month and was not looking forward to having to teach others how to do them. A couple of years went by where I was too busy with being MIS manager and doing marketing for others. The next time I looked the crop had grown a bit - but most still didn't do what I wanted. Then Microsoft purchased a product from a small company, Vermeer, called FrontPage.

I chose Frontpage because at the time I could not find anything that came even close to the utility it had for allowing otherwise untrained HTML editors (webmasters) to edit and update a site once it was initially laid out. All the hosting I've done of these sites has been on Unix/Linux systems - where the orginal web grew up, and Microsoft had created a kit that allowed Apache to do what was necessary on the server to deal with things like authentication of external editors (without having to add them to the underlying system's password system) and various extra facilities such as indexing and search update without creating special scripts and such. Microsoft's own server, IIS, was only just starting to be created at that time.

You see, even back in the late '90s I was telling people that just having a "static" web site - really nothing more than an online brochure - was not going to be enough to attract and keep the attention of the potential customers and the various methods they would use to find the sites. Search engines were just starting to crawl the web, indexing it for search. The basic premise was that if a site remained the same, then the engine did not come back as frequently. If it had changed each time the crawler came, then the crawler came more often.

Allowing and encouraging people to add content consistently to their site was the intention. By building a "magazine" style of site - similar to my first site, the Digital Rag at Wimsey, the episodic nature of many businesses and organizations built up a huge following and history of information and comment.

This was before the advent of social sites where all and sundry are encouraged to come to the site and contribute.

Today we go much farther...


I guess the first question you might ask is "why move away from Frontpage?" - the second being "why not stay with Microsoft completely?"

Why move from Frontpage?

The first reason is that Microsoft in their infinte wisdom have stopped supporting Frontpage extensions on Unix/Linux systems. I've been able to get them running on Apache 2.0 - but the Apache 2.2 version has some features I want for other things - and Frontpage is problematic on it.

The second reason is that Frontpage in general is just not up to the standards being set by newer content management systems.

Why not stick with Microsoft?

Microsoft has all manner of CMS functionality in their latest offerings - but I just can't justify their cost of ownership. I've written about this in other forums and lots of others have too. Suffice it to say that to me this is simply not an option.

What's out there other than Microsoft?

The answer is "Lots" - both proprietary and open source. I spent quite a bit of time trying various options before I settled on one. Everything from Wordpress to Joomla to Drupal and lots of proprietary offerings that were similar. Some of the proprietary offerings appear to be far more capable than anything else I've seen - but they're also hundreds of thousands to license and ongoing costs are in the tens of thousands/month.

Many of these products I first learned about through my reading of various web and internet security forums and postings - they were the subject of warnings and articles on hackings, etc. Some of them appeared literally daily in some of my reading. Contrast this with Frontpage's history which has only to my knowledge had one major flaw - one which my implementations simply did not come close to as it was related to remotely setting up new sites - which I did by hand instead. I hate to say it, but I was spoiled by Microsoft's security in this product (something I can't say about many of their others) I had to find something that was as easy to maintain, otherwise I'd end up annoying both myself and my customers by having to do (and get paid for) far more maintenance than previously.

I've had all manner of customers and their members tell me that XYZ is better than "what we're running" - or ask for features that are specific to something else and ask why "we" don't have it. The bottom line for most of the answers came down to one of long-term goals and maintenance costs. I've shown over and over that given a long-term set of goals and outlook, the most secure/maintenance free system that has a reasonable following and ongoing support and development is the best choice. Maintenance costs are the killer - either because they cost so much compared to the expected return on investment of the project, or because they're not done and something happens to the site that ends up costing more; bad publicity, extra costs for hosting due to hackers stealing bandwidth, etc.

In the long run features that are a "must" will be added to the successful packages - and this is getting easier as more and more of them start to use standardized tools and add-ons such as layout managers and editors. The "plugin" concept is quickly growing and maturing such that many of them are cross-platform capable now, and the trend is accelerating. It is the underlying core components along with the talent and drive of the developers that makes the system "better" or "poorer" than another similar one. Choose this core wisely and reap the benefits for a long time.

So what did you choose and why?

The new system I use is the open source package called Geeklog - or in this case a "fork" of the project called glFusion. A "Fork" in open source terms is where someone decides that the original code is a good starting point for going off in a direction that the originators either don't want to go, or can't for some reason. In this case glFusion is only one of a couple of such forks of Geeklog - another being Nextide's nexContent.

Geeklog endeared itself to me for several reasons, not the least of which is that it is written by (and largely for) a web security firm for their own internal use. If they can't write secure code, then nobody can. Security to me means that the sites I help create are (far) less likely to suffer from hacking and cracking - activities unwanted by the site owners that can include such things as putting up nasty viruses, changing page contents to confuse or enrage viewers, or using the underlying server for other purposes such as sending out spam, distributing porn, communicating with other hackers, etc. - all of which I've seen and had to deal with in other contexts at one time or another for various reasons.

It also means that there is far less maintenance necessary for any site - so I and my staff can look after far more such sites without fear of losing control or missing updates. Contrast this with some other CMS system that have "critical" updates almost daily - and stories of hackings and crackings almost as frequently.

One of the sites I first used Geeklog on is the www.hancockwildlife.org site. The expectation and design goal was to have viewers of the bald eagle nest cameras they're famous for create an ever increasing amount of content about the foundation's major subjects. This would serve both to keep the cost of creation of the site down, and provide lots of new content to keep the search engines coming back and increasing the exposure - a synergistic circle that has more than proven true. Today the site gets millions of page-views per month, has a viewership of close to a half million/month and a dedicated core of members in the thousands. Together with the discussion forum (which runs on phpBB but will soon be running on glFusion's own forum module) this site is one of the most active nature sites on the planet - and yet it has a budget measured in just $thousands/year rather than $millions as some of the others have. The site has tens of thousands of video and image posts, comments, etc. and hundreds of articles and news items; all added daily from all over the planet.

Another site that I've just started moving from Frontpage to glFusion is the www.centa.com site. The original intent of this site was to build a world-wide following and customer base for CEN-TA's cross-border tax, visa and immigration services. The Frontpage version has undergone several theme and layout changes (done by staff there with little web knowledge) since it was created in 1999, and has had well in excess of 5000 pages of question/answer postings added by David Ingram, the owner, and others in his organization. CEN-TA rates in the top 10 for many of its chosen search terms, typically number 1 in most jurisdictions. This without resorting to tricks or techniques that the search engines might at some point discount.

The key to these sites has been the ability of the staff, principles, members and even the general public to add relevant content - content which keeps the site fresh and the search engines happy; all because they use a content management system.

Of course setting up such a system requires a bit of thought. Just because staff can change a page does not mean they should. Lots of times the system has been set such that only content in a small portion of the overall site can be changed without going through extra steps and hands. The same thing goes for basic look/feel decisions - colors, menus, styles, etc.

The major thing that sets a CMS system apart from other web sites is that adding content to it does not involve copying/"ftp-ing" files from the creator's computer to the host system. A built-in editor allows content to be created interactively or at worst cut/pasted from local files. This means that content creators don't need the latest version of DreamWeaver or other web creation tools. They simply need a web browser. It also means that the underlying server does not have to have new (or worse, shared) accounts added to it, further adding to security) This is in direct contrast to Frontpage and other systems where a client program on the creator's computer does the design work, then either the whole web site or at least the changed pages are uploaded to the host computer. In the case of Frontpage for example it might take several minutes to add a new page to a site that was fairly large (the CEN-TA site being a great example - took 1/2 hour at one point before I split the site up) The software had to go through and adjust the page layout for each and every page that might reference the new one - rebuilding menus and re-laying pages because in the end, all the pages are "static" - no content created "on the fly" when they are viewed.

In contrast, adding a new page via the CMS system is done in seconds, no matter how many other pages there are already on the site. All the pages of the site are created from a database at the time a viewer views them. This means that things like links and menus are created in real time - from the latest information available. Page content itself can be pushed into the database either manually or automatically, and layout can be tailored to the viewer's browser abilities including such things as the iPhone and other smaller, portable viewers. I take advantage of this with one site where items from an e-mail list are also added to the page database. Many other sites do similar things.

The Downside - sort of

The one downside to using a real CMS system (as opposed to a page maintenance system like Frontpage) is that the underlying server hardware has to be MUCH faster.

Contrast one site initially put together with Frontpage - www.vanishingtattoo.com - with the Geeklog site www.hancockwildlife.org

VanishingTattoo.com gets over a million page views/day - about 5-6 Mbytes/second and runs quite nicely on a dual Xeon 2.6GHz with 2 Gigs of RAM along with a number of other sites and facilities. Only twice to my knowledge has the server ever been stressed, and both times a bit of tuning allowed the system to deal with peaks well in excess of 10 times normal - about 40-50Mbytes/second.

The Wildlife.org site was hosted on identical hardware (in the same facility, on the same network) initially, along with the major supporter's own web site and a small number of others of no importance or traffic load. At the beginning of this nesting season the site started to get busy at just 200,000 page views/day, pushing similar bandwidth. At 300,000 pages/day and 10Mbytes/second the machine all but stopped. This site is now hosted on a machine that is about 10 times larger - 8 core 64 bit Xeon with 8 Gigs of RAM - which should do for the next while.

The point is that creating each page "on the fly" causes a bottleneck. The good thing is that the hardware today is so much faster than what was available as little as a couple of years ago (when these sites were originally set up), and at little or no premium, that this should not be an issue for most sites.

The other good thing is that the art of clustering and multiple processor based web sites has also matured to the point where growing a successful site to handle literally millions of members and tens of millions of page-views per hour is well understood.

All in all, the move from individual page generation to content managed systems is a win all around. Less hassles and software on the (untrained) creators' systems - only a browser needed in most cases - and far more flexibility.

I'm happy - and so are my customers.

0 comments

The following comments are owned by whomever posted them. This site is not responsible for what they say.

Login Welcome to Richard Pitt's Personal Site
Saturday, July 31 2010 @ 01:06 PM PDT