Zoek

Uitgebreid zoeken Artikelen per auteur

  

The WebHub Way of Thinking

In the early days of the world wide web (1994), I was lucky enough to learn HTML from an expert who worked in a makeshift office on the second floor of a warehouse near San Francisco, and CGI-WIN from Robert Denny, the author of WebSite (a great product subsequently side-lined by Microsoft's HTTP-server, IIS). Internet lifetimes have gone by since then, and yet, looking back, some ideas have withstood the trial by fire of Windows OS upgrades, Delphi migrating to CodeGear, Unicode, CSS, the iPhone ...

At HREF Tools in 1995, we created WebHub, a totally-cool, extremely-flexible web development framework for Delphi. I have worn quite a few hats at HREF, but revealing WebHub's secrets has been my passion all along.

With this article, I invite you to peek inside the world of WebHub, to learn from our way of thinking about web development with Delphi, and to aim for a better world wide web.

Fig. 1: Woven strands form cables which suspend the Brooklyn Bridge in New York

The observation that I would like to start with is the almost inconceivable strength of individual strands when woven suitably together. For me, this is evident in the Brooklyn Bridge (see figure 1), where each cable is made of 4 main strands, each consisting of 19 wires and, in total, containing over 391km of wire (see ref. 1)! 

Explaining WebHub is a little like showing wire after wire of an intricate structure, hoping that the listener will imagine how they combine together

Explaining WebHub is a little like showing wire after wire of an intricate structure, hoping that the listener will imagine how they combine together.

Indeed, WebHub contains many threads (pun intended), or groups of functionality. When taken together, they enable you to create a web application which can carry a New York rush-hour quantity of traffic. You can build anything from off-Broadway static sites to Wall Street business-suit applications to Paris cafes.

We have a portfolio section on www.href.com, and you are invited to have a look there to see what some of our customers have built over the years, often on small budgets.  WebHub sites easily support a million dynamic page requests per day ... and more, if you throw more than $2500 worth of hardware at the problem.

Fig. 2: Snail shell

The second metaphor to conjure up appears in figure 2 - the snail - illustrating the notion of slow but steady progress. WebHub evolution may not always go in a straight line, but it moves consistently toward something greater. Similarly, many web sites start in a fairly primitive state, and evolve over time, as the owners become more ambitious and the developers become more skilled. The WebHub framework supports entry-level static sites, simplistic dynamic sites and full-featured database-driven sites. Throughout that evolution, WebHub developers can use one syntax, one paradigm, one approach. Over the lifetime of a web site, that saves hundreds of hours and potentially tens of thousands of dollars.

Yes, WebHub can help you build static sites with great ease. In the section below on teamwork, there are some example of re-usable droplets, and page definitions. Any page can be exported to a static file; all pages can be exported by a single verb on the main application object. So while I use WebHub to build dynamic sites for my day job, I also use it to maintain static sites for relatives and friends!

Four essential facets

Every web application has one job, which is to process requests that arrive over HTTP, calculate the answer and optionally perform any other task(s), and then return that answer in the appropriate format (HTML, CSS, XML, etc.).

(A ‘web application’ as defined for this article is a dynamic web site which probably has database-driven content and may, or may not, look and feel like a Windows application).

Let's have a look at four essential facets of WebHub which combine to accomplish that job elegantly and super-fast: teamwork, overall architecture, consistent url with save-state, and Delphi hooks.

Teamwork

You may have noticed that some people are more left-brained, logical, and methodical than others. An optimized web development team needs one or more left-brained people to write the logic in Delphi (or PHP or C#) and to do the database work. That team also needs some right-brain talent to design the page layout, choose the color scheme, imagine the best user interface, create graphics, logos, and so on.

Once in a great while, a single person has strong talent in both areas, but usually, almost always, a web site moves forward best with a team of at least two people.

To keep our story simple, consider the left-brained person(s) to be the Delphi developer and the right-brained person(s) to be the web artist in charge of the HTML

To keep our story simple, consider the left-brained person(s) to be the Delphi developer and the right-brained person(s) to be the web artist in charge of the HTML. In reality, sometimes the Delphi developer will do a bit of HTML or CSS coding, and sometimes the web artist will offer an observation that leads to better logic.

Supporting teamwork means ensuring that each person can do their work without interfering with the other person, and without waiting too long between tasks for some shared resource. To talk concretely, this means dividing up the web site assets into at least two groups of files: one set for the Delphi coder and one set for the web artist. It is then easy to use version control, and everyone can work pretty much in parallel.

WebHub, by default, keeps all web page definitions in plain-text files outside the Delphi application. Every detail about the page layout, colors and user interface can be altered without recompiling or even stopping the Delphi application.

For any web artist who uses Dreamweaver, we have a custom plug-in which provides live content directly in "design mode" – and this means 100% live, 100% accurate content. The web artist can enjoy all the normal benefits of Dreamweaver plus context-sensitive help for editing any WebHub expression, shortcuts for testing, and a built-in reference to all WebHub commands.

Most importantly, the web artist can accomplish great things with a fairly minimal understanding of WebHub, components, and database issues, while the Delphi developer can keep the Delphi IDE as home base.

Now that we have a file-based territory for each side of the brain, let's talk about iterative development. There are few organizations with the luxury of completely specifying a software project in advance of its commencement. Rather, someone builds a prototype and then features are refined according to priorities and budget.

You may be surprised to learn that the syntax in the external files actively assists with iterative development, via the "sketch" tag.... this will become clear shortly.

WebHub uses its own markup language, which is a superset of HTML, with special tags to mark out pages, reusable droplets, text substitution macros, plus expressions for calls to reusable pieces. This markup language is called WebHub-HTML, or W-HTML for short, and is saved in "teko" files (*.whteko).

A WebHub page definition file (*.whteko) can contain any number of page, droplet or macro definitions, in any order. Page attributes can be shared by all, or some, pages within that file. We wanted to convey the idea that our files could contain more than just one web page, and we chose the word 'teko' (from Esperanto, where it means "briefcase") as the keyword. To remember the file extension, think of (WebHub + teko) = .whteko, "a briefcase of declarations." Full documentation about the .whteko file format can be found at http://webhub.com/dynhelp.

Within any page, droplet or macros, you can use a WebHub expression. A few examples will help you get the pattern quickly:

(~mcHelloWorld~)
(~waHelloWorld.execute~)
(~waHello.execute|world~)

Inside Delphi, you can make use of any WebHub expression by saying

pWebApp.SendMacro(name-here);

WebHub includes about 35 built-in commands, of which JUMP is the most commonly used. The JUMP command outputs an <a href_tag> to "jump" the surfer to the desired page. Details vary based on configuration settings for the type of HTTP server software and any optional filters.

Thus (~JUMP|home|Home Page~) could generate any of the following, depending on configuration settings, in a web app identified as "a", for the surfer using session number 123:

  • <a href="/scripts/runisa.dll?a:home:123">Home Page</a>
  • <a href="/a:home:123">Home Page</a>
  • <a href="http://www.href.com/a:home:123">Home Page</a>
  • <a href="/a/home/123">Home Page</a>

Listing 1 shows a minimalist .whteko file:

<!DOCTYPE whteko PUBLIC
  "-//HREF//DTD whteko stage 2.14//Strict//EN//"
  "http://webhub.com/dtd/0214/whteko.dtd">
<whteko>

<whpage pageid="home">
<html><head>
<whdroplet name="drHeadTags" show="yes">
<title>(~pageid~)</title>
<style type="text/css">
  body { font-family:Verdana, Arial; }
</style>
</whdroplet>
</head>
<body>
<p>hello this is a page with id (~pageid~).</p>
<p>(~JUMP|home|show again~)</p>
<p>(~JUMP|clock|check server time~)</p>
</body></html>
</whpage>

<whpage pageid="clock">
<html><head>(~drHeadTags~)</head>
<body>
<p>The time on the server is
(~CentralInfo.WebTimeLocal~).</p>
<p>(~JUMP|clock|show again~)</p>
<p>(~JUMP||back home~)</p>
</body></html>
</whpage>

</whteko>

Listing 1: A WHTEKO file containing two page declarations (home and clock)

In Dreamweaver and in the WebHub editor, syntax highlighting makes the code much easier to read. Figure 3 shows a subset of Listing 1, with colors, to give you the idea.

Fig. 3: Part of listing 1, showing syntax highlighting within Dreamweaver

What to notice in listing 1:

  1. A droplet is used to define a portion of the header within the home page. The droplets are called (re-used) on the clock page.
  2. Links from one page to another are generated by the JUMP macro as needed at calculation time; see lines 27 and 28 in figure 3. When the target PageID is omitted as in line 28, the default PageID is filled in automatically.
  3. The doctype allows content to be validated by suitable programs (including Dreamweaver)
  4. Parentils surround WebHub expressions, e.g. (~pageid~) and (~drHeadTags~)

This .whteko file would be loaded by the WebHub application (compiled in Delphi), and all the content would be available for use when the web pages are served.

The relationship between EXEs and AppIDs is many-to-many. A single EXE can load many AppIDs (albeit one at a time). An AppID can be loaded by many EXE instances.

To see more examples, you may visit http://demos.href.com  where we have more than 3 dozen WebHub demos. All include full-source (Delphi code plus WHTEKO code). Please feel free to browse through them. Use the [Source] link in the page footer of each demo to explore the source code. All the "lite" demos can be loaded by whLite.exe as they all use the same basic components.

In today's world, teamwork often involves players in more than one country, with strengths in different human languages ("lingvos"). Winning web sites cater for visitors from multiple countries and provide content in more than one lingvo.

With its full Unicode support within Delphi, Delphi 2009 offers us a tremendous leg up in terms of development for such a multi-lingvo world. We definitely recommend that you use Delphi 2009 if any of your database or other source data is in Unicode format.

The WHTEKO files are generally in UTF-8 format; Delphi 2009 users can use UTF-16.

Regardless of Delphi version, WebHub has built-in support for translating content to multiple lingvos. This is ideally done using Dreamweaver, where a translator can set the design-lingvo to their own native lingvo and immediately see how the page would look using their wording. Even without Dreamweaver, a translator can enter translations for words, phrases and entire droplets if needed.

On a public site, the application will operate with a default lingvo. For our sites, the default lingvo is English, so any macros or droplets which have not yet been translated will appear in English. Meanwhile all translated content will appear in the surfer's selected lingvo.

A seemingly minor but actually tremendously powerful feature of WebHub is that its expressions can be nested, to a large degree. A macro can call a droplet; the droplet can call another macro; that macro can call a web action component, which can call another macro. Thus control can go back and forth between .whteko and .pas as often as needed. Not all web artists master the concepts of dynamic content, subroutines and variables; for those that do, WebHub is an excellent fit.

Overall architecture: runner, hub, application(s)

The WebHub architecture is fairly straightforward. It consists of three moving parts: (a) a runner; (b) the hub; and (c) your application(s). See figure 4 for a diagram.

Fig. 4: WebHub architecture

The runner (yellow circle "5" in figure 4) is a small program which receives the request from the HTTP server. When used with Microsoft IIS, the runner is an ISAPI DLL; with Apache, the runner is an exe to support CGI-BIN. As new HTTP servers come on the market, we add or modify the runners to adapt to them. We have phased out the first runner from 1995, which was for the CGI-WIN interface.

The Hub (yellow-6) is the process controller. It aggregates requests from the runner(s), assigns session numbers to newly arriving surfers, and directs traffic to the least-busy application instance available to serve the request. The Hub makes WebHub scalable.

For a low-traffic web site (roughly 1 to 5 dynamic requests per 5 seconds), running one instance of your application (yellow-7) is probably sufficient. (Of course, it depends on how long your average page request takes to calculate, which often depends on the speed of your database connection. As they say, "your mileage may vary.")

For a high-traffic web site, it helps significantly to run two or more instances of your application. The Hub then directs traffic to the instance most ready to handle the request.

The application calculates the page and hands the response back to the runner, which hands it back to the HTTP server.

For really high-traffic sites or for very old-hardware scenarios, the functionality of a web site can be divided up, with each group of pages and features being associated with an Application ID ("AppID"). A great way to eliminate bottlenecks on a site is to find the feature which is most popular and/or slowest, and give it its own AppID. That allows all other page requests to go through at an easy pace because a dedicated instance is available for the troublemaker.

WebHub saves-state across applications. Even if you use multiple AppIDs, the surfer will not notice any problem or data loss. Surfer data is easily shared among WebHub EXEs, even if they are running on different machines, as long as those machines share a network drive.

You can start by putting all features under a single AppID, and if needed, later separate one feature out into its own AppID, or, in extreme cases, onto a dedicated separate server.

Coming back for a moment to the JUMP macro, part of the simplicity of WebHub is that linking to a page is just a matter of referencing the PageID, e.g. (~JUMP|home|Home Page~), where "home" is the PageID and "Home Page" is the visible phrase seen by the surfer. (Yes, the visible phrase can be a graphic image and you can add further HTML tags if needed.) 

If a particular page, say "CalcReport", needed to be moved to a separate AppID, you would simply change JUMP|CalcReport|... to JUMP|NewAppID:CalcReport|... and that would take the surfer to the NewAppID.

Fig. 5: WebHubAdmin provides a way to see inside the Hub

The Hub runs as a non-gui service, and a helper program named WebHubAdmin is used to find out what is going on. Figure 5 shows the "Connected Panel" of WebHubAdmin, where you can see the WebHub applications that are running.

In figure 5, there are three columns worth pointing out: Process, AppID and PID. The Process column lists the name of the EXE running. The AppID column lists the string identifying the WebHub application. The PID is the process identifier as provided by the operating system. If you analyze the PIDs carefully, you will find that every Process is a separately running instance of the EXE. The Process named "WHLITE" runs many times, once for each "lite" demo.

A WebHub developer license includes a 1-cpu, 1-EXE unlock code for use on the developer's computer and the same capacity on one production server.   A developer can build unlimited EXEs in sequence; the limit is on the number of EXEs running at the same time.

Unlocking the Hub to support more EXEs or more CPUs requires a more expensive license.

The idea behind the pricing is that you can get started for a reasonably small amount of money (considering that almost all technical support is included), and most of your payments are deferred in time until you complete your project and deploy it for your customer in a high-traffic or significantly complicated scenario. Presumably by that time you are financially benefiting from your use of the WebHub framework.

Consistent URLs

One of the early decisions in the design of WebHub was to use completely consistent URLs for all requests. In particular, we allowed for 4 parameters:

  • AppID: the application identifier
  • PageID: the page identifier, e.g. "home" or "clock" in listing 1
  • SessionID: the surfer's session number
  • Command String: a string containing any further details, formatted as desired

In other words, all custom data fields went into the command string.

In 1995, WebHub URLs looked like this:

http://localhost/cgi-win/runwin.exe?
  appid:pageid:sessionid:command

The separator character was originally a colon (':') and can be configured as slash ('/') instead.

When Microsoft added IIS to the operating system, WebHub URLs usually looked like this:

http://localhost/scripts/runisa.dll?
  appid:pageid:sessionid:command

Then we added the Coolness Layer, which was an ISAPI filter enabling short URLs by sitting between Microsoft IIS and the runner, and WebHub URLs could look like this:

http://localhost/appid:pageid:sessionid:command

On domains using only one AppID, this could be shortened to

http://localhost/pageid:sessionid:command

In 2002, we retired the Coolness Layer in favor of StreamCatcher, an ISAPI filter which, among other things, recognized web robots based on their user agent. (While not all web robots play fair and use a clear user agent string, the important ones such as googlebot do.)  Detected web robots could be assigned a pre-arranged session number, and this session number could be hidden from the URL. This meant that links to WebHub pages in Google, Yahoo, etc., could omit the session number, and when used by humans, would receive a session number. A WebHub URL when using StreamCatcher for google could look  this clean:

http://localhost/pageid

When a WebHub developer wants to add support for StreamCatcher, he or she makes a few adjustments to the application-level configuration file and the "server profile", and thereafter, the JUMP macro automatically generates short URLs instead of long ones.

The point is that by keeping a simple, consistent set of parameters in the URL, we were lucky enough to survive 14 years of HTTP server software changes without any need to change the syntax of the JUMP macro

The point is that by keeping a simple, consistent set of parameters in the URL, we were lucky enough to survive 14 years of HTTP server software changes without any need to change the syntax of the JUMP macro.

That is not to imply that our upgrade path has been completely trivial, it has not. However, we have provided wizards and conversion tools for the pieces that needed converting, as well as free technical support for most customers needing assistance.

Hooks to Delphi

The fourth major strand deals with the way that we hook from a WHTEKO file into Delphi. There are quite a few ways to get either property values or complex calculated values from Delphi. We saw one example of obtaining a component property value in the minimalist example in listing 1:

(~CentralInfo.WebTimeLocal~)

CentralInfo is the reserved component name of the TwhCentralInfo component, and WebTimeLocal is a property which returns the time on server (local to the server, not local to the surfer).

Thus obtaining the value of any string, integer, or TStringList property is easy.

In additional, obtaining any data entered by the surfer into a form on any page within the web application is also easy. Let's say you had an input field named "email", you could display the data entered into that field using

(~email~)

and you could access it in Delphi using

pWebApp.StringVar['email']

What about more complex requirements? WebHub has a "web action component" which can be used to trigger any behavior and output any result. The component is called by name, so if you name it "waHello" (for web-action-hello), you could hook to it like this:

(~waHello.execute~)

and you could pass it parameters like this

(~waHello.execute|a,b,c~)

In the component's OnExecute method, you would write your calculation and/or response code. The component's response will be added into the output stream at the exact point where the page designer called it.

Generally, WebHub pages are created from top to bottom. However, there is a unique feature called an ANCHOR. A page designer can set an ANCHOR at any point on the page, and subsequently back-fill content into that position by calling ANCHORMODIFY. This means you can build shared droplets which are generic, with details filled in "a bit later" when some Delphi code has done the necessary database lookups. This is not always needed, but occasionally it enables the impossible.

Listing 2 shows what the OnExecute event handler might look like.

procedure TDM001.waHelloExecute(Sender: TObject);
begin
  pWebApp.Response.Send('<p>hello!</p>');
end;

Listing 2: OnExecute event handler for a web action component which says hello

Weaving the strands

Now you may start to put the pieces together. Let's look at a real example that you can follow in your web browser. A request to

http://www.href.com/hrefsite:rubicon

is transformed by StreamCatcher into

http://www.href.com/scripts/runisa.dll?hrefsite:rubicon

and is then sent by Microsoft IIS to our runner (runisa.dll), which asks the Hub for a new session number, and the EXE serving the AppID "hrefsite" will respond. That EXE contains WebHub components which enable it to know what should happen on the page with id "rubicon". This page happens to be a product-information page, so a series of standard droplets are sent out to give information about the product, Rubicon. When the EXE has finished building the web page, the data goes back to the runner, back to IIS, and back to the surfer's browser to be displayed. The time to calculate this page is 31 milliseconds, as reported in the page footer for anyone interested. This web application is running on a server bought in 2005, with 1 Xeon-processor. Eight (8) WebHub application instances running eight different AppIDs are on that server. With newer hardware, performance could be much quicker – but this is fast enough for our little web site.

To be continued…

References:

  1. http://www.endex.com/gf/buildings/bbridge/bbridgefacts.htm
 
Geef feedback:
Verzend Commentaar