HOME
CONTACT
UPDATED
Dec 2, 2006
Artwork by Don
|
21st Century Coding Standards
by James J Keene
Introduction
What happens when a compiler is written based on Intel Pentium opcode and Microsoft Windows specifications? Short answer: the HotBasic compiler.
In early 2003, I needed a compiler. In my survey, no existing compiler in any computer language met even minimal standards for professional software development.
The various C language entries were horse-and-buggy, full of obsolete, misguided design concepts from a bygone era. Many
other compilers were in the kiddie, trainer-wheel category. Amazingly, there was nothing available for professional purposes. It appeared that compiler makers did not feel any compunction to write a good compiler and professional software writers had been left out in the cold. This was reminescent, perhaps, of Detroit's big automakers ... before Japanese competitors entered the picture.
So I decided to write a good compiler meeting minimum professional quality standards for my own use, since nobody else had done so. As work proceeded, it became apparent that HotBasic was becoming the best compiler available in any language. I plead total innocence since I did not set out to write the best compiler; it just turned out that way. This article tries to explain why.
Looking back, I now see that I was not alone. It is as if about fifteen years ago, all compiler makers agreed to drop support for professional software writers and target the trainer-wheel, hobbiest market. Thus, all professionals -- defined as anybody who writes programs as part of their work -- not just myself, were essentially ignored. As a result, and again -- I did not plan it this way, HotBasic has no competition and the kiddie compilers by Microsoft, Borland and others continue to compete among themselves.
Best Executables
The "under the hood" analysis of HotBasic quality as the Premium Brand, as "well-built and highly desirable," is easily documented; it is not just hype or ego.
The name of the game for a compiler is the executable it produces. Every aspect of the compile task requires a decision regarding exactly which sequences of Pentium opcodes will be written into the application executables which writers will use and distribute. Hence, best executables are obtained to the degree that these decisions are efficient and secure.
1. HotBasic is an original work. My assumption was that a new compiler must have its own library and therefore the task required writing a library and the compiler itself.
I wonder when was the last time anybody did this? Decades ago? Almost everything these days is copy-cat, based on existing general-purpose libraries, and therefore has no chance to compete with HotBasic. Most of the recent compilers are in the category of student projects in computing 102: "Your assignment is to write a compiler for twenty language keywords using an existing general-purpose library."
Let us translate HotBasic is based on Intel Pentium opcodes and Microsoft Windows specifications into English.
First, the only constraint was (a) what a Pentium CPU can do and (b) what the Windows Operating System (OS) can do. How simple! We are free to be totally ignorant of everything else -- including what general-purpose libraries exist and what methods other authors have used to write compilers. It almost goes without saying that this latter information is irrelevant, if no existing compilers meet even minimal coding standards for professional work. Further, in HotBasic's development and use, there is no need to worry about bugs in existing libraries, numerous security issues, poorly-designed computing methods and the like -- all of these are "baggage from the past" which are left for historians to ponder.
Second, focus on these minimal specifications with which the compiler must comply leads, I think, to compiler efficiency. Visual Basic (VB) may be a case in point ... of the opposite. It appears that VB compiler authors were largely unaware of the Windows OS, even though they worked in the same company -- Microsoft.
Today, we can inform them that Micosoft Windows has a user32.dll on all installations from W95 to present, and it does just about everything regarding display of a graphic form for an application on the screen. Had VB creators been better informed, they would have noticed several other standard .dll files with all the functions needed to write the Visual Basic compiler. Was it sheer ignorance of Windows OS specifications, found in the API details, that led to the creation of special run-time .dll files for VB? Maybe somebody knows the real reason. But that was the day that professional coders were thrown to the dogs and bloatware run-time libraries were born. And wonder of wonders, it seems that all compiler makers competing with Microsoft followed this same template. Nobody adopted the Japanese automaker philosophy of making a better product.
2. HotBasic is independent. The compiler creates an application executable designed to do its own independent data processing with the absolute minimum need for OS services. What are we talking about? HotBasic applications call OS functions for things like file and screen input/output, retrieving resources, things which only the OS can do. In contrast, everything else is done at the application level with HotBasic's own code-generation protocols and original library.
And this is saying a lot. A few examples. The LIST Object is entirely implemented by the executable, no calls to OS functions regarding "atoms" and the like. Arrays, variants, math operations, string objects and numerous other standard computing items are handled entirely at the application level.
While various compilers may each offer their own mix on this design principle, it is clear that most do not hesitate to call on the OS or a run-time .dll for the most trivial of computing tasks. Proof? Easy. Just load non-HotBasic executables or even Microsoft .dll files, into a text viewer -- yes, that's right, load any of these binary files into a text viewer -- and scroll down and look at what you see in plain text. Typically, you will see names of functions that determine string length, copy strings, and all sorts of trivia. In summary, you can see for yourself all the trivia that these non-HotBasic executables cannot do themselves. Case closed.
With HotBasic, we have grown up, so to speak. We don't run to Mama for every little detail of our computing work; HotBasic's generated and library code does all of these functions.
Thus, any machine running HotBasic executables will suddenly become more productive for all the other applications running. Since HotBasic off-loads from the OS much data processing which it does like a big boy, all by itself, the OS can attend more to the sniffling brats compiled with obsolete compilers, including Microsoft's own .dll files.
3. HotBasic has its own link library. It is simply not possible to write a new compiler without writing a matching library -- that is, if the compiler is to be competitive. C compilers are still library-dumb and the library is compiler-dumb, for example, as documented in detail in Compiler Library Statistics.
A library is a set of procedures which programs may use. It is most often preferable to place such procedures in a library, since a program can call them with a single Pentium opcode. An alternative would be to repeat the procedure code within the program each time it is needed.
General-purpose libraries have no knowledge of where input values are located in a program, and therefore require that all such values be put in a known location -- usually the application stack. There are literally thousands of useful procedures in .dll and other library files on every Microsoft Windows machine and program source code can access these as needed. A program uses these general-purpose libraries only when the source code language keywords do not natively perform the needed action.
The source code of a program does much more than that. It is filled with the keywords of the computer language used, such as PRINT, SHOWMESSAGE, IF ... THEN statements, and so forth. When a compiler writes an executable implementing these keywords, it may include calls to library functions. However, the library containing these procedures is not a general-purpose library, if the compiler is well-written.
Why? Remember that a compiler and its library are a pair of tools, where each should be aware of how the other works. The HotBasic compiler writes Pentium opcodes according to rules which dictate where values are stored -- CPU registers, known memory locations, etc. Its library procedures know where these values are. Hence, there is seldom any need to place input values (known as arguments or parameters) on the application stack in a HotBasic best executable before a library procedure call, because the library was written for the compiler and already knows where the input values are. In sum, HotBasic is more efficient since needless movement of data to and from the application stack is avoided for the most part.
On the other hand, compilers that implement source code language keywords by calling procedures in a general-purpose library, as is presently done in all well-known compilers, are intrinsicly inefficient and obsolete. Alas, the only remedy for software writers is to upgrade from their current language/compiler to HotBasic to deliver better workmanship in their applications. And for compiler makers, the remedy is to rewrite both their compiler and library. In that case, HotBasic would have its first real competition -- it can get lonely being the only "premium brand" compiler; there is room for more.
Let us document the foregoing with a test program, overhead.bas:
$APPTYPE CONSOLE
defstr k$
k$="HotBasic General-Purpose Library Overhead Test"
color 15: print k$: color 7: print
defint i,j,k = 5000000
defreal10 t0,t,t1
declare sub lstrlen lib "kernel32" (c$ as string)
t=timer
for i=one to k 'empty loop
next i
t0=timer - t: k$=left$(str$(t0),6)
print k; " empty FOR NEXT iterations : "; k$; " seconds"
t=timer
for i=one to k
j=len("A") 'short string factors out function time
next i
t1=timer - t: dec(t1,t0): k$=left$(str$(t1),6)
print k; " native HotBasic LEN() call: "; k$; " seconds"
t=timer
for i=one to k
lstrlen("A"): j=retfunc
next i
t=timer - t: dec(t,t0): k$=left$(str$(t),6)
print k; " kernel32 lstrlen API call : "; k$; " seconds"
print
print "[Native HB]/[Windows API] ratio = "; t1/t
print
PAUSE
END
Overhead.bas produces these results on a 300 mHz machine:
HotBasic General-Purpose Library Overhead Test
5000000 empty FOR NEXT iterations : 0.0700 seconds
5000000 native HotBasic LEN() call: 0.2909 seconds
5000000 kernel32 lstrlen API call : 1.0309 seconds
[Native HB]/[Windows API] ratio = 0.282250242483025
Notice that the 0.07 sec for empty FOR NEXT loop time is subtracted from the two test condition times, to better estimate their times independently of the FOR NEXT loop time in the code. Further, we use a short string "A" to minimize the actual time used by HotBasic's and kernel32's string length functions.
If we assume near-zero overhead in the HotBasic LEN() call (a single "call" opcode; no stack arguments), a high percent of the 0.2909 secs reported above is the actual time used by the LEN() function in the HotBasic library.
Let us also assume, for our present purposes, that the kernel32 lstrlen function uses about the same time as HotBasic's LEN(). But the kernel32 calls took 1.0309 secs. Why all the extra time? 1.0309 - 0.2909 is 0.7400 secs, much more than HotBasic's total time of 0.2909 seconds. Another way to look at the overhead involved in calling general-purpose library functions is the ratio above showing that HotBasic gets the job done in less than 30% of the time required when a compiler-dumb general-purpose library is used.
Please note that kernel32.dll is by definition a general-purpose library and as part of the Windows OS, it must be as explained above. And by definition, a general-purpose library is "compiler-dumb". So the intent is not to throw mud at kernel32.dll, but to underscore what a general-pupose library is and when its usage is warranted.
For example, in HotBasic and most computer languages, the software writer can access any procedure written by anybody, if it is not covered by the language's native keywords. Most often, this is implemented with DECLARE statements of some sort. In these cases, the called procedures are almost always in general-purpose libraries.
In contrast, when a computer language, such as any variation of C or VB, implements its own native keywords with calls to an associated general-purpose library, then it is appropriate to call the library "compiler-dumb" and raise the question of why this is so. The answer is simple: the compiler is poorly-written and lacks a matching flag-ship library. Hence, the question reduces to: How is it possible that the library associated with the language does not know where its input parameters are in most cases? The implication is that the compiler generates binary code for the executable in a chaotic, haphazard manner.
The foregoing demonstration provides concrete data on the rationale supporting HotBasic design principle #2 above: HotBasic is independent. Instead of running to Mama for every little nit and nat in routine computing, a 21st century compiler should produce executables which can do their own work.
Please note that the "Windows API" condition in the overhead.bas experiment is a simulation of what happens with all C language compilers, which use general-purpose libraries for nearly everything. Thus, the above data also provides strong support for HotBasic design principle #3 above: HotBasic has its own link library. This factor alone is a fine introduction to appreciate how out-dated, how primitive are C compilers and a whole host of software development tools based on C software.
It seems C was a great victory early on since its goal was modest: free coders from the drudgery of writing binary opcodes or assembler keywords, by providing a higher level language. That's it. Anything that worked in any manner was a great tool, great progress, a great victory. At first, nobody had the perspective we now have to see that there is plenty of foolishness in C and some of these horse-and-buggy features, such as using general-purpose libraries to implement native keywords, became "givens" for what followed. Apparently, after the initial glow of a successful higher language, nobody stopped to ask, "Does this make any sense?"
It may be noteworthy that design principles #2 and #3 depend on #1: HotBasic is an original work, which asserts that the only pertinent information for the task of producing best executables are processor and platform specifications.
And now you know why. Nothing in technical specifications from these outstanding corporations -- Intel for the CPU hardware and Microsoft for OS software -- requires that compilers be poorly designed or that executables be inefficient or insecure. Indeed, no doubt both Intel and Microsoft welcome the advent of best executables.
We have only begun to enumerate what we lovingly call 21st century coding standards. Please stay tuned for items #4 and up in continuation articles further documenting the HotBasic revolution.
Copyright © 2006 James J Keene PhD
HotBasic™ is a trademark of James J Keene
Original Publication: Nov 29, 2006
Back to HotBasic Home Page.
|
|