HOME
CONTACT
UPDATED
Apr 10, 2006
Artwork by Don
|
Compiler Library Statistics
by James J Keene
Introduction
Most compilers use general-purpose libraries. A general-purpose library assumes no knowledge of where values used by its procedures -- known as parameters or arguments, are located. Thus, any application can call the procedures in the library, but must pay the price of putting the arguments in a known location, usually the application stack.
Windows' .dll modules, such as kernel32.dll, must be written as general-purpose libraries so Windows can provide services to a wide variety of applications which differ internally in how they handle and locate data.
On the other hand, a well-written compiler cannot use a general-purpose library in production of executables. Obviously, the compiler can generate code in which the location of common types of arguments is known and therefore, its library can also know where most argument values are. In short, a well-written compiler needs its own library specifically written for producing executables.
The number of library procedures that have no stack arguments may indicate the degree to which a particular computer language and compiler meet this quality criterion. The present study asks: Do the various C language compilers (C, C+, C++, C#, etc) use general-purpose libraries to produce executables?
Methods
Windows kernel32.dll was used as a base-line for a general-purpose library, although it is not used by a compiler to produce executables. The number of stack arguments used by its procedures was counted from their prototypes -- number of dword stack arguments.
The C language prototypes shown in header (.h) files for stdio and stdlib were used to count arguments as number of dword values as shown in Table 1. The counting method was conservative: (1) ",..." arguments were counted as one, although this might more accurately be estimated as two or more; and (2) REAL10 ("long double") values were counted as two (same as "double") although these 80-bit values might more accurately be counted as three dwords on the stack.
Table 1: Compiler Library Statistics
Library Proc With no args Av. args/proc Total args
~~~~~~~ ~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~~~~ ~~~~~~~~~~
HotBasic.. 205 135 (65.9%) 0.87 [2.56] 179
C
stdio 64 5 ( 7.8%) 1.84 [2.00] 118
stdlib 57 3 ( 5.3%) 2.14 [2.26] 122
string 37 0 ( 0.0%) 2.16 [2.16] 80
math 78 1 ( 1.2%) 2.35 [2.39] 184
time 13 2 (15.4%) 1.31 [1.55] 17
conio 18 3 (16.7%) 1.11 [1.33] 20
ctype 19 0 ( 0.0%) 1.00 [1.00] 19
malloc 12 3 (25.0%) 1.00 [1.33] 12
process 21 2 ( 9.5%) 2.95 [3.26] 62
C total... 319 19 ( 5.9%) 2.14 [2.28] 684
kernel32.. 803 35 ( 4.4%) 2.42 [2.53] 1946
Legend: "double" and "long double" counted as two stack arguments; ",..." counted as one; based on lcc C libraries. kernel32 is Windows kernel32.dll. Average (Av.) in [] based on procedures with arguments.
Results
Table 1 shows only 4.4% of 803 procedures have no stack arguments in kernel32.dll, used as a "model" or base-line for a general-purpose library -- where its procedures can have no idea of argument location and therefore, all arguments must be passed to the procedure, in this case, on the stack.
Of 319 C language procedures counted, only 5.9% have no stack arguments. In contrast, 65.9% of 205 HotBasic library procedures have no stack arguments.
Discussion
The C libraries appear similar to general-purpose libraries, not a "flag-ship" library usable in making executables by a compiler which should know where most arguments are and therefore the library procedures should also know where most arguments are -- typically in CPU registers or global locations.
Comparing C and HotBasic, one can observe why a compiler cannot use a general-purpose library and also produce "best executables" -- namely, there is a significant unneeded overhead in shuffling data to/from the stack.
This difference may be magnified by two additional facts: (1) The 205 HotBasic procedures cover a much greater scope of functionality than the 319 C procedures because (2) many of the C procedures would not be required at all with a better-written compiler. That is, various C and other compilers add unneeded overhead by calling many procedures in the first instance, instead of upgrading compiler quality regarding this design issue.
The software writer may take a big hit when trying to do math in C. Because the library is "compiler-unaware", arguments for all the math functions are put on the stack. Even worse, these are most often double values -- two stack operations per argument. Not only is double a second-class level of precision in today's FPU-based CPU's, but calculations are slowed by even larger overhead, compared to a "real compiler" and its library as found in HotBasic.
Notice the low percent of procedures with no stack arguments in kernel32.dll and in the C language prototypes is not due to design of compiler code generation to match parameter needs of library procedures. It is simply a "base-line" for procedures that require no parameters from the application.
These results may be surprising. It appears that C compiler makers have not optimized the compiler/library package for producing executables. Hence, for C and other language compilers to be competitive in this aspect of 21st Century coding standards, a major rewrite of both the compiler and its library is required.
Summary
o About 2/3 (65.9%) of HotBasic.lib procedures have no stack arguments, because the procedures already know where the arguments are, whereas only 5.9% of C library procedures counted have no arguments "(void)". This is over a 10x efficiency factor favoring HotBasic; more like 20x when one considers that the C compiler mindlessly puts the arguments on the stack and the "argument-dumb" library has to then get them (two unneeded movements of data).
Not to mention the bloating of the .exe by all the code to do the mindless, pointless data shuffling in C language compilers.
o All compilers/languages based on these sorts of C libraries are clearly "not 21st Century", not efficient and not capable of producing best executables.
Note: Comments and checking data tabulation work in Table 1 welcome.
Copyright © 2006 James J Keene PhD
HotBasic™ is a trademark of James J Keene
Original Publication: Apr 9, 2006
Back to HotBasic Home Page.
|
|