The B::C, B::CC, B::Bytecode Perl Compiler Kit Copyright (c) 1996, 1997, Malcolm Beattie Copyright (c) 2008, 2009 Reini Urban Releases are at http://search.cpan.org/dist/B-C/ Code is at http://code.google.com/p/perl-compiler/ svn checkout http://perl-compiler.googlecode.com/svn/trunk/ perl-compiler USAGE The Bytecode, C and CC backends are now all functional enough to compile almost the whole of the main perl test suite. In the case of the CC backend, any failures are all due to differences and/or known bugs documented below. See the file TESTS. In the following examples, you'll need to replace "perl" by perl -Iblib/arch if you have built the extensions for a dynamic loading platform but haven't installed the extensions completely. You'll need to replace "perl" by ./perl if you have built the extensions into a statically linked perl binary. (1) To compile perl program foo.pl with the C backend, do perl -MO=C,-ofoo.c foo.pl Then use the cc_harness perl program to compile the resulting C source: perl cc_harness -O2 -o foo foo.c If you are using a non-ANSI pre-Standard C compiler that can't handle pre-declaring static arrays, then add -DBROKEN_STATIC_REDECL to the options you use: perl cc_harness -O2 -o foo -DBROKEN_STATIC_REDECL foo.c If you are using a non-ANSI pre-Standard C compiler that can't handle static initialisation of structures with union members then add -DBROKEN_UNION_INIT to the options you use. If you want command line arguments passed to your executable to be interpreted by perl (e.g. -Dx) then compile foo.c with -DALLOW_PERL_OPTIONS. Otherwise, all command line arguments passed to foo will appear directly in @ARGV. The resulting executable foo is the compiled version of foo.pl. See the file NOTES for extra options you can pass to -MO=C. There are some constraints on the contents on foo.pl if you want to be able to compile it successfully. Some problems can be fixed fairly easily by altering foo.pl; some problems with the compiler are known to be straightforward to solve and I'll do so soon. The file Todo lists a number of known problems. See the XSUB section lower down for information about compiling programs which use XSUBs. (2) To compile foo.pl with the CC backend (which generates actual optimised C code for the execution path of your perl program), use perl -MO=CC,-ofoo.c foo.pl and proceed just as with the C backend. You should almost certainly use an option such as -O2 with the subsequent cc_harness invocation so that your C compiler uses optimisation. The C code generated by the Perl compiler's CC backend looks ugly to humans but is easily optimised by C compilers. To make the most of this optimizing compiler backend, you need to tell the compiler when you're using int or double variables so that it can optimise appropriately (although this part of the compiler is the most buggy). You currently do that by naming lexical variables ending in "_i" for ints, "_d" for doubles, "_ir" for int "register" variables or "_dr" for double "register" variables. Here "register" is a promise that you won't pass a reference to the variable into a sub which then modifies the variable. The compiler ought to catch attempts to use "\$i" just as C compilers catch attempts to do "&i" for a register int i but it doesn't at the moment. Bugs in the CC backend may make your program fail in mysterious ways and give wrong answers rather than just crash in boring ways. But, hey, this CC is still on the beta level so you knew that anyway. See the XSUB section lower down for information about compiling programs which use XSUBs. If your program uses classes which define methods (or other subs which are not exported and not apparently used until runtime) then you'll need to use -u compile-time options (see the NOTES file) to force the subs to be compiled. Future releases will probably default the other way, do more auto-detection and provide more fine-grained control. Since compiled executables need linking with libperl, you may want to turn libperl.a into a shared library if your platform supports it. For example, with Digital UNIX, do something like ld -shared -o libperl.so -all libperl.a -none -lc and with Linux/ELF, rebuild the perl .c files with -fPIC (and I also suggest -fomit-frame-pointer for Linux on Intel architectures), do "make libperl.a" and then do gcc -shared -Wl,-soname,libperl.so.5 -o libperl.so.5.3 `ar t libperl.a` and then cp libperl.so.5.3 /usr/lib cd /usr/lib ln -s libperl.so.5.3 libperl.so.5 ln -s libperl.so.5 libperl.so ldconfig When you compile perl executables with cc_harness, append -L/usr/lib otherwise the -L for the perl source directory will override it. For example, perl -Iblib/arch -MO=CC,-O2,-ofoo3.c foo3.bench perl cc_harness -o foo3 -O2 foo3.c -L/usr/lib ls -l foo3 -rwxr-xr-x 1 mbeattie xzdg 11218 Jul 1 15:28 foo3 You'll probably also want to link your main perl executable against libperl.so; it's nice having an 11K perl executable. (3) To compile foo.pl into bytecode do perl -MO=Bytecode,-ofoo.plc foo.pl To run the resulting bytecode file foo.plc, you use the ByteLoader module which should have been built along with the extensions. perl -MByteLoader foo.plc Previous Perl releases had ByteLoader in CORE, so you can omit -MByteLoader there. You can also do -H to automatically use ByteLoader. perl -MO=Bytecode,-H,-ofoo.plc foo.pl perl foo.plc Any extra arguments are passed in as @ARGV; they are not interpreted as perl options. See the NOTES file for details of these and other options (including optimisation options and ways of getting at the intermediate "assembler" code that the Bytecode backend uses). (3) There are little Bourne shell scripts and perl programs to aid with some common operations: perlcc, assemble, disassemble, cc_harness, t/testc.sh, t/testplc.sh, pl2exe.pl XSUBS The C and CC backends can successfully compile some perl programs which make use of XSUB extensions. [I'll add more detail to this section in a later release.] As a prerequisite, such extensions must not need to do anything in their BOOT: section which needs to be done at runtime rather than compile time. Normally, the only code in the boot_Foo() function is a list of newXS() calls which xsubpp puts there and the compiler handles saving those XS subs itself. For each XSUB used, the C and CC compiler will generate an initialiser in their C output which refers to the name of the relevant C function (XS_Foo_somesub). What is not yet automated is the necessary commands and cc command-line options (e.g. via "perl cc_harness") which link against the extension libraries. For now, you need the XSUB extension to have installed files in the right format for using as C libraries (e.g. Foo.a or Foo.so). As the Foo.so files (or your platform's version) aren't suitable for linking against, you will have to reget the extension source and rebuild it as a static extension to force the generation of a suitable Foo.a file. Then you need to make a symlink (or copy or rename) of that file into a libFoo.a suitable for cc linking. Then add the appropriate -L and -l options to your "perl cc_harness" command line to find and link against those libraries. You may also need to fix up some platform-dependent environment variable to ensure that linked-against .so files are found at runtime too. DIFFERENCES The result of running a compiled Perl program can sometimes be different from running the same program with standard perl. Think of the compiler as having a slightly different implementation of the language Perl. Unfortunately, since Perl has had a single implementation until now, there are no formal standards or documents defining what behaviour is guaranteed of Perl the language and what just "happens to work". Some of the differences below are almost impossible to change because of the way the compiler works. Others can be changed to produce "standard" perl behaviour if it's deemed proper and the resulting performance hit is accepted. I'll use "standard perl" to mean the result of running a Perl program using the perl executable from the perl distribution. I'll use "compiled Perl program" to mean running an executable produced by this compiler kit ("the compiler") with the CC backend. Loops Standard perl calculates the target of "next", "last", and "redo" at run-time. The compiler calculates the targets at compile-time. For example, the program sub skip_on_odd { next NUMBER if $_[0] % 2 } NUMBER: for ($i = 0; $i < 5; $i++) { skip_on_odd($i); print $i; } produces the output 024 with standard perl but gives a compile-time error with the compiler. Context of ".." The context (scalar or array) of the ".." operator determines whether it behaves as a range or a flip/flop. Standard perl delays until runtime the decision of which context it is in but the compiler needs to know the context at compile-time. For example, @a = (4,6,1,0,0,1); sub range { (shift @a)..(shift @a) } print range(); while (@a) { print scalar(range()) } generates the output 456123E0 with standard Perl but gives a compile-time error with compiled Perl. Arithmetic Optimized Compiled Perl programs use native C arithmetic much more frequently than standard perl. Operations on large numbers or on boundary cases may produce different behaviour. Deprecated features Features of standard perl such as $[ which have been deprecated in standard perl since version 5 was released have not been implemented in the compiler. STATUS See STATUS CHANGES See Changes BUGS Here are some things which may cause the compiler problems. The following render the compiler useless (without serious hacking): * Use of the DATA filehandle (via __END__ or __DATA__ tokens) on certain Perl versions. * Operator overloading with %OVERLOAD * The (deprecated) magic array-offset variable $[ does not work * For -O1 and -O2 copy-on-grow of static pvs and heks since 5.10 is disabled. Fixes thof e destruction failures there is work in progress. * The following operators are not yet implemented for CC goto sort with a non-default comparison (i.e. a named sub or inline block) continue/next/last to a outer LABEL * You can't use "last" to exit from a non-loop block. The following may give significant problems: * BEGIN blocks containing complex initialisation code * Code which is only ever referred to at runtime (e.g. via eval "..." or via method calls): see the -u option for the C and CC backends. * Run-time lookups of lexical variables in "outside" closures The following may cause problems (not thoroughly tested): * Dependencies on whether values of some "magic" Perl variables are determined at compile-time or runtime. * For the C and CC backends: compile-time strings which are longer than your C compiler can cope with in a single line or definition. * Reliance on intimate details of global destruction * For the Bytecode backend: high -On optimisation numbers with code that has complex flow of control. * Any "-w" option in the first line of your perl program is seen and acted on by perl itself before the compiler starts. The compiler itself then runs with warnings turned on. This may cause perl to print out warnings about the compiler itself since I haven't tested it thoroughly with warnings turned on. There is a terser but more complete list in the Todo file. LICENSE This program is free software; you can redistribute it and/or modify it under the terms of either: a) the GNU General Public License as published by the Free Software Foundation; either version 1, or (at your option) any later version, or b) the "Artistic License" which comes with this kit. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See either the GNU General Public License or the Artistic License for more details. You should have received a copy of the Artistic License with this kit, in the file named "Artistic". If not, you can get one from the Perl distribution. You should also have received a copy of the GNU General Public License, in the file named "Copying". If not, you can get one from the Perl distribution or else write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. Reini Urban Fri 18 December 2009 Malcolm Beattie 2 September 1996