9

A preprocessor is a tool that takes source code as input and outputs a modified version of it for input to a downstream tool, such as a compiler or interpreter. Preprocessors can be used to implement software variability (i.e., by conditionally including code specific to certain target architectures) or to realize control structures and other features not present in the programming language of the source code.

Perhaps the most famous preprocessor is the C preprocessor, CPP, which according to Dennis Ritchie was developed around 1972 or 1973 and became a standard part of the language. However, CPP wasn't the only preprocessor in use in 1972; IFTRAN, for example, was a FORTRAN preprocessor intended to provide support for various structured programming concepts that didn't yet exist in the language.

I'm interested in learning more about the history of preprocessors, in in particular the earliest known source code preprocessor and what programming language it was intended for. Did preprocessors get their start with high-level programming languages or were they in widespread use back when all we had was assembly?

6
  • 1
    The question is rather pointless, as it asks for examples to a concept that was only much later codified. After all, it's next to impossible to distinguish the first languages like Flow-Matic from preprocessors. Even less Macro-Assemblers.
    – Raffzahn
    CommentedMay 24, 2020 at 22:04
  • In the ancient times compilers had multiple passes, each slowly moving towards the final goal. I've seen that the Algol60 compiler for the GIER did this - I would suggest that the first step or two were exactly this.CommentedMay 25, 2020 at 9:25
  • 1
    One reason for having more than one pass in a compiler is to allow forward references. The first pass scans the source for defined variables or macros. Subsequent passes use those definitions. Languages such as Algol-60 evolved away from allowing forward references, enabling compiling with one less pass.CommentedMay 25, 2020 at 10:38
  • 1
    @another-dave: "Here is a language so far ahead of its time, that it was not only an improvement on its predecessors, but also on nearly all its successors." – Tony Hoare on Algol 60.CommentedMay 25, 2020 at 19:06
  • 1
    I know that quote. However, unlike Hoare, I also thought Algol 68 a decent language, though whether it should have been called 'Algol' is a separate question.
    – dave
    CommentedMay 25, 2020 at 20:57

2 Answers 2

15

The very name we have for one of the constructs -- "macro", short for "macroinstruction" -- comes from the assembly-language era, the prefix "macro" having its usual meaning (as a modifier) of something large, in this case larger than one instruction.

A macroinstruction looks like an instruction in the assembly language, generally follows the same syntax rules as the assembly language, but generates one or more actual machine instructions.

There are at least 4 cases to consider when looking for the first macroprocessor/preprocessor.

  1. A standalone preprocessor (which is what the question asks).

  2. Macroinstructions built in to an assembler for special cases, with no facility for defining more in the source code

  3. Macro assemblers permitting definition and use in source code.

  4. Preprocessing facilities built into higher-level language compilers.

Having established my taxonomy, I am unable to answer the question of "first". I should note that before C came into being, the parties involved had worked on Multics, whose system programming language was PL/I. PL/I came with extensive compile-time, i.e. preprocessing, facilities (my case #4). Ritchie, McIlroy, &co., would certainly have been aware of these facilities - and we may well suppose the simplicity of the C preprocessor is a reaction to the rococo complexity of PL/I in that respect.

One possible contender is the 705 Autocoder System from IBM, manual dated 1957. I think the manual is describing a macro-assembler rather than a preprocessor to a later assembler. "The Autocoder compiles instructions written in a simple notation and translates them into a program in the language of the machine".


I found this paper on The History of Macro Processing in Programming Language Extensibility, which looks like it might be relevant and interesting.

3
  • With suitable hand-waving, you can convert a macro-assembler into a "preprocessor to a later assembler" by running its output through a disassembler. So I'm not sure you have to worry too much about which of the two the manual is for, as long as you're not actually planning to run it!CommentedMay 25, 2020 at 23:29
  • "generates one or more actual machine instructions" should be "generates zero or more actual machine instructions"CommentedOct 20, 2021 at 14:15
  • Not so sure the word "macroinstruction" applies if it generates less than an actual instruction !
    – dave
    CommentedOct 20, 2021 at 16:57
10

Preprocessing is older than high-level languages. Macro systems came into use in the mid-1950s as ways to reduce the amount of assembler code that needed to be written and to make it easier to comply with programming standards. At first, pre-processors were separate programs, but they were followed by "macro-assemblers," which had built-in pre-processors, by the late 1950s. Discovering the very first pre-processor may well become a matter of definitions.

1
  • 2
    Agreed, and to your point, it might be tricky to come up with "the first".
    – dave
    CommentedMay 24, 2020 at 21:33

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.