The RDP parser generator

RDP compiles attributed LL(1) grammars decorated with C-language semantic actions into recursive descent parsers. It can automatically build Abstract Syntax Trees using our Reduced Derivation Tree model. It has built-in support for symbol table handling, set manipulation and generalised graph representation.

RDP is written in strict ANSI C and produces strict ANSI C. RDP was developed using Borland C++ versions 3.1 and 5.1 on a PC and has also been built with no problems on Alpha/OSF-1, DECstation/Ultrix, HP Apollo/HPUX, Sun 4/SunOS, Solaris, Linux and NetBSD 0.9 hosts all using GCC as well as variety of vendor's own compilers. RDP also compiles for MS-DOS under Microsoft C V7.0, gcc (using the djpgg port) and several other compilers. I have reports of successful builds on Mac, Amiga and the Acorn Archimedes.

RDP is C++ 'clean' i.e. there are no identifiers used that clash with C++ reserved words. RDP generated code has been used with g++ applications, and compiles with g++ and the Borland C++ (as opposed to ANSI) compiler.


  • Tutorial manual for new users (Pdf, 69 pages)
  • User manual (Pdf, 103 pages)
  • Support library manual (Pdf, 59 pages)
  • Language development case study (Pdf, 133 pages)
  • Paper describing RDP from SIGPLan Notices (Pdf, 8 pages)
  • The RDP ftp archive


RDP itself, and the language processors it generates, use standard library modules to manage symbol tables, sets, graphs, memory allocation, text buffering, command line argument processing and scanning. The RDP scanner is programmed by loading tokens into a symbol table at the start of each run. In principle, the RDP scanner can be used to support runtime extensible languages, such as user defined operators in Algol-68, although nobody has had the nerve to try this yet.

RDP produces complete runnable programs with built-in help information and command line switches that are specified as part of the EBNF file. In this sense RDP output is far more shrink-wrapped than the usual parser generators which helps beginning students.

The RDP text buffering routines automatically handle nested files, error message reporting and text data buffering to provide an efficient general purpose front end. This is also a great help to new users since writing efficient (and correct) text buffering and scanning routines from scratch is, in my experience, harder than it appears.

The RDP graph handling package provides a general framework for building graph data structures that may then be output in a form suitable for display with the VCG (Visualisation of Compiler Graphs) tool. RDP generated parsers can be set to automatically build derivation trees in a form suitable for human viewing.

RDP generates itself (you mean you use a parser generator which _isn't_ written in itself?) which is a nice demonstration of the bootstrapping technique used for porting compilers to new architectures.

What you get

  • The machine generated source for the RDP translator (rdp.c). RDP checks that the source grammar is LL(1) and explains exactly why a non-LL(1) grammar is unacceptable. This version of RDP does not attempt to rework a grammar by itself.

  • The decorated EBNF file describing RDP that was processed by the RDP executable to produce its own source code (rdp.bnf). This is good for boggling undergraduate's minds with.

  • Decorated EBNF files for the languages minicalc, minicond, miniloop and minitree that are used as examples in the case study manual. The languages illustrate the development of a simple programming language by way of a syntax checker, two interpreters and finally two syntax-directed compilers that produces assembly language for a mythical machine called MVM (Mini Virtual Machine). On of the compilers is a single pass translator and the other uses a tree-based intermediate form

  • The decorated EBNF file for mvmasm, an assembler for the language produced by the above compilers along with a simulator called mvmsim for the resulting executable files.

  • An EBNF file describing a not particularly standard Pascal with some extensions for Turbo Pascal which generates a fully working parser.

  • A set of functions to automate the handling of command line arguments (arg.c).

  • Routines to implement general graph data structures (graph.c).

  • A set of wrapper functions for the standard C memory allocation routines with built in fatal error handling for out of memory errors (memalloc.c).

  • A programmable scanner with integrated error handling (scan.c and scanner.c).

  • A set-handling package that supports dynamically sizable sets (set.c).

  • A hash-coded symbol table with support for multiple symbol tables, nested scope rules and arbitrary user data (symbol.c).

  • A standard text buffering package with integrated messaging utilities that are used for all communication with the user (textio.c).

  • Sources and makefiles for everything which you may use freely on condition that you send copies of any modifications, enhancements and ill-conceived changes you might make back to me so that I can improve RDP.

  • User, library support, tutorial and case study manuals.

Versions and support

RDP has had six main releases including the original 1.0 release in February 1994. The current version is 1.5, released in May 1998. RDP has now been used pretty ferociously by lots of people in industry and academia and has stood up very well. We intend to stabilise RDP with this release, although we will, of course, continue to respond to bug reports. We have an internal beta release (version 1.6) that you can pick up from the gtb page if you are brave. Version 1.6 has a lot of extra features designed to support advanced users and is mainly intended as a bootstrap vehicle for GTB, our new Grammar ToolBox tool. RDP version 1.6 is rather profligate in its use of memory, and is not as well documented as version 1.5, so our advice is to use version 1.5 in the first instance and contact us before making a decision to move up to the later version.

  • Contact us
  • Location map
  • Terms & conditions