Anatomy of LLVM Backends


Published:
Tags:LLVM

One difficulty when working with LLVM backends is the sheer amount of files, classes and ultimately code that needs to be present in order to get a LLVM backend going. The aim of this post is to give a very brief overview over the necessary classes.

The backend referenced here can be found in my github repository.

The structure on a high level

The Targets directory directly contains all the code that is necessary to go from a target independent representation to a valid machine specific instruction stream.

simple sub directory structure

Tablegen files

The separation is not uniform but a trend can be detected to split the backend definitions into a couple of files:

File Content
LEG.td Top level table-gen file. Include all other table-gen files from here and instantiate only a couple of high level structs
LEGDevices.td Contains optional architectural features and processor definitions
LEGRegisterInfo.td Contains all structural information regarding registers. This includes register names, subregister information, and register classes.
LEGInstrFormat.td Describes instruction formats, and operands.
LEGInstrInfo.td Describes the individual Instructions, and defines instruction selection patterns
LEGCallingConv.td Describes the calling convention(s)

Classes in the base directory

The classes in the base directory contain everything necessary for going from the general target independent directed acyclic graph (DAG) to a target specific instruction stream.

Class / file What it does
LEGAsmPrinter Assists with outputting human readable assembly
LEGExpandPseudo Replacing Pseudo instructions (that simplify instruction selection) with valid machine instructions
LEGFrameLowering Handles function frame lowering - e.g. saving and restoring callee saved registers in prologue and epilogue
LEGInstrInfo Register copying, spilling and loading. Also handles branch optimization
LEGDAGToDAGISel Lowering the target independent DAG to an instruction selection. Contains selection functions for complex patterns declared in LEGInstrFormat.td
LEGISelLowering Sets up the instruction selection taking target specific features and instructions like divmul. Also contains the code to lower calls (setting up the arguments etc.), returns (putting return values into the right locations), and addresses
LEGMCInstLower Translates between the architecture specific Operands and their MC version
LEGRegisterInfo Assists LLVM in selecting registers as well as lowering operands that access local variables
LEGSubtarget Holds information about active CPU features
LEGTargetMachine Registers the backends specific optimization passes with LLVM

Classes in MCTargetDesc

These classes pick up where those in the base directory left of. Here we already start with instruction for our architecture and the goal is to serialize them either as human readable assembly or machine executable binary code.

Class What it does
LEGAsmBackend Handles fixups. These are values that are not known at the time the instruction is selected and thus need to be fixed up. Instruction pointer relative operands are among those values needing fixups.
LEGELFObjectWriter Translates fixups into relocations
LEGInstPrinter Prints instructions and operands as human readable assembly
LEGMCCodeEmitter Encodes instructions and operands into byte code
LEGTargetDesc Registers the machine Code generating aspects of the backend