The Saffire Debugger

Debugging

A rudimentary debugging system is available in src/components/debugger/dbgp. It’s based on Derick Rethan’s DBGP protocol, which is a language agnostic debugging protocol. However, most IDEs do not work well with custom languages. For isntance, PHPStorm, a popular IDE will not accept any other debugger connection except for specific PHP.

The only debugger currently available capable of dealing with DBGP decently is the Komodo IDE debugger.

For more information about DBGP, see the XDebug site.

A debugging connection is made to an IDE, mostly on IPv4 host, port 9000. This connection sends over information, from the engine to the IDE with XML. The IDe sends back information as commands and argument options like property_get -i 5 -n "$x['a b']" -d 0 -c 0 -p 0.

The reason for this is because this way it’s easier for a debugger to parse the incoming data from an IDE. The ide itself normally is already large and capabable of handling XML. A debugger isn’t always able to. However, sending out XML is a lot easier than parsing it.

With Saffire, debugging works by running the virtual machine in a different runmode : VM_RUNMODE_DEBUG. Only when we want to debug, this mode will be set, which is done by using the “–debug” flag on saffire exec (we will try and make this a bit more flexible, for instance, by allowing requests to setup debugging sessions for instance, but for security and for making it ourself not so difficult, we need to trigger it manually).

Once the VM_RUNMODE_DEBUG option is set, the VM initialization function vm_init() will call dbgp_init. This will create a debug_info structure: an (unfortunatelly, for now) global structure with information about the current debugging context.

Then, it will setup a connection to the configured IDE (dbgp_sock_init()) and will wait until a connection has been setup (it will retry every second).

As soon as a socket is opened, an init XML will be send out. This XML contains information about Saffire (and the debugger) which are of interest for the IDE. For instance, it will tell which language the debugger is using (so: saffire), which version, which thread, the IDE key, the application ID etc.

At this point, the debugger goes into a “listing mode” by calling dbgp_parse_incoming_commands(). This will wait for incoming commands from the IDE, and executes them. This will continue until a command places the debugger in RUNNING mode basically.

dbgp_read_commandline() will read the commands from the IDE (which works the same way as reading from command line arguments), and the dbgp_parse_incoming_command will handle the commands.

Note that we do not do anything with the VM here. The debugger is just inside a loop accepting, parsing and executing commands.

Each command comes with some information like a transaction id. If we return information to the IDE again, we must use the same transaction ID (an IDE can send out multiple commands without synchroniously waiting for a response).

The command that can be executed can be found in the dbgp_commands structure found in src/component/debugger/dbgp/commands.c. Not all of them are implemented yet though.

The feature_get command will return the features that the debugger has. For instance, the debugger does not support multiple sessions, so this feature flag will return 0 to the IDE.

context_names

This will return the different contexts in which variables reside. Currently Saffire supports ‘local’, ‘global’ and ‘built-ins’.

stack_get

This will return the complete stack of the current running program. This is why the stackframe is saved into the debugger info structure.

context_get

Returns a variable from a given context. We are talking about Saffire objects, but we can easily convert them to something else. For instance, we can just return the value from a numerical object, or the string from a string object. We do this for some of the objects (boolean, string, numerical, null), but others could follow too. Otherwise, we dump a list of properties and the values of a class from which the IDE wants the info.

run

Strangely enough, this does not nothing by itself, but just places the debugger into a “RUNNING” mode. This means that some commands cannot be called anymore (setting features), and others can be called.

breakpoint_set

This sets a breakpoint somewhere in code. There are many different breakpoint types: on a certain line, when a certain call is made, when a certain condition is met, when an exception occurs etc.

The function will take the breakpoint and adds then to the list of breakpoints in the debug-info structure, but does not do anything with them itself.

step_into

This command sets the step-into of the debug-info structure to 1 and stores the current line number. This way, the debugger can figure out if the virtual machine is on another line or not.

Same with step-over and step-out.

source

It’s possible for an IDE to debug code to which it does not have access by itself. For instance, running on a remote server, but it does not have any file access to that server. In that case, it can ask the debugger to send over source files which can be used to display in the debugger. Normally, this is done through direct file access and path mappings.

stdout / stderr

These commands tells the debugger on how to deal with stdout and stderr. For instance, it can be possible for an IDE to capture the stdout (and/or stderr) and display it directly within the IDE. There are different options: normal mode (output on server), split mode (output on server and also send to the IDE), ide mode (output only on ide, not on server).

This is done through a fairly simple system: both stdout and stderr have a channel mode in the debug info structure.

During the debugging init phase (dbgp_init), we actually change the default output helpers to the _debug_output_char_helper and the _debug_output_string_helper, so all output will pass through these helpers.

These helpers will check the current channel modes, and the file number the string or character must be printed. If the file number is 1 (STDOUT), and the channel mode for stdout is “normal”, then the output will be passed to the original output helpers so it will be printed on the server.

If the mode is IDE (or split), the output will send over via XML to the IDE, and still optionally passed to the output handler as well if needed (in case of split mode).

Running

Once the debugger is inside the RUNNING mode (through the “run” command, mostly), normal activities are resumed and the loaded code will run through the VM. On each iteration in the virtual machine, there is a check if the VM is running in VM_RUNMODE_DEBUG, and if the debugger is still attached. It might be possible that the debugger is not attached, but it would still run in debugger mode (although nothing can control it anymore).

If in VM_RUNMODE_DEBUG, then dbgp_debug() is called with the global debug_info structure, plus the current stack frame.

Inside dbgp_debug, it checks if step-into has been set. If so, it checks if we are on the given line. If so, we clear the step-into flag, set the debug state to BREAK, set a reason OK and return this info back through XML to the IDE. At that point, we go to dbgp_parse_incoming_commands(), and wait for other commands that will either fetch context variables, set/get breakpoints, or run, step in, out again etc.

We basically follow the same setup when dealing with step-out and step-over.

Breakpoints are also handled here. But since there are multiple types of breakpoints, we must have multiple ways to handle them. Currently, Saffire only supports line breakpoints, as this looks similar to the step-into code. It’s easy enough to add other types here as well.