Testing Terminology

Regressions :’(

A few weeks ago, Egmont Koblinger (a developer on VTE (a Virtual Terminal library)) opened a ticket on Terminology about incorrect parsing of escape codes. The following day, I pushed a commit to master to fix that issue.

I introduced a regression: some other escape codes were no longer parsed correctly. In a few hours after my change, the ticket was reopened and I got notified twice on IRC.

I felt bad about it, having only tested the specific change; the bug described in the ticket was fixed and my shell was drawing correctly, nothing fancier than that.

I quickly reverted my changes on the repository but I needed to act.

Testing tools

Currently, there is already some testing. On each change to the master branch, coverity and codacy do static analysis. On Terminology itself, I wrote tyfuzz, a tool to run with American Fuzzy Lop. I need to run it for a long period of time (weeks) and it would not help in this case.

In the enlightenment/EFL world, we have exactness, a pixel perfect regression tester, but that’s a bit too complex for my needs. Then, there is vttest, an VT100/VT220/XTerm test utility. The main issue is that is it interactive: you press a key then there is a message telling you what you will see and you need to figure whether what is displayed is right or not. This is would be too complex to automate.

Introducing tytest

I needed to write something new. I wanted something simple with as little dependency as possible.

I already had tyfuzz which takes escape codes in input, parses and interprets them, and just exits. At first, I wanted to dump some kind of representation of what is drawn but that would be quite some work. I ended up taking a checksum of the different parts of the terminal and output it. I am using a MD5 checksum as it was already in use in Terminology. I write the tests in POSIX shell script.

Here is a simple test that sets a cursor shape:

    #!/bin/sh
    printf '\033[0 q'

And how to take a checksum:

tests/cursor-shape-0.sh | build/src/bin/tytest
c91e01b0e859cc043f21d804a01bcd50

I wrote a shell script on top of that simple concept and I am able to test any new changes for regressions on the tests I already wrote.

Continuous Integration

I used CircleCI to run the tests on each change to the master branch of Terminology. Based on an alpine image, it compiles Terminology on both clang and gcc, and run the tests. Then it uploads the coverage of the tests to codecov.io.

How do I know what to test?

So far, I have been reading the VT510 documentation and writing tests based on each command. I write the shell script, run it in XTerm and in Terminology, then compare visually. Then I fix Terminology ☺

I also rely on the ECMA-48 documentation. Something I discovered lately is the Terminal Guide, it is of great help since it lists bugs in other implementations (urxvt, xterm, vte, konsole and linux-vc).

I also check the coverage reports from codecov.io.

Then I try to do some abstract art!

Testing cursor movements:

tytest on cursor movements

Testing IL - Insert Lines:

tytest on IL

At the moment, there are 65 tests which found lots of small bugs. I will keep on adding tests and make Terminology even better!