Determining affected tests

Automatically determining affected tests sounds too good to be true. Python developers rightfully have a suspecting attitude towards any tool which tries to be too clever about their source code. Code completion and symbol searching doesn’t need to be 100% reliable but messing with the test suite execution? This page explains what testmon tries and what it does not try to achieve.

There is no heuristics involved. testmon works with these pretty solid assumptions:

  • coverage.py library can reliably determine executed and unexecuted lines of code of a tested program
  • even in dynamic language, unexecuted line of code doesn’t influence the outcome of execution of the surrounding code
  • method body which is not reached/executed by a specific test cannot influence outcome of that test
  • the lines which are executed can have so many side effects that we don’t try to determine their real dependencies, we re-execute dependent tests even on smallest change

E.g. having test_s.py:

    1    def add(a, b):
    2        return a + b
    3    
    4    def subtract(a, b):
    5        return a - b
    6
    7    def test_add(a, b):
    8        assert add(1, 2) == 3

If you run coverage run -m pytest test_s.py::test_add you’ll get:

    1>   def add(a, b):
    2>       return a + b
    3     
    4>   def subtract(a, b):
    5!       return a - b
    6    
    7>   def test_add():
    8>       assert add(1, 2) == 3

Now you can change the unexecuted line ! return a - b to nuclear_bomb.explode() and it still won’t affect running test_s.py::test_add.

Implementation details

How does testmon approach processing the source code and determining the dependencies? It splits the code into blocks. Blocks can have holes which are denoted by a placeholder. ( “transformed_into_block” token ). Each Block also has a start, end (line numbers, 1-based, closed interval)

The above code is transformed into 4 blocks:

Block1: 1-8 (start-end)

    def add(a, b):
        transformed_into_block
    def subtract(a, b):
        transformed_into_block
    def test_add(a, b):
        transformed_into_block

Block2: 2-2

    return a + b

Block3: 5-5

    return a - b

Block4: 8-8

    assert add(1, 2) == 3

After running the test with coverage analysis and parsing the source code, testmon determines which blocks does test_s.py::test_add depend on. In our example it’s Block 1,2 and 4. (and not Block 3). testmon doesn’t store the whole code of the block but just a checksum of it. Block 3 can be changed to anything. As long as the Block 1,2 and 4 stay the same, the execution path for test_s.py::test_add and it’s outcome will stay the same.

The limits and reliability of this method are pretty much the same as limits of coverage.py (things that cause trouble)