1All about co_lnotab, the line number table.
2
3Code objects store a field named co_lnotab.  This is an array of unsigned bytes
4disguised as a Python bytes object.  It is used to map bytecode offsets to
5source code line #s for tracebacks and to identify line number boundaries for
6line tracing.
7
8The array is conceptually a compressed list of
9    (bytecode offset increment, line number increment)
10pairs.  The details are important and delicate, best illustrated by example:
11
12    byte code offset    source code line number
13        0                   1
14        6                   2
15       50                   7
16      350                 207
17      361                 208
18
19Instead of storing these numbers literally, we compress the list by storing only
20the difference from one row to the next.  Conceptually, the stored list might
21look like:
22
23    0, 1,  6, 1,  44, 5,  300, 200,  11, 1
24
25The above doesn't really work, but it's a start. An unsigned byte (byte code
26offset) can't hold negative values, or values larger than 255, a signed byte
27(line number) can't hold values larger than 127 or less than -128, and the
28above example contains two such values.  (Note that before 3.6, line number
29was also encoded by an unsigned byte.)  So we make two tweaks:
30
31 (a) there's a deep assumption that byte code offsets increase monotonically,
32 and
33 (b) if byte code offset jumps by more than 255 from one row to the next, or if
34 source code line number jumps by more than 127 or less than -128 from one row
35 to the next, more than one pair is written to the table. In case #b,
36 there's no way to know from looking at the table later how many were written.
37 That's the delicate part.  A user of co_lnotab desiring to find the source
38 line number corresponding to a bytecode address A should do something like
39 this:
40
41    lineno = addr = 0
42    for addr_incr, line_incr in co_lnotab:
43        addr += addr_incr
44        if addr > A:
45            return lineno
46        if line_incr >= 0x80:
47            line_incr -= 0x100
48        lineno += line_incr
49
50(In C, this is implemented by PyCode_Addr2Line().)  In order for this to work,
51when the addr field increments by more than 255, the line # increment in each
52pair generated must be 0 until the remaining addr increment is < 256.  So, in
53the example above, assemble_lnotab in compile.c should not (as was actually done
54until 2.2) expand 300, 200 to
55    255, 255, 45, 45,
56but to
57    255, 0, 45, 127, 0, 73.
58
59The above is sufficient to reconstruct line numbers for tracebacks, but not for
60line tracing.  Tracing is handled by PyCode_CheckLineNumber() in codeobject.c
61and maybe_call_line_trace() in ceval.c.
62
63*** Tracing ***
64
65To a first approximation, we want to call the tracing function when the line
66number of the current instruction changes.  Re-computing the current line for
67every instruction is a little slow, though, so each time we compute the line
68number we save the bytecode indices where it's valid:
69
70     *instr_lb <= frame->f_lasti < *instr_ub
71
72is true so long as execution does not change lines.  That is, *instr_lb holds
73the first bytecode index of the current line, and *instr_ub holds the first
74bytecode index of the next line.  As long as the above expression is true,
75maybe_call_line_trace() does not need to call PyCode_CheckLineNumber().  Note
76that the same line may appear multiple times in the lnotab, either because the
77bytecode jumped more than 255 indices between line number changes or because
78the compiler inserted the same line twice.  Even in that case, *instr_ub holds
79the first index of the next line.
80
81However, we don't *always* want to call the line trace function when the above
82test fails.
83
84Consider this code:
85
861: def f(a):
872:    while a:
883:       print(1)
894:       break
905:    else:
916:       print(2)
92
93which compiles to this:
94
95  2           0 SETUP_LOOP              26 (to 28)
96        >>    2 LOAD_FAST                0 (a)
97              4 POP_JUMP_IF_FALSE       18
98
99  3           6 LOAD_GLOBAL              0 (print)
100              8 LOAD_CONST               1 (1)
101             10 CALL_FUNCTION            1
102             12 POP_TOP
103
104  4          14 BREAK_LOOP
105             16 JUMP_ABSOLUTE            2
106        >>   18 POP_BLOCK
107
108  6          20 LOAD_GLOBAL              0 (print)
109             22 LOAD_CONST               2 (2)
110             24 CALL_FUNCTION            1
111             26 POP_TOP
112        >>   28 LOAD_CONST               0 (None)
113             30 RETURN_VALUE
114
115If 'a' is false, execution will jump to the POP_BLOCK instruction at offset 18
116and the co_lnotab will claim that execution has moved to line 4, which is wrong.
117In this case, we could instead associate the POP_BLOCK with line 5, but that
118would break jumps around loops without else clauses.
119
120We fix this by only calling the line trace function for a forward jump if the
121co_lnotab indicates we have jumped to the *start* of a line, i.e. if the current
122instruction offset matches the offset given for the start of a line by the
123co_lnotab.  For backward jumps, however, we always call the line trace function,
124which lets a debugger stop on every evaluation of a loop guard (which usually
125won't be the first opcode in a line).
126
127Why do we set f_lineno when tracing, and only just before calling the trace
128function?  Well, consider the code above when 'a' is true.  If stepping through
129this with 'n' in pdb, you would stop at line 1 with a "call" type event, then
130line events on lines 2, 3, and 4, then a "return" type event -- but because the
131code for the return actually falls in the range of the "line 6" opcodes, you
132would be shown line 6 during this event.  This is a change from the behaviour in
1332.2 and before, and I've found it confusing in practice.  By setting and using
134f_lineno when tracing, one can report a line number different from that
135suggested by f_lasti on this one occasion where it's desirable.
136