1# Syscall descriptions syntax
2
3Pseudo-formal grammar of syscall description:
4
5```
6syscallname "(" [arg ["," arg]*] ")" [type]
7arg = argname type
8argname = identifier
9type = typename [ "[" type-options "]" ]
10typename = "const" | "intN" | "intptr" | "flags" | "array" | "ptr" |
11	   "buffer" | "string" | "strconst" | "filename" | "len" |
12	   "bytesize" | "bytesizeN" | "bitsize" | "vma" | "proc"
13type-options = [type-opt ["," type-opt]]
14```
15
16common type-options include:
17
18```
19"opt" - the argument is optional (like mmap fd argument, or accept peer argument)
20```
21
22rest of the type-options are type-specific:
23
24```
25"const": integer constant, type-options:
26	value, underlying type (one if "intN", "intptr")
27"intN"/"intptr": an integer without a particular meaning, type-options:
28	optional range of values (e.g. "5:10", or "100:200")
29"flags": a set of flags, type-options:
30	reference to flags description (see below)
31"array": a variable/fixed-length array, type-options:
32	type of elements, optional size (fixed "5", or ranged "5:10", boundaries inclusive)
33"ptr"/"ptr64": a pointer to an object, type-options:
34	type of the object; direction (in/out/inout)
35	ptr64 has size of 8 bytes regardless of target pointer size
36"buffer": a pointer to a memory buffer (like read/write buffer argument), type-options:
37	direction (in/out/inout)
38"string": a zero-terminated memory buffer (no pointer indirection implied), type-options:
39	either a string value in quotes for constant strings (e.g. "foo"),
40	or a reference to string flags (special value `filename` produces file names),
41	optionally followed by a buffer size (string values will be padded with \x00 to that size)
42"stringnoz": a non-zero-terminated memory buffer (no pointer indirection implied), type-options:
43	either a string value in quotes for constant strings (e.g. "foo"),
44	or a reference to string flags,
45"fmt": a string representation of an integer (not zero-terminated), type-options:
46	format (one of "dec", "hex", "oct") and the value (a resource, int, flags, const or proc)
47	the resulting data is always fixed-size (formatted as "%020llu", "0x%016llx" or "%023llo", respectively)
48"fileoff": offset within a file
49"len": length of another field (for array it is number of elements), type-options:
50	argname of the object
51"bytesize": similar to "len", but always denotes the size in bytes, type-options:
52	argname of the object
53"bitsize": similar to "len", but always denotes the size in bits, type-options:
54	argname of the object
55"vma": a pointer to a set of pages (used as input for mmap/munmap/mremap/madvise), type-options:
56	optional number of pages (e.g. vma[7]), or a range of pages (e.g. vma[2-4])
57"proc": per process int (see description below), type-options:
58	value range start, how many values per process, underlying type
59"text": machine code of the specified type, type-options:
60	text type (x86_real, x86_16, x86_32, x86_64, arm64)
61"void": type with static size 0
62	mostly useful inside of templates and varlen unions, can't be syscall argument
63```
64
65flags/len/flags also have trailing underlying type type-option when used in structs/unions/pointers.
66
67Flags are described as:
68
69```
70flagname = const ["," const]*
71```
72
73or for string flags as:
74
75```
76flagname = "\"" literal "\"" ["," "\"" literal "\""]*
77```
78
79## Ints
80
81`int8`, `int16`, `int32` and `int64` denote an integer of the corresponding size.
82`intptr` denotes a pointer-sized integer, i.e. C `long` type.
83
84By appending `be` suffix (e.g. `int16be`) integers become big-endian.
85
86It's possible to specify range of values for an integer in the format of `int32[0:100]`.
87
88To denote a bitfield of size N use `int64:N`.
89
90It's possible to use these various kinds of ints as base types for `const`, `flags`, `len` and `proc`.
91
92```
93example_struct {
94	f0	int8			# random 1-byte integer
95	f1	const[0x42, int16be]	# const 2-byte integer with value 0x4200 (big-endian 0x42)
96	f2	int32[0:100]		# random 4-byte integer with values from 0 to 100 inclusive
97	f3	int64:20		# random 20-bit bitfield
98}
99```
100
101## Structs
102
103Structs are described as:
104
105```
106structname "{" "\n"
107	(fieldname type "\n")+
108"}" ("[" attribute* "]")?
109```
110
111Structs can have attributes specified in square brackets after the struct.
112Attributes are:
113
114```
115"packed": the struct does not have paddings and has default alignment 1
116"align_N": the struct has alignment N
117"size": the struct is padded up to the specified size
118```
119
120attribute
121
122## Unions
123
124Unions are described as:
125
126```
127unionname "[" "\n"
128	(fieldname type "\n")+
129"]"
130```
131
132Unions can have a trailing "varlen" attribute (specified in square brackets after the union),
133which means that union length is not maximum of all option lengths,
134but rather length of a particular chosen option.
135
136## Resources
137
138Resources represent values that need to be passed from output of one syscall to input of another syscall. For example, `close` syscall requires an input value (fd) previously returned by `open` or `pipe` syscall. To achieve this, `fd` is declared as a resource. Resources are described as:
139
140```
141"resource" identifier "[" underlying_type "]" [ ":" const ("," const)* ]
142```
143
144`underlying_type` is either one of `int8`, `int16`, `int32`, `int64`, `intptr` or another resource (which models inheritance, for example, a socket is a subype of fd). The optional set of constants represent resource special values, for example, `0xffffffffffffffff` (-1) for "no fd", or `AT_FDCWD` for "the current dir". Special values are used once in a while as resource values. If no special values specified, special value of `0` is used. Resources can then be used as types, for example:
145
146```
147resource fd[int32]: 0xffffffffffffffff, AT_FDCWD, 1000000
148resource sock[fd]
149resource sock_unix[sock]
150
151socket(...) sock
152accept(fd sock, ...) sock
153listen(fd sock, backlog int32)
154```
155
156## Type Aliases
157
158Complex types that are often repeated can be given short type aliases using the
159following syntax:
160
161```
162type identifier underlying_type
163```
164
165For example:
166
167```
168type signalno int32[0:65]
169type net_port proc[20000, 4, int16be]
170```
171
172Then, type alias can be used instead of the underlying type in any contexts.
173Underlying type needs to be described as if it's a struct field, that is,
174with the base type if it's required. However, type alias can be used as syscall
175arguments as well. Underlying types are currently restricted to integer types,
176`ptr`, `ptr64`, `const`, `flags` and `proc` types.
177
178There are some builtin type aliases:
179```
180type bool8	int8[0:1]
181type bool16	int16[0:1]
182type bool32	int32[0:1]
183type bool64	int64[0:1]
184type boolptr	intptr[0:1]
185
186type filename string[filename]
187```
188
189## Type Templates
190
191Type templates can be declared as follows:
192```
193type buffer[DIR] ptr[DIR, array[int8]]
194type fileoff[BASE] BASE
195type nlattr[TYPE, PAYLOAD] {
196	nla_len		len[parent, int16]
197	nla_type	const[TYPE, int16]
198	payload		PAYLOAD
199} [align_4]
200```
201
202and later used as follows:
203```
204syscall(a buffer[in], b fileoff[int64], c ptr[in, nlattr[FOO, int32]])
205```
206
207There is builtin type template `optional` defined as:
208```
209type optional[T] [
210	val	T
211	void	void
212] [varlen]
213```
214
215## Length
216
217You can specify length of a particular field in struct or a named argument by using `len`, `bytesize` and `bitsize` types, for example:
218
219```
220write(fd fd, buf buffer[in], count len[buf]) len[buf]
221
222sock_fprog {
223	len	len[filter, int16]
224	filter	ptr[in, array[sock_filter]]
225}
226```
227
228If `len`'s argument is a pointer (or a `buffer`), then the length of the pointee argument is used.
229
230To denote the length of a field in N-byte words use `bytesizeN`, possible values for N are 1, 2, 4 and 8.
231
232To denote the length of the parent struct, you can use `len[parent, int8]`.
233To denote the length of the higher level parent when structs are embedded into one another, you can specify the type name of the particular parent:
234
235```
236struct s1 {
237    f0      len[s2]  # length of s2
238}
239
240struct s2 {
241    f0      s1
242    f1      array[int32]
243}
244
245```
246
247## Proc
248
249The `proc` type can be used to denote per process integers.
250The idea is to have a separate range of values for each executor, so they don't interfere.
251
252The simplest example is a port number.
253The `proc[20000, 4, int16be]` type means that we want to generate an `int16be`
254integer starting from `20000` and assign `4` values for each process.
255As a result the executor number `n` will get values in the `[20000 + n * 4, 20000 + (n + 1) * 4)` range.
256
257## Integer Constants
258
259Integer constants can be specified as decimal literals, as `0x`-prefixed
260hex literals, as `'`-surrounded char literals, or as symbolic constants
261extracted from kernel headers or defined by `define` directives. For example:
262
263```
264foo(a const[10], b const[-10])
265foo(a const[0xabcd])
266foo(a int8['a':'z'])
267foo(a const[PATH_MAX])
268foo(a ptr[in, array[int8, MY_PATH_MAX]])
269define MY_PATH_MAX	PATH_MAX + 2
270```
271
272## Misc
273
274Description files also contain `include` directives that refer to Linux kernel header files,
275`incdir` directives that refer to custom Linux kernel header directories
276and `define` directives that define symbolic constant values.
277