1////
2	vim.syntax: asciidoc
3
4	Copyright (c) 2011 Thomas Graf <tgraf@suug.ch>
5////
6
7Routing Family Netlink Library (libnl-route)
8============================================
9Thomas Graf <tgraf@suug.ch>
103.1, Aug 11 2011:
11
12== Introduction
13
14This library provides APIs to the kernel interfaces of the routing family.
15
16
17NOTE: Work in progress.
18
19== Addresses
20
21[[route_link]]
22== Links (Network Devices)
23
24The link configuration interface is part of the +NETLINK_ROUTE+ protocol
25family and implements the following netlink message types:
26
27- View and modify the configuration of physical and virtual network devices.
28- Create and delete virtual network devices (e.g. dummy devices, VLAN devices,
29  tun devices, bridging devices, ...)
30- View and modify per link network configuration settings (e.g.
31  +net.ipv6.conf.eth0.accept_ra+, +net.ipv4.conf.eth1.forwarding+, ...)
32
33.Naming Convention (network device, link, interface)
34
35In networking several terms are commonly used to refer to network devices.
36While they have distinct meanings they have been used interchangeably in
37the past. Within the Linux kernel, the term _network device_ or _netdev_ is
38commonly used In user space the term _network interface_ is very common.
39The routing netlink protocol uses the term _link_ and so does the _iproute2_
40utility and most routing daemons.
41
42=== Netlink Protocol
43
44This section describes the protocol semantics of the netlink based link
45configuration interface. The following messages are defined:
46
47[options="header", cols="1,2,2"]
48|==============================================================================
49| Message Type   | User -> Kernel                    | Kernel -> User
50| +RTM_NEWLINK+  | Create or update virtual network device
51| Reply to +RTM_GETLINK+ request or notification of link added or updated
52| +RTM_DELLINK+  | Delete virtual network device
53| Notification of link deleted or disappeared
54| +RTM_GETLINK+  | Retrieve link configuration and statistics |
55| +RTM_SETLINK+  | Modify link configuration |
56|==============================================================================
57
58See link:core.html#core_msg_types[Netlink Library - Message Types] for more
59information on common semantics of these message types.
60
61==== Link Message Format
62
63All netlink link messages share a common header (+struct ifinfomsg+) which
64is appended after the netlink header (+struct nlmsghdr+).
65
66image:ifinfomsg.png["Link Message Header"]
67
68The meaning of each field may differ depending on the message type. A
69+struct ifinfomsg+ is defined in +<linux/rtnetlink.h>+ to represent the
70header.
71
72Address Family (8bit)::
73The address family is usually set to +AF_UNSPEC+ but may be specified in
74+RTM_GETLINK+ requests to limit the returned links to a specific address
75family.
76
77Link Layer Type (16bit)::
78Currently only used in kernel->user messages to report the link layer type
79of a link. The value corresponds to the +ARPHRD_*+ defines found in
80+<linux/if_arp.h>+. Translation from/to strings can be done using the
81functions nl_llproto2str()/nl_str2llproto().
82
83Link Index (32bit)::
84Carries the interface index and is used to identify existing links.
85
86Flags (32bit)::
87In kernel->user messages the value of this field represents the current
88state of the link flags. In user->kernel messages this field is used to
89change flags or set the initial flag state of new links. Note that in order
90to change a flag, the flag must also be set in the _Flags Change Mask_ field.
91
92Flags Change Mask (32bit)::
93The primary use of this field is to specify a mask of flags that should be
94changed based on the value of the _Flags_ field. A special meaning is given
95to this field when present in link notifications, see TODO.
96
97Attributes (variable)::
98All link message types may carry netlink attributes. They are defined in the
99header file <linux/if_link.h> and share the prefix +IFLA_+.
100
101==== Link Message Types
102
103.RTM_GETLINK (user->kernel)
104
105Lookup link by 1. interface index or 2. link name (+IFLA_IFNAME+) and return
106a single +RTM_NEWLINK+ message containing the link configuration and statistics
107or a netlink error message if no such link was found.
108
109*Parameters:*
110
111* *Address family*
112** If the address family is set to +PF_BRIDGE+, only bridging devices will be
113   returned.
114** If the address family is set to +PF_INET6+, only ipv6 enabled devices will
115   be returned.
116
117*Flags:*
118
119* +NLM_F_DUMP+ If set, all links will be returned in form of a multipart
120  message.
121
122*Returns:*
123
124* +EINVAL+ if neither interface nor link name are set
125* +ENODEV+ if no link was found
126* +ENOBUFS+ if allocation failed
127
128.RTM_NEWLINK (user->kernel)
129
130Creates a new or updates an existing link. Only virtual links may be created
131but all links may be updated.
132
133*Flags:*
134
135- +NLM_F_CREATE+ Create link if it does not exist
136- +NLM_F_EXCL+ Return +EEXIST+ if link already exists
137
138*Returns:*
139
140- +EINVAL+ malformed message or invalid configuration parameters
141- +EAFNOSUPPORT+ if a address family specific configuration (+IFLA_AF_SPEC+)
142  is not supported.
143- +EOPNOTSUPP+ if the link does not support modification of parameters
144- +EEXIST+ if +NLM_F_EXCL+ was set and the link exists alraedy
145- +ENODEV+ if the link does not exist and +NLM_F_CREATE+ is not set
146
147.RTM_NEWLINK (kernel->user)
148
149This message type is used in reply to a +RTM_GETLINK+ request and carries
150the configuration and statistics of a link. If multiple links need to
151be sent, the messages will be sent in form of a multipart message.
152
153The message type is also used for notifications sent by the kernel to the
154multicast group +RTNLGRP_LINK+ to inform about various link events. It is
155therefore recommended to always use a separate link socket for link
156notifications in order to separate between the two message types.
157
158TODO: document how to detect different notifications
159
160.RTM_DELLINK (user->kernel)
161
162Lookup link by 1. interface index or 2. link name (+IFLA_IFNAME+) and delete
163the virtual link.
164
165*Returns:*
166
167* +EINVAL+ if neither interface nor link name are set
168* +ENODEV+ if no link was found
169* +ENOTSUPP+ if the operation is not supported (not a virtual link)
170
171.RTM_DELLINK (kernel->user)
172
173Notification sent by the kernel to the multicast group +RTNLGRP_LINK+ when
174
175a. a network device was unregistered (change == ~0)
176b. a bridging device was deleted (address family will be +PF_BRIDGE+)
177
178=== Get / List
179
180[[link_list]]
181==== Get list of links
182
183To retrieve the list of links in the kernel, allocate a new link cache
184using +rtnl_link_alloc_cache()+ to hold the links. It will automatically
185construct and send a +RTM_GETLINK+ message requesting a dump of all links
186from the kernel and feed the returned +RTM_NEWLINK+ to the internal link
187message parser which adds the returned links to the cache.
188
189[source,c]
190-----
191#include <netlink/route/link.h>
192
193int rtnl_link_alloc_cache(struct nl_sock *sk, int family, struct nl_cache **result)
194-----
195
196The cache will contain link objects (+struct rtnl_link+, see <<link_object>>)
197and can be accessed using the standard cache functions. By setting the
198+family+ parameter to an address familly other than +AF_UNSPEC+, the resulting
199cache will only contain links supporting the specified address family.
200
201The following direct search functions are provided to search by interface
202index and by link name:
203
204[source,c]
205-----
206#include <netlink/route/link.h>
207
208struct rtnl_link *rtnl_link_get(struct nl_cache *cache, int ifindex);
209struct rtnl_link *rtnl_link_get_by_name(struct nl_cache *cache, const char *name);
210-----
211
212.Example: Link Cache
213
214[source,c]
215-----
216struct nl_cache *cache;
217struct rtnl_link *link;
218
219if (rtnl_link_alloc_cache(sock, AF_UNSPEC, &cache)) < 0)
220	/* error */
221
222if (!(link = rtnl_link_get_by_name(cache, "eth1")))
223	/* link does not exist */
224
225/* do something with link */
226
227rtnl_link_put(link);
228nl_cache_put(cache);
229-----
230
231[[link_direct_lookup]]
232==== Lookup Single Link (Direct Lookup)
233
234If only a single link is of interest, the link can be looked up directly
235without the use of a link cache using the function +rtnl_link_get_kernel()+.
236
237[source,c]
238-----
239#include <netlink/route/link.h>
240
241int rtnl_link_get_kernel(struct nl_sock *sk, int ifindex, const char *name, struct rtnl_link **result);
242-----
243
244It will construct and send a +RTM_GETLINK+ request using the parameters
245provided and wait for a +RTM_NEWLINK+ or netlink error message sent in
246return. If the link exists, the link is returned as link object
247(see <<link_object>>).
248
249.Example: Direct link lookup
250[source,c]
251-----
252struct rtnl_link *link;
253
254if (rtnl_link_get_kernel(sock, 0, "eth1", &link) < 0)
255	/* error */
256
257/* do something with link */
258
259rtnl_link_put(link);
260-----
261
262NOTE: While using this function can save a substantial amount of bandwidth
263      on the netlink socket, the result will not be cached, subsequent calls
264      to rtnl_link_get_kernel() will always trigger sending a +RTM_GETLINK+
265      request.
266
267[[link_translate_ifindex]]
268==== Translating interface index to link name
269
270Applications which require to translate interface index to a link name or
271vice verase may use the following functions to do so. Both functions require
272a filled link cache to work with.
273
274[source,c]
275-----
276char *rtnl_link_i2name (struct nl_cache *cache, int ifindex, char *dst, size_t len);
277int rtnl_link_name2i (struct nl_cache *cache, const char *name);
278-----
279
280=== Add / Modify
281
282Several types of virtual link can be added on the fly using the function
283+rtnl_link_add()+.
284
285[source,c]
286-----
287#include <netlink/route/link.h>
288
289int rtnl_link_add(struct nl_sock *sk, struct rtnl_link *link, int flags);
290-----
291
292=== Delete
293
294The deletion of virtual links such as VLAN devices or dummy devices is done
295using the function +rtnl_link_delete()+. The link passed on to the function
296can be a link from a link cache or it can be construct with the minimal
297attributes needed to identify the link.
298
299[source,c]
300-----
301#include <netlink/route/link.h>
302
303int rtnl_link_delete(struct nl_sock *sk, const struct rtnl_link *link);
304-----
305
306The function will construct and send a +RTM_DELLINK+ request message and
307returns any errors returned by the kernel.
308
309.Example: Delete link by name
310[source,c]
311-----
312struct rtnl_link *link;
313
314if (!(link = rtnl_link_alloc()))
315	/* error */
316
317rtnl_link_set_name(link, "my_vlan");
318
319if (rtnl_link_delete(sock, link) < 0)
320	/* error */
321
322rtnl_link_put(link);
323-----
324
325[[link_object]]
326=== Link Object
327
328A link is represented by the structure +struct rtnl_link+. Instances may be
329created with the function +rtnl_link_alloc()+ or via a link cache (see
330<<link_list>>) and are freed again using the function +rtnl_link_put()+.
331
332[source,c]
333-----
334#include <netlink/route/link.h>
335
336struct rtnl_link *rtnl_link_alloc(void);
337void rtnl_link_put(struct rtnl_link *link);
338-----
339
340[[link_attr_name]]
341==== Name
342The name serves as unique, human readable description of the link. By
343default, links are named based on their type and then enumerated, e.g.
344eth0, eth1, ethn but they may be renamed at any time.
345
346Kernels >= 2.6.11 support identification by link name.
347
348[source,c]
349-----
350#include <netlink/route/link.h>
351
352void rtnl_link_set_name(struct rtnl_link *link, const char *name);
353char *rtnl_link_get_name(struct rtnl_link *link);
354-----
355
356*Accepted link name format:* +[^ /]*+ (maximum length: 15 characters)
357
358[[link_attr_ifindex]]
359==== Interface Index (Identifier)
360The interface index is an integer uniquely identifying a link. If present
361in any link message, it will be used to identify an existing link.
362
363[source,c]
364-----
365#include <netlink/route/link.h>
366
367void rtnl_link_set_ifindex(struct rtnl_link *link, int ifindex);
368int rtnl_link_get_ifindex(struct rtnl_link *link);
369-----
370
371[[link_attr_group]]
372==== Group
373Each link can be assigned a numeric group identifier to group a bunch of links
374together and apply a set of changes to a group instead of just a single link.
375
376
377[source,c]
378-----
379#include <netlink/route/link.h>
380
381void rtnl_link_set_group(struct rtnl_link *link, uint32_t group);
382uint32_t rtnl_link_get_group(struct rtnl_link *link);
383-----
384
385[[link_attr_address]]
386==== Link Layer Address
387The link layer address (e.g. MAC address).
388
389[source,c]
390-----
391#include <netlink/route/link.h>
392
393void rtnl_link_set_addr(struct rtnl_link *link, struct nl_addr *addr);
394struct nl_addr *rtnl_link_get_addr(struct rtnl_link *link);
395-----
396
397[[link_attr_broadcast]]
398==== Broadcast Address
399The link layer broadcast address
400
401[source,c]
402-----
403#include <netlink/route/link.h>
404
405void rtnl_link_set_broadcast(struct rtnl_link *link, struct nl_addr *addr);
406struct nl_addr *rtnl_link_get_broadcast(struct rtnl_link *link);
407-----
408
409[[link_attr_mtu]]
410==== MTU (Maximum Transmission Unit)
411The maximum transmission unit specifies the maximum packet size a network
412device can transmit or receive. This value may be lower than the capability
413of the physical network device.
414
415[source,c]
416-----
417#include <netlink/route/link.h>
418
419void rtnl_link_set_mtu(struct rtnl_link *link, unsigned int mtu);
420unsigned int rtnl_link_get_mtu(struct rtnl_link *link);
421-----
422
423[[link_attr_flags]]
424==== Flags
425The flags of a link enable or disable various link features or inform about
426the state of the link.
427
428[source,c]
429-----
430#include <netlink/route/link.h>
431
432void rtnl_link_set_flags(struct rtnl_link *link, unsigned int flags);
433void rtnl_link_unset_flags(struct rtnl_link *link, unsigned int flags);
434unsigned int rtnl_link_get_flags(struct rtnl_link *link);
435-----
436
437[options="compact"]
438[horizontal]
439IFF_UP::           Link is up (administratively)
440IFF_RUNNING::      Link is up and carrier is OK (RFC2863 OPER_UP)
441IFF_LOWER_UP::     Link layer is operational
442IFF_DORMANT::      Driver signals dormant
443IFF_BROADCAST::    Link supports broadcasting
444IFF_MULTICAST::    Link supports multicasting
445IFF_ALLMULTI::     Link supports multicast routing
446IFF_DEBUG::        Tell driver to do debugging (currently unused)
447IFF_LOOPBACK::     Link loopback network
448IFF_POINTOPOINT::  Point-to-point link
449IFF_NOARP::        ARP is not supported
450IFF_PROMISC::      Status of promiscious mode
451IFF_MASTER::       Master of a load balancer (bonding)
452IFF_SLAVE::        Slave to a master link
453IFF_PORTSEL::      Driver supports setting media type (only used by ARM ethernet)
454IFF_AUTOMEDIA::    Link selects port automatically (only used by ARM ethernet)
455IFF_ECHO::         Echo sent packets (testing feature, CAN only)
456IFF_DYNAMIC::      Unused (BSD compatibility)
457IFF_NOTRAILERS::   Unused (BSD compatibility)
458
459To translate a link flag to a link flag name or vice versa:
460
461[source,c]
462-----
463#include <netlink/route/link.h>
464
465char *rtnl_link_flags2str(int flags, char *buf, size_t size);
466int rtnl_link_str2flags(const char *flag_name);
467-----
468
469[[link_attr_txqlen]]
470==== Transmission Queue Length
471
472The transmission queue holds packets before packets are delivered to
473the driver for transmission. It is usually specified in number of
474packets but the unit may be specific to the link type.
475
476[source,c]
477-----
478#include <netlink/route/link.h>
479
480void rtnl_link_set_txqlen(struct rtnl_link *link, unsigned int txqlen);
481unsigned int rtnl_link_get_txqlen(struct rtnl_link *link);
482-----
483
484[[link_attr_operstate]]
485==== Operational Status
486The operational status has been introduced to provide extended information
487on the link status. Traditionally the link state has been described using
488the link flags +IFF_UP, IFF_RUNNING, IFF_LOWER_UP+, and +IFF_DORMANT+ which
489was no longer sufficient for some link types.
490
491[source,c]
492-----
493#include <netlink/route/link.h>
494
495void rtnl_link_set_operstate(struct rtnl_link *link, uint8_t state);
496uint8_t rtnl_link_get_operstate(struct rtnl_link *link);
497-----
498
499[options="compact"]
500[horizontal]
501IF_OPER_UNKNOWN::          Unknown state
502IF_OPER_NOTPRESENT::       Link not present
503IF_OPER_DOWN::             Link down
504IF_OPER_LOWERLAYERDOWN::   L1 down
505IF_OPER_TESTING::          Testing
506IF_OPER_DORMANT::          Dormant
507IF_OPER_UP::               Link up
508
509Translation of operational status code to string and vice versa:
510
511[source,c]
512-----
513#include <netlink/route/link.h>
514
515char *rtnl_link_operstate2str(uint8_t state, char *buf, size_t size);
516int rtnl_link_str2operstate(const char *name);
517-----
518
519[[link_attr_mode]]
520==== Mode
521Currently known link modes are:
522
523[options="compact"]
524[horizontal]
525IF_LINK_MODE_DEFAULT::   Default link mode
526IF_LINK_MODE_DORMANT::   Limit upward transition to dormant
527
528[source,c]
529-----
530#include <netlink/route/link.h>
531
532void rtnl_link_set_linkmode(struct rtnl_link *link, uint8_t mode);
533uint8_t rtnl_link_get_linkmode(struct rtnl_link *link);
534-----
535
536Translation of link mode to string and vice versa:
537
538[source,c]
539-----
540char *rtnl_link_mode2str(uint8_t mode, char *buf, size_t len);
541uint8_t rtnl_link_str2mode(const char *name);
542-----
543
544[[link_attr_alias]]
545==== IfAlias
546Alternative name for the link, primarly used for SNMP IfAlias.
547
548[source,c]
549-----
550#include <netlink/route/link.h>
551
552const char *rtnl_link_get_ifalias(struct rtnl_link *link);
553void rtnl_link_set_ifalias(struct rtnl_link *link, const char *alias);
554-----
555
556*Length limit:* 256
557
558[[link_attr_arptype]]
559==== Hardware Type
560
561[source,c]
562-----
563#include <netlink/route/link.h>
564#include <linux/if_arp.h>
565
566void rtnl_link_set_arptype(struct rtnl_link *link, unsigned int arptype);
567unsigned int rtnl_link_get_arptype(struct rtnl_link *link);
568----
569
570Translation of hardware type to character string and vice versa:
571
572[source,c]
573-----
574#include <netlink/utils.h>
575
576char *nl_llproto2str(int arptype, char *buf, size_t len);
577int nl_str2llproto(const char *name);
578-----
579
580[[link_attr_qdisc]]
581==== Qdisc
582The name of the queueing discipline used by the link is of informational
583nature only. It is a read-only attribute provided by the kernel and cannot
584be modified. The set function is provided solely for the purpose of creating
585link objects to be used for comparison.
586
587For more information on how to modify the qdisc of a link, see section
588<<route_tc>>.
589
590[source,c]
591-----
592#include <netlink/route/link.h>
593
594void rtnl_link_set_qdisc(struct rtnl_link *link, const char *name);
595char *rtnl_link_get_qdisc(struct rtnl_link *link);
596-----
597
598[[link_attr_promiscuity]]
599==== Promiscuity
600The number of subsystem currently depending on the link being promiscuous mode.
601A value of 0 indicates that the link is not in promiscuous mode. It is a
602read-only attribute provided by the kernel and cannot be modified. The set
603function is provided solely for the purpose of creating link objects to be
604used for comparison.
605
606[source,c]
607-----
608#include <netlink/route/link.h>
609
610void rtnl_link_set_promiscuity(struct rtnl_link *link, uint32_t count);
611uint32_t rtnl_link_get_promiscuity(struct rtnl_link *link);
612-----
613
614[[link_num_rxtx_queues]]
615==== RX/TX Queues
616The number of RX/TX queues the link provides. The attribute is writable but
617will only be considered when creating a new network device via netlink.
618
619[source,c]
620-----
621#include <netlink/route/link.h>
622
623void rtnl_link_set_num_tx_queues(struct rtnl_link *link, uint32_t nqueues);
624uint32_t rtnl_link_get_num_tx_queues(struct rtnl_link *link);
625
626void rtnl_link_set_num_rx_queues(struct rtnl_link *link, uint32_t nqueues);
627uint32_t rtnl_link_get_num_rx_queues(struct rtnl_link *link);
628-----
629
630[[link_attr_weight]]
631==== Weight
632This attribute is unused and obsoleted in all recent kernels.
633
634
635[[link_modules]]
636=== Modules
637
638[[link_bonding]]
639==== Bonding
640
641.Example: Add bonding link
642[source,c]
643-----
644#include <netlink/route/link.h>
645
646struct rtnl_link *link;
647
648link = rtnl_link_bond_alloc();
649rtnl_link_set_name(link, "my_bond");
650
651/* requires admin privileges */
652if (rtnl_link_add(sk, link, NLM_F_CREATE) < 0)
653	/* error */
654
655rtnl_link_put(link);
656-----
657
658[[link_vlan]]
659==== VLAN
660
661[source,c]
662-----
663extern char *		rtnl_link_vlan_flags2str(int, char *, size_t);
664extern int		rtnl_link_vlan_str2flags(const char *);
665
666extern int		rtnl_link_vlan_set_id(struct rtnl_link *, int);
667extern int		rtnl_link_vlan_get_id(struct rtnl_link *);
668
669extern int		rtnl_link_vlan_set_flags(struct rtnl_link *,
670						 unsigned int);
671extern int		rtnl_link_vlan_unset_flags(struct rtnl_link *,
672						   unsigned int);
673extern unsigned int	rtnl_link_vlan_get_flags(struct rtnl_link *);
674
675extern int		rtnl_link_vlan_set_ingress_map(struct rtnl_link *,
676						       int, uint32_t);
677extern uint32_t *	rtnl_link_vlan_get_ingress_map(struct rtnl_link *);
678
679extern int		rtnl_link_vlan_set_egress_map(struct rtnl_link *,
680						      uint32_t, int);
681extern struct vlan_map *rtnl_link_vlan_get_egress_map(struct rtnl_link *,
682						      int *);
683-----
684
685.Example: Add a VLAN device
686[source,c]
687-----
688struct rtnl_link *link;
689int master_index;
690
691/* lookup interface index of eth0 */
692if (!(master_index = rtnl_link_name2i(link_cache, "eth0")))
693	/* error */
694
695/* allocate new link object of type vlan */
696link = rtnl_link_vlan_alloc();
697
698/* set eth0 to be our master device */
699rtnl_link_set_link(link, master_index);
700
701rtnl_link_vlan_set_id(link, 10);
702
703if ((err = rtnl_link_add(sk, link, NLM_F_CREATE)) < 0)
704	/* error */
705
706rtnl_link_put(link);
707-----
708
709[[link_macvlan]]
710==== MACVLAN
711
712[source,c]
713-----
714extern struct rtnl_link *rtnl_link_macvlan_alloc(void);
715
716extern int		rtnl_link_is_macvlan(struct rtnl_link *);
717
718extern char *		rtnl_link_macvlan_mode2str(int, char *, size_t);
719extern int		rtnl_link_macvlan_str2mode(const char *);
720
721extern char *		rtnl_link_macvlan_flags2str(int, char *, size_t);
722extern int		rtnl_link_macvlan_str2flags(const char *);
723
724extern int		rtnl_link_macvlan_set_mode(struct rtnl_link *,
725			                           uint32_t);
726extern uint32_t		rtnl_link_macvlan_get_mode(struct rtnl_link *);
727
728extern int		rtnl_link_macvlan_set_flags(struct rtnl_link *,
729						 uint16_t);
730extern int		rtnl_link_macvlan_unset_flags(struct rtnl_link *,
731						   uint16_t);
732extern uint16_t		rtnl_link_macvlan_get_flags(struct rtnl_link *);
733-----
734
735.Example: Add a MACVLAN device
736[source,c]
737-----
738struct rtnl_link *link;
739int master_index;
740struct nl_addr* addr;
741
742/* lookup interface index of eth0 */
743if (!(master_index = rtnl_link_name2i(link_cache, "eth0")))
744	/* error */
745
746/* allocate new link object of type macvlan */
747link = rtnl_link_macvlan_alloc();
748
749/* set eth0 to be our master device */
750rtnl_link_set_link(link, master_index);
751
752/* set address of virtual interface */
753addr = nl_addr_build(AF_LLC, ether_aton("00:11:22:33:44:55"), ETH_ALEN);
754rtnl_link_set_addr(link, addr);
755nl_addr_put(addr);
756
757/* set mode of virtual interface */
758rtnl_link_macvlan_set_mode(link, rtnl_link_macvlan_str2mode("bridge"));
759
760if ((err = rtnl_link_add(sk, link, NLM_F_CREATE)) < 0)
761	/* error */
762
763rtnl_link_put(link);
764-----
765
766[[link_vxlan]]
767==== VXLAN
768
769[source,c]
770-----
771extern struct rtnl_link *rtnl_link_vxlan_alloc(void);
772
773extern int	rtnl_link_is_vxlan(struct rtnl_link *);
774
775extern int	rtnl_link_vxlan_set_id(struct rtnl_link *, uint32_t);
776extern int	rtnl_link_vxlan_get_id(struct rtnl_link *, uint32_t *);
777
778extern int	rtnl_link_vxlan_set_group(struct rtnl_link *, struct nl_addr *);
779extern int	rtnl_link_vxlan_get_group(struct rtnl_link *, struct nl_addr **);
780
781extern int	rtnl_link_vxlan_set_link(struct rtnl_link *, uint32_t);
782extern int	rtnl_link_vxlan_get_link(struct rtnl_link *, uint32_t *);
783
784extern int	rtnl_link_vxlan_set_local(struct rtnl_link *, struct nl_addr *);
785extern int	rtnl_link_vxlan_get_local(struct rtnl_link *, struct nl_addr **);
786
787extern int	rtnl_link_vxlan_set_ttl(struct rtnl_link *, uint8_t);
788extern int	rtnl_link_vxlan_get_ttl(struct rtnl_link *);
789
790extern int	rtnl_link_vxlan_set_tos(struct rtnl_link *, uint8_t);
791extern int	rtnl_link_vxlan_get_tos(struct rtnl_link *);
792
793extern int	rtnl_link_vxlan_set_learning(struct rtnl_link *, uint8_t);
794extern int	rtnl_link_vxlan_get_learning(struct rtnl_link *);
795extern int	rtnl_link_vxlan_enable_learning(struct rtnl_link *);
796extern int	rtnl_link_vxlan_disable_learning(struct rtnl_link *);
797
798extern int	rtnl_link_vxlan_set_ageing(struct rtnl_link *, uint32_t);
799extern int	rtnl_link_vxlan_get_ageing(struct rtnl_link *, uint32_t *);
800
801extern int	rtnl_link_vxlan_set_limit(struct rtnl_link *, uint32_t);
802extern int	rtnl_link_vxlan_get_limit(struct rtnl_link *, uint32_t *);
803
804extern int	rtnl_link_vxlan_set_port_range(struct rtnl_link *,
805					       struct ifla_vxlan_port_range *);
806extern int	rtnl_link_vxlan_get_port_range(struct rtnl_link *,
807					       struct ifla_vxlan_port_range *);
808
809extern int	rtnl_link_vxlan_set_proxy(struct rtnl_link *, uint8_t);
810extern int	rtnl_link_vxlan_get_proxy(struct rtnl_link *);
811extern int	rtnl_link_vxlan_enable_proxy(struct rtnl_link *);
812extern int	rtnl_link_vxlan_disable_proxy(struct rtnl_link *);
813
814extern int	rtnl_link_vxlan_set_rsc(struct rtnl_link *, uint8_t);
815extern int	rtnl_link_vxlan_get_rsc(struct rtnl_link *);
816extern int	rtnl_link_vxlan_enable_rsc(struct rtnl_link *);
817extern int	rtnl_link_vxlan_disable_rsc(struct rtnl_link *);
818
819extern int	rtnl_link_vxlan_set_l2miss(struct rtnl_link *, uint8_t);
820extern int	rtnl_link_vxlan_get_l2miss(struct rtnl_link *);
821extern int	rtnl_link_vxlan_enable_l2miss(struct rtnl_link *);
822extern int	rtnl_link_vxlan_disable_l2miss(struct rtnl_link *);
823
824extern int	rtnl_link_vxlan_set_l3miss(struct rtnl_link *, uint8_t);
825extern int	rtnl_link_vxlan_get_l3miss(struct rtnl_link *);
826extern int	rtnl_link_vxlan_enable_l3miss(struct rtnl_link *);
827extern int	rtnl_link_vxlan_disable_l3miss(struct rtnl_link *);
828-----
829
830.Example: Add a VXLAN device
831[source,c]
832-----
833struct rtnl_link *link;
834struct nl_addr* addr;
835
836/* allocate new link object of type vxlan */
837link = rtnl_link_vxlan_alloc();
838
839/* set interface name */
840rtnl_link_set_name(link, "vxlan128");
841
842/* set VXLAN network identifier */
843if ((err = rtnl_link_vxlan_set_id(link, 128)) < 0)
844	/* error */
845
846/* set multicast address to join */
847if ((err = nl_addr_parse("239.0.0.1", AF_INET, &addr)) < 0)
848	/* error */
849
850if ((err = rtnl_link_set_group(link, addr)) < 0)
851	/* error */
852
853nl_addr_put(addr);
854
855if ((err = rtnl_link_add(sk, link, NLM_F_CREATE)) < 0)
856	/* error */
857
858rtnl_link_put(link);
859-----
860
861[[link_ipip]]
862==== IPIP
863
864[source,c]
865-----
866extern struct rtnl_link *rtnl_link_ipip_alloc(void);
867extern int rtnl_link_ipip_add(struct nl_sock *sk, const char *name);
868
869extern int rtnl_link_ipip_set_link(struct rtnl_link *link,  uint32_t index);
870extern uint32_t rtnl_link_ipip_get_link(struct rtnl_link *link);
871
872extern int rtnl_link_ipip_set_local(struct rtnl_link *link, uint32_t addr);
873extern uint32_t rtnl_link_ipip_get_local(struct rtnl_link *link);
874
875extern int rtnl_link_ipip_set_remote(struct rtnl_link *link, uint32_t addr);
876extern uint32_t rtnl_link_ipip_get_remote(struct rtnl_link *link);
877
878extern int rtnl_link_ipip_set_ttl(struct rtnl_link *link, uint8_t ttl);
879extern uint8_t rtnl_link_ipip_get_ttl(struct rtnl_link *link);
880
881extern int rtnl_link_ipip_set_tos(struct rtnl_link *link, uint8_t tos);
882extern uint8_t rtnl_link_ipip_get_tos(struct rtnl_link *link);
883
884extern int rtnl_link_ipip_set_pmtudisc(struct rtnl_link *link, uint8_t pmtudisc);
885extern uint8_t rtnl_link_ipip_get_pmtudisc(struct rtnl_link *link);
886
887-----
888
889.Example: Add a ipip tunnel device
890[source,c]
891-----
892struct rtnl_link *link
893struct in_addr addr
894
895/* allocate new link object of type vxlan */
896if(!(link = rtnl_link_ipip_alloc()))
897        /* error */
898
899/* set ipip tunnel name */
900if ((err = rtnl_link_set_name(link, "ipip-tun")) < 0)
901         /* error */
902
903/* set link index  */
904if ((err = rtnl_link_ipip_set_link(link, if_index)) < 0)
905        /* error */
906
907/* set local address */
908inet_pton(AF_INET, "192.168.254.12", &addr.s_addr);
909if ((err = rtnl_link_ipip_set_local(link, addr.s_addr)) < 0)
910        /* error */
911
912/* set remote address */
913inet_pton(AF_INET, "192.168.254.13", &addr.s_addr
914if ((err = rtnl_link_ipip_set_remote(link, addr.s_addr)) < 0)
915        /* error */
916
917/* set tunnel ttl  */
918if ((err = rtnl_link_ipip_set_ttl(link, 64)) < 0)
919        /* error */
920
921if((err = rtnl_link_add(sk, link, NLM_F_CREATE)) < 0)
922        /* error */
923
924rtnl_link_put(link);
925-----
926
927[[link_ipgre]]
928==== IPGRE
929
930[source,c]
931-----
932extern struct rtnl_link *rtnl_link_ipgre_alloc(void);
933extern int rtnl_link_ipgre_add(struct nl_sock *sk, const char *name);
934
935extern int rtnl_link_ipgre_set_link(struct rtnl_link *link,  uint32_t index);
936extern uint32_t rtnl_link_ipgre_get_link(struct rtnl_link *link);
937
938extern int rtnl_link_ipgre_set_iflags(struct rtnl_link *link, uint16_t iflags);
939extern uint16_t rtnl_link_get_iflags(struct rtnl_link *link);
940
941extern int rtnl_link_ipgre_set_oflags(struct rtnl_link *link, uint16_t oflags);
942extern uint16_t rtnl_link_get_oflags(struct rtnl_link *link);
943
944extern int rtnl_link_ipgre_set_ikey(struct rtnl_link *link, uint32_t ikey);
945extern uint32_t rtnl_link_get_ikey(struct rtnl_link *link);
946
947extern int rtnl_link_ipgre_set_okey(struct rtnl_link *link, uint32_t okey);
948extern uint32_t rtnl_link_get_okey(struct rtnl_link *link)
949
950extern int rtnl_link_ipgre_set_local(struct rtnl_link *link, uint32_t addr);
951extern uint32_t rtnl_link_ipgre_get_local(struct rtnl_link *link);
952
953extern int rtnl_link_ipgre_set_remote(struct rtnl_link *link, uint32_t addr);
954extern uint32_t rtnl_link_ipgre_get_remote(struct rtnl_link *link);
955
956extern int rtnl_link_ipgre_set_ttl(struct rtnl_link *link, uint8_t ttl);
957extern uint8_t rtnl_link_ipgre_get_ttl(struct rtnl_link *link);
958
959extern int rtnl_link_ipgre_set_tos(struct rtnl_link *link, uint8_t tos);
960extern uint8_t rtnl_link_ipgre_get_tos(struct rtnl_link *link);
961
962extern int rtnl_link_ipgre_set_pmtudisc(struct rtnl_link *link, uint8_t pmtudisc);
963extern uint8_t rtnl_link_ipgre_get_pmtudisc(struct rtnl_link *link);
964
965-----
966
967.Example: Add a ipgre tunnel device
968[source,c]
969-----
970struct rtnl_link *link
971struct in_addr addr
972
973/* allocate new link object of type vxlan */
974if(!(link = rtnl_link_ipgre_alloc()))
975	/* error */
976
977/* set ipgre tunnel name */
978if ((err = rtnl_link_set_name(link, "ipgre-tun")) < 0)
979	/* error */
980
981/* set link index  */
982if ((err = rtnl_link_ipgre_set_link(link, if_index)) < 0)
983	/* error */
984
985/* set local address */
986inet_pton(AF_INET, "192.168.254.12", &addr.s_addr);
987if ((err = rtnl_link_ipgre_set_local(link, addr.s_addr)) < 0)
988	/* error */
989
990/* set remote address */
991inet_pton(AF_INET, "192.168.254.13", &addr.s_addr
992if ((err = rtnl_link_ipgre_set_remote(link, addr.s_addr)) < 0)
993	/* error */
994
995/* set tunnel ttl  */
996if ((err = rtnl_link_ipgre_set_ttl(link, 64)) < 0)
997	/* error */
998
999if((err = rtnl_link_add(sk, link, NLM_F_CREATE)) < 0)
1000	/* error */
1001
1002rtnl_link_put(link);
1003-----
1004
1005[[link_sit]]
1006==== SIT
1007
1008[source,c]
1009-----
1010extern struct rtnl_link *rtnl_link_sit_alloc(void);
1011extern int rtnl_link_sit_add(struct nl_sock *sk, const char *name);
1012
1013extern int rtnl_link_sit_set_link(struct rtnl_link *link,  uint32_t index);
1014extern uint32_t rtnl_link_sit_get_link(struct rtnl_link *link);
1015
1016extern int rtnl_link_sit_set_iflags(struct rtnl_link *link, uint16_t iflags);
1017extern uint16_t rtnl_link_get_iflags(struct rtnl_link *link);
1018
1019extern int rtnl_link_sit_set_oflags(struct rtnl_link *link, uint16_t oflags);
1020extern uint16_t rtnl_link_get_oflags(struct rtnl_link *link);
1021
1022extern int rtnl_link_sit_set_ikey(struct rtnl_link *link, uint32_t ikey);
1023extern uint32_t rtnl_link_get_ikey(struct rtnl_link *link);
1024
1025extern int rtnl_link_sit_set_okey(struct rtnl_link *link, uint32_t okey);
1026extern uint32_t rtnl_link_get_okey(struct rtnl_link *link)
1027
1028extern int rtnl_link_sit_set_local(struct rtnl_link *link, uint32_t addr);
1029extern uint32_t rtnl_link_sit_get_local(struct rtnl_link *link);
1030
1031extern int rtnl_link_sit_set_remote(struct rtnl_link *link, uint32_t addr);
1032extern uint32_t rtnl_link_sit_get_remote(struct rtnl_link *link);
1033
1034extern int rtnl_link_sit_set_ttl(struct rtnl_link *link, uint8_t ttl);
1035extern uint8_t rtnl_link_sit_get_ttl(struct rtnl_link *link);
1036
1037extern int rtnl_link_sit_set_tos(struct rtnl_link *link, uint8_t tos);
1038extern uint8_t rtnl_link_sit_get_tos(struct rtnl_link *link);
1039
1040extern int rtnl_link_sit_set_pmtudisc(struct rtnl_link *link, uint8_t pmtudisc);
1041extern uint8_t rtnl_link_sit_get_pmtudisc(struct rtnl_link *link);
1042
1043-----
1044
1045.Example: Add a sit tunnel device
1046[source,c]
1047-----
1048struct rtnl_link *link
1049struct in_addr addr
1050
1051/* allocate new link object of type vxlan */
1052if(!(link = rtnl_link_sit_alloc()))
1053	/* error */
1054
1055/* set sit tunnel name */
1056if ((err = rtnl_link_set_name(link, "sit-tun")) < 0)
1057	/* error */
1058
1059/* set link index  */
1060if ((err = rtnl_link_sit_set_link(link, if_index)) < 0)
1061	/* error */
1062
1063/* set local address */
1064inet_pton(AF_INET, "192.168.254.12", &addr.s_addr);
1065if ((err = rtnl_link_sit_set_local(link, addr.s_addr)) < 0)
1066	/* error */
1067
1068/* set remote address */
1069inet_pton(AF_INET, "192.168.254.13", &addr.s_addr
1070if ((err = rtnl_link_sit_set_remote(link, addr.s_addr)) < 0)
1071	/* error */
1072
1073/* set tunnel ttl  */
1074if ((err = rtnl_link_sit_set_ttl(link, 64)) < 0)
1075	/* error */
1076
1077if((err = rtnl_link_add(sk, link, NLM_F_CREATE)) < 0)
1078        /* error */
1079
1080rtnl_link_put(link);
1081-----
1082
1083
1084[[link_ipvti]]
1085==== IPVTI
1086
1087[source,c]
1088-----
1089extern struct rtnl_link *rtnl_link_ipvti_alloc(void);
1090extern int rtnl_link_ipvti_add(struct nl_sock *sk, const char *name);
1091
1092extern int rtnl_link_ipvti_set_link(struct rtnl_link *link,  uint32_t index);
1093extern uint32_t rtnl_link_ipvti_get_link(struct rtnl_link *link);
1094
1095extern int rtnl_link_ipvti_set_ikey(struct rtnl_link *link, uint32_t ikey);
1096extern uint32_t rtnl_link_get_ikey(struct rtnl_link *link);
1097
1098extern int rtnl_link_ipvti_set_okey(struct rtnl_link *link, uint32_t okey);
1099extern uint32_t rtnl_link_get_okey(struct rtnl_link *link)
1100
1101extern int rtnl_link_ipvti_set_local(struct rtnl_link *link, uint32_t addr);
1102extern uint32_t rtnl_link_ipvti_get_local(struct rtnl_link *link);
1103
1104extern int rtnl_link_ipvti_set_remote(struct rtnl_link *link, uint32_t addr);
1105extern uint32_t rtnl_link_ipvti_get_remote(struct rtnl_link *link);
1106
1107-----
1108
1109.Example: Add a ipvti tunnel device
1110[source,c]
1111-----
1112struct rtnl_link *link
1113struct in_addr addr
1114
1115/* allocate new link object of type vxlan */
1116if(!(link = rtnl_link_ipvti_alloc()))
1117	/* error */
1118
1119/* set ipvti tunnel name */
1120if ((err = rtnl_link_set_name(link, "ipvti-tun")) < 0)
1121	/* error */
1122
1123/* set link index  */
1124if ((err = rtnl_link_ipvti_set_link(link, if_index)) < 0)
1125	/* error */
1126
1127/* set local address */
1128inet_pton(AF_INET, "192.168.254.12", &addr.s_addr);
1129if ((err = rtnl_link_ipvti_set_local(link, addr.s_addr)) < 0)
1130	/* error */
1131
1132/* set remote address */
1133inet_pton(AF_INET, "192.168.254.13", &addr.s_addr
1134if ((err = rtnl_link_ipvti_set_remote(link, addr.s_addr)) < 0)
1135	/* error */
1136
1137if((err = rtnl_link_add(sk, link, NLM_F_CREATE)) < 0)
1138	/* error */
1139
1140rtnl_link_put(link);
1141-----
1142
1143[[link_ip6tnl]]
1144==== IP6TNL
1145
1146[source,c]
1147-----
1148extern struct rtnl_link *rtnl_link_ip6_tnl_alloc(void);
1149extern int rtnl_link_ip6_tnl_add(struct nl_sock *sk, const char *name);
1150
1151extern int rtnl_link_ip6_tnl_set_link(struct rtnl_link *link,  uint32_t index);
1152extern uint32_t rtnl_link_ip6_tnl_get_link(struct rtnl_link *link);
1153
1154extern int rtnl_link_ip6_tnl_set_local(struct rtnl_link *link, struct in6_addr *);
1155extern int rtnl_link_ip6_tnl_get_local(struct rtnl_link *link, struct in6_addr *);
1156
1157extern int rtnl_link_ip6_tnl_set_remote(struct rtnl_link *link, struct in6_addr *);
1158extern int rtnl_link_ip6_tnl_get_remote(struct rtnl_link *link, struct in6_addr *);
1159
1160extern int rtnl_link_ip6_tnl_set_ttl(struct rtnl_link *link, uint8_t ttl);
1161extern uint8_t rtnl_link_ip6_tnl_get_ttl(struct rtnl_link *link);
1162
1163extern int rtnl_link_ip6_tnl_set_tos(struct rtnl_link *link, uint8_t tos);
1164extern uint8_t rtnl_link_ip6_tnl_get_tos(struct rtnl_link *link);
1165
1166extern int rtnl_link_ip6_tnl_set_encaplimit(struct rtnl_link *link, uint8_t encap_limit);
1167extern uint8_t rtnl_link_ip6_tnl_get_encaplimit(struct rtnl_link *link);
1168
1169extern int rtnl_link_ip6_tnl_set_flags(struct rtnl_link *link, uint32_t flags);
1170extern uint32_t rtnl_link_ip6_tnl_get_flags(struct rtnl_link *link);
1171
1172extern uint32_t rtnl_link_ip6_tnl_get_flowinfo(struct rtnl_link *link);
1173extern int rtnl_link_ip6_tnl_set_flowinfo(struct rtnl_link *link, uint32_t flowinfo);
1174
1175extern int rtnl_link_ip6_tnl_set_proto(struct rtnl_link *link, uint8_t proto);
1176extern uint8_t rtnl_link_ip6_tnl_get_proto(struct rtnl_link *link);
1177
1178-----
1179
1180.Example: Add a ip6tnl tunnel device
1181[source,c]
1182-----
1183struct rtnl_link *link
1184struct in6_addr addr
1185
1186link = rtnl_link_ip6_tnl_alloc();
1187
1188rtnl_link_set_name(link, "ip6tnl-tun");
1189rtnl_link_ip6_tnl_set_link(link, if_index);
1190
1191inet_pton(AF_INET6, "2607:f0d0:1002:51::4", &addr);
1192rtnl_link_ip6_tnl_set_local(link, &addr);
1193
1194inet_pton(AF_INET6, "2607:f0d0:1002:52::5", &addr);
1195rtnl_link_ip6_tnl_set_remote(link, &addr);
1196
1197rtnl_link_add(sk, link, NLM_F_CREATE);
1198rtnl_link_put(link);
1199
1200-----
1201
1202
1203== Neighbouring
1204
1205== Routing
1206
1207[[route_tc]]
1208== Traffic Control
1209
1210The traffic control architecture allows the queueing and
1211prioritization of packets before they are enqueued to the network
1212driver. To a limited degree it is also possible to take control of
1213network traffic as it enters the network stack.
1214
1215The architecture consists of three different types of modules:
1216
1217- *Queueing disciplines (qdisc)* provide a mechanism to enqueue packets
1218  in different forms. They may be used to implement fair queueing,
1219  prioritization of differentiated services, enforce bandwidth
1220  limitations, or even to simulate network behaviour such as packet
1221  loss and packet delay. Qdiscs can be classful in which case they
1222  allow traffic classes described in the next paragraph to be attached
1223  to them.
1224
1225- *Traffic classes (class)* are supported by several qdiscs to build
1226  a tree structure for different types of traffic. Each class may be
1227  assigned its own set of attributes such as bandwidth limits or
1228  queueing priorities. Some qdiscs even allow borrowing of bandwidth
1229  between classes.
1230
1231- *Classifiers (cls)* are used to decide which qdisc/class the packet
1232  should be enqueued to. Different types of classifiers exists,
1233  ranging from classification based on protocol header values to
1234  classification based on packet priority or firewall marks.
1235  Additionally most classifiers support *extended matches (ematch)*
1236  which allow extending classifiers by a set of matcher modules, and
1237  *actions* which allow classifiers to take actions such as mangling,
1238  mirroring, or even rerouting of packets.
1239
1240.Default Qdisc
1241
1242The default qdisc used on all network devices is `pfifo_fast`.
1243Network devices which do not require a transmit queue such as the
1244loopback device do not have a default qdisc attached. The `pfifo_fast`
1245qdisc provides three bands to prioritize interactive traffic over bulk
1246traffic. Classification is based on the packet priority (diffserv).
1247
1248image:qdisc_default.png["Default Qdisc"]
1249
1250.Multiqueue Default Qdisc
1251
1252If the network device provides multiple transmit queues the `mq`
1253qdisc is used by default. It will automatically create a separate
1254class for each transmit queue available and will also replace
1255the single per device tx lock with a per queue lock.
1256
1257image:qdisc_mq.png["Multiqueue default Qdisc"]
1258
1259.Example of a customized classful qdisc setup
1260
1261The following figure illustrates a possible combination of different
1262queueing and classification modules to implement quality of service
1263needs.
1264
1265image:tc_overview.png["Classful Qdisc diagram"]
1266
1267=== Traffic Control Object
1268
1269Each type traffic control module (qdisc, class, classifier) is
1270represented by its own structure. All of them are based on the traffic
1271control object represented by `struct rtnl_tc` which itself is based
1272on the generic object `struct nl_object` to make it cacheable. The
1273traffic control object contains all attributes, implementation details
1274and statistics that are shared by all of the traffic control object
1275types.
1276
1277image:tc_obj.png["struct rtnl_tc hierarchy"]
1278
1279It is not possible to allocate a `struct rtnl_tc` object, instead the
1280actual tc object types must be allocated directly using
1281`rtnl_qdisc_alloc()`, `rtnl_class_alloc()`, `rtnl_cls_alloc()` and
1282then casted to `struct rtnl_tc` using the `TC_CAST()` macro.
1283
1284.Usage Example: Allocation, Casting, Freeing
1285[source,c]
1286-----
1287#include <netlink/route/tc.h>
1288#include <netlink/route/qdisc.h>
1289
1290struct rtnl_qdisc *qdisc;
1291
1292/* Allocation of a qdisc object */
1293qdisc = rtnl_qdisc_alloc();
1294
1295/* Cast the qdisc to a tc object using TC_CAST() to use rtnl_tc_ functions. */
1296rtnl_tc_set_mpu(TC_CAST(qdisc), 64);
1297
1298/* Free the qdisc object */
1299rtnl_qdisc_put(qdisc);
1300-----
1301
1302[[tc_attr]]
1303==== Attributes
1304
1305Handle::
1306The handle uniquely identifies a tc object and is used to refer
1307to other tc objects when constructing tc trees.
1308+
1309[source,c]
1310-----
1311void rtnl_tc_set_handle(struct rtnl_tc *tc, uint32_t handle);
1312uint32_t rtnl_tc_get_handle(struct rtnl_tc *tc);
1313-----
1314
1315Interface Index::
1316The interface index specifies the network device the traffic object
1317is attached to. The function `rtnl_tc_set_link()` should be preferred
1318when setting the interface index. It stores the reference to the link
1319object in the tc object and allows retrieving the `mtu` and `linktype`
1320automatically.
1321+
1322[source,c]
1323-----
1324void rtnl_tc_set_ifindex(struct rtnl_tc *tc, int ifindex);
1325void rtnl_tc_set_link(struct rtnl_tc *tc, struct rtnl_link *link);
1326int rtnl_tc_get_ifindex(struct rtnl_tc *tc);
1327-----
1328
1329Link Type::
1330The link type specifies the kind of link that is used by the network
1331device (e.g. ethernet, ATM, ...). It is derived automatically when
1332the network device is specified with `rtnl_tc_set_link()`.
1333The default fallback is `ARPHRD_ETHER` (ethernet).
1334+
1335[source,c]
1336-----
1337void rtnl_tc_set_linktype(struct rtnl_tc *tc, uint32_t type);
1338uint32_t rtnl_tc_get_linktype(struct rtnl_tc *tc);
1339-----
1340
1341Kind::
1342The kind character string specifies the type of qdisc, class,
1343classifier. Setting the kind results in the module specific
1344structure being allocated. Therefore it is imperative to call
1345`rtnl_tc_set_kind()` before using any type specific API functions
1346such as `rtnl_htb_set_rate()`.
1347+
1348[source,c]
1349-----
1350int rtnl_tc_set_kind(struct rtnl_tc *tc, const char *kind);
1351char *rtnl_tc_get_kind(struct rtnl_tc *tc);
1352-----
1353
1354MPU::
1355The Minimum Packet Unit specifies the minimum packet size which will
1356be transmitted
1357ever be seen by this traffic control object. This value is used for
1358rate calculations. Not all object implementations will make use of
1359this value. The default value is 0.
1360+
1361[source,c]
1362-----
1363void rtnl_tc_set_mpu(struct rtnl_tc *tc, uint32_t mpu);
1364uint32_t rtnl_tc_get_mpu(struct rtnl_tc *tc);
1365-----
1366
1367MTU::
1368The Maximum Transmission Unit specifies the maximum packet size which
1369will be transmitted. The value is derived from the link specified
1370with `rtnl_tc_set_link()` if not overwritten with `rtnl_tc_set_mtu()`.
1371If no link and MTU is specified, the value defaults to 1500
1372(ethernet).
1373+
1374[source,c]
1375-----
1376void rtnl_tc_set_mtu(struct rtnl_tc *tc, uint32_t mtu);
1377uint32_t rtnl_tc_get_mtu(struct rtnl_tc *tc);
1378-----
1379
1380Overhead::
1381The overhead specifies the additional overhead per packet caused by
1382the network layer. This value can be used to correct packet size
1383calculations if the packet size on the wire does not match the packet
1384size seen by the kernel. The default value is 0.
1385+
1386[source,c]
1387-----
1388void rtnl_tc_set_overhead(struct rtnl_tc *tc, uint32_t overhead);
1389uint32_t rtnl_tc_get_overhead(struct rtnl_tc *tc);
1390-----
1391
1392Parent::
1393Specifies the parent traffic control object. The parent is identifier
1394by its handle. Special values are:
1395- `TC_H_ROOT`: attach tc object directly to network device (root
1396  qdisc, root classifier)
1397- `TC_H_INGRESS`: same as `TC_H_ROOT` but on the ingress side of the
1398  network stack.
1399+
1400[source,c]
1401-----
1402void rtnl_tc_set_parent(struct rtnl_tc *tc, uint32_t parent);
1403uint32_t rtnl_tc_get_parent(struct rtnl_tc *tc);
1404-----
1405
1406Statistics::
1407Generic statistics, see <<tc_stats>> for additional information.
1408+
1409[source,c]
1410-----
1411uint64_t rtnl_tc_get_stat(struct rtnl_tc *tc, enum rtnl_tc_stat id);
1412-----
1413
1414[[tc_stats]]
1415==== Accessing Statistics
1416
1417The traffic control object holds a set of generic statistics. Not all
1418traffic control modules will make use of all of these statistics. Some
1419modules may provide additional statistics via their own APIs.
1420
1421.Statistic identifiers `(enum rtnl_tc_stat)`
1422[cols="m,,", options="header", frame="topbot"]
1423|====================================================================
1424| ID                 | Type    | Description
1425| RTNL_TC_PACKETS    | Counter | Total # of packets transmitted
1426| RTNL_TC_BYTES      | Counter | Total # of bytes transmitted
1427| RTNL_TC_RATE_BPS   | Rate    | Current bytes/s rate
1428| RTNL_TC_RATE_PPS   | Rate    | Current packets/s rate
1429| RTNL_TC_QLEN       | Rate    | Current length of the queue
1430| RTNL_TC_BACKLOG    | Rate    | # of packets currently backloged
1431| RTNL_TC_DROPS      | Counter | # of packets dropped
1432| RTNL_TC_REQUEUES   | Counter | # of packets requeued
1433| RTNL_TC_OVERLIMITS | Counter | # of packets that exceeded the limit
1434|====================================================================
1435
1436NOTE: `RTNL_TC_RATE_BPS` and `RTNL_TC_RATE_PPS` only return meaningful
1437      values if a rate estimator has been configured.
1438
1439.Usage Example: Retrieving tc statistics
1440[source,c]
1441-------
1442#include <netlink/route/tc.h>
1443
1444uint64_t drops, qlen;
1445
1446drops = rtnl_tc_get_stat(TC_CAST(qdisc), RTNL_TC_DROPS);
1447qlen  = rtnl_tc_get_stat(TC_CAST(qdisc), RTNL_TC_QLEN);
1448-------
1449
1450==== Rate Table Calculations
1451
1452[[tc_qdisc]]
1453=== Queueing Discipline (qdisc)
1454
1455.Classless Qdisc
1456
1457The queueing discipline (qdisc) is used to implement fair queueing,
1458priorization or rate control. It provides a _enqueue()_ and
1459_dequeue()_ operation. Whenever a network packet leaves the networking
1460stack over a network device, be it a physical or virtual device, it
1461will be enqueued to a qdisc unless the device is queueless. The
1462_enqueue()_ operation is followed by an immediate call to _dequeue()_
1463for the same qdisc to eventually retrieve a packet which can be
1464scheduled for transmission by the driver. Additionally, the networking
1465stack runs a watchdog which polls the qdisc regularly to dequeue and
1466send packets even if no new packets are being enqueued.
1467
1468This additional watchdog is required due to the fact that qdiscs may
1469hold on to packets and not return any packets upon _dequeue()_ in
1470order to enforce bandwidth restrictions.
1471
1472image:classless_qdisc_nbands.png[alt="Multiband Qdisc", float="right"]
1473
1474The figure illustrates a trivial example of a classless qdisc
1475consisting of three bands (queues). Use of multiple bands is a common
1476technique in qdiscs to implement fair queueing between flows or
1477prioritize differentiated services.
1478
1479Classless qdiscs can be regarded as a blackbox, their inner workings
1480can only be steered using the configuration parameters provided by the
1481qdisc. There is no way of taking influence on the structure of its
1482internal queues itself.
1483
1484.Classful Qdisc
1485
1486Classful qdiscs allow for the queueing structure and classification
1487process to be created by the user.
1488
1489image:classful_qdisc.png["Classful Qdisc"]
1490
1491The figure above shows a classful qdisc with a classifier attached to
1492it which will make the decision whether to enqueue a packet to traffic
1493class +1:1+ or +1:2+. Unlike with classless qdiscs, classful qdiscs
1494allow the classification process and the structure of the queues to be
1495defined by the user. This allows for complex traffic class rules to
1496be applied.
1497
1498.List of Qdisc Implementations
1499[options="header", frame="topbot", cols="2,1^,8"]
1500|======================================================================
1501| Qdisc     | Classful | Description
1502| ATM       | Yes      | FIXME
1503| Blackhole | No       | This qdisc will drop all packets passed to it.
1504| CBQ       | Yes      |
1505The CBQ (Class Based Queueing) is a classful qdisc which allows
1506creating traffic classes and enforce bandwidth limitations for each
1507class.
1508| DRR       | Yes      |
1509The DRR (Deficit Round Robin) scheduler is a classful qdisc
1510impelemting fair queueing. Each class is assigned a quantum specyfing
1511the maximum number of bytes that can be served per round.  Unused
1512quantum at the end of the round is carried over to the next round.
1513| DSMARK   | Yes       | FIXME
1514| FIFO     | No        | FIXME
1515| GRED     | No        | FIXME
1516| HFSC     | Yes       | FIXME
1517| HTB      | Yes       | FIXME
1518| mq       | Yes       | FIXME
1519| multiq   | Yes       | FIXME
1520| netem    | No        | FIXME
1521| Prio     | Yes       | FIXME
1522| RED      | Yes       | FIXME
1523| SFQ      | Yes       | FIXME
1524| TBF      | Yes       | FIXME
1525| teql     | No        | FIXME
1526|======================================================================
1527
1528
1529.QDisc API Overview
1530[cols="a,a", options="header", frame="topbot"]
1531|====================================================================
1532| Attribute | C Interface
1533|
1534Allocation / Freeing::
1535|
1536[source,c]
1537-----
1538struct rtnl_qdisc *rtnl_qdisc_alloc(void);
1539void rtnl_qdisc_put(struct rtnl_qdisc *qdisc);
1540-----
1541|
1542Addition::
1543|
1544[source,c]
1545-----
1546int rtnl_qdisc_build_add_request(struct rtnl_qdisc *qdisc, int flags,
1547				 struct nl_msg **result);
1548int rtnl_qdisc_add(struct nl_sock *sock, struct rtnl_qdisc *qdisc,
1549                   int flags);
1550-----
1551|
1552Modification::
1553|
1554[source,c]
1555-----
1556int rtnl_qdisc_build_change_request(struct rtnl_qdisc *old,
1557				    struct rtnl_qdisc *new,
1558				    struct nl_msg **result);
1559int rtnl_qdisc_change(struct nl_sock *sock, struct rtnl_qdisc *old,
1560		      struct rtnl_qdisc *new);
1561-----
1562|
1563Deletion::
1564|
1565[source,c]
1566-----
1567int rtnl_qdisc_build_delete_request(struct rtnl_qdisc *qdisc,
1568				    struct nl_msg **result);
1569int rtnl_qdisc_delete(struct nl_sock *sock, struct rtnl_qdisc *qdisc);
1570-----
1571|
1572Cache::
1573|
1574[source,c]
1575-----
1576int rtnl_qdisc_alloc_cache(struct nl_sock *sock,
1577			   struct nl_cache **cache);
1578struct rtnl_qdisc *rtnl_qdisc_get(struct nl_cache *cache, int, uint32_t);
1579
1580struct rtnl_qdisc *rtnl_qdisc_get_by_parent(struct nl_cache *, int, uint32_t);
1581-----
1582|====================================================================
1583
1584[[qdisc_get]]
1585==== Retrieving Qdisc Configuration
1586
1587The function rtnl_qdisc_alloc_cache() is used to retrieve the current
1588qdisc configuration in the kernel. It will construct a +RTM_GETQDISC+
1589netlink message, requesting the complete list of qdiscs configured in
1590the kernel.
1591
1592[source,c]
1593-------
1594#include <netlink/route/qdisc.h>
1595
1596struct nl_cache *all_qdiscs;
1597
1598if (rtnl_link_alloc_cache(sock, &all_qdiscs) < 0)
1599	/* error while retrieving qdisc cfg */
1600-------
1601
1602The cache can be accessed using the following functions:
1603
1604- Search qdisc with matching ifindex and handle:
1605+
1606[source,c]
1607--------
1608struct rtnl_qdisc *rtnl_qdisc_get(struct nl_cache *cache, int ifindex, uint32_t handle);
1609--------
1610- Search qdisc with matching ifindex and parent:
1611+
1612[source,c]
1613--------
1614struct rtnl_qdisc *rtnl_qdisc_get_by_parent(struct nl_cache *cache, int ifindex , uint32_t parent);
1615--------
1616- Or any of the generic cache functions (e.g. nl_cache_search(), nl_cache_dump(), etc.)
1617
1618.Example: Search and print qdisc
1619[source,c]
1620-------
1621struct rtnl_qdisc *qdisc;
1622int ifindex;
1623
1624ifindex = rtnl_link_get_ifindex(eth0_obj);
1625
1626/* search for qdisc on eth0 with handle 1:0 */
1627if (!(qdisc = rtnl_qdisc_get(all_qdiscs, ifindex, TC_HANDLE(1, 0))))
1628	/* no such qdisc found */
1629
1630nl_object_dump(OBJ_CAST(qdisc), NULL);
1631
1632rtnl_qdisc_put(qdisc);
1633-------
1634
1635[[qdisc_add]]
1636==== Adding a Qdisc
1637
1638In order to add a new qdisc to the kernel, a qdisc object needs to be
1639allocated. It will hold all attributes of the new qdisc.
1640
1641[source,c]
1642-----
1643#include <netlink/route/qdisc.h>
1644
1645struct rtnl_qdisc *qdisc;
1646
1647if (!(qdisc = rtnl_qdisc_alloc()))
1648	/* OOM error */
1649-----
1650
1651The next step is to specify all generic qdisc attributes using the tc
1652object interface described in the section <<tc_attr>>.
1653
1654The following attributes must be specified:
1655- IfIndex
1656- Parent
1657- Kind
1658
1659[source,c]
1660-----
1661/* Attach qdisc to device eth0 */
1662rtnl_tc_set_link(TC_CAST(qdisc), eth0_obj);
1663
1664/* Make this the root qdisc */
1665rtnl_tc_set_parent(TC_CAST(qdisc), TC_H_ROOT);
1666
1667/* Set qdisc identifier to 1:0, if left unspecified, a handle will be generated by the kernel. */
1668rtnl_tc_set_handle(TC_CAST(qdisc), TC_HANDLE(1, 0));
1669
1670/* Make this a HTB qdisc */
1671rtnl_tc_set_kind(TC_CAST(qdisc), "htb");
1672-----
1673
1674After specyfing the qdisc kind (rtnl_tc_set_kind()) the qdisc type
1675specific interface can be used to set attributes which are specific
1676to the respective qdisc implementations:
1677
1678[source,c]
1679------
1680/* HTB feature: Make unclassified packets go to traffic class 1:5 */
1681rtnl_htb_set_defcls(qdisc, TC_HANDLE(1, 5));
1682------
1683
1684Finally, the qdisc is ready to be added and can be passed on to the
1685function rntl_qdisc_add() which takes care of constructing a netlink
1686message requesting the addition of the new qdisc, sends the message to
1687the kernel and waits for the response by the kernel. The function
1688returns 0 if the qdisc has been added or updated successfully or a
1689negative error code if an error occured.
1690
1691CAUTION: The kernel operation for updating and adding a qdisc is the
1692         same. Therefore when calling rtnl_qdisc_add() any existing
1693         qdisc with matching handle will be updated unless the flag
1694         NLM_F_EXCL is specified.
1695
1696The following flags may be specified:
1697[horizontal]
1698NLM_F_CREATE::  Create qdisc if it does not exist, otherwise
1699                -NLE_OBJ_NOTFOUND is returned.
1700NLM_F_REPLACE:: If another qdisc is already attached to the same
1701                parent and their handles mismatch, replace the qdisc
1702                instead of returning -EEXIST.
1703NLM_F_EXCL::    Return -NLE_EXISTS if a qdisc with matching handles
1704                exists already.
1705
1706WARNING: The function rtnl_qdisc_add() requires administrator
1707         privileges.
1708
1709[source,c]
1710------
1711/* Submit request to kernel and wait for response */
1712err = rtnl_qdisc_add(sock, qdisc, NLM_F_CREATE);
1713
1714/* Return the qdisc object to free memory resources */
1715rtnl_qdisc_put(qdisc);
1716
1717if (err < 0) {
1718	fprintf(stderr, "Unable to add qdisc: %s\n", nl_geterror(err));
1719	return err;
1720}
1721------
1722
1723==== Deleting a qdisc
1724
1725[source,c]
1726------
1727#include <netlink/route/qdisc.h>
1728
1729struct rtnl_qdisc *qdisc;
1730
1731qdisc = rtnl_qdisc_alloc();
1732
1733rtnl_tc_set_link(TC_CAST(qdisc), eth0_obj);
1734rtnl_tc_set_parent(TC_CAST(qdisc), TC_H_ROOT);
1735
1736rtnl_qdisc_delete(sock, qdisc)
1737
1738rtnl_qdisc_put(qdisc);
1739------
1740
1741WARNING: The function rtnl_qdisc_delete() requires administrator
1742         privileges.
1743
1744
1745[[qdisc_htb]]
1746==== HTB - Hierarchical Token Bucket
1747
1748.HTB Qdisc Attributes
1749
1750Default Class::
1751The default class is the fallback class to which all traffic which
1752remained unclassified is directed to. If no default class or an
1753invalid default class is specified, packets are transmitted directly
1754to the next layer (direct transmissions).
1755+
1756[source,c]
1757-----
1758uint32_t rtnl_htb_get_defcls(struct rtnl_qdisc *qdisc);
1759int rtnl_htb_set_defcls(struct rtnl_qdisc *qdisc, uint32_t defcls);
1760-----
1761
1762Rate to Quantum (r2q)::
1763TODO
1764+
1765[source,c]
1766-----
1767uint32_t rtnl_htb_get_rate2quantum(struct rtnl_qdisc *qdisc);
1768int rtnl_htb_set_rate2quantum(struct rtnl_qdisc *qdisc, uint32_t rate2quantum);
1769-----
1770
1771
1772.HTB Class Attributes
1773
1774Priority::
1775+
1776[source,c]
1777-----
1778uint32_t rtnl_htb_get_prio(struct rtnl_class *class);
1779int rtnl_htb_set_prio(struct rtnl_class *class, uint32_t prio);
1780-----
1781
1782Rate::
1783The rate (bytes/s) specifies the maximum bandwidth an invidivual class
1784can use without borrowing. The rate of a class should always be greater
1785or erqual than the rate of its children.
1786+
1787[source,c]
1788-----
1789uint32_t rtnl_htb_get_rate(struct rtnl_class *class);
1790int rtnl_htb_set_rate(struct rtnl_class *class, uint32_t ceil);
1791-----
1792
1793Ceil Rate::
1794The ceil rate specifies the maximum bandwidth an invidivual class
1795can use. This includes bandwidth that is being borrowed from other
1796classes. Ceil defaults to the class rate implying that by default
1797the class will not borrow. The ceil rate of a class should always
1798be greater or erqual than the ceil rate of its children.
1799+
1800[source,c]
1801-----
1802uint32_t rtnl_htb_get_ceil(struct rtnl_class *class);
1803int rtnl_htb_set_ceil(struct rtnl_class *class, uint32_t ceil);
1804-----
1805
1806Burst::
1807TODO
1808+
1809[source,c]
1810-----
1811uint32_t rtnl_htb_get_rbuffer(struct rtnl_class *class);
1812int rtnl_htb_set_rbuffer(struct rtnl_class *class, uint32_t burst);
1813-----
1814
1815Ceil Burst::
1816TODO
1817+
1818[source,c]
1819-----
1820uint32_t rtnl_htb_get_bbuffer(struct rtnl_class *class);
1821int rtnl_htb_set_bbuffer(struct rtnl_class *class, uint32_t burst);
1822-----
1823
1824Quantum::
1825TODO
1826+
1827[source,c]
1828-----
1829int rtnl_htb_set_quantum(struct rtnl_class *class, uint32_t quantum);
1830-----
1831
1832extern int	rtnl_htb_set_cbuffer(struct rtnl_class *, uint32_t);
1833
1834
1835
1836
1837[[tc_class]]
1838=== Class
1839
1840[options="header", cols="s,a,a,a,a"]
1841|=======================================================================
1842|        | UNSPEC             | TC_H_ROOT          | 0:pY  | pX:pY
1843| UNSPEC 3+^|
1844[horizontal]
1845qdisc =:: root-qdisc
1846class =:: root-qdisc:0
1847|
1848[horizontal]
1849qdisc =:: pX:0
1850class =:: pX:0
1851| 0:hY 3+^|
1852[horizontal]
1853qdisc =:: root-qdisc
1854class =:: root-qdisc:hY
1855|
1856[horizontal]
1857qdisc =:: pX:0
1858class =:: pX:hY
1859| hX:hY 3+^|
1860[horizontal]
1861qdisc =:: hX:
1862class =:: hX:hY
1863|
1864if pX != hX
1865    return -EINVAL
1866[horizontal]
1867qdisc =:: hX:
1868class =:: hX:hY
1869|=======================================================================
1870
1871[[tc_cls]]
1872=== Classifier (cls)
1873
1874TODO
1875
1876[[tc_classid_mngt]]
1877=== ClassID Management
1878
1879TODO
1880
1881[[tc_pktloc]]
1882=== Packet Location Aliasing (pktloc)
1883
1884TODO
1885
1886[[tc_api]]
1887=== Traffic Control Module API
1888
1889TODO
1890