blob: d6861ed8a052d19b9633da994937bab16b815f5b [file] [log] [blame]
Klement Sekerabb912f22022-01-25 17:32:38 +00001.. _reassembly:
2
3IP Reassembly
4=============
5
6Some VPP functions need access to whole packet and/or stream
7classification based on L4 headers. Reassembly functionality allows
8both former and latter.
9
10Full reassembly vs shallow (virtual) reassembly
11-----------------------------------------------
12
13There are two kinds of reassembly available in VPP:
14
151. Full reassembly changes a stream of packet fragments into one
16packet containing all data reassembled with fragment bits cleared
17and fragment header stripped (in case of ip6). Note that resulting
18packet may come out of reassembly as a buffer chain. Because it's
19impractical to parse headers which are split over multiple vnet
20buffers, vnet_buffer_chain_linearize() is called after reassembly so
21that L2/L3/L4 headers can be found in first buffer. Full reassembly
22is costly and shouldn't be used unless necessary. Full reassembly is by
23default enabled for both ipv4 and ipv6 traffic for "forus" traffic
24- that is packets aimed at VPP addresses. This can be disabled via API
25if desired, in which case "forus" fragments are dropped.
26
272. Shallow (virtual) reassembly allows various classifying and/or
28translating features to work with fragments without having to
29understand fragmentation. It works by extracting L4 data and adding
30them to vnet_buffer for each packet/fragment passing throught SVR
31nodes. This operation is performed for both fragments and regular
32packets, allowing consuming code to treat all packets in same way. SVR
33caches incoming packet fragments (buffers) until first fragment is
34seen. Then it extracts L4 data from that first fragment, fills it for
35any cached fragments and transmits them in the same order as they were
36received. From that point on, any other passing fragments get L4 data
37populated in vnet_buffer based on reassembly context.
38
39Multi-worker behaviour
40^^^^^^^^^^^^^^^^^^^^^^
41
42Both reassembly types deal with fragments arriving on different workers
43via handoff mechanism. All reassembly contexts are stored in pools.
44Bihash mapping 5-tuple key to a value containing pool index and thread
45index is used for lookups. When a lookup finds an existing reasembly on
46a different thread, it hands off the fragment to that thread. If lookup
47fails, a new reassembly context is created and current worker becomes
48owner of that context. Further fragments received on other worker
49threads are then handed off owner worker thread.
50
51Full reassembly also remembers thread index where first fragment (as in
52fragment with fragment offset 0) was seen and uses handoff mechanism to
53send the reassembled packet out on that thread even if pool owner is
54a different thread. This then requires an additional handoff to free
55reassembly context as only pool owner can do that in a thread-safe way.
56
57Limits
58^^^^^^
59
60Because reassembly could be an attack vector, there is a configurable
61limit on the number of concurrent reassemblies and also maximum
62fragments per packet.
63
64Custom applications
65^^^^^^^^^^^^^^^^^^^
66
67Both reassembly features allow to be used by custom applicatind which
68are not part of VPP source tree. Be it patches or 3rd party plugins,
69they can build their own graph paths by using "-custom*" versions of
70nodes. Reassembly then reads next_index and error_next_index for each
71buffer from vnet_buffer, allowing custom application to steer
72both reassembled packets and any packets which are considered an error
73in a way the custom application requires.
74
75Full reassembly
76---------------
77
78Configuration
79^^^^^^^^^^^^^
80
81Configuration is via API (``ip_reassembly_enable_disable``) or CLI:
82
83``set interface reassembly <interface-name> [on|off|ip4|ip6]``
84
85here ``on`` means both ip4 and ip6.
86
87A show command is provided to see reassembly contexts:
88
89For ip4:
90
91``show ip4-full-reassembly [details]``
92
93For ip6:
94
95``show ip6-full-reassembly [details]``
96
97Global full reassembly parameters can be modified using API
98``ip_reassembly_set`` and retrieved using ``ip_reassembly_get``.
99
100Defaults
101""""""""
102
103For defaults values, see #defines in
104
105`ip4_full_reass.c <__REPOSITORY_URL__/src/vnet/ip/reass/ip4_full_reass.c>`_
106
107========================================= ==========================================
108#define description
109----------------------------------------- ------------------------------------------
110IP4_REASS_TIMEOUT_DEFAULT_MS timeout in milliseconds
111IP4_REASS_EXPIRE_WALK_INTERVAL_DEFAULT_MS interval between reaping expired sessions
112IP4_REASS_MAX_REASSEMBLIES_DEFAULT maximum number of concurrent reassemblies
113IP4_REASS_MAX_REASSEMBLY_LENGTH_DEFAULT maximum number of fragments per reassembly
114========================================= ==========================================
115
116and
117
118`ip6_full_reass.c <__REPOSITORY_URL__/src/vnet/ip/reass/ip6_full_reass.c>`_
119
120========================================= ==========================================
121#define description
122----------------------------------------- ------------------------------------------
123IP6_REASS_TIMEOUT_DEFAULT_MS timeout in milliseconds
124IP6_REASS_EXPIRE_WALK_INTERVAL_DEFAULT_MS interval between reaping expired sessions
125IP6_REASS_MAX_REASSEMBLIES_DEFAULT maximum number of concurrent reassemblies
126IP6_REASS_MAX_REASSEMBLY_LENGTH_DEFAULT maximum number of fragments per reassembly
127========================================= ==========================================
128
129Finished/expired contexts
130^^^^^^^^^^^^^^^^^^^^^^^^^
131
132Reassembly contexts are freed either when reassembly is finished - when
133all data has been received or in case of timeout. There is a process
134walking all reassemblies, freeing any expired ones.
135
136Shallow (virtual) reassembly
137----------------------------
138
139Configuration
140^^^^^^^^^^^^^
141
142Configuration is via API (``ip_reassembly_enable_disable``) only as
143there is no value in turning SVR on by hand without a feature consuming
144buffer metadata. SVR is designed to be turned on by a feature requiring
145it in a programmatic way.
146
147A show command is provided to see reassembly contexts:
148
149For ip4:
150
151``show ip4-sv-reassembly [details]``
152
153For ip6:
154
155``show ip6-sv-reassembly [details]``
156
157Global shallow reassembly parameters can be modified using API
158``ip_reassembly_set`` and retrieved using ``ip_reassembly_get``.
159
160Defaults
161""""""""
162
163For defaults values, see #defines in
164
165`ip4_sv_reass.c <__REPOSITORY_URL__/src/vnet/ip/reass/ip4_sv_reass.c>`_
166
167============================================ ==========================================
168#define description
169-------------------------------------------- ------------------------------------------
170IP4_SV_REASS_TIMEOUT_DEFAULT_MS timeout in milliseconds
171IP4_SV_REASS_EXPIRE_WALK_INTERVAL_DEFAULT_MS interval between reaping expired sessions
172IP4_SV_REASS_MAX_REASSEMBLIES_DEFAULT maximum number of concurrent reassemblies
173IP4_SV_REASS_MAX_REASSEMBLY_LENGTH_DEFAULT maximum number of fragments per reassembly
174============================================ ==========================================
175
176and
177
178`ip6_sv_reass.c <__REPOSITORY_URL__/src/vnet/ip/reass/ip6_sv_reass.c>`_
179
180============================================ ==========================================
181#define description
182-------------------------------------------- ------------------------------------------
183IP6_SV_REASS_TIMEOUT_DEFAULT_MS timeout in milliseconds
184IP6_SV_REASS_EXPIRE_WALK_INTERVAL_DEFAULT_MS interval between reaping expired sessions
185IP6_SV_REASS_MAX_REASSEMBLIES_DEFAULT maximum number of concurrent reassemblies
186IP6_SV_REASS_MAX_REASSEMBLY_LENGTH_DEFAULT maximum number of fragments per reassembly
187============================================ ==========================================
188
189Expiring contexts
190^^^^^^^^^^^^^^^^^
191
192There is no way of knowing when a reassembly is finished without
193performing (an almost) full reassembly, so contexts in SVR cannot be
194freed in the same way as in full reassembly. Instead a different
195approach is taken. Least recently used (LRU) list is maintained where
196reassembly contexts are ordered based on last update. The oldest
197context is then freed whenever SVR hits limit on number of concurrent
198reassembly contexts. There is also a process reaping expired sessions
199similar as in full reassembly.
200
201Truncated packets
202^^^^^^^^^^^^^^^^^
203
204When SVR detects that a packet has been truncated in a way where L4
205headers are not available, it will mark it as such in vnet_buffer,
206allowing downstream features to handle such packets as they deem fit.
207
208Fast path/slow path
209^^^^^^^^^^^^^^^^^^^
210
211SVR runs is implemented fast path/slow path way. By default, it assumes
212that any passing traffic doesn't contain fragments, processing buffers
213in a dual-loop. If it sees a fragment, it then jumps to single-loop
214processing.
215
216Feature enabled by other features/reference counting
217^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
218
219SVR feature is enabled by some other features, like NAT, when those
220features are enabled. For this to work, it implements a reference
221counted API for enabling/disabling SVR.