Rework how data is queued for the TCP connections:
* net_context no longer allocates net_pkt for TCP connections. This
was not only inefficient (net_context has no knowledge of the TX
window size), but also error-prone in certain configuration (for
example when IP fragmentation was enabled, net_context may attempt
to allocate enormous packet, instead of let the data be fragmented
for the TCP stream.
* Instead, implement already defined `net_tcp_queue()` API, which
takes raw buffer and length. This allows to take TX window into
account and also better manage the allocated net_buf's (like for
example avoid allocation if there's still room in the buffer). In
result, the TCP stack will not only no longer exceed the TX window,
but also prevent empty gaps in allocated net_buf's, which should
lead to less out-of-mem issues with the stack.
* As net_pkt-based `net_tcp_queue_data()` is no longer in use, it was
removed.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
Modify internal L4 protocols APIs, to allow to enforce checksum
calculation, regardless of the checksum HW offloading capability.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
There is a pending TODO regarding unifying net_tcp_unref() and
net_tcp_put(). Given that net_tcp_unref() is no longer used by the upper
layer (it should only use get/put APIs), the function can be removed
from external TCP API, as referencing/dereferencing is now only used
internally by the TCP stack.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
TCP2 is no longer needed as it is the unique implementation since the
legacy one has been removed.
Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
Similar to UDP, some drivers can make use of the following functions:
net_tcp_get_hdr()
net_tcp_set_hdr()
Let's expose them as <net/tcp.h> and change all internal references
to "tcp_internal.h".
Signed-off-by: Michael Scott <michael@opensourcefoundries.com>
Move core TCP functionality from net_context.c to tcp.c. Create empty
functions that the compiler can remove if TCP is not configured. As a
result remove TCP ifdefs from net_context.
Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>
Queue a TCP FIN packet when needed if the socket was connected or
listening and where FIN wasn't already received.
Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>
With CONFIG_NET_TCP is not set, provide empty static inline
prototypes for all TCP functions available to other parts of
the IP stack.
Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>
If context is bound to IPv6 unspecified addresss and some port
number, then unspecified address is passed in TCP reset packet
message preparation. Eventually packet dropped at the peer.
Signed-off-by: Ravi kumar Veeramally <ravikumar.veeramally@linux.intel.com>
This field is set and maintained, but not actually used for anything.
The only purpose for it would be to validate ACK numbers from peer,
but such a validation is now implemented by using send_seq field
directly.
Fixes: #4653
Signed-off-by: Paul Sokolovsky <paul.sokolovsky@linaro.org>
Per RFC 793:
A new acknowledgment (called an "acceptable ack"), is one for which
the inequality below holds:
SND.UNA < SEG.ACK =< SND.NXT
If acknowledgement is received for sequence number which wasn't yet
sent, log an error and ignore it.
Signed-off-by: Paul Sokolovsky <paul.sokolovsky@linaro.org>
Right now in FIN_WAIT1 state, if we receive FIN+ACK message, then
tcp state changed to FIN_WAIT2 on ACK flag and immediately on FIN
flag state changed to TIME_WAIT. Then final ACK is prepared and sent
(in queue at-least) to peer. Again immediately state changed to
TCP_CLOSED, where context is freed. net_context_put frees context
and releases tcp connection. Final ACK packet which is in queue
is dropped.
As a side effect of freed ACK packet, peer device keep on sending
FIN+ACK messages (that's why we see a lot of "TCP spurious
retransimission" messages in wireshark). As a result
of context free (respective connection handler also removed), we see
lot of packets dropped at connection input handler and replying with
ICMP error messages (destination unreachable).
To fix this issue, timewait timer support is required. When tcp
connection state changed to TIMEWAIT state, it should wait until
TIMEWAIT_TIMETOUT before changing state to TCP_CLOSED. It's
appropriate to close the tcp connection after timewait timer expiry.
Note: Right now timeout value is constant (250ms). But it should
be 2 * MSL (Maximum segment lifetime).
Signed-off-by: Ravi kumar Veeramally <ravikumar.veeramally@linux.intel.com>
Add a generic function for TCP option parsing. So far we're
interested only in MSS option value, so that's what it handles.
Use it to parse MSS value in net_context incoming SYN packet
handler.
Signed-off-by: Paul Sokolovsky <paul.sokolovsky@linaro.org>
Calculates full TCP header length (with options). Macro introduced
for reuse, to avoid "magic formula". (E.g., it would be needed to
parse TCP options).
Signed-off-by: Paul Sokolovsky <paul.sokolovsky@linaro.org>
MSS is Maximum Segment Size (data payload) of TCP. In SYN packets,
each side of the connection shares an MSS it wants to use (receive)
via the corresponding TCP option. If the option is not available,
the RFC mandates use of the value 536.
This patch handles storage of the send MSS (in the TCP structure,
in TCP backlog), with follow up patch handling actual parsing it
from the SYN TCP options.
Signed-off-by: Paul Sokolovsky <paul.sokolovsky@linaro.org>
The expire function can call net_context_unref() which tries to
get a semaphore with K_FOREVER. This is not allowed in interrupt
context. To overcome this, run the expire functionality from
system work queue instead.
Fixes#4683
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
The net_tcp_get/set_hdr() and net_udp_get/set_hdr() documentation
was not clear in corresponding header file. Clarify how the return
value of the function is supposed to be used.
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
This reverts commit 817245c564.
In certain cases the peer seems to discard the FIN packet we are
sending, which means that the TCP stream is not closed properly.
This needs more work so revert this for time being.
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
If network context is closed, send FIN by placing it to the end
of send queue instead of sending it immediately. This way all
pending data is sent before the connection is closed.
Jira: ZEP-1853
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
k_delayed_work_cancel now only fail if it hasn't been submitted which
means it is not in use anyway so it safe to reset its data regardless
of its return.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
It's never a good idea to redefine functions as macros if intended
to be unused in some configuration
- "statement with no effect" warnings
- "unused argument" warnings
- No type checking done if the macros are used
These have been redefined as empty inline functions.
Fixes compiler warnings with XCC.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The second 'const' is misguided, indicating that the returns pointer
value itself cannot be changed, but since pointers are passed by value
anyway this is not useful and was generating warnings with XCC.
The leading 'const' indicates that the memory pointed to is constant,
which is all we needed.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Inline functions declared in header files need to be declared
static. Fixes a compiler warning with XCC compiler.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This fixes the existing situation that "if application buffers data,
it's the problem of application". It's actually the problem of the
stack, as it doesn't allow application to control receive window,
and without this control, any buffer will overflow, peer packets
will be dropped, peer won't receive acks for them, and will employ
exponential backoff, the connection will crawl to a halt.
This patch adds net_context_tcp_recved() function which an
application must explicitly call when it *processes* data, to
advance receive window.
Jira: ZEP-1999
Signed-off-by: Paul Sokolovsky <paul.sokolovsky@linaro.org>
If either UDP or TCP is enabled but not both, then connectivity
fails. This was a side effect of commit 3604c391e ("net: udp:
Remove NET_UDP_HDR() macro and direct access to net_buf")
Jira: ZEP-2380
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
The commit 210c30805b ("net: context: Close connection fast
if TIME_WAIT support is off") was not a proper way of closing
the connection. So if Zephyr closes the connection (active close),
then send FIN and install a timer that makes sure that if the peer
FIN + ACK is lost, we close the connection properly after a timeout.
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Remove NET_TCP_HDR() macro as we cannot safely access TCP header
via it if the network packet header spans over multiple net_buf
fragments.
Fixed also the TCP unit tests so that they pass correctly.
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
We must check if we receive RST in any of the TCP states.
If we do not do this, then the net_context might leak as it
would never be released in some of the states. Receiving RST
in any TCP state is not described in TCP state diagram but is
described in RFC 793 which says in chapter "Reset Processing"
that system "...aborts the connection and advises the user and
goes to the CLOSED state."
We need to also validate the received RST and accept only those
TCP reset packets that contain valid sequence number.
The validate_state_transitions() function is also changed to
accept CLOSED state transition from various other states.
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Convert code to use u{8,16,32,64}_t and s{8,16,32,64}_t instead of C99
integer types.
Jira: ZEP-2051
Change-Id: I4ec03eb2183d59ef86ea2c20d956e5d272656837
Signed-off-by: Kumar Gala <kumar.gala@linaro.org>
We must not let ACK timer to run if we have already cancelled it.
So keep track that the timer is cancelled and refuse to run it
if it was indeed cancelled. The reason why the timer might be run
in this case is because the timer might be scheduled to be triggered
after which one cannot cancel it.
Change-Id: I1c8b8cee72bc7a644e02db154d9d009b8d98ade2
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Most of these macros are not exactly exposing a buffer, but a specific
header pointer (ipv6, ivp4, ethernet and so on), so it relevant to
rename them accordingly.
Change-Id: I66e32f7c3f2bc75994befb28d823e24299a53f5c
Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
- net_pkt becomes a stand-alone structure with network packet meta
information.
- network packet data is still managed through net_buf, mostly named
'frag'.
- net_pkt memory management is done through k_mem_slab
- function got introduced or relevantly renamed to target eithe net_pkt
or net_buf fragments.
- net_buf's sent_list ends up in net_pkt now, and thus helps to save
memory when TCP is enabled.
Change-Id: Ibd5c17df4f75891dec79db723a4c9fc704eb843d
Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
There have been long lasting confusion between net_buf and net_nbuf.
While the first is actually a buffer, the second one is not. It's a
network buffer descriptor. More precisely it provides meta data about a
network packet, and holds the chain of buffer fragments made of net_buf.
Thus renaming net_nbuf to net_pkt and all names around it as well
(function, Kconfig option, ..).
Though net_pkt if the new name, it still inherit its logic from net_buf.
'
This patch is the first of a serie that will separate completely net_pkt
from net_buf.
Change-Id: Iecb32d2a0d8f4647692e5328e54b5c35454194cd
Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
This is a start to move away from the C99 {u}int{8,16,32,64}_t types to
Zephyr defined u{8,16,32,64}_t and s{8,16,32,64}_t. This allows Zephyr
to define the sized types in a consistent manor across all the
architectures we support and not conflict with what various compilers
and libc might do with regards to the C99 types.
We introduce <zephyr/types.h> as part of this and have it include
<stdint.h> for now until we transition all the code away from the C99
types.
We go with u{8,16,32,64}_t and s{8,16,32,64}_t as there are some
existing variables defined u8 & u16 as well as to be consistent with
Zephyr naming conventions.
Jira: ZEP-2051
Change-Id: I451fed0623b029d65866622e478225dfab2c0ca8
Signed-off-by: Kumar Gala <kumar.gala@linaro.org>
There are no users of net_tcp_set_state() left outside of
subsys/net/ip/tcp.c.
And the naming of this function is confusing -- it could easily
be mistaken for net_tcp_change_state() which contains additional
logic for certain tcp states.
Let's remove it entirely and fix the remaining uses to set
tcp->state directly.
Change-Id: I92855ad180e8682780fcff11e50af06adcbc177c
Signed-off-by: Michael Scott <michael.scott@linaro.org>
The connection close paths were a little tangle. The use of separate
callbacks for "active" and "passive" close obscured the fundamentally
symmetric operation of those modes and made it hard to check sequence
numbers for validation (they didn't). Similarly the use of the
official TCP states missed some details we need, like the distinction
between having "queued" a FIN packet for transmission and the state
reached when it's actually transmitted.
Remove the state-specific callbacks (which actually had very little to
do) and just rely on the existing packet queuing and generic sequence
number handling in tcp_established(). A few new state bits in the
net_tcp struct help us track current state in a way that doesn't fall
over the asymmetry of the TCP state diagram. We can also junk the
FIN-specific timer and just use the same retransmit timer we do for
data packets (though long term we should investigate choosing
different timeouts by state).
Change-Id: I09951b848c63fefefce33962ee6cff6a09b4ca50
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
If the parameter "timeout" is set in net_context_connect(), the
assumption by the user is that the function would wait for SYNACK
to be received before returning to the caller.
Currently this is not the case. The timeout parameter is handed
off to net_l2_offload_ip_connect() if CONFIG_NET_L2_OFFLOAD_IP is
defined but never handled in a normal call.
To implement the timeout, let's use a semaphore to wait for
tcp_synack_received() to get a SYNACK before returning from
net_context_connect().
Change-Id: I7565550ed5545e6410b2d99c429367c1fb539970
Signed-off-by: Michael Scott <michael.scott@linaro.org>
net_context is used for more than just TCP contexts. However,
the accept_cb field is only used for TCP. Let's move it from
the generic net_context structure to the TCP specific net_tcp
structure.
Change-Id: If923c7aba1355cf5f91c07a7e7e469d385c7c365
Signed-off-by: Michael Scott <michael.scott@linaro.org>
No more than 4 bits are necessary to store the state of a TCP connection,
so better pack it using bitfields so that it uses only 4 bits instead of
32, by sharing space with `retry_timeout_shift` and `flags` fields.
There are 12 (or 14, if you count the 2 unused bits in the `flags`
field) bits remaining in the same dword, but I don't know what to to
stuff there yet.
This also changes all direct field access for the `state` field to
function calls. These functions are provided as `static inline`
functions and they perform only casts, so there's no function call
overhead.
Change-Id: I0197462caa0b71b287c0773ec5cd2dd4101a4766
Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>
This frees up some more memory as well, by computing the maximum segment
size whenever needed. A flag is set in the TCP context to signal if
the value has been already computed.
Change-Id: Idb228d4682540f92b269e3878fcee45cbc28038a
Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>
This value is never set (always zero), so it's safe to remove it from
the net_tcp struct.
Change-Id: Ie4c1d90204a9834f2223b09828af42ee101bd045
Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>
Rename the variable to `retry_timeout_shift`, and shift-right the value
each time there's a timeout. This saves some memory in that structure
by using the holes left due to alignment.
Change-Id: I18f45d00ecc434a588758a8d331921db902f4419
Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>