-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed text for RTT computation and ACK_MP scheduling #217
Changes from 5 commits
b8d3993
b27e4af
828fa5b
fc86c9e
a3648e8
bcad910
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -889,31 +889,36 @@ Terrestrial | 100ms | 350ms | |
Satellite | 350ms | 600ms | ||
{: #fig-example-ack-delay title="Example of ACK delays using multiple paths"} | ||
|
||
Using the default algorithm specified in {{QUIC-RECOVERY}} would result | ||
in suboptimal performance, computing average RTT and standard | ||
deviation from series of different delay measurements of different | ||
combined paths. At the same time, early tests showed that it is | ||
desirable to send ACKs through the shortest path because a shorter | ||
ACK delay results in a tighter control loop and better performances. | ||
The tests also showed that it is desirable to send copies of the ACKs | ||
on multiple paths, for robustness if a path experiences sudden losses. | ||
|
||
An early implementation mitigated the delay variation issue by using | ||
time stamps, as specified in {{QUIC-Timestamp}}. When the timestamps | ||
are present, the implementation can estimate the transmission delay | ||
on each one-way path, and can then use these one way delays for more | ||
efficient implementations of recovery and congestion control | ||
algorithms. | ||
|
||
If timestamps are not available, implementations could estimate one | ||
way delays using statistical techniques. For example, in the example | ||
shown in Table 1, implementations can use "same path" | ||
measurements to estimate the one way delay of the terrestrial path to | ||
about 50ms in each direction, and that of the satellite path to about | ||
300ms. Further measurements can then be used to maintain estimates | ||
of one way delay variations, using logical similar to Kalman filters. | ||
But statistical processing is error-prone, and using time stamps | ||
provides more robust measurements. | ||
The ACK_MP frames describe packets that were sent on the specified path, | ||
but they may be received through any available path. There is an | ||
understandable concern that if successive acknowledgements are received | ||
on different paths, the measured RTT samples will fluctuate widely, | ||
and that might result in poor performance. In fact, this concern is | ||
probably not justified. | ||
|
||
The computed values reflect both the state of the network path and the | ||
scheduling decisions by the sender of the ACK_MP frames. In the example | ||
above, we may assume that the ACK_MP will be sent over the terrestrial | ||
link, because that provides the best response time. In that case, the | ||
computed RTT value for the satellite path will be about 350ms. This | ||
lower than the 600ms that would be measured if the ACK_MP came over | ||
the satellite channel, but it is still the right value for computing | ||
for example the PTO timeout: if an ACK_MP is not received after more | ||
than 350ms, either the data packet or its ACK_MP were probably lost. | ||
|
||
The simplest implementation is to compute smoothedRTT and RTTvar per | ||
{{Section 5.3 of QUIC-RECOVERY}} regardless of the path through which MP_ACKs are | ||
received. This algorithm will provide good results, | ||
except if the set of paths changes and the ACK_MP sender | ||
revisits its sending preferences. This is not very | ||
different from what happens on a single path if the routing changes. | ||
The RTT, RTT variance and PTO estimates will rapidly converge to | ||
reflect the new conditions. | ||
There is however an exception: some congestion | ||
control functions rely on estimates of the minimum RTT. It might be prudent | ||
for nodes to remember the path over which the ACK MP that produced | ||
the minimum RTT was received, and to restart the minimum RTT computation | ||
if that path is abandoned. | ||
|
||
## Packet Scheduling | ||
|
||
|
@@ -931,9 +936,14 @@ implementation. | |
|
||
Note that this implies that an endpoint may send and receive ACK_MP | ||
frames on a path different from the one that carried the acknowledged | ||
packets. A reasonable default consists in sending ACK_MP frames on the | ||
path they acknowledge packets, but the receiver must not assume its | ||
peer will do so. | ||
packets. As noted in {{compute-rtt}} the values computed using | ||
the standard algorithm reflect both the characteristics of the | ||
path and the scheduling algorithm of ACK_MP frames. The estimates will converge | ||
faster if the scheduling strategy is stable, but besides that | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if convergence is the right term here. I think it might actually not work if you keep using multiple paths for ACKs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think "convergence" generally means that the algorithm has produced results that reflect the current network state, and that the smoothed RTT and RTTvar values will reflect that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just pushed a fix in the latest commit. |
||
implementations can choose between multiple strategies such as sending | ||
ACK_MP frames on the path they acknowledge packets, or sending | ||
ACK_MP frames on the shortest path, which results in shorter control loops | ||
and thus better performance. | ||
|
||
## Retransmissions | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the PTO 3xRTT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I'm not sure about this. In this examples you assume that the terrestrial path will used in a stable way for ACK. In the paragraph above you say if different path are used which is a different case. So if I send half my ACK over each pass, I will see an average RTT of 475ms which is too low for those ACKs that are sent over the satellite path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commenting on this PR after the merge: in the example that you cite, the smoothed RTT will actually be very close to the RTT via the shortest path + maybe 1 ACK delay. ACK_MP frames are redondant. If you send ACK alternately on short and long path, the ACK on the short path will arrive before the one on the long path. When the ACK on the long path arrives, its "highest" packet will already have been acked as part of a range carried on the short path, so it will not contribute to RTT computation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only if you send ACKs multiple time per RTT and the latency difference between the path is large.
So maybe we should rather not make any default recommendations but just explain the problems and non-problems in different alternatives...?
(btw. which merge are you talking about? This PR is not merged yet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge: I was mislead by the PR search feature. Sorry.
General idea:
1- loss detection will be mostly triggered by the ACK sent on the shortest return path, because with RACK most loss detection is triggered by packet number comparisons, not by timers. That means losses are detected faster if enough ACK travel on the shortest return path, which tends to improve performance quite a bit.
2- ACKing packets faster leads to lower memory utilisation, as packets get out of the retransmit queue faster.
3- Still, we need timer measurements to compute the PTO and deal with the loss of the "last packet". Just feeding all RTT samples into the algorithms works, because the combination of
SmoothedRTT
andRTTvar
will capture both the characteristics of the paths and whatever algorithm is implemented by the peer. The PTO formula incorporates both average and variance, and thus the PTO ends up making sense.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From RFC 9002: