566
社区成员
QUIC 全称 Quick UDP Internet Connection, 是谷歌公司研发的一种基于 UDP 协议的低时延互联网传输协议。在2018年IETF会议中,HTTP-over-QUIC协议被重命名为HTTP/3,并成为 HTTP 协议的第三个正式版本。本文将介绍QUIC协议的优势、特性和原理
因此,QUIC协议被提出来解决这些问题 。
QUIC基本上就是使用UDP重新实现了HTTP2的特性。它在UDP的基础上,集成了TLS握手,提供必要的反馈来实现可靠的交付和拥塞控制,并且允许连接迁移,使得在网络拓扑过地址映射变化后继续连接。
QUIC和TLS1.3一样使用Diffie-Hellman密钥交换算法来交换密钥。该算法的优点是交换密钥只需要1次RTT。在QUIC下,只有首次建立连接交换密钥时消耗1RTT时间,再次连接时就是0RTT了。这已最大限度的减少握手延迟带来的影响。这个特性在连接延迟较大的移动网络上有较好的性能提升。
TCP下一个连接是以四元组标识的,即(SrcIp,SrcPort,DstIp,DstPort)。而QUIC连接是以客户端产生的一个64位随机数作为连接标识。当网络、端口发生改变或中断时,只要连接标识不改变,连接就不会中断。
QUIC是基于UDP的,UDP不需要保证包的时序,因而不存在等待丢包恢复,不存在队头阻塞问题。如果某个资源的某个包丢失了,只会影响单个资源,其他资源会继续传输。
QUIC协议在应用层实现,更新较为轻量,没有硬件升级的包袱。
我选择了微软对QUIC的实现MsQuic,该实现使用c语言编写。构建过程参考了官方的文档BUILD.md。我在Win11上进行的构建,过程记录如下:
下载源码。 建议直接使用git工具从github上下载源码,因为
msquic。还依赖其他库。
git clone --recursive https://github.com/microsoft/msquic
在源码根目录下打开Powershell(管理员身份运行),执行下列命令
./scripts/prepare-machine.ps1 -Configuration Dev
该命令能确保你安装了所有依赖。
运行编译命令:
./scripts/build.ps1
编译成功后,打开 msquic\msquic\bld\windows\x64_schannel
(这是默认编译选型下的目录)目录下的 Visual Studio工程文件:msquic.sln 。可
以看到msquic包含的所有项目。至此msquic编译完成
这个示例很简单,实现了比较简单的服务器端和客户端。客户端尝试取连接服务端,打开一个双向的网络流,发送一些数据然后就关闭。服务端会接受所有连接,接受数据和stream。当stream关闭后,服务端发送自己的数据然后关闭发送方向。连接会在计时触发器到时后关闭。
编译完成后,可以在msquic\artifacts\bin\windows\x64_Debug_schannel下找到quicsample可执行文件。
首先使用Windows自带的工具生成证书:
然后在powershell中运行以下命令,指定运行服务端,同时指定用刚生成的thumbprint作为cert_hash:
输入以下命令,运行客户端。-target 指定服务器地址和端口,4567是源码中默认的端口号。 -unsecure表示不用证书验证
服务端的运行结果:
可以用wireshark捕获到该流量:
我们简单看看sample的源码,来熟悉一下msquic的api。
在读源码的过程中,注意msquit使用了SAL批注,第一次接触的时候会感觉很晦涩。
在微软官网的文档中,这么介绍SAL:
Microsoft 源代码注释语言 (SAL) 提供了一组注释,可用于描述函数如何使用其参数、函数对参数的假设,以及函数完成时做出保证。 批注在头文件 中定义 <sal.h> 。 Visual Studio C++ 的代码分析使用 SAL 注释来修改其函数分析。 有关用于开发驱动程序的 SAL 2.0 Windows,请参阅适用于 Windows驱动程序 的 SAL 2.0 注释。
在本机,C 和 C++ 仅为开发人员提供有限的方式来一致地表达意图和固定。 通过使用 SAL 注释,可以更详细地描述函数,以便使用这些函数的开发人员可以更好地了解如何使用它们。
一个例子,比如__cdecl
表示C语言默认的函数调用方法:所有参数从右到左依次入栈,这些参数由调用者清除,称为手动清栈。具体请参考微软官方文档。
BOOLEAN
GetFlag(
_In_ int argc,
_In_reads_(argc) _Null_terminated_ char* argv[],
_In_z_ const char* name
)
{
const size_t nameLen = strlen(name);
for (int i = 0; i < argc; i++) {
if (_strnicmp(argv[i] + 1, name, nameLen) == 0
&& strlen(argv[i]) == nameLen + 1) {
return TRUE;
}
}
return FALSE;
}
GetFlag
的作用是判断命令行参数是否有name
这个参数。
_Ret_maybenull_ _Null_terminated_ const char*
GetValue(
_In_ int argc,
_In_reads_(argc) _Null_terminated_ char* argv[],
_In_z_ const char* name
)
{
const size_t nameLen = strlen(name);
for (int i = 0; i < argc; i++) {
if (_strnicmp(argv[i] + 1, name, nameLen) == 0
&& strlen(argv[i]) > 1 + nameLen + 1
&& *(argv[i] + 1 + nameLen) == ':') {
return argv[i] + 1 + nameLen + 1;
}
}
return NULL;
}
GetValue
的作用是读取特定命令行参数name
的值
// Helper function to convert a hex character to its decimal value.
//
uint8_t
DecodeHexChar(
_In_ char c
)
{
if (c >= '0' && c <= '9') return c - '0';
if (c >= 'A' && c <= 'F') return 10 + c - 'A';
if (c >= 'a' && c <= 'f') return 10 + c - 'a';
return 0;
}
//
// Helper function to convert a string of hex characters to a byte buffer.
//
uint32_t
DecodeHexBuffer(
_In_z_ const char* HexBuffer,
_In_ uint32_t OutBufferLen,
_Out_writes_to_(OutBufferLen, return)
uint8_t* OutBuffer
)
{
uint32_t HexBufferLen = (uint32_t)strlen(HexBuffer) / 2;
if (HexBufferLen > OutBufferLen) {
return 0;
}
for (uint32_t i = 0; i < HexBufferLen; i++) {
OutBuffer[i] =
(DecodeHexChar(HexBuffer[i * 2]) << 4) |
DecodeHexChar(HexBuffer[i * 2 + 1]);
}
return HexBufferLen;
}
这两个函数是char转换成十进制的helper function。目的是为了网络传输。
//
// Allocates and sends some data over a QUIC stream.
//
void
ServerSend(
_In_ HQUIC Stream
)
{
//
// Allocates and builds the buffer to send over the stream.
//
void* SendBufferRaw = malloc(sizeof(QUIC_BUFFER) + SendBufferLength);
if (SendBufferRaw == NULL) {
printf("SendBuffer allocation failed!\n");
MsQuic->StreamShutdown(Stream, QUIC_STREAM_SHUTDOWN_FLAG_ABORT, 0);
return;
}
QUIC_BUFFER* SendBuffer = (QUIC_BUFFER*)SendBufferRaw;
SendBuffer->Buffer = (uint8_t*)SendBufferRaw + sizeof(QUIC_BUFFER);
SendBuffer->Length = SendBufferLength;
printf("[strm][%p] Sending data...\n", Stream);
//
// Sends the buffer over the stream. Note the FIN flag is passed along with
// the buffer. This indicates this is the last buffer on the stream and the
// the stream is shut down (in the send direction) immediately after.
//
QUIC_STATUS Status;
if (QUIC_FAILED(Status = MsQuic->StreamSend(Stream, SendBuffer, 1, QUIC_SEND_FLAG_FIN, SendBuffer))) {
printf("StreamSend failed, 0x%x!\n", Status);
free(SendBufferRaw);
MsQuic->StreamShutdown(Stream, QUIC_STREAM_SHUTDOWN_FLAG_ABORT, 0);
}
}
可以看到怎么在有stream的情况下发送data。首先allocate一段空间,大小为需要发送的buffer的大小加上QUIC_BUFFER
的大小。然后进行处理,准备好这个BUFFER,用StreamSend进行发送。
后面定义了服务端的回调函数:
_IRQL_requires_max_(DISPATCH_LEVEL)
_Function_class_(QUIC_STREAM_CALLBACK)
QUIC_STATUS
QUIC_API
ServerStreamCallback(
_In_ HQUIC Stream,
_In_opt_ void* Context,
_Inout_ QUIC_STREAM_EVENT* Event
)
//
// The server's callback for connection events from MsQuic.
//
_IRQL_requires_max_(DISPATCH_LEVEL)
_Function_class_(QUIC_CONNECTION_CALLBACK)
QUIC_STATUS
QUIC_API
ServerConnectionCallback(
_In_ HQUIC Connection,
_In_opt_ void* Context,
_Inout_ QUIC_CONNECTION_EVENT* Event
)
//
// The server's callback for listener events from MsQuic.
//
_IRQL_requires_max_(PASSIVE_LEVEL)
_Function_class_(QUIC_LISTENER_CALLBACK)
QUIC_STATUS
QUIC_API
ServerListenerCallback(
_In_ HQUIC Listener,
_In_opt_ void* Context,
_Inout_ QUIC_LISTENER_EVENT* Event
)
它们具有相似的结构。函数体主要都是根据Event
的类型进行不同的操作。
void
RunServer(
_In_ int argc,
_In_reads_(argc) _Null_terminated_ char* argv[]
)
中主要工作就是把回调函数设置好,把要配置的东西比如证书、idle等设置好。
client端的工作具有类似的结构,不再赘述
main
函数里,展现了api使用的一个要点:
使用MsQuicOpen
获得API table,使用完之后,MsQuicClose
关闭MsQuic
:
//
// Open a handle to the library and get the API function table.
//
if (QUIC_FAILED(Status = MsQuicOpen(&MsQuic))) {
printf("MsQuicOpen failed, 0x%x!\n", Status);
goto Error;
}
...
if (MsQuic != NULL) {
if (Configuration != NULL) {
MsQuic->ConfigurationClose(Configuration);
}
if (Registration != NULL) {
//
// This will block until all outstanding child objects have been
// closed.
//
MsQuic->RegistrationClose(Registration);
}
MsQuicClose(MsQuic);
}
构建完成后,使用secnetperf.exe进行测试。
运行一下命令,开启服务端
.\secnetperf.exe
运行一下命令,开启客户端进行测试
.\secnetperf.exe -test:tput -target:127.0.0.1 -upload:1000000000 -tcp:0
表示使用quic上传1000000000bytes
对比使用tcp的结果:
可以看出性能瓶颈不在网络环境,而在于cpu时,msquic的性能不如tcp。
由于我没有合适的环境进行测试,毕竟本地的环回测试和实际网络测试差别巨大,没有什么价值,所以我查阅了Quic相关的测试结果。
在这篇文章中,Fastly的工程师对quic和tcp做了
一些对比。当然,他们没有使用MsQuic,而是使用了他们自己对Quic的实现。
他们的测试环境也是非常理想化的,使用了比较低端的硬件,使得网络不会饱和。不出所料,quic的性能远低于tcp
最后,在做出3个合理的调整后,quic的性能追平了tcp。这3个调整分别是减小确认频率、
使用通用分段卸载 (GSO) 合并数据包、增加数据包大小。
微软官方提供了性能测试的dashbord。
测试方法采用的是标准化的draft-banks-quic-performance。
具体结果不在这里贴出来了。我更感兴趣的是和tcp的对比,好在他们也提供了
wan perf,
可以看到大部分情况下,msquic都表现得比stcp好。这里仅截取一张图片作为例子:
Quic的具体实现太庞大了,这里只能给出一部分观察结果。
在看具体的代码前我们先介绍一下QUIC中的基本术语:
QUIC:
该传输协议。QUIC是一个名字,而不是一个首字母缩略词。
Endpoint:
可以通过生成、接收和处理 QUIC 数据包来参与 QUIC 连接的实体。QUIC 中只有两种类型的端点:客户端和服务器。
Client:
启动 QUIC 连接的终结点
Server:
接受 QUIC 连接的终结点。
QUIC packet:
QUIC 的完整可处理单元,可以封装在 UDP 数据报中。可以将一个或多个 QUIC 数据包封装在单个 UDP 数据报(datagram)中。
Ack-eliciting packet:
包含 ACK、PADDING 和 CONNECTION_CLOSE以外的帧的 QUIC 数据包。这些会导致收件人发送确认。
Frame:
结构化协议信息的单元。有多种帧类型,每种类型都携带不同的信息。帧包含在 QUIC 数据包中。
Address:
在没有限定条件的情况下使用时,IP 版本、IP 地址和 UDP 端口号的元组表示网络路径的一端。
Connection ID:
用于标识终结点上的 QUIC 连接的标识符。每个终结点为其对等方选择一个或多个连接 ID,以包含在发送到终结点的数据包中。此值对等方不透明。
Stream:
QUIC 连接中有序字节的单向或双向通道。一个 QUIC 连接可以同时承载多个streams.
Application:
使用 QUIC 发送和接收数据的实体。
我们主要关注项目core,因为quic相关的实现都在这里。
前面的基本术语介绍基本已经把抽象层次说清楚了,我们看看msquic中怎么实现的。
Frame的实现在frame.h和frame.c中
//
// Different types of QUIC frames
//
typedef enum QUIC_FRAME_TYPE {
QUIC_FRAME_PADDING = 0x0ULL,
QUIC_FRAME_PING = 0x1ULL,
QUIC_FRAME_ACK = 0x2ULL, // to 0x3
QUIC_FRAME_ACK_1 = 0x3ULL,
QUIC_FRAME_RESET_STREAM = 0x4ULL,
QUIC_FRAME_STOP_SENDING = 0x5ULL,
QUIC_FRAME_CRYPTO = 0x6ULL,
QUIC_FRAME_NEW_TOKEN = 0x7ULL,
QUIC_FRAME_STREAM = 0x8ULL, // to 0xf
QUIC_FRAME_STREAM_1 = 0x9ULL,
QUIC_FRAME_STREAM_2 = 0xaULL,
QUIC_FRAME_STREAM_3 = 0xbULL,
QUIC_FRAME_STREAM_4 = 0xcULL,
QUIC_FRAME_STREAM_5 = 0xdULL,
QUIC_FRAME_STREAM_6 = 0xeULL,
QUIC_FRAME_STREAM_7 = 0xfULL,
QUIC_FRAME_MAX_DATA = 0x10ULL,
QUIC_FRAME_MAX_STREAM_DATA = 0x11ULL,
QUIC_FRAME_MAX_STREAMS = 0x12ULL, // to 0x13
QUIC_FRAME_MAX_STREAMS_1 = 0x13ULL,
QUIC_FRAME_DATA_BLOCKED = 0x14ULL,
QUIC_FRAME_STREAM_DATA_BLOCKED = 0x15ULL,
QUIC_FRAME_STREAMS_BLOCKED = 0x16ULL, // to 0x17
QUIC_FRAME_STREAMS_BLOCKED_1 = 0x17ULL,
QUIC_FRAME_NEW_CONNECTION_ID = 0x18ULL,
QUIC_FRAME_RETIRE_CONNECTION_ID = 0x19ULL,
QUIC_FRAME_PATH_CHALLENGE = 0x1aULL,
QUIC_FRAME_PATH_RESPONSE = 0x1bULL,
QUIC_FRAME_CONNECTION_CLOSE = 0x1cULL, // to 0x1d
QUIC_FRAME_CONNECTION_CLOSE_1 = 0x1dULL,
QUIC_FRAME_HANDSHAKE_DONE = 0x1eULL,
/* 0x1f to 0x2f are unused currently */
QUIC_FRAME_DATAGRAM = 0x30ULL, // to 0x31
QUIC_FRAME_DATAGRAM_1 = 0x31ULL,
/* 0x32 to 0xad are unused currently */
QUIC_FRAME_ACK_FREQUENCY = 0xafULL,
QUIC_FRAME_MAX_SUPPORTED
} QUIC_FRAME_TYPE;
定义了frame的类型。frame.h后面定义了各种特定类型的
frame,frame.c中提供了实现。我们仅以RESET_STREAM帧作为例子
//
// QUIC_FRAME_RESET_STREAM Encoding/Decoding
//
typedef struct QUIC_RESET_STREAM_EX {
QUIC_VAR_INT StreamID;
QUIC_VAR_INT ErrorCode;
QUIC_VAR_INT FinalSize;
} QUIC_RESET_STREAM_EX;
_Success_(return != FALSE)
BOOLEAN
QuicResetStreamFrameEncode(
_In_ const QUIC_RESET_STREAM_EX * const Frame,
_Inout_ uint16_t* Offset,
_In_ uint16_t BufferLength,
_Out_writes_to_(BufferLength, *Offset)
uint8_t* Buffer
);
_Success_(return != FALSE)
BOOLEAN
QuicResetStreamFrameDecode(
_In_ uint16_t BufferLength,
_In_reads_bytes_(BufferLength)
const uint8_t * const Buffer,
_Inout_ uint16_t* Offset,
_Out_ QUIC_RESET_STREAM_EX* Frame
);
根据RFC 9000,我们可以看到
RESET_STREAM frames contain the following fields:
Stream ID:
A variable-length integer encoding of the stream ID of the stream being terminated.Application Protocol Error Code:
A variable-length integer containing the application protocol error code (see Section 20.2) that indicates why the stream is being closed.Final Size:
A variable-length integer indicating the final size of the stream by the RESET_STREAM sender, in units of bytes; see Section 4.5.
可以看见UIC_RESET_STREAM_EX
定义了RESET_STREAM frame的头部。
QuicResetStreamFrameEncode
定义了如何编码RESET_STREAM frame,实现如下:
_Success_(return != FALSE)
BOOLEAN
QuicResetStreamFrameEncode(
_In_ const QUIC_RESET_STREAM_EX * const Frame,
_Inout_ uint16_t* Offset,
_In_ uint16_t BufferLength,
_Out_writes_to_(BufferLength, *Offset) uint8_t* Buffer
)
{
uint16_t RequiredLength =
sizeof(uint8_t) + // Type
QuicVarIntSize(Frame->ErrorCode) +
QuicVarIntSize(Frame->StreamID) +
QuicVarIntSize(Frame->FinalSize);
if (BufferLength < *Offset + RequiredLength) {
return FALSE;
}
Buffer = Buffer + *Offset;
Buffer = QuicUint8Encode(QUIC_FRAME_RESET_STREAM, Buffer);
Buffer = QuicVarIntEncode(Frame->StreamID, Buffer);
Buffer = QuicVarIntEncode(Frame->ErrorCode, Buffer);
QuicVarIntEncode(Frame->FinalSize, Buffer);
*Offset += RequiredLength;
return TRUE;
}
首先判断Buffer
是否足够长,如果不够长返回False。如果足够长,然后依次把Frame类型
、StreamID、错误码、最终大小都编码到Buffer中。最后返回True。
QuicResetStreamFrameDecode
定义了如何编码RESET_STREAM frame,实现如下:
_Success_(return != FALSE)
BOOLEAN
QuicResetStreamFrameDecode(
_In_ uint16_t BufferLength,
_In_reads_bytes_(BufferLength)
const uint8_t * const Buffer,
_Inout_ uint16_t* Offset,
_Out_ QUIC_RESET_STREAM_EX* Frame
)
{
if (!QuicVarIntDecode(BufferLength, Buffer, Offset, &Frame->StreamID) ||
!QuicVarIntDecode(BufferLength, Buffer, Offset, &Frame->ErrorCode) ||
!QuicVarIntDecode(BufferLength, Buffer, Offset, &Frame->FinalSize)) {
return FALSE;
}
return TRUE;
}
和Encode相反,把StreamID、错误码、最终大小都解码到QUIC_RESET_STREAM_EX
结构体Frame
中。
Packet的实现在packet.h和packet.c中
packet.h中定义了packet的头部
//
// The layout invariant (not specific to a particular version) fields
// of a QUIC packet.
//
typedef struct QUIC_HEADER_INVARIANT {
union {
struct {
uint8_t VARIANT : 7;
uint8_t IsLongHeader : 1;
};
struct {
uint8_t VARIANT : 7;
uint8_t IsLongHeader : 1;
uint32_t Version;
uint8_t DestCidLength;
uint8_t DestCid[0];
//uint8_t SourceCidLength;
//uint8_t SourceCid[SourceCidLength];
} LONG_HDR;
struct {
uint8_t VARIANT : 7;
uint8_t IsLongHeader : 1;
uint8_t DestCid[0];
} SHORT_HDR;
};
} QUIC_HEADER_INVARIANT;
QUIC_HEADER_INVARIANT
某种程度上说,算是所有Header的基类。
//
// Represents the long header format. All values in Network Byte order.
// The 4 least significant bits are protected by header protection.
//
typedef struct QUIC_LONG_HEADER_V1 {
uint8_t PnLength : 2;
uint8_t Reserved : 2; // Must be 0.
uint8_t Type : 2;
uint8_t FixedBit : 1; // Must be 1.
uint8_t IsLongHeader : 1;
uint32_t Version;
uint8_t DestCidLength;
uint8_t DestCid[0];
//uint8_t SourceCidLength;
//uint8_t SourceCid[SourceCidLength];
// QUIC_VAR_INT TokenLength; {Initial}
// uint8_t Token[0]; {Initial}
//QUIC_VAR_INT Length;
//uint8_t PacketNumber[PnLength];
//uint8_t Payload[0];
} QUIC_LONG_HEADER_V1;
定义了Long Header,文件中还有其他Long Header的特化,比如重试数据报头
//
// Represents the long header retry packet format. All values in Network Byte
// order.
//
typedef struct QUIC_RETRY_V1 {
uint8_t UNUSED : 4;
uint8_t Type : 2;
uint8_t FixedBit : 1; // Must be 1.
uint8_t IsLongHeader : 1;
uint32_t Version;
uint8_t DestCidLength;
uint8_t DestCid[0];
//uint8_t SourceCidLength;
//uint8_t SourceCid[SourceCidLength];
//uint8_t Token[*];
//uint8_t RetryIntegrityField[16];
} QUIC_RETRY_V1;
//
// Represents the short header format. All values in Network Byte order.
// The 5 least significant bits are protected by header protection.
//
typedef struct QUIC_SHORT_HEADER_V1 {
uint8_t PnLength : 2;
uint8_t KeyPhase : 1;
uint8_t Reserved : 2; // Must be 0.
uint8_t SpinBit : 1;
uint8_t FixedBit : 1; // Must be 1.
uint8_t IsLongHeader : 1;
uint8_t DestCid[0]; // Length depends on connection.
//uint8_t PacketNumber[PnLength];
//uint8_t Payload[0];
} QUIC_SHORT_HEADER_V1;
定义了Short Header,值得注意的是,为了实现可变长的Destination Connection ID,
这些结构体的定义都运用了柔性数组的技巧,即uint8_t DestCid[0]; // Length depends on connection.
文件中还定义了很多函数,比如用来判断packet的类型,进行编码或解码等。
举个例子:
//
// Returns TRUE for a handshake packet (non-0RTT long header).
//
inline
BOOLEAN
QuicPacketIsHandshake(
_In_ const QUIC_HEADER_INVARIANT* Packet
)
{
if (!Packet->IsLongHeader) {
return FALSE;
}
switch (Packet->LONG_HDR.Version) {
case QUIC_VERSION_1:
case QUIC_VERSION_DRAFT_29:
case QUIC_VERSION_MS_1:
return ((QUIC_LONG_HEADER_V1*)Packet)->Type != QUIC_0_RTT_PROTECTED;
default:
return TRUE;
}
}
这个函数要实现的功能是判断是否是握手数据包
这里使用所谓的头部类的基类QUIC_HEADER_INVARIANT
先判断是否是长头部数据包,
因为所有握手数据包都是长头部的。如果确定了是长头部的,再利用union中的LONG_HDR
结构体的变量Version
获取版本。确认了是支持的版本后,将指针Packet
的类型转换成QUIC_LONG_HEADER_V1*
,在进行类型的判断,不是QUIC_0_RTT_PROTECTED
的即可。
在binding.h和binging.c中定义了QUIC_BINDING
相关的结构体和函数。QUIC_BINDING
抽象了一个UDP相关的绑定。
以QuicBindingInitialize
函数为例:
_IRQL_requires_max_(PASSIVE_LEVEL)
QUIC_STATUS
QuicBindingInitialize(
_In_ const CXPLAT_UDP_CONFIG* UdpConfig,
_Out_ QUIC_BINDING** NewBinding
)
{
QUIC_STATUS Status;
QUIC_BINDING* Binding;
BOOLEAN HashTableInitialized = FALSE;
Binding = CXPLAT_ALLOC_NONPAGED(sizeof(QUIC_BINDING), QUIC_POOL_BINDING);
if (Binding == NULL) {
QuicTraceEvent(
AllocFailure,
"Allocation of '%s' failed. (%llu bytes)",
"QUIC_BINDING",
sizeof(QUIC_BINDING));
Status = QUIC_STATUS_OUT_OF_MEMORY;
goto Error;
}
Binding->RefCount = 0; // No refs until it's added to the library's list
Binding->Exclusive = !(UdpConfig->Flags & CXPLAT_SOCKET_FLAG_SHARE);
Binding->ServerOwned = !!(UdpConfig->Flags & CXPLAT_SOCKET_SERVER_OWNED);
Binding->Connected = UdpConfig->RemoteAddress == NULL ? FALSE : TRUE;
Binding->StatelessOperCount = 0;
CxPlatDispatchRwLockInitialize(&Binding->RwLock);
CxPlatDispatchLockInitialize(&Binding->StatelessOperLock);
CxPlatListInitializeHead(&Binding->Listeners);
QuicLookupInitialize(&Binding->Lookup);
if (!CxPlatHashtableInitializeEx(&Binding->StatelessOperTable, CXPLAT_HASH_MIN_SIZE)) {
Status = QUIC_STATUS_OUT_OF_MEMORY;
goto Error;
}
HashTableInitialized = TRUE;
CxPlatListInitializeHead(&Binding->StatelessOperList);
//
// Random reserved version number for version negotation.
//
CxPlatRandom(sizeof(uint32_t), &Binding->RandomReservedVersion);
Binding->RandomReservedVersion =
(Binding->RandomReservedVersion & ~QUIC_VERSION_RESERVED_MASK) |
QUIC_VERSION_RESERVED;
#ifdef QUIC_COMPARTMENT_ID
Binding->CompartmentId = UdpConfig->CompartmentId;
BOOLEAN RevertCompartmentId = FALSE;
QUIC_COMPARTMENT_ID PrevCompartmentId = QuicCompartmentIdGetCurrent();
if (PrevCompartmentId != UdpConfig->CompartmentId) {
Status = QuicCompartmentIdSetCurrent(UdpConfig->CompartmentId);
if (QUIC_FAILED(Status)) {
QuicTraceEvent(
BindingErrorStatus,
"[bind][%p] ERROR, %u, %s.",
Binding,
Status,
"Set current compartment Id");
goto Error;
}
RevertCompartmentId = TRUE;
}
#endif
#if QUIC_TEST_DATAPATH_HOOKS_ENABLED
QUIC_TEST_DATAPATH_HOOKS* Hooks = MsQuicLib.TestDatapathHooks;
CXPLAT_UDP_CONFIG HookUdpConfig = *UdpConfig;
if (Hooks != NULL) {
QUIC_ADDR RemoteAddressCopy;
if (UdpConfig->RemoteAddress != NULL) {
RemoteAddressCopy = *UdpConfig->RemoteAddress;
}
QUIC_ADDR LocalAddressCopy;
if (UdpConfig->LocalAddress != NULL) {
LocalAddressCopy = *UdpConfig->LocalAddress;
}
Hooks->Create(
UdpConfig->RemoteAddress != NULL ? &RemoteAddressCopy : NULL,
UdpConfig->LocalAddress != NULL ? &LocalAddressCopy : NULL);
HookUdpConfig.LocalAddress = (UdpConfig->LocalAddress != NULL) ? &LocalAddressCopy : NULL;
HookUdpConfig.RemoteAddress = (UdpConfig->RemoteAddress != NULL) ? &RemoteAddressCopy : NULL;
HookUdpConfig.CallbackContext = Binding;
Status =
CxPlatSocketCreateUdp(
MsQuicLib.Datapath,
&HookUdpConfig,
&Binding->Socket);
} else {
#endif
((CXPLAT_UDP_CONFIG*)UdpConfig)->CallbackContext = Binding;
Status =
CxPlatSocketCreateUdp(
MsQuicLib.Datapath,
UdpConfig,
&Binding->Socket);
#if QUIC_TEST_DATAPATH_HOOKS_ENABLED
}
#endif
#ifdef QUIC_COMPARTMENT_ID
if (RevertCompartmentId) {
(void)QuicCompartmentIdSetCurrent(PrevCompartmentId);
}
#endif
if (QUIC_FAILED(Status)) {
QuicTraceEvent(
BindingErrorStatus,
"[bind][%p] ERROR, %u, %s.",
Binding,
Status,
"Create datapath binding");
goto Error;
}
QUIC_ADDR DatapathLocalAddr, DatapathRemoteAddr;
QuicBindingGetLocalAddress(Binding, &DatapathLocalAddr);
QuicBindingGetRemoteAddress(Binding, &DatapathRemoteAddr);
QuicTraceEvent(
BindingCreated,
"[bind][%p] Created, Udp=%p LocalAddr=%!ADDR! RemoteAddr=%!ADDR!",
Binding,
Binding->Socket,
CASTED_CLOG_BYTEARRAY(sizeof(DatapathLocalAddr), &DatapathLocalAddr),
CASTED_CLOG_BYTEARRAY(sizeof(DatapathRemoteAddr), &DatapathRemoteAddr));
*NewBinding = Binding;
Status = QUIC_STATUS_SUCCESS;
Error:
if (QUIC_FAILED(Status)) {
if (Binding != NULL) {
QuicLookupUninitialize(&Binding->Lookup);
if (HashTableInitialized) {
CxPlatHashtableUninitialize(&Binding->StatelessOperTable);
}
CxPlatDispatchLockUninitialize(&Binding->StatelessOperLock);
CxPlatDispatchRwLockUninitialize(&Binding->RwLock);
CXPLAT_FREE(Binding, QUIC_POOL_BINDING);
}
}
return Status;
}
它实现的就是根据udp配置初始化一个binding
packet builder抽象了产生一个udp数据报链的逻辑,其中每个udp数据报都可能包含多个quic数据包。
_IRQL_requires_max_(DISPATCH_LEVEL)
_Success_(return != FALSE)
BOOLEAN
QuicPacketBuilderInitialize(
_Inout_ QUIC_PACKET_BUILDER* Builder,
_In_ QUIC_CONNECTION* Connection,
_In_ QUIC_PATH* Path
)
{
CXPLAT_DBG_ASSERT(Path->DestCid != NULL);
Builder->Connection = Connection;
Builder->Path = Path;
Builder->PacketBatchSent = FALSE;
Builder->PacketBatchRetransmittable = FALSE;
Builder->Metadata = &Builder->MetadataStorage.Metadata;
Builder->EncryptionOverhead = CXPLAT_ENCRYPTION_OVERHEAD;
Builder->TotalDatagramsLength = 0;
if (Connection->SourceCids.Next == NULL) {
QuicTraceLogConnWarning(
NoSrcCidAvailable,
Connection,
"No src CID to send with");
return FALSE;
}
Builder->SourceCid =
CXPLAT_CONTAINING_RECORD(
Connection->SourceCids.Next,
QUIC_CID_HASH_ENTRY,
Link);
uint64_t TimeNow = CxPlatTimeUs64();
uint64_t TimeSinceLastSend;
if (Connection->Send.LastFlushTimeValid) {
TimeSinceLastSend =
CxPlatTimeDiff64(Connection->Send.LastFlushTime, TimeNow);
} else {
TimeSinceLastSend = 0;
}
Builder->SendAllowance =
QuicCongestionControlGetSendAllowance(
&Connection->CongestionControl,
TimeSinceLastSend,
Connection->Send.LastFlushTimeValid);
if (Builder->SendAllowance > Path->Allowance) {
Builder->SendAllowance = Path->Allowance;
}
Connection->Send.LastFlushTime = TimeNow;
Connection->Send.LastFlushTimeValid = TRUE;
return TRUE;
}
初始化函数,根据connection初始化一个builder,其中path是connection和网络路径
相关的部分
_IRQL_requires_max_(PASSIVE_LEVEL)
void
QuicPacketBuilderSendBatch(
_Inout_ QUIC_PACKET_BUILDER* Builder
)
{
QuicTraceLogConnVerbose(
PacketBuilderSendBatch,
Builder->Connection,
"Sending batch. %hu datagrams",
(uint16_t)Builder->TotalCountDatagrams);
QuicBindingSend(
Builder->Path->Binding,
&Builder->Path->Route,
Builder->SendData,
Builder->TotalDatagramsLength,
Builder->TotalCountDatagrams,
Builder->Connection->Worker->IdealProcessor);
Builder->PacketBatchSent = TRUE;
Builder->SendData = NULL;
Builder->TotalDatagramsLength = 0;
Builder->Metadata->FrameCount = 0;
}
这是数据builder的数据发送的函数,可以看到调用了QuicBindingSend
函数
我们看看这个函数
_IRQL_requires_max_(DISPATCH_LEVEL)
QUIC_STATUS
QuicBindingSend(
_In_ QUIC_BINDING* Binding,
_In_ const CXPLAT_ROUTE* Route,
_In_ CXPLAT_SEND_DATA* SendData,
_In_ uint32_t BytesToSend,
_In_ uint32_t DatagramsToSend,
_In_ uint16_t IdealProcessor
)
{
QUIC_STATUS Status;
#if QUIC_TEST_DATAPATH_HOOKS_ENABLED
QUIC_TEST_DATAPATH_HOOKS* Hooks = MsQuicLib.TestDatapathHooks;
if (Hooks != NULL) {
CXPLAT_ROUTE RouteCopy = *Route;
BOOLEAN Drop =
Hooks->Send(
&RouteCopy.RemoteAddress,
&RouteCopy.LocalAddress,
SendData);
if (Drop) {
QuicTraceLogVerbose(
BindingSendTestDrop,
"[bind][%p] Test dropped packet",
Binding);
CxPlatSendDataFree(SendData);
Status = QUIC_STATUS_SUCCESS;
} else {
Status =
CxPlatSocketSend(
Binding->Socket,
&RouteCopy,
SendData,
IdealProcessor);
if (QUIC_FAILED(Status)) {
QuicTraceLogWarning(
BindingSendFailed,
"[bind][%p] Send failed, 0x%x",
Binding,
Status);
}
}
} else {
#endif
Status =
CxPlatSocketSend(
Binding->Socket,
Route,
SendData,
IdealProcessor);
if (QUIC_FAILED(Status)) {
QuicTraceLogWarning(
BindingSendFailed,
"[bind][%p] Send failed, 0x%x",
Binding,
Status);
}
#if QUIC_TEST_DATAPATH_HOOKS_ENABLED
}
#endif
QuicPerfCounterAdd(QUIC_PERF_COUNTER_UDP_SEND, DatagramsToSend);
QuicPerfCounterAdd(QUIC_PERF_COUNTER_UDP_SEND_BYTES, BytesToSend);
QuicPerfCounterIncrement(QUIC_PERF_COUNTER_UDP_SEND_CALLS);
return Status;
}
可以看到,实际发送的代码部分是:
Status =
CxPlatSocketSend(
Binding->Socket,
Route,
SendData,
IdealProcessor);
if (QUIC_FAILED(Status)) {
QuicTraceLogWarning(
BindingSendFailed,
"[bind][%p] Send failed, 0x%x",
Binding,
Status);
}
CxPlatSocketSend
是个具体实现和平台相关的函数,利用socket进行数据发送。
作者:NP094