- 可靠
- 支持广域网和局域网
- 大文件传输, GB以上
- 连接迁移,
VERSION = 0 不支持非必要 - 安全层,
VERSION = 0 不支持非必要
固定头部(1-Byte):
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| version | type | 头部(1-byte)
+------+------+------+------+------+------+------+------+
| remain | PCF | CF | Flag
+------+------+------+------+------+------+------+------+
| connection id(optional) |
+------+------+------+------+------+------+------+------+
- version (bit 5~8): 占用 3bit,即 2^3 = 8 个版本。当前版本 0。
-
控制流 0x0 ~ 0xF
00000
0: 发送广播数据包BROAD
- 用于局域网的服务发现
00001
1: 发送多播数据包MULTI
- 用于组播推送场景,
不支持可靠
- 用于组播推送场景,
00010
2: 窗口的大小询问WSASK
(Window Size ASK)- 接收窗口为0时的询问数据包,避免死锁
00011
3: 窗口的大小询问WSANS
(Window Size ANSwer)- 接收窗口为0时的回答数据包,避免死锁,也可以主动通知窗口
-
数据流 0x8 ~ 0xf
10000
16: 发送开始数据包BEGIN
- 完整数据包超过MTU的数据包的首个包
10001
17: 发送中间数据包DOING
- 完整数据包超过MTU的数据包的后续且非结束数据包
10010
18: 发送结束数据包DONED
- 完整数据包超过MTU的数据包的结束包
10011
19: 开始中间结束数据包BDODO
对于小型数据, 首个数据包既是中间数据包又是最后一个数据包- 小于MTU的数据包,一次传输
10100
20: 针对数据包的确认DTACK
(DaTa ACKnowledge)- 所有类型的数据传输包的确认包
- CF(2): connection id flag
- 00: 关闭 connection id, 即 0bit = 0Byte
- 01: 开启 connection id, 即 8bit = 1Byte
- 10: 开启 connection id,即 32bit = 4Byte
- 11: 开启 connection id,即 64bit = 8Byte
- PCF(2): Packet Ctrl Flag,用于不同数据包类型的控制参数
- 因不同的数据包确定
- remain(5): 保留位
- CF = 00 时,connection id 禁止携带 connection id,局域网建议酌情使用
- CF = 01 时,用于保持连接的标识,1Byte, 局域网建议酌情使用
- CF = 10 时,用于保持连接的标识,4Byte, 局域网建议酌情使用
- CF = 11 时,用于保持连接的标识,8Byte, 局域网不建议使用
控制流:
BROAD: 广播包
0 3 7
+------+-------------+---------------------------+
| kind | len | ~~~ |
+------+-------------+---------------------------+
MULTI: 组播
暂时保留
WSASK: 窗口询问控制包 (Window Size ASK)
0 4 5 8
+-----------------------------+-------+---------------------+
| ts(4) | tn(1) | seq(3) |
+-----------------------------+-------+---------------------+
WSANS: 窗口回复控制包 (Window Size ANSwer)
0 4 5 8
+-----------------------------+-------+---------------------+
| echo ts(4) | tn(1) | seq(3) |
+-------------------------------------+---------------------+
| delta(2) | wnd(4) |
+--------------+----------------------+
数据流:
BDODO: 小数据包 (Begin Doing Doned)
0 4 5 8
+-----------------------------+-------+---------------------+
| ts(4) | tn(1) | seq(3) |
+-----------------------------+-------+---------------------+
BEGIN/DOING/DONED: 数据传输包
0 4 5 8
+-----------------------------+-------+---------------------+
| ts(4) | tn(1) | seq(3) |
+-----------------------------+-------+---------------------+
| payload(optional) |
+-----------------------------------------------------------+
DTACK: 数据传输ack包 (DaTa ACKnowledge)
0 4 5 8
+-----------------------------+-------+---------------------+
| echo ts(4) | tn(1) | seq(3) |
+-----------------------------+-------+---------------------+
| delta(2) | wnd(4) | una(4) |
+--------------+---------------------+----------------------+
| sack([tn+seq] * m) |
+-----------------------------------------------------------+
-
几个常量:
- MTU:1370
- MSS:1370 - 20 = 1350
-
序列号:
- tn + seq :
32bit
, 默认 tn:5 bit
; seq:28 bit
- tn: task number 任务号,表示同一个连接下,同时支持的不同数据块
- 0 :代表所有的task所在的connection,主要用于流控与拥塞控制,询问窗口使用
- 1 ~ 9:协议内部保留使用,禁止被发送任务分配
- 1 后期认证握手使用stream id
- 2 BROAD 使用的通讯流
- 3 MULTI 使用的通讯流
- seq: sequence number 序列号,默认情况下,随机在
0 ~ 2^28
生成,表示数据包的序号,随着数据包自增
- tn + seq :
-
时间戳:
- ts: timestamp 发送数据时的时间戳,相对于当天00:00:00的毫秒数,最大
2^32 ms ~= 49.7
天 - 作用:
- 标记每个数据包发送的时间戳,在ack时回显发送端,用于计算rtt
- 标记因为 seq 回绕情况,无法区分序号相同的不同数据包
- ts: timestamp 发送数据时的时间戳,相对于当天00:00:00的毫秒数,最大
-
窗口大小:
- wnd: window, 接收端的最大接收窗口大小,单位:包个数,所以最大的数据字节数:
wnd * MTU
- wnd: window, 接收端的最大接收窗口大小,单位:包个数,所以最大的数据字节数:
-
数据:
- payload 长度: MSS - 固定报头 - 可变报头
-
ACK方式:
- una:最大的未被确认的序号
- ack: 当前的ACK的序号
- sack:选择确认,是一个列表 (tn + seq) * n,其中tn被复用来表示次数,sack的数据包属于同一个Task的,禁止多Task串用
- nack: 未被确认的seq,等价于ACK了0次
- dack: 重复确认的seq,等价于ACK了m次,m >= 1
-
delta:
- 表示接收方收到当前数据包,经过处理到ack发出时的时间消耗,单位ms,2byte,最大65535ms
- 该值可以用来发现接收方的处理性能是否存在瓶颈,比如内存和cpu计算压力大
- 如果存在累积确认的机制,需要排除这部分的时间消耗,避免计算误差
CF必须为0
header-0: PCF = 0b00 (default)
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 0 0 0 0 0 | Type: 0
+------+------+------+------+------+------+------+------+
| x x x x | 0 0 | 0 | 0 | Flag = 0
+------+------+------+------+------+------+------+------+
bbuf-0: PCF = 0b00 (default)
0 1
+-------+
| tn(1) |
+-------+
header-1: PCF = 0b01 -> 支持选项
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 0 0 0 0 0 | Type: 0
+------+------+------+------+------+------+------+------+
| x x x x | 0 1 | 0 0 | Flag = 4
+------+------+------+------+------+------+------+------+
body-1: PCF = 0b01 -> 支持选项
0 1 3 ?
+-------+------+------+---------------------------+
| tn(1) | kind | len | value |
+-------+------+------+---------------------------+
len: 表示后面value的大小,最大255字节
kind:【建议】由于是广播包,仅在局域网中生效,建议选项考虑局域网相关场景设计
- kind:0 ~ 63 64个 协议后期扩展占用,自定义场景禁止使用
- kind: 64 ~ 223 160个 自定义场景使用范围
- kind: 224 ~ 255 32个 实验期间使用
广播包添加选项后,建议不要超过MTU,减小IP分包可能性
- 暂时忽略该场景
header-0: PCF = 0b00 (default) 单任务询问,用于task级别的流控和拥塞控制
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 0 0 0 1 0 | Type: 2
+------+------+------+------+------+------+------+------+
| x x x x | 0 0 | ? ? | Flag = 0 | 1 | 2 | 3
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-0:
0 4 5 8(byte)
+-----------------------------+-------+---------------------+
| ts(4) | tn(1) | seq(3) |
+-----------------------------+-------+---------------------+
header-1: PCF = 0b01 连接询问,连接级别的的窗口询问,用于连接级别的流控和拥塞控制
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 0 0 0 1 0 | Type: 2
+------+------+------+------+------+------+------+------+
| x x x x | 0 1 | ? ? | Flag = 4 | 5 | 6 | 7
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-1:
0 4 5 8(byte)
+-----------------------------+-------+---------------------+
| ts(4) | 0 | 0 |
+-----------------------------+-------+---------------------+
header-0: PCF = 0b00 (default) 单任务询问回复,用于task级别的流控和拥塞控制
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 0 0 0 1 1 | Type: 3
+------+------+------+------+------+------+------+------+
| x x x x | 0 0 | ? ? | Flag = 0 | 1 | 2 | 3
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-0:
0 4 5 8
+-----------------------------+-------+---------------------+
| echo ts(4) | tn(1) | seq(3) |
+-------------------------------------+---------------------+
| delta(2) | wnd(4) |
+--------------+----------------------+
header-1: PCF = 0b01 连接询问回复,连接级别的的窗口询问,用于连接级别的流控和拥塞控制
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 0 0 0 1 1 | Type: 3
+------+------+------+------+------+------+------+------+
| x x x x | 0 1 | ? ? | Flag = 4 | 5 | 6 | 7
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-1:
0 4 5 8
+-----------------------------+-------+---------------------+
| echo ts(4) | 0 | 0 |
+-------------------------------------+---------------------+
| delta(2) | wnd(4) |
+--------------+----------------------+
header-2: PCF = 0b10 单任务窗口通知,用于task级别的流控和拥塞控制
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 0 0 0 1 1 | Type: 3
+------+------+------+------+------+------+------+------+
| x x x x | 1 0 | ? ? | Flag = 8 | 9 | 10 | 11
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-2:
0 4 5 8
+-----------------------------+-------+---------------------+
| echo ts(4) | 0 | 0 |
+-------------------------------------+---------------------+
| delta(2) | wnd(4) |
+--------------+----------------------+
header-3: PCF = 0b11 连接窗口通知,连接级别的的窗口询问,用于连接级别的流控和拥塞控制
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 0 0 0 1 1 | Type: 3
+------+------+------+------+------+------+------+------+
| x x x x | 1 1 | ? ? | Flag = 12 | 13 | 14 | 15
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-3:
0 4 5 8
+-----------------------------+-------+---------------------+
| echo ts(4) | 0 | 0 |
+-------------------------------------+---------------------+
| delta(2) | wnd(4) |
+--------------+----------------------+
没有Flag
header-0: PCF = 0b00 (default) 单任务询问回复,用于task级别的流控和拥塞控制
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 0 0 0 1 1 | Type: 3
+------+------+------+------+------+------+------+------+
| x x x x | 0 0 | 0 0 | Flag = 0
+------+------+------+------+------+------+------+------+
body-0:
0 4 5 8(byte)
+-----------------------------+-------+---------------------+
| ts(4) | 0 | 0 |
+-----------------------------+-------+---------------------+
header-0:
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 1 0 0 0 0 | Type: 16
+------+------+------+------+------+------+------+------+
| x x x x | 0 0 | ? ? | Flag = 0 | 1 | 2 | 3
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-0:
0 4 5 8
+-----------------------------+-------+---------------------+
| ts(4) | tn(1) | seq(3) |
+-----------------------------+-------+---------------------+
| payload(MSS - 1 - 4 - 4) |
+-----------------------------------------------------------+
header-0:
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 1 0 0 0 1 | Type: 17
+------+------+------+------+------+------+------+------+
| x x x x | 0 0 | ? ? | Flag = 0 | 1 | 2 | 3
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-0:
0 4 5 8
+-----------------------------+-------+---------------------+
| ts(4) | tn(1) | seq(3) |
+-----------------------------+-------+---------------------+
| payload(MSS - 1 - 4 - 4) |
+-----------------------------------------------------------+
header-0:
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 1 0 0 1 0 | Type: 18
+------+------+------+------+------+------+------+------+
| x x x x | 0 0 | ? ? | Flag = 0 | 1 | 2 | 3
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-0:
0 4 5 8
+-----------------------------+-------+---------------------+
| ts(4) | tn(1) | seq(3) |
+-----------------------------+-------+---------------------+
| payload(MSS - 1 - 4 - 4) |
+-----------------------------------------------------------+
header-0: PCF = 0b00 (default) 0 否 采用流式传输,收集多个BDD凑成 MSS 大小的数据包
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 1 0 0 1 1 | Type: 19
+------+------+------+------+------+------+------+------+
| x x x x | 0 0 | ? ? | Flag = 0 | 1 | 2 | 3
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-0:
0 4 5 8
+-----------------------------+-------+---------------------+
| ts(4) | tn(1) | seq(3) |
+-----------------------------+-------+---------------------+
| payload(optional) | payload
+-----------------------------------------------------------+
header-1: PCF = 0b01 是 采用流式传输,尽可能再不过多等待的情况下,收集多个BDD凑成 MSS 大小的数据包
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 1 0 0 1 1 | Type: 19
+------+------+------+------+------+------+------+------+
| x x x x | 0 1 | ? ? | Flag = 4 | 5 | 6 | 7
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-1:
0 4
+-----------------------------+
| ts(4) |
+-----------------------------+--------------+----------+
| tn1(1) | seq1(3) | len1(2) | payload1 |
+--------+--------------------+--------------+----------+
| tn2(1) | seq2(3) | len2(2) | payload2 |
+--------+--------------------+--------------+----------+
header-0: PCF = 0b00 -> ack
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 1 0 1 0 0 | Type: 20
+------+------+------+------+------+------+------+------+
| x x x x | 0 0 | ? ? | Flag = 0 | 1 | 2 | 3
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-0: PCF = 0b00 -> ack
0 4 5 8
+-----------------------------+-------+---------------------+
| echo ts(4) | tn(1) | seq(3) | tn + seq = ack
+-----------------------------+-------+---------------------+
| delta(2) | wnd(4) |
+--------------+----------------------+
header-1: PCF = 0b01 -> ack + una (default)
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 1 0 1 0 0 | Type: 20
+------+------+------+------+------+------+------+------+
| x x x x | 0 1 | ? ? | Flag = 4 | 5 | 6 | 7
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-1: PCF = 0b01 -> ack + una (default)
0 4 5 8
+-----------------------------+-------+---------------------+
| echo ts(4) | tn(1) | seq(3) | tn + seq = ack
+-----------------------------+-------+---------------------+
| delta(2) | wnd(4) | una(4) |
+--------------+----------------------+---------------------+
header-2: PCF = 0b10 -> ack + una + sack, 必须是同一个连接,建议是同一个task,但允许不同task
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 1 0 1 0 0 | Type: 20
+------+------+------+------+------+------+------+------+
| x x x x | 1 0 | ? ? | Flag = 8 | 9 | 10 | 11
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-2: PCF = 0b10 -> ack + una + sack, 必须是同一个连接,建议是同一个task,但允许不同task
0 4 5 8
+-----------------------------+-------+---------------------+
| echo ts(4) | tn(1) | seq(3) | tn + seq = ack
+-----------------------------+-------+---------------------+
| delta(2) | wnd(4) | una(4) |
+--------------+--------------+-------+---------------------+
| tn1(1)| seq1(3) | tn2(1)| seq2(3) |
+-------+---------------------+-------+---------------------+
| tn3(1)| seq3(3) | tnn(1)| seqn(3) |
+-------+---------------------+-------+---------------------+
最大sack量 m = (
MSS - 1 // 1byte 固定头
- 4 // 4byte 时间戳
- 4 // 4byte tn + seq
- 4 // 4byte 接收窗口大小
- 4 // 4byte una
) / 4 = (MSS - 17) / 4 = (1350 - 17) / 4 = 333 个
header-3: PCF = 0b11 -> ack + sack, 必须是同一个连接
8 7 6 5 4 3 2 1 0
+------+------+------+------+------+------+------+------+
| 0 0 0 | 1 0 1 0 0 | Type: 20
+------+------+------+------+------+------+------+------+
| x x x x | 1 1 | ? ? | Flag = 12 | 13 | 14 | 15
+------+------+------+------+------+------+------+------+
| connection id(optional) | cid
+------+------+------+------+------+------+------+------+
body-3: PCF = 0b11 -> ack + sack, 必须是同一个连接,建议是同一个task,但允许不同task
0 4 5 8
+-----------------------------+-------+---------------------+
| echo ts(4) | tn(1) | seq(3) | tn + seq = ack
+-----------------------------+-------+---------------------+
| delta(2) | wnd(4) | tn1 + seq1(4) |
+--------------+--------------+-------+---------------------+
| tn2(1)| seq2(3) | tn3(1)| seq3(3) |
+-------+---------------------+-------+---------------------+
| tn4(1)| seq4(3) | tnn(1)| seqn(3) |
+-------+---------------------+-------+---------------------+
最大sack量 m = (
MSS - 1 // 1byte 固定头
- 4 // 4byte 时间戳
- 4 // 4byte tn + seq
- 4 // 4byte 接收窗口大小
) / 4 = (MSS - 13) / 4 = (1350 - 13) / 4 = 334 个
- 广播包和组播包类型是否必要?
- 局域网:支持广播,用于服务发现
- 局域网:可以支持组播,用于多设备协作
- 广域网:不支持广播
- 广域网:支持组播,VERSION = 0 不支持
- 协议的设计中并无len字段标记数据包大小,由于udp是基于数据报协议,收到的必然是完整的数据包,所以除了必要的数据header及固定的控制字段,剩下的均为数据
- 数据包足够大时,必然在业务层做拆分,在接收端接收时基于seq按序组合,由于没有数据长度标识,必须有一种机制区分开始,中间,结束。所以引入BEGIN,DOING,DONED
- 对于BDD 数据包,详见疑问3
- 对于大于MTU的数据包,类型处于BEGIN,DOING,DONED 数据必然大于MTU,所以没有该种情况
- 对于BDD类型的数据包
- 在业务场景独立时,如果为了达到MTU而做的任何等待,必将对发送流程造成不必要的延迟
- udp是基于数据报协议,收到的必然是完整的数据包,如果需要拼接到MTU,必然需要支持处理粘包的问题,引入len标记数据包长度
- 协议的数据包类型是否需要独立支持广播包和组播包?
- 支持,广播在局域网
- ack包与数据包是否需要统一结构
- 不统一,增加PCF来控制不同的ACK方案
- 数据包与控制包是否需要统一结构
- 不统一,可以更加灵活的定制不同类型的数据包,并针对性的做优化调整
- 数据包是否需要时间戳字段,是不是所有类型的数据包都需要?
- 需要,处理广播和组播,都需要加上,用于更加丰富的rtt采样数据源
- 流量控制算法设计思路
- task级别的流控,通知task的滑动窗口设计
- 连接级别的流控,用于虚拟连接级别的控制,所有的task的滑动窗口总和即为该连接的最大发送窗口
- 拥塞控制算法设计思路
- 独立模块支持流控,可以支持插拔式拥塞控制
- 代码分层结构要尽可能简单
- 模块化
- 可插拔化
- 是否需要支持连接迁移?
- 可选性的功能