Project Address#
Note: The developer's comments only represent personal opinions. The code is open source and does not provide any additional explanations about the protocol.
Project Attributes#
-
Advantages
- Seamless integration with existing TCP proxy tools
- Resistant to detection, reporting, and sniffing
- Low performance overhead for plaintext mode without encryption on relay devices
- Open source code, can be modified to use DPDK, eBPF, and other technologies to achieve high throughput
- Targeting the weaknesses of all current firewalls (i.e., using TCP three/four-tuple to track connections)
-
Disadvantages
- No resistance to binary matching when not encrypted
- Encrypted streams may increase the firewall's focus on traffic, resulting in increased latency or packet loss, but no port blocking has occurred in a 5TB unidirectional scenario.
- May cause DDoS mitigation mechanisms to misjudge and subsequently block traffic or reset existing connections.
-
Toy
The essence of this project is just the author's toy and a verification of a concept, and there is no intention to maintain it continuously.
The reason for openness is only to provide some new ideas for the current environment.
Implementation#
Prelude#
Nowadays, there are so many circumvention protocols on the Internet, with various levels of implementation but each having its own problems. Shadowsocks, as the leader of fully encrypted streams, is still the preferred choice for many people after solving the problems of active sniffing and accurate identification of random traffic. Various TLS-based tricks developed by R have also successfully hidden the tree represented by personal traffic in a forest.
However, these protocols still have their own weaknesses in practical applications: as firewalls gradually delegate inspection, monitoring, and other functions to edge devices, the fully encrypted traffic flowing out of personal user devices is like a bright light in a community area network; ordinary users often forget about the sniffing and monitoring from the Telecommunications Administration Bureau when using R's certificate theft function on relay devices, and eventually they are reported and expelled.
And both design approaches have a serious problem: encryption overhead. Most client devices of proxy tools often use CPUs produced nearly 5 years ago and complete physical device performance, which can easily provide the performance required to run OpenSSL encryption suites; while the server side is often a virtual host with strict performance limitations. In order to achieve higher throughput bandwidth, the mainstream configuration of 1 core and 512MB of memory has long been inadequate. The commonly used programming language Golang's memory overhead makes the situation even worse for the server side.
Purpose#
The design of a protocol that reduces the encryption strength on the server side based on the principle of firewall traffic monitoring became the primary goal of the project. After reviewing various open source projects, a painful fact emerged: the West Chamber Plan proposed in the early years is currently the only project that bypasses firewalls from a theoretical level to access the external network.
As it stands today, such vulnerabilities are almost impossible to find, so we must start from the principle of firewall traffic monitoring to deal with them. Modern firewalls and commercially available firewalls all rely on three-tuples or even more detailed four-tuples for traffic sniffing, namely:
(source IP, destination IP, destination PORT)
(source IP, source PORT, destination IP, destination PORT)
An idea emerged in our minds: if we split the original three-tuple into two different three-tuples, will the firewall still have the ability to establish associations or monitor internal packets?
Design Approach#
We have long summarized the firewall traffic exemption rules listed in usenix23 and even more in-depth. We have designed a handshake response that almost completely complies with these five rules. It is composed entirely of printable bytes, random at the character level, and low entropy at the binary level.
Considering the scenario where there may be no public IP address, all connections in this design are established actively by the client, and the data flows are merged after being transmitted to the server.
Considering that some firewalls and rules used in some provinces, cities, and special occasions have binary matching capabilities, we have also integrated optional encryption suites in the protocol.
Shortcomings#
-
After the firewall is initially ignored for the TCP connection through the plaintext handshake packet, we did not continuously detect the degree of increased latency and packet loss for the three-tuple, so we cannot confirm whether there are currently undetected probing mechanisms in the case of large traffic.
-
We have conducted large-scale traffic tests on one side and simple tests in Quanzhou, Xinjiang, and other regions, confirming connectivity, but we still cannot conduct in-depth tests in Iran and other regions.
-
After running for about 48 hours, mobile networks will directly block the connection, but it is unknown what mechanism was triggered. No abnormalities were observed in China Telecom and China Unicom.
-
It is unclear whether the plaintext handshake packet or the splitting of the three-tuple is effective. (Considering the possible existence of AES feature cracking... personally, I think both are effective)
Configuration File#
Example#
[app]
alignment=4096
mode=client
ip=::
port=30000
inbound-ip=localhost
inbound-port=10000
outbound-ip=localhost
outbound-port=20000
turbo=true
backlog=511
fast-open=true
keep-alived=true
connect.timeout=10
handshake.timeout=5
protocol=tcp
For more configuration examples, please refer to the samples directory.
Parameter Explanation#
-
alignment
Memory alignment parameter for alignment_malloc
-
mode
Running mode, [client, server]
-
ip
If the running mode is client, this value represents the listening IP; if the running mode is server, this value represents the original destination IP
-
port
If the running mode is client, this value represents the listening PORT; if the running mode is server, this value represents the original destination PORT
-
inbound-ip
Set the IP used for the upstream link TCP connection
-
inbound-port
Set the PORT used for the upstream link TCP connection
-
outbound-ip
Set the IP used for the downstream link TCP connection
-
outbound-port
Set the PORT used for the downstream link TCP connection
-
turbo
TCP link acceleration
-
backlog
TCP link backlog parameter
-
fast-open
Enable TCP Fast Open
-
keep-alived
Enable TCP Keep-alive
-
connect.timeout
Set the TCP timeout time
-
handshake.timeout
Set the protocol handshake timeout time
-
protocol
Transport protocol, TCP means no encryption except for the handshake packet.
Use Cases#
warp+uds#
Cloudflare's warp is often deployed on virtual hosts with average network quality as a tool to improve international routing. Cloudflare provides package manager installation methods for most Linux distributions.
-
Install WARP
Please refer to the official Cloudflare tutorial for installation instructions. -
Switch modes
After successfully installing warp and registering the client, switch the client's running mode to proxy mode.warp-cli set-mode proxy
-
Change the port used by warp (optional)
In proxy mode, warp listens on port 1080 by default.warp-cli set-proxy-port [PORT]
Warp only listens on the localhost for the specified socks5 port, so there is no need to worry about the port being scanned.
-
Start the UDS server
The configuration file is as follows:
[app]
alignment=4096
mode=server
ip=127.0.0.1
port=1080
inbound-ip=::
inbound-port=10000
outbound-ip=::
outbound-port=20000
turbo=true
backlog=511
fast-open=true
keep-alived=true
connect.timeout=10
handshake.timeout=5
protocol=tcp
Start the UDS server:
./udss --config=config.ini
- Start the UDS client
[app]
alignment=4096
mode=client
ip=127.0.0.1
port=1080
inbound-ip=[remote IP]
inbound-port=10000
outbound-ip=[remote IP]
outbound-port=20000
turbo=true
backlog=511
fast-open=true
keep-alived=true
connect.timeout=10
handshake.timeout=5
protocol=tcp
- Connect
Connect to socks5://127.0.0.1:1080 using your favorite tool.
Dante + UDS#
If not using warp to optimize international routing through Cloudflare's network and reduce the possibility of being intercepted, I personally prefer lightweight Dante to create socks5 connections.
The operation process is very similar to warp, so I won't repeat it here.
Existing Relay Access to UDS#
Test without encrypting pure vless data flow during relay
-
Download udss on the landing device
-
Set the target port of the server to the listening port of the original proxy
-
Download udsc on the relay machine
-
Set the client's upstream and downstream connections to correspond to the server
-
Set the client's listening address and port
If a smooth migration is required, the original port cannot be used. It is recommended to interrupt the original service and then start UDS.
-
Start the UDS client
Conclusion#
uds is just a conceptual implementation and is not recommended for large-scale use in production environments. It is also not suitable for UDP over TCP transmission scenarios. If there is a need for UDP traffic, please modify the code to implement it.
All uds code follows the MIT license.