This tutorial is dedicated to people who want to use TCP-Linux
to do NS-2 simulations. For information on how to install
TCP-Linux
into NS-2, see TCP-Linux
website. For general tutorials of NS-2, see the NS-2 website.
WARNING: You also need to set "window_" option in tcp agent to be large enough to see the performance difference. "window_" is the upperbound of congestion window in a TCP. It is 20 by default.
| A
script for TCP-Sack1 |
A
script for TCP-Linux using Highspeed TCP (hstcp) |
| #Create a simulator object set ns [new Simulator] #Create two nodes and a link set bs [$ns node] set br [$ns node] $ns duplex-link $bs $br 100Mb 10ms DropTail #setup sender side set tcp [new Agent/TCP/Sack1] $tcp set timestamps_ true $tcp set windowOption_ 8 $ns attach-agent $bs $tcp #set up receiver side set sink [new Agent/TCPSink/Sack1] $sink set ts_echo_rfc1323_ true $ns attach-agent $br $sink #logical connection $ns connect $tcp $sink #Setup a FTP over TCP connection set ftp [new Application/FTP] $ftp attach-agent $tcp $ftp set type_ FTP #Schedule the life of the FTP $ns at 0 "$ftp start" $ns at 10 "$ftp stop" #Schedule the stop of the simulation $ns at 11 "exit 0" #Start the simulation $ns run |
#Create a simulator object set ns [new Simulator] #Create two nodes and a link set bs [$ns node] set br [$ns node] $ns duplex-link $bs $br 100Mb 10ms DropTail #setup sender side set tcp [new Agent/TCP/Linux] $tcp set timestamps_ true $ns at 0 "$tcp select_ca highspeed" $ns attach-agent $bs $tcp #set up receiver side set sink [new Agent/TCPSink/Sack1] $sink set ts_echo_rfc1323_ true $ns attach-agent $br $sink #logical connection $ns connect $tcp $sink #Setup a FTP over TCP connection set ftp [new Application/FTP] $ftp attach-agent $tcp $ftp set type_ FTP #Schedule the life of the FTP $ns at 0 "$ftp start" $ns at 10 "$ftp stop" #Schedule the stop of the simulation $ns at 11 "exit 0" #Start the simulation $ns run |
If all the TCP/Linux flows in the simulation have the same set of parameter values, we can use "set_ca_default_param" command to change the default parameters any time. If any flow calls this command, all other flows can see the changes, too.
The format of the command is:
< tcp instance > set_ca_default_param < algorithm name> < parameter name> < new value>
To print out the current default value of a parameter, the command get_ca_default_param:
< tcp instance > get_ca_default_param < algorithm name> < parameter name>
If a particular flow has to use a different value for a parameter, we can use "set_ca_param" command to change the local value of a parameter any time. This command may slow down the simulation.
The format of the command is:
< tcp instance > set_ca_param < algorithm name> < parameter name> < new value>
To print out the current local value of a parameter, the command get_ca_param:
< tcp instance > get_ca_param < algorithm name> < parameter name>
The following table shows an example of changing parameters in TCP Vegas.
At time 3 sec,
the Vegas parameters (both alpha and beta) are changed to 40. 40 is equivalent to 20 packets because
Vegas uses the last bit of the parameters for accuracy preservation.
Note that this change is a global change on Vegas parameters. All the TCP-Linux which is running TCP-Vegas without per-connection parameters will be affected by this change. In the following example, we only have tcp(1) change default parameters, tcp(2) can see the new values too.
At time 6 sec, tcp(3) changes its local parameter of alpha and beta to be 20 (equivalent to 10 packets).
Due to a smaller alpha value than other flows, tcp(3) will see smaller throughput from 6 sec.
And at the bottleneck queue, the queue length will be around 9 packets from 0 to 3 seconds, around 60 packets from 3 to 6 seconds, and around 50 packets from 6 to 10 seconds.
| A script for TCP-Linux using TCP-Vegas (vegas) |
| #Create a simulator object set ns [new Simulator] #Create a bottleneck link. set router_snd [$ns node] set router_rcv [$ns node] $ns duplex-link $router_snd $router_rcv 10Mb 10ms DropTail $ns queue-limit $router_snd $router_rcv 10000 # Create two flows sharing the bottleneck link for {set i 1} {$i <=3} {incr i 1} { #Create the sending nodes,the receiving nodes. set bs($i) [$ns node] $ns duplex-link $bs($i) $router_snd 100Mb 1ms DropTail set br($i) [$ns node] $ns duplex-link $router_rcv $br($i) 100Mb 1ms DropTail #setup sender side set tcp($i) [new Agent/TCP/Linux] $tcp($i) set timestamps_ true $tcp($i) set window_ 100000 $ns at 0 "$tcp($i) select_ca vegas" $ns attach-agent $bs($i) $tcp($i) #set up receiver side set sink($i) [new Agent/TCPSink/Sack1] $sink($i) set ts_echo_rfc1323_ true $ns attach-agent $br($i) $sink($i) #logical connection $ns connect $tcp($i) $sink($i) #Setup a FTP over TCP connection set ftp($i) [new Application/FTP] $ftp($i) attach-agent $tcp($i) $ftp($i) set type_ FTP #Schedule the life of the FTP $ns at 0 "$ftp($i) start" $ns at 10 "$ftp($i) stop" } #change default parameters, all TCP/Linux will see the changes! $ns at 3 "$tcp(1) set_ca_default_param vegas alpha 40" $ns at 3 "$tcp(1) set_ca_default_param vegas beta 40" # confirm the changes by printing the parameter values (optional). Note that tcp(2) can see the change of default value even the change is made by tcp(1) $ns at 3 "$tcp(2) get_ca_default_param vegas alpha" $ns at 3 "$tcp(2) get_ca_default_param vegas beta" # change local parameters, only tcp(3) is affected. $ns at 6 "$tcp(3) set_ca_param vegas alpha 20" $ns at 6 "$tcp(3) set_ca_param vegas beta 20" # confirm the changes by printing the parameter values (optional) $ns at 6 "$tcp(3) get_ca_param vegas alpha" $ns at 6 "$tcp(3) get_ca_param vegas beta" #Schedule the stop of the simulation $ns at 11 "exit 0" #Start the simulation $ns run |
The patch supports the simulation to change Linux parameters (out side the congestion control modules) in the same way as the congestion control modules.
The Linux system is regarded as a special module "linux". Hence, get_default_ca_param, set_default_ca_param, get_ca_param, set_ca_param
can also tune the Linux parameters. For example:
set_default_ca_param linux sysctl_tcp_abc 0
turns off the ABC option of in Linux system.
All the Linux system variables are listed in tcp/linux/ns-linux-param.c. The following table summarizes all the parameters currently available:
| Variable Name | Default value | Description |
| sysctl_tcp_abc | 1 | 0: Turn off Appropriate Byte Counting (ABC); 1: Turn on ABC. Turn on for faster cwnd growth in bulk transfer. |
| tcp_max_burst | 3 | The maximum number of packets that can be sent back-to-back during loss recovery. This parameter controls the maximum burst size. |
| debug_level | 1 | The verbose level of debug message. 0: print everying including INFO; 1: print ERROR and NOTICE; 2: print ERROR only |
You might encounter one of the following problems in the last step:
| Naive Reno ( u32 in the codes are equivalent to unsigned int) |
| /*
This is a very naive Reno implementation, shown as an example on how to
develop a new congestion control algorithm with TCP-Linux. */ /* This file itself should be copied to tcp/linux/ directory. */ /* To let the compiler compiles this file, an entry "tcp/linux/<NameOfThisFile>.o" should be added to Makefile */ /* This definition lets the compiler knows the name of this protocol */ #define NS_PROTOCOL "tcp_naive_reno.c" /* This two header files link your implementation to TCP-Linux */ #include "ns-linux-c.h" #include "ns-linux-util.h" /* Define a parameter alpha for AI parameter */ static int alpha = 1; /* Declare alpha as a parameter */ module_param(alpha, int, 0644); /* Declare the explanation for alpha*/ MODULE_PARM_DESC(alpha, "AI increment size of window (in unit of pkt/round trip time)"); /* Define a parameter beta for MD parameter */ /* Declare beta as a parameter */ static int beta = 2; module_param(beta, int, 0644); /* Declare the explanation for beta*/ MODULE_PARM_DESC(beta, "MD decrement portion of window: every loss the window is reduced by a proportion of 1/beta"); /* This equivalent to opencwnd in other implementation of NS-2. */ /* This function increase congestion window for each acknowledgment*/ void tcp_naive_reno_cong_avoid(struct tcp_sock *tp, u32 ack, u32 rtt, u32 in_flight, int flag) { if (tp->snd_cwnd < tp->snd_ssthresh) { tp->snd_cwnd++; } else { if (tp->snd_cwnd_cnt >= tp->snd_cwnd) { tp->snd_cwnd += alpha; tp->snd_cwnd_cnt = 0; if (tp->snd_cwnd > tp->snd_cwnd_clamp) tp->snd_cwnd = tp->snd_cwnd_clamp; } else { tp->snd_cwnd_cnt++; } } } /* This function returns the slow-start threshold after a loss.*/ /* ssthreshold should be half of the congestion window after a loss */ u32 tcp_naive_reno_ssthresh(struct tcp_sock *tp) { int reduction = tp->snd_cwnd / beta; return max(tp->snd_cwnd - reduction, 2U); } /* This function returns the congestion window after a loss -- it is called AFTER the function ssthresh (above) */ /* Congestion window should be equal to the slow start threshold (after slow start threshold set to half of cwnd before loss). */ u32 tcp_naive_reno_min_cwnd(struct tcp_sock *tp) { return tp->snd_ssthresh; } /* a constant record for this congestion control algorithm */ static struct tcp_congestion_ops tcp_naive_reno = { .name = "naive_reno", .ssthresh = tcp_naive_reno_ssthresh, .cong_avoid = tcp_naive_reno_cong_avoid, .min_cwnd = tcp_naive_reno_min_cwnd }; /* defines a initialization function */ int tcp_naive_reno_register(void) { tcp_register_congestion_control(&tcp_naive_reno); return 0; } /* declare the initialization function */ module_init(tcp_naive_reno_register); |
| Variable Name |
type (32bit by default) |
Meanings |
equivalence in existing NS-2 TCP |
| snd_nxt |
unsigned |
The sequence number of the
next byte that TCP is going to send. |
t_seqno_*size_ |
| snd_una | unsigned | The sequence number of the
next byte that TCP is waiting for acknowledgment |
(highest_ack_+1)*size_ |
| mss_cache |
unsigned | The size of a packet |
size_ |
| srtt |
unsigned | 8 times of the smooth RTT |
t_srtt_ |
| rx_opt.rcv_tsecr | unsigned | Value of timestamp echoed by the
last acknowledgment |
ts_echo_ |
| rx_opt.saw_tstamp | bool |
Whether tiemstamp is seen in the last acknowledgment | !hdr_flags::access(pkt)->no_ts_ |
| snd_ssthresh |
unsigned | Slow-Start
threshold |
ssthresh_ |
| snd_cwnd |
unsigned | Congestion
window |
trunc(cwnd_) |
| snd_cwnd_cnt |
unsigned (16 bit) |
Fraction
of congestion window which is not accumulated to 1 |
trunc(cwnd_*cwnd_)%cwnd_ |
| snd_cwnd_clamp |
unsigned (16bit) |
upper bound of the congestion
window |
wnd_ |
| snd_cwnd_stamp |
unsigned | the last
time that the congestion window is changed (to detect idling and other
situations) |
n/a |
| bytes_acked |
unsigned |
the number of bytes that were
acknowledged in the last acknowledgment (for ABC) |
n/a |
| icsk_ca_state |
unsigned (8bit) |
The current congestion control
state, which can be one of the followings: TCP_CA_Open: normal state TCP_CA_Recovery: Loss Recovery after a Fast Transmission TCP_CA_Loss: Loss Recovery after a Timeout (The following two states are not effective in TCP-Linux but is effective in Linux) TCP_CA_Disorder: duplicate packets detected, but haven't reach the threshold. So TCP shall assume that packet reordering is happening. TCP_CA_CWR: the state that congestion window is decreasing (after local congesiton in NIC, or ECN and etc). |
n/a |
| icsk_ca_priv |
unsigned[16] |
private
data for individual congestion control algorithm for this flow |
n/a |
| icsk_ca_ops |
struct tcp_congesiton_ops* |
a pointer to the congestion
control algorithm structure for this flow |
n/a |
| function name |
explanation |
| cong_avoid |
This function is called every
time an acknowledgment is received and the congestion window can be
increased. This is equivalent to opencwnd
in tcp.cc. ack is the number of bytes that are acknowledged in the latest acknowledgment; rtt is the the rtt measured by the latest acknowledgment; in_flight is the packet in flight before the latest acknowledgment; good_ack is an indicator whether the current situation is normal (no duplicate ack, no loss and no SACK). Value: 1 for normal, 0 for dubious |
| ssthresh |
This function is called when the
TCP flow detects a loss. It returns the slow start threshold of a flow, after a packet loss is detected. |
| min_cwnd |
This function is called when the
TCP flow detects a loss. It returns the congestion window of a flow, after a packet loss is detected; (for many algorithms, this will be equal to ssthresh). When a loss is detected, min_cwnd is called after ssthresh. But some others algorithms might set min_cwnd to be smaller than ssthresh. If this is the case, there will be a slow start after loss recovery. |
| undo_cwnd |
returns the congestion window of
a flow, after a false loss detection (due to false timeout or packet
reordering) is confirmed. This function is not effective in the
current version of TCP-Linux. |
| rtt_sample |
This function is called when a
new RTT sample is obtained. It is mainly used by delay-based congestion
control algorithms which usually need accurate timestamps. usrtt is the RTT value in microsecond (us) unit. |
| set_state | This function is called when the
congestion state of
the TCP is changed. newstate is the state code for the state that TCP is going to be in. The possible states are listed in the data structure interface table. It is to notify the congestion control algorithm and is used by some algorithms which turn off their special control during loss recovery. |
| cwnd_event | This function is called when
there is an event that might be interested for congestion control
algorithm. ev is the congestion event code. The possible events are: CA_EVENT_FAST_ACK: An acknowledgment in sequence is received; CA_EVENT_SLOW_ACK: An acknowledgment not in sequence is received; CA_EVENT_TX_START: first transmission when no packet is in flight CA_EVENT_CWND_RESTART: congestion window is restarted CA_EVENT_COMPLETE_CWR: congestion window recovery is finished. CA_EVENT_FRTO: fast recovery timeout happens CA_EVENT_LOSS: retransmission timeout happens |
| pkts_acked |
This function is called when
there is an acknowledgment that acknowledges some new packets. num_acked is the number of packets that are acknowledged by this acknowledgments. last is the time (in microsecond) when the latest acked packet was sent. A value of 0 means no timestamp measurement is collected for this acked packet. |
| init |
This function is called after
the first acknowledgment is received and before the congestion control
algorithm will be called for the first time. If the congestion control algorithm has private data, it should initialize its private date here. |
| release |
This function is called when the
flow finishes. If the congestion control algorithm has allocated
additional memory other than the 16 unsigned int of icsk_ca_priv,
it should delete the additional memory here to avoid memory leak. |
Please check the known Linux bugs page to make sure it is really the problem of the algorithm, not a bug in Linux implementation.
This work is inspired and greatly helped by Prof. Pei Cao at Stanford and by Prof. Steven Low at Caltech. Many thanks to them!