F AST Background


Overview

The development of robust and stable ultrascale networking, at 100 Gbps and higher speeds in the wide area, is critical to support the new generation of ultrascale computing and Petabyte to Exabyte datasets that promise to drive discoveries in fundamental and applied sciences of the next decade. Continued advances in computing, communication, and storage technologies, combined with the development of national and global Grid systems, hold the promise of providing the required capacities and an effective environment for computing and science.

The key challenge we face, and intend to overcome, is that some of our current network control and resource sharing algorithms, that are at the foundation of these transformative trends in science and computer science, cannot scale to this regime.

Our goal is to develop theory, algorithms and prototypes to design, demonstrate, and deploy protocols that are scalable to arbitrary network capacity and size in order to fulfill the vision of ultrascale networking.

Ultrascale Network Protocols for Computing and Science in the 21st Century
J. J. Bunn, J. C. Doyle, S. H. Low, H. B. Newman and S. M. Yip, White paper to US Department of Energy's Ultrascale Simulation for Science (USS) initiative, September 2002 (pdf 100K)

Data Intensive Grids and Networks for High Energy and Nuclear Physics
H. B. Newman, InterAct Magazine, September, 2002 (pdf 276K)


Theory

Equilibrium and Dynamics of Current Protocols

Congestion control is a distributed algorithm to share network resources among competing users. It consists of two components: and a network algorithm that updates and feeds back, implicitly or explicitly, congestion information to sources, and a source algorithm that dynamically adjusts rate (or window size) in response to congestion in its path. In the current Internet, the source algorithm is carried out by TCP, and the link algorithm is carried out by (active) queue management (AQM) schemes such as DropTail or RED.

We can interpret TCP/AQM algorithms as distributed asynchronous primal-dual algorithms carried out over the Internet in real-time to solve a utility maximizaiton probelm. All equilibrium properties, such as throughput, loss and delay, or fairness, can be understood in terms of the underlying optimizaiton problem. Different algorithms, AIMD (Reno) or Vegas, differ in the utility functions they implicitly optimize.

A Duality Model of TCP and Queue Management Algorithms
S. H. Low, ITC Specialist Seminar on IP Traffic Measurement, Modeling and Management, September 18-20, 2000, Monterey, CA (USA) (ps 507K). To appear in IEEE/ACM Transactions on Networking, October 2003.


Understanding Vegas: A Duality Model
S. H. Low, Larry Peterson and Limin Wang, Journal of ACM, 49(2):207-235, March 2002 (pdf 793K)

Is the system equilibrium stable? It can be shown that, locally, the equilibrium can become unstable as delay increases, or more strikingly, as network capacity increases!

A Control Theoretic Analysis of RED
C. Hollot, V. Misra, D. Towsley and W. B. Gong, IEEE Inforcom, April 2001. (pdf 295K)

Dynamics of TCP/RED and a Scalable Control
S. H. Low, F. Paganini, J. Wang, S. Adlakha and J. C. Doyle, IEEE Infocom, June 2002 (pdf 435K)

Can we design TCP/AQM algorithms, as simple and decentralized as the current protocols, that maintain stability as networks scale up in capacity in size? To do this, we need to step back and understand the optimization problem TCP/AQM implicitly solves, and the delay and decentralization constriants under which it operates.

Scalable and Robust Protocols

Instead of first designing a TCP/AQM and then deduce the underlying optimization problem, as we did to understand existing protocols, we proceed in the opposite direction: start with an optimization problem and deduce the appropriate algorithms.

The first paper below formulates the TCP/AQM problem as a constrained convex optimization and proposes a `primal algorithm' to solve it. The second paper proposes a `dual algorithm' to solve its Lagrangian dual.

These algorithms are proved to be globally stable either in the absence of delay, or with a conservative bound on stepsize (responsiveness to congestion) in the presence of delay and other asynchronism.

Rate Control in Communication Networks: Shadow Prices, Proportional Fairness and stability
F. P. Kelly, A.K. Maulloo and D. K. H. Tan, Journal of the Operational Research Society 49 (1998), 237-252.

Optimization Flow Control, I: Basic Algorithm and Convergence
S. H. Low and D. E. Lapsley IEEE/ACM Transactions on Networking, 7(6):861-75, Dec. 1999 (pdf 270K)

The original primal algorithm can be enhanced to maintain local stability for arbitrary capacity, under a suitable bound on delay.

On the stability of networks operating TCP-like protocols
Glenn Vinnicombe, IFAC 2002. (pdf 73K)

Robust congestion control for the Internet
Glenn Vinnicombe, submitted for publication (pdf 219K)

The dual algorithm can be specialized to maintain local stability for arbitrary delay and capacity.

Scalable Laws for Stable Network Congestion Control
Fernando Paganini, J. C. Doyle and S. H. Low, IEEE CDC, Orlando, FL, December 2001. (ps 475K)

The primal and dual algorithms are complementary in many ways. The primal algorithms allow general utility functions, and hence arbitrary fairness in rate allocation, but give up tight control on utilization. The dual algorithm, on the other hand, can achieve very high utilization, but is restricted to a specific class of utility functions, and hence fairness in rate allocation.

By adding slow timescale dynamics at links (for primal algorithm) or at sources (for dual algorithm), it is possible to achieve both high utilization and arbitrary fairness.

Stable, Scalable, Fair Congestion Control and AQM Schemes that Achieve High Utilization in the Internet
S. Kunniyur and R. Srikant, Asian Control Conference, September 2002, Singapore (ps 186K)

A new TCP/AQM for stability and performance in fast networks
Fernando Paganini, Zhikui Wang, Steven H. Low, John C. Doyle, Proc. of 39th Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, October 2002 (pdf 1.36M)

Stabilized Vegas
D. H. Choe and S. H. Low, Proc. of 39th Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, October 2002

Our first FAST implementation will be based on Stabilized Vegas.


Towards an Internet Theory

An emerging theoretical foundation for the Internet is discussed at:

Robustness and Internet


FAST v1: Stabilized Vegas

TCP Vegas was proposed in 1994 by Lawrence Brakmo and Larry Peterson as an alternative to TCP Reno, in

TCP Vegas: End to End Congestion Avoidance on a Global Internet
L. Brakmo and L. Peterson, IEEE Journal on Selected Areas in Communication, 13(8):1465-1480, October 1995

In addition to many enhancements, the most important difference is a new congestion avoidance algorithm that uses queueing delay, as opposed to packet loss as Reno does, as a measure of congestion. This has important equilibrium and stability implications.

The Vegas algorithm implies that it has a log utility function and achieves proportional fairness in equilibrium, see, e.g.,

Understanding Vegas: A Duality Model
S. H. Low, Larry Peterson and Limin Wang, Journal of ACM, 49(2):207-235, March 2002

The implicit link algorithm in Vegas has the right scaling with respect to network capacity. This makes Vegas potentially scalable to high bandwidth, in stark contrast to Reno and its variants which becomes unstable as capacity increases. However, Vegas can become unstable if delay is large; but a simple modification can stabilize it, as explained in:

Stabilized Vegas
D. H. Choe and S. H. Low, Proc. of 39th Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, October 2002

Our first implementation will be based on Stabilized Vegas, which works with DropTail routers that are predominate in the current Internet and which has a natural incremental deployment path where some routers are DropTail and others implement a new AQM (active queue management).

More Vegas References


References

Some Background

On the self-similar nature of Ethernet traffic W.E. Leland, M.S. Taqqu, W. Willinger, and D.V. Wilson, IEEE/ACM Transactions on Networking, 1994, vol. 2(1), pp. 1~15. (pdf 566K)

Self-similarity in World Wide Web traffic: Evidence and possible causes M.E. Crovella and A. Bestavros, IEEE/ACM Transactions on Networking, 1997, vol. 5(6), pp. 835~846. (ps 796K)

Scaling phenomena in the Internet: Critically examining criticality Walter Willinger, Ramesh Govindan, Sugih Jamin, Vern Paxson, and Scott Shenker, PNAS 2002 99 Suppl. 1: 2573-2580

Heavy Tails, Generalized Coding, and Optimal Web Layout
Xiaoyun Zhu, Jie Yu and John Doyle, IEEE Inforcom, April 2001 (ps 392K)

End-to-end congestion control: utility functions, random losses and ECN marks S. Kunniyur and R. Srikant, IEEE INFOCOM 2000, Tel Aviv, Israel (ps 2.70M)

End-to-end congestion control for the Internet: delays and stability R. Johari and D. Tan, IEEE/ACM Transactions on Networking 9:818-832, 2001 (ps 1.65M)

Analysis and Design of AQM for Stabilizing TCP
K. B. Kim and S. H. Low, Caltech Technical Report caltechCSTR:2002.009, March, 2002 (pdf 539K)

AIMD, Fairness and Fractal Scaling of TCP Traffic
Francois Baccelli and Dohy Hong, IEEE Infocom, June 2002, New York, NY. (pdf 811K)

The case for a new IP congestion control framework
Tom Kelly, Cambridge University Engineering Department Technical Report, July 2002 (pdf 142K)


Recent Surveys and Links

Control of Communication Networks
R. Srikant, in "Perspectives in Control Engineering: Technologies, Applications, New Directions," Tariq Samad (Ed.), IEEE Press, 2000, pp. 462-488. (ps 1.85M)

Internet Congestion Control
S. H. Low, F. Paganini and J. C. Doyle
IEEE Control Systems Magazine, 22(1):28-43, Feb. 2002 (ps 672K)

Theories and Models for Internet Quality of Service
Victor Firoiu, Jean-Yves Le Boudec, Don Towsley and Zhi-li Zhang, Proceedings of IEEE, May 2002 (pdf 320K)

John Doyle's Complex networks
Frank Kelly's Self-managed Internet
R. Srikant's Congestion Control and Pricing
Jim Kurose and Don Towsley's Computer Networks Research Group
Sally Floyd's Global optimization with end-to-end congestion control
Microsoft Research's Congestion control and differentiated services


Vegas References (Send us links to relevant work)

Vegas Homepage at The University of Arizona, Tucson. Larry Peterson's new homepage at Princeton.

Evaluation of TCP Vegas: emulation and experiment
J. S. Ahn, P. B. Danzig, Z. Liu, and L. Yan, Proceedings of SIGCOMM'95, 1995.

Analysis of TCP Vegas and TCP Reno
O. Ait-Hellal and E. Altman, Telecommunication Systems, 2000 (A short version appeared in IEEE International Conference on Communications (ICC'97) , Montreal, 8-12 june 1997)

Comparison of TCP Reno and TCP Vegas via fluid approximation
Thomas Bonald, Performance Evaulation, 36(37):307-332, 1999

Analysis and comparison of TCP Reno and Vegas
J. Mo, R. La, V. Anantharam, and J. Walrand, Proceedings of IEEE Infocom, March 1999.

A note on the fairness of TCP Vegas
C. Boutremans and J. Y. Le Boudec, Proceedings of International Zurich Seminar on Broadband Communications, pages 163--170, February 2000.

TCP Vegas revisited
U. Hengartner, J. Bolliger, and T. Gross, Proceedings of IEEE Infocom, March 2000.

Fair end-to-end window-based congestion control
Jeonghoon Mo and Jean Walrand. IEEE/ACM Transactions on Networking, 8(5):556--567, October 2000.

Accumulation-based Congestion Control
Yong Xia, David Harrison, Shivkumar Kalyanaraman, Kishore Ramachandran, Arvind Venkatesan, submitted, 2002

A case for TCP Vegas in High Performance Computational Grids
Eric Weigle and Wu-chun Feng, 9th International Symposium on High Performance Distributed Computing (HPDC'01), August, 2001

Neal Cardwell and Boris Bak's Linux implementation (kernel 2.3.x)