The Internet Quota System
This document is historical, and some parts of it are out-of-date. The broad principles of the model described below remain unchanged from their initial implementation in 1999. There have, however, been some changes (for instance, quotas are significantly higher than the 50MB mentioned below). For up-to-date information, see http://www.ru.ac.za/quota/.
Background
Measures used to manage Internet access apply mainly to world-wide-web traffic. Other basic services, such as electronic mail and newsgroup access, are not subject to restrictions. It is worth noting that the delivery and receipt of email messages to and from off-campus sites can be seriously disrupted by the overloading of our internet access circuits.
The measures are primarily intended to control access from the relatively small number of users of the service who consume the most bandwidth. As the mechanisms for implementing charge-back costing schemes do not exist at present, and the overhead of running such schemes is considered too high, there are no user-directed funding implications. The cost of providing an Internet service to the Rhodes community is funded on a central basis.
The management mechanisms make use of the idea of quotas which are used to control internet web use on an individual PC and/or user basis.
The Quota Model
One of the inherent difficulties in implementing an access mechanism based on the content of web access (ie. by restricting access to certain domains and/or certain types of traffic) is that it leads to ongoing difficulty with reaching agreement on which sites or content should be filtered. It also means there is ongoing "tinkering" with the details for restricting access. The original quota model overcame these difficulties by allowing completely free access to almost any site and content, but allocated a quota of traffic to each device, based on the PC being used rather than being coupled to an individual user.
This mechanism works well for so called "accountable" PCs, those being used by a particular individual. In the case of laboratory PCs which can be used by anyone, a quota based on the PC is less useful. To obviate this shortcoming, additional restrictions had to be imposed on lab PCs to control subtle and not so subtle abuses allowed by anonymity and the ability to hop from PC to PC. However, at the beginning of the 2000 academic year, a different quota model for PC laboratories is being implemented, based on per user accounting. This will, hopefully, eliminate abuses and allow completely unrestricted access from labs configured to provide per user accounting information.
All incoming web traffic, because it is forced through the web cache system, is logged. These logs contain the network number of the PC to which the pages are being delivered, and the number of bytes being delivered. Additionally, in the case of certain lab PCs, each log entry can contain the login id of the person using the PC. It has been relatively easy to develop daily summary procedures which examine the log files and produce statistics on the amount of web data retrieved by each PC or user. This summary includes all types of data (text, images, audio files, video files, etc.).
Detailed reports are produced from these logs. These consist of the days total incoming un-cached traffic per client PC or server or user. The report is then averaged with similar reports for the previous several days (either one or two weeks worth of data). The accumulated and averaged data is sorted, and the output used to generate two lists of PC network addresses or user names. The first, or "heavy usage" list contains the network names of PCs or users whose accumulated weekly usage would exceed approximately 50 megabytes. The second, or "high usage" list contains the network names of PCs or users whose weekly usage would exceed about 20 megabytes.
These lists would not be much use in themselves. However, there are mechanisms on the Rhodes web cache/proxy server which make it possible to set up "delay pools" of PCs or login names. The web proxy system can thus control the effective bandwidth or rate at which web pages are delivered to PCs or users in a given delay pool. There are two pools associated with the quota model in use at Rhodes:
- The "heavy usage" class will deliver un-cached pages larger than 4 KBytes at a maximum rate of 1.2 KBytes/second to a PC, ensuring that the total bandwidth of all PCs or users in this class does not exceed 16 Kbits/second.
- The "high usage" class will deliver un-cached pages larger than 8 Kbytes at a maximum rate of 2.4 Kbytes/second to a PC, ensuring that the total bandwidth of all PCs or users in this class does not exceed 32 Kbits/second.
Apart from occasional fine tuning of the parameters associated with various thresholds, this quota control system runs completely automatically. There is no regular intervention required from IT systems staff, and there is no need to introduce administratively unwieldy mechanisms for implementing cost recovery from users.
Implementation
A client based accounting system as described above has been running for most of 1999, and a mixed client and user based accounting system will be used in 2000. The vast majority of users will be completely unaffected by these measures - in actual fact, they will be positively affected. Internet response times for web users have improved over the last year - not as a result of spending more and more money on bandwidth, but as a result of large volume users being restricted as described.
The lists of PCs affected are posted to the ru.stats Usenet newsgroup with a subject similar to "Quota - averaged WWW client stats". User related information is posted in articles with a title similar to "Quota - averaged WWW user stats".
The system described will be run on an ongoing basis with continual monitoring and evaluation of the effect on network access and response time. If the lab based user accounting is successful, some of the other access restrictions in place for the labs will be removed.
A complete description of other internet access restrictions currently in place (which are based on content, domains, timeslots, and physical location of the PC) is available here.


