Comparison of Squid 1.1.10 and 1.NOVM.10

Background

As you probably know, Squid (1.1.x) tends to use a lot of virtual memory (VM) to operate well. There two large uses for VM are object metadata, and in-transit object data. Squid uses about 120 bytes/object for the metadata. This includes the URL, timestamps, and the index to the disk file. As objects are retrieved from Web servers, Squid holds the object data (i.e. "body") in VM until the transfer is complete. At that point, if the object is cachable, it will begin writing the data to disk. Otherwise, the object is simply destroyed. This avoids any disk activity for uncachable objects.

The use of VM for in-transit objects is problematic for very busy caches, or for very large objects. There is a workaround for the large objects--we can free the memory up to the lowest offset of all clients reading from it. This also means that large objects are uncachable.

``NOVM'' Version

A few months ago, I began development of a branch version called ``NOVM.'' This version does not use VM for in-transit object data, but does still hold the entire cache metadata in VM. All objects are written to disk as they are retrieved from Web servers or neighbor caches. This version essentially trades off memory for file descriptors.

How do the two versions compare?

One might expect that the 1.NOVM.x version does not perform as well as 1.1.x, presumably because the object data can be accessed very quickly from virtual memory. In other words, the NOVM version might perform less well because it uses the disk for all transfers.

What is a good measure of performance? A number of things come to mind:

For this (simple) experiment, I have focused on the service-time as a measure of performance. This is simply the amount of elapsed time from client connection establishment to connection close. This value is logged as the second field in Squid's access.log.

The Setup

Three computers were used for this experiment: (1) a 75Mhz Pentium running Linux, (2) a SGI Indy running IRIX, and (3) a Sun Sparcstation 1+. These three systems were connected via a dedicated 10Mb/s ethernet segment (i.e. no other machines on the segment). The Sparcstation ran the HTTP client, the Pentium ran the HTTP server, and Squid ran on the SGI.

The simulated HTTP server was written to be a low-impact (non-forking) TCP server application. It accepts connections, reads requests (which are ignored), and then writes a simple HTTP reply followed by a random amount of bogus content. The object size is randomly chosen to match the real file size distributions we see on the NLANR caches.

The HTTP client (tcp-banger2) was also written to be simple and low-impact. It reads URLs from stdin and generates Proxy HTTP requests. A command line parameter limits the number of simultaneous proxy connections.

The experiment was run four times, twice with 1.NOVM.10 and twice with 1.1.10.

  1. 1.NOVM.10 with an empty cache to measure cache misses (NOVM/MISS)
  2. 1.NOVM.10 with a full cache to measure cache hits (NOVM/HIT)
  3. 1.1.10 with an empty cache to measure cache misses (VM/MISS)
  4. 1.1.10 with a full cache to measure cache hits (VM/HIT) The tcp-banger2 client requested 10,000 unique URLs from Squid. The order of the URLs was randomized before each run. The tcp-banger2 client was limited to no more than 63 open connections at one time. The tcp-banger2 seemed to perform much worse above 64 connections, presumably due to the select(2) implementation.

    The same squid.conf file was used for all runs. Additionally, squid was patched to log some statistics (page faults, VM usage from mallinfo(), and FD usage) once per second to cache.log.

    Results


    Number of Requests vs Time

    This graph shows how quickly each run was completed, and also the rate at which connections were handled. This table summarizes the graph:

    RUN             TOTAL TIME      REQUEST RATE
    --------------- --------------- ------------
    NOVM/MISS       688 seconds     14.5 req/sec
    NOVM/HIT        610 seconds     16.4 req/sec
    VM/MISS         648 seconds*    15.4 req/sec
    VM/HIT          593 seconds     16.9 req/sec
    

    *The final connection of the VM/MISS run took an unusually long time (as you can see on the graph below). 648 seconds is when the 9999th connection completed.

    Note that the two HIT cases are quite close to each other.


    Service Times

    This graph shows cumulative distribution histograms of the service times (2nd field of access.log). Here the two HIT runs are similar to each other, and the two MISS runs are similar to each other as well. Interestingly, the NOVM/HIT run has a better median service time than the VM/HIT case, but they are quite close.

    RUN             MEDIAN
    --------------- ------------
    NOVM/MISS       3.59 seconds
    NOVM/HIT        2.16 seconds
    VM/MISS         3.46 seconds
    VM/HIT          2.29 seconds
    


    Filedescriptor Usage

    This graph shows filedescriptor usage over time. The graph shows that the NOVM/MISS case peaks at about 260 FDs, both NOVM/HIT and VM/MISS at 140 FDs, and the two VM/HIT case peaks around 110 FDs. Remember there are a maximum of 63 simultaneous client connections.


    VM Usage

    This graph shows the amount of memory used by the Squid process over time. The value was acquired from the mallinfo(3) function call. Clearly the NOVM cases use much less memory. On the VM/MISS plot, you can see the 8Mb 'cache_mem' pool being filled up by in-memory objects in two stages.

    Note that the two HIT cases do not begin at zero. This is because the HIT runs were made immediately following the MISS runs without killing and restarting the Squid process.


    Comments


    index.html,v 1.4 1998/01/16 18:05:43 kc Exp