The paper describes a method of extracting a string of random bits from the behaviour of the operating system scheduler. To reduce the load imposed on the system by the random number generator and to improve the properties of the random numbers generated, the random bits gathered from the scheduler are used to perturb a pseudo-random number generator.
The result is a stream of random numbers with a computational cost approaching that of a sequence of pseudo-random numbers, but with the properties of a sequence of real random numbers.
Other approaches which use hardware found in an existing system (Davis 1994) typically require modifications to the operating system or access to the hardware at a privileged level. The method proposed here operates at the user level.
The paper addresses a method of extracting a string of random bits from the behaviour of the operating system scheduler. To reduce the load imposed on the system by the random number generator and to improve the properties of the random numbers generated, the random bits gathered from the scheduler are used to perturb a pseudo-random number generator which has the property that a long series of values is required before it is possible to determine the next value (Wallace 1990).
The implementation and testing is being carried out on a number of Unix-based platforms.
There are several varieties of Unix and a number of different scheduling algorithms have been employed. Two common varieties are discussed.
The term priority is used in the following discussions in the normal sense of the word in English: the highest priority is the most important. This is distinct from the concept of priority level used in the UNIX operating system where the process with the smallest priority level value is the next process to run.
The scheduler of a pre-emptive multi-tasking operating system acts as a rapidly changing stream of events when viewed as a process operating on a machine. We exploit this property to produce a sequence of random numbers. Furthermore, we illustrate that the order of events is hard to predict in the short-term, and impossible to predict in the long-term, thus guaranteeing true random behaviour.
When two clocks are sampled, the result can be classified into three classes (illustrated in Figure 1): lagging, synchronised and leading. If the clocks are sampled infrequently enough and the clocks are derived from two different time sources, the result of the sampling process will have an indeterminate behaviour.
This principle can be used directly by locating two independent clocks within a computer system and performing sampling. Classifying the relationship between the two clocks into lagging, leading or synchronised can be used to provide an indeterminate pattern which can be represented as a bitstream. Two candidate clocks for this type of operation are the real-time clock - although many systems derive ticks from the system clock, the real-time is often corrected by periodic references to an independent source of wall-clock time - and the system clock. The requirement that sufficient time elapse between sampling the clocks makes the method impractical for generating large volumes of random data, although it ensures that long-term behaviour is truly random.
This approach is implemented using the scheduler and a set of identical processes. The processes are instructed to wakeup at the same time beginning the tournament. The processes perform some equivalent tasks and then perform an operation which employs mutual exclusion of the other process. By monitoring the result of the operation, the outcome of the tournament can be determined. On a multiprocessor, competitions run in parallel. On a uniprocessor, the competitions are run in sequence against the time quantum allocated to each process.
Figure 2 illustrates the operation of a competition employing two processes on a multiprocessor and on a uniprocessor system. In the illustration, two identical processes - pa and pb - are set in operation. Each process performs a computational task and, at the end of the task, each process attempts to perform a mutual exclusion operation and record its name. When a process ends, it releases the mutual exclusion, allowing the other process to record its name. In the multiprocessor system, the two processes can compete directly against each other. On a uniprocessor system, one of the processes is scheduled first. At the end of the first process's time quanta, another process is scheduled. After completing its computational task, one of the processes enters the mutual exclusion region and writes its name. The lagging process records its name before terminating. Two random elements exist in the uniprocessor example - the selection of the first process to run, and the competition to enter the mutual exclusion region before a process's time quanta ends.
This method never exposes the code to optimisation, allowing the idle operation to be performed exactly as the programmer specified.
A requirement that the code be easily portable across systems prevented the use of assembly code; assembly code is platform-specific and hence, not portable.
The approach employed exploited the C programming language's support for separate compilation. Separate compilation allows functions in other translation units (in C, a translation unit typically equates to a file) and libraries of precompiled code to be called. The linker is responsible for satisfying these external references at link time. To support separate compilation, a compiler must generate a call to a function in another translation unit. The optimiser cannot remove this call since the compiler is unaware of the action of code in other translation units on the process state. This ensures that at least the time taken by setting up, performing and returning from a function call is consumed by the idle operation code. Figure 4 illustrates code fragments used to implement the idle operation.
A system call is performed before the idle operation is commenced to increase the chance of synchronising the idle operations with a quanta boundary.
The beginning of a quanta is generated by one of two events: a process resuming after it has been pre-empted and returning from a system call which causes the scheduler to reschedule the process. The sleep and the select system calls can cause processes to be rescheduled. This implementation employs select and makes use of the sysconf system call - provided by POSIX - to return the maximum length of a quanta in microseconds.
The length of a quanta in microseconds is used as a goal time. The algorithm increases the number of times the idle operation is called while measuring the number of microseconds that elapse. When the elapsed time is greater than the goal time, the last increment is subtracted and the increment size halved. This process is repeated until the increment has a value of zero. Figure 5 shows a plot of the number of iterations per trial and the size of the increment against the trial number.
The parent conducts a tournament by forking a number of child processes, creating the result file, waiting until the child processes have completed and reading the results file for the tournament. The result is returned to the caller of the parent.
When a child process is started, it locates a unique number to use as its name by consulting a counter in the copied state of the parent process. Each child will see a different number since the state of the parent process is copied after the counter has been incremented. The child processes suspend themselves until a fixed period of time after a reference time initially determined by the parent. After the child process awakes, it attempts to gain an exclusive lock on the results file. If the process succeeds, it writes its name into the file and surrenders the remainder of its quanta. If the process fails, it performs a sufficient number of idle operations to nearly fill out its quanta, surrenders anything remaining of its quanta and then retries accessing the file. After a child process has written its name into the result file a specified number of times, the child terminates. The number of times a child writes its name into the result file is known as the number of repetitions.
The result of the parent process's operations and the child processes' operations is a pattern which is returned to the caller.
Figure 6 and Figure 7 show relevant code fragments to implement the method of generation of random numbers described above.
After a complete set of patterns is generated, all the duplicate patterns are removed and the patterns are numbered continuously from zero.
When a result is recovered from the random number generator, the list is searched and the number allocated to the pattern is returned. This allows a single integer to be returned to represent the pattern.
The cost of generating the list is fairly high; however, this cost is amortised over all random numbers generated with the same number of repetitions and number of child processes.
Four experiments are reported on each of the two platforms. The parameters for each experiment are listed in Table 1.
| Name | # Processes | # Repeats | # Results |
|---|---|---|---|
| 3.1.100 | 3 | 1 | 100 |
| 3.2.500 | 3 | 2 | 500 |
| 4.1.5000 | 4 | 1 | 5000 |
| 5.1.10000 | 5 | 1 | 10000 |
The # Processes column indicates the number of child processes created. The # Repeats column refers to the number of times a child process writes its name into the results file. The # Results column refers to the number of random numbers generated.
The number of results generated was increased for experiments with a larger number of potential outcomes. The relative number of results gives a simple measure of the coverage of the results generated over the set of possible results. A first-order estimate of the repetition of patterns is found by counting the number of times a given value appears in a row.
| Experiment | Platform | # Possible Patterns | # Different Patterns | # Different Lengths |
|---|---|---|---|---|
| 3.1.100 | FreeBSD | 6 | 6 | 6 |
| 3.2.500 | FreeBSD | 120 | 9 | 5 |
| 4.1.5000 | FreeBSD | 24 | 18 | 18 |
| 5.1.10000 | FreeBSD | 120 | 45 | 13 |
| 3.1.100 | Solaris | 6 | 6 | 5 |
| 3.2.500 | Solaris | 120 | 15 | 8 |
| 4.1.5000 | Solaris | 24 | 24 | 10 |
| 5.1.10000 | Solaris | 120 | 117 | 6 |
Table 2 and Figures 9, 10, 11 and 12 summarise the results generated in the experiments. The graphs of values show that some values are highly preferred by the random number generator; however, a wide spread of results are produced even if in only small numbers. The graphs of the length of repeating values are particularly encouraging as they tend toward short string lengths. This indicates that although some values are highly favoured, the favoured values tend not to occur in long continuous sequences.
An interesting result can be seen from the 3.2.500 series and the 5.1.10000 series. The graphs and other experiments indicate that increasing the number of repetitions appears to be less beneficial for improving the spread of results than increasing the number of processes.
The multiprocessor results are generally of higher quality, in that the number of patterns produced is higher and the lengths of repeating strings tend to be shorter. This result is expected as there is direct contention between processes for access to files, in addition to the indirect method available on uniprocessor systems.
The results presented showed that although the random numbers were not of high quality (the values tended to be poorly distributed), the behaviour was random.
Future work will focus on optimising the quality of the raw random number stream. Currently, the raw random number stream could be used to perturb a pseudo-random number generator yielding real random numbers with a uniform distribution. Using a perturbed random number generator would allow random numbers with a cost approaching that of pseudo-random numbers to be generated, but with the added advantage that the series produced would have a truly random behaviour.
Return to Conference Proceedings