Instructions

In this simulation, a group of students takes a test made up of 100 true/false questions. Each student knows the answers to a specific number of questions and guesses at the remaining. For simplicity, it is assumed that there is no partial knowledge and therefore a guess has a 0.5 chance of being correct. Students get one point for each question they answer correctly and lose one point for each question they answer incorrectly.

Students take two versions of the test. It is assumed that a given student knows the answers to exactly the same number of questions on both tests. This number is called the student's "true score." The student's scores on the the two tests will generally be different because the number of correct guesses will differ for the two tests.

Some test scores will be above the "true scores" and some will be below the true scores. If a student is correct on exactly half the questions he or she guesses on, the true score and the test score will be equal.

The two graphs shown when the simulation begins plot observed scores as a function of true scores for the two tests. For the upper figure, test scores that are above the "true score" are shown in blue ,test scores below the true score are shown in green, and scores that equal the true score are shown in magenta. The colors in the lower figure are based on scores on the first test. Therefore, if someone scored above their true score on the first test and below their true score on the second test, the data point for that person would be blue in both figures. In the upper figure, the point would be above the black diagonal line; in the lower figure the point would be below the line.

The default is for true scores to be sampled from a normal distribution with at mean of 60 and a standard deviation of 5. With a sample size of 1,000, the sample mean true score can be expected to be quite close to 60 (95 times out of 100 it should be be between 59.7 and 60.3). The expected score for an item that is guessed at is 0 since students are penalized by one point if they guess wrong. Therefore, the mean test scores should also be close to 60.

The simulation is designed to show the characteristics of a group selected on the basis of test scores. Drag the black horizontal bar at the bottom of the first figure upwards to set the criterion for selecting students. Only students as high or higher than the bar are selected.

The upper graph uses colors to show which students:

  1. Had true scores above the cutoff, had good luck, and were selected (dark blue)
  2. Had true scores below the cutoff, had good luck, and were selected (light blue).
  3. Had true scores above the cutoff, had bad luck, and were selected (green).
  4. Had true scores above the cutoff, had bad luck, and were not selected (red)
  5. Had true scores below the cutoff, had good luck, and were not selected (purple).
  6. Had true scores below the cutoff, had bad luck, and were not selected (orange).
  7. Had true scores equal to their test scores (magenta).

The lower graph plots the data for the same students on the second test. The color coding is based on the upper graph. For example, all the blue points in the upper graph are above the black line whereas in the lower graph blue points are about equally distributed on both sides of the line.

The panel on the right side of the screen shows the true scores, Test 1 scores, and Test 2 scores separately for all students and for students selected based on their Test 1 scores. Students are further divided into subgroups on the basis the relationship between their true scores and their Test 1 scores. The percent of selected students whose Test 1 score exceeded their true score is shown and labeled "Percent lucky."

The sample standard deviations and the correlations among the true scores, Test 1 scores, and Test 2 scores are shown at the bottom of the panel.

You can set the mean and standard deviation of the true scores and the sample size.

Credits

The simulations were developed as part of a grant from NSF to David Lane of Rice University. Partial support for this work was provided by the National Science Foundation's Division of Undergraduate Education through grant DUE 9751307. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.