680 F.2d 965
30 Fair Empl.Prac.Cas. 906,
29 Empl. Prac. Dec. P 32,720
EQUAL EMPLOYMENT OPPORTUNITY COMMISSION, Appellant,
AMERICAN NATIONAL BANK, Appellee.
United States Court of Appeals,
Submitted Aug. 4, 1981.
Decided May 18, 1982.
On Suggestion for Rehearing En Banc.
Constance L. Dupre, Acting Gen. Counsel, Philip B. Sklover, Acting Associate Gen. Counsel, Vella M. Fink, Asst. Gen. Counsel, William H. Ng, Washington, D. C., for appellant.
Paul M. Thompson, Thomas J. Manley, Gregory B. Robertson, Hunton & Williams, Richmond, Va., for appellee.
The appellee's petition for rehearing and suggestion for rehearing en banc has been considered by the court.
The panel considered the petition for rehearing and decided, Judge Russell dissenting, that it should be and it is denied.
On a poll of the court on the suggestion for rehearing en banc, the court was evenly divided. Because a majority of the judges in regular active service did not vote in favor of rehearing en banc, that suggestion in motion is denied. From this denial, Judge Russell, Judge Widener, Judge Hall, Judge Ervin and Judge Chapman dissent, and Judge Widener has filed a dissenting opinion in which Judge Russell joins.
Entered at the direction of Judge Phillips.
WIDENER, Circuit Judge, dissenting:
I would respectfully add a word of dissent.
I dissent because there are significant errors in both the district court and panel opinions concerning the appropriate use of statistical evidence in employment discrimination cases. These errors, I think, may only, at best, confuse the law of employment discrimination in this circuit.
The errors in using the statistics are two-fold. First, the district court failed to employ a proper statistical analysis in arriving at its conclusion that the employment statistics demonstrated a prima facie case of employment discrimination. The panel majority compounded this error when it adopted the district court's finding. Second, this majority erred in its use of statistical rebuttal evidence. Correct statistical analysis shows that the EEOC never demonstrated a prima facie case here (especially for Suffolk), and assuming, arguendo, that a prima facie case was presented, the defendant provided sufficient rebuttal evidence.1
I. Prima Facie Case
The Supreme Court has reasoned that statistical evidence may alone establish a prima facie case of racial discrimination. See, e.g., Hazelwood School District v. United States, 433 U.S. 299, 307-08, 97 S.Ct. 2736, 2741, 53 L.Ed.2d 768 (1977); Teamsters v. United States, 431 U.S. 324, 339, 97 S.Ct. 1843, 1856, 52 L.Ed.2d 396 (1977). Nevertheless, it is essential that a court's utilization of statistics be mathematically correct in order to prevent an invalid conclusion. In the instant case, the district court found a prima facie case of discrimination exclusively on the basis of employment statistics.2 The court compared the percentage of blacks in the defendant's office/clerical work force to the percentage of blacks in comparable jobs in the respective geographic areas. It concluded:
that there has been consistent underrepresentation of blacks on the work force of defendant between 1969 and 1975... The fact remains ... that the percentage of blacks in defendant's total work force has remained at least 15 percent below the percentage of blacks in the available work force of any of the communities comprising the two relevant labor market areas.
EEOC v. American National Bank, 21 E.P.D. (CCH) P 30,369 at 13,057 (E.D.Va., June 25, 1979). The opinion continued "The Court finds, therefore, that the statistical evidence presented here by the EEOC is prima facie statistical proof of a pattern or practice of discrimination." Id. at 13,058.
The panel majority followed the district court and adopted the finding of a prima facie case, concluding:
These statistical disparities are substantial, in some cases reaching the 'inexorable zero' point. Teamsters, 431 U.S. at 342 n. 23 (97 S.Ct. at 1858 n. 23). They show that blacks were consistently underrepresented in the office and clerical categories in branches in both cities and underrepresented in the officials and managers categories in Suffolk all years and Portsmouth for four of seven years. The district court's conclusion, that considered alone, they establish a prima facie case is firmly supported by the record. (Emphasis added).
EEOC v. American National Bank, 652 F.2d 1176, 1190 (4th Cir. 1981).
In my opinion the conclusions of the panel majority and that of the district court are equally incorrect.
In reaching their conclusions using straight percentage comparisons of actual and expected percentages of minority employees, the courts contravened both common statistical principles and Supreme Court precedent. Statisticians do not simply look at two figures, in this case the actual percentage of black employees compared with the percentage available, and make a subjective conclusion, as did the courts here, that the figures are significantly different. Rather, statisticians compare figures through an objective process known as hypothesis testing. C. Hicks, Fundamental Concepts in the Design of Experiments, 15 (1973) (hereinafter Hicks); F. Mosteller, R. Rourke & G. Thomas, Probability With Statistical Applications 302-05 (2d ed. 1970) (hereinafter Mosteller, Rourke & Thomas); R. Winkler & W. Hays, Statistics: Probability, Inference, and Decision, 402-03 (2d ed. 1975) (hereinafter Winkler & Hays). The process of hypothesis testing is readily adapted to employment discrimination statistics as they follow regular statistical formulae. See W. Connolly & D. Peterson, Use of Statistics in Equal Employment Opportunity Litigation, 74-83 (1980) (hereinafter Connolly & Peterson).
In hypothesis testing, the statistician sets a null hypothesis, which he subsequently seeks to disprove through the use of various objective mathematical formulae. Hicks, supra at 15-16; Mosteller, Rourke & Thomas, supra at 303-04. If the hypothesis is disproven, the statistician can state, to a given precision, that the two statistics are in fact different. Hicks, supra at 16; Mosteller, Rourke & Thomas, supra at 305-06. In the situation of racial employment discrimination statistics, the null hypothesis is that the actual and expected percentages of minority employees are equal. If the null hypothesis is rejected, then one can say, with a given precision, that the difference between the two percentages is not the result of random chance.
The process of hypothesis testing is not an academic exercise for substituting objective mathematical criteria for the judgment of a court of law. Rather, it is a means of according mathematical credibility to statistical evidence used in the litigation process. It is axiomatic in statistical analysis that the precision and dependability of statistics is related to the size of the sample being tested. K. Hammond & J. Householder, Introduction to the Statistical Method, 299 (1962) (hereinafter Hammond & Householder); Winkler & Hays, supra at 444-47.3 Without the use of hypothesis testing, a court may accord credibility in law to statistics which are not deserving of mathematical credibility. Such a situation often arises when the employment situation under consideration involves a small number of people, such as in the instant case.4
The Supreme Court has recognized the danger of making comparisons between the percentages of minority population without further mathematical analysis. In Mayor v. Educational Equality League, 415 U.S. 605, 94 S.Ct. 1323, 39 L.Ed.2d 630 (1974), the Court reversed a court of appeals finding of discrimination in the make-up of a commission to nominate members of the Philadelphia school board. The court of appeals had found it significant that while 60% of the school population was black and 34% of the city's population was black, only 15% of the 13 member nominating board was black. The Supreme Court said: "(T) he District Court's concern for the smallness of the sample represented by the 13 member panel was well founded. The Court of Appeals erred in failing to recognize the importance of this flaw in straight percentage comparisons." Id. at 621, 94 S.Ct. at 1333 (emphasis added).
In the instant case, both the district court and the panel majority, in finding a prima facie case had been proven by statistics, committed the same error of drawing conclusions from straight percentage comparisons without taking into account sample size at all.5 That majority justified its finding of a prima facie case through the use of straight percentage comparisons by saying:
As frequently observed by the Supreme Court, and as recognized by the district court, gross statistical disparities in the static work force during the relevant period may alone constitute prima facie proof of the discriminatory practice. Hazelwood, 433 U.S. at 307-08 (97 S.Ct. at 2741); Teamsters, 431 U.S. at 335 n. 15, 339 n. 20 (97 S.Ct. at 1854 n. 15, 1856 n. 20); Arlington Heights v. Metropolitan Housing Development Corp., 429 U.S. 252, 265-66 (97 S.Ct. 555, 563, 55 L.Ed.2d 450) (1976).
652 F.2d at 1189. The panel majority and the district court before it, however, did not consider a footnote in Hazelwood where the Supreme Court explained the meaning of statistical disparity. The Supreme Court said, "A precise method of measuring such statistical disparities was explained in Castaneda v. Partida, 430 U.S. 482, 496-497 n. 17, 97 S.Ct. 1272, 1281 n. 17, 51 L.Ed.2d 498. It involves the calculation of the 'standard deviation' as a measure of the predicted fluctuations from the sample." 433 U.S. at 309, n. 14, 97 S.Ct. at 2742, n. 14. Clearly then the term "gross statistical disparity" in the Hazelwood opinion does not mean a seemingly large difference in straight percentage comparisons; it means disparities must be found at the conclusion of statistical analysis using standard deviations.
The methodology for computing the standard deviations of employment discrimination statistics is well documented elsewhere and need not be reviewed in detail here. See EEOC v. United Virginia Bank/Seaboard National, 615 F.2d 147, 151 (4th Cir. 1980); Connolly & Peterson, supra at 74-88. I have noted in the margin, however, the way such formulae account for small samples as are present here.6 It is important to emphasize that such formulae are of no value, and thus statistical evidence is also of no value, unless a court employs them at the proper point in the judicial process and that point is before the court finds a prima facie case on the basis of statistical evidence.7
NOTE: OPINION CONTAINS TABLE OR OTHER DATA THAT IS NOT VIEWABLE
NOTE: OPINION CONTAINS TABLE OR OTHER DATA THAT IS NOT VIEWABLE
In the instant case, the district court prepared a standard deviation analysis of the defendant's static employment figures as part of its consideration of rebuttal evidence. I have prepared a further analysis using slightly different assumptions and those statistics appear in the margin.8 Concerning the defendant's Suffolk branches, both the district court's and my analyses show that the difference between the actual and expected percentages of black employees is less than two standard deviations for almost all years regardless if the statistics are compared to Suffolk, Nansemond County or the combined populations. In the Castaneda opinion, the Supreme Court noted that "if the difference between the expected value and the observed number is greater than two or three standard deviations, then the hypothesis ... that the difference was due to chance would be suspect to a social scientist." 430 U.S. at 497 n. 17, 97 S.Ct. at 1281 n. 17.9 Here, under Castaneda and Hazelwood, since the number of standard deviations is less than that figure, one may not conclude that the difference between actual and expected black employment is statistically significant or that the figures demonstrate a prima facie case of employment discrimination.
It has just been shown that a proper consideration of the statistics leaves no way for a prima facie case to have been proven at Suffolk. Portsmouth, however, is another matter. Having found a prima facie case proven by straight percentage comparisons, the district court then applied the Castaneda-Hazelwood analysis, considered the applicant flow data, and concluded that that analysis lent no support to allegations of discrimination. It concluded that such evidence neutralized the EEOC's statistical evidence. Had the district court done what it should have done and applied the Castaneda-Hazelwood analysis initially to the static work force statistics, and after that considered Portsmouth alone, that could have justified its finding of a prima facie case. Since it considered both Portsmouth and the Norfolk SMSA, and since it held the Norfolk SMSA "more reliable," 21 E.P.D. at 13,066, the finding of a prima facie case under standard statistical principles as well as following the teaching of Castaneda-Hazelwood would be doubtful at best. In the nearest year to the trial, 1975, the standard deviation was only 1.38; for the four years before that, it had not exceeded 2.58; and for the three years before that had not exceeded 3.81. In addition, the standard deviation analysis indicates for applicant flow figures for Portsmouth for the only year for which figures were available, a standard deviation of only 2.16, hardly a sufficient variation to be of statistical significance and dislodge the null hypothesis of chance. While it is arguable that a prima facie case has been statistically proven for Portsmouth, the district court's consideration of the matter under the wrong mathematical and legal standard at the very best would make necessary a remand rather than an outright reversal.
II. Rebuttal Evidence
After finding the establishment of a prima facie case of employment discrimination, both the district court and the panel majority considered whether further statistical evidence rebutted the prima facie case. Both courts first considered whether a standard deviation analysis of the defendant's static employment figures rebutted the prima facie case. The district court held that it did, while the panel majority held that it did not. 652 F.2d at 1197. As discussed in the first part of this dissent, such an analysis should have been completed both by the district court and by this court as part of the analysis of the prima facie case. If such an analysis had been done properly, it would have shown that there was no prima facie case.
Assuming, arguendo, that the EEOC had established a prima facie case, it would be possible for the employer to rebut that evidence with evidence of hiring patterns since passage of Title VII. The Supreme Court has explained this process:
The burden then shifts to the employer to defeat the prima facie showing of a pattern or practice by demonstrating that the Government's proof is either inaccurate or insignificant. An employer might show, for example, that the claimed discriminatory pattern is a product of pre-Act hiring rather than unlawful post-Act discrimination, or that during the period it is alleged to have pursued a discriminatory policy it made too few employment decisions to justify the inference that it had engaged in a regular practice of discrimination.
Teamsters, 431 U.S. at 360, 97 S.Ct. at 1867 (footnote omitted); see Hazelwood, 433 U.S. at 309-10, 97 S.Ct. at 2742-43.
In the present case, the district court found that the evidence of the defendant's hiring practices since passage of Title VII (i.e. the applicant flow data) provided further rebuttal of the prima facie case. The district court first looked at the percentage of black employees hired by the defendant compared to the percentage of blacks among job applicants.10 The court concluded that these straight percentage comparisons as well as the small number of hiring decisions rebutted the allegations that the defendant had engaged in a practice of discrimination. As discussed in the first part of this dissent, such a straight percentage comparison is statistically meaningless without a standard deviation analysis. The district court completed a standard deviation analysis, but compared the defendant's hiring to the percentage of black employees in the regional work force rather than to the percentage of blacks in the applicant pool. Such a comparison is statistically valid, although a comparison to the applicant pool is preferable. Connolly & Peterson, supra at 83-86.11 In any case, the analysis confirmed the district court's conclusion that the defendant's hiring practices did not evidence discrimination.
The appeals court disagreed as to whether the applicant flow data rebutted the prima facie case. The panel majority first concluded that there were sufficient hiring decisions on which to base a decision concerning hiring practices for office/clerical employees. 652 F.2d at 1194. The court then turned to the meaning of the data, and, after expressing doubts as to the dependability of the data, concluded that it did not rebut the prima facie case. Id. at 1197. The panel majority rejected the district court's standard deviation analysis because it was based on the percentage of black employees in the work force rather than in the applicant pool. Id. at 1197, n. 17. The panel majority termed this comparison "manifestly incorrect," concluding, "Therefore, the district court's particular use of standard deviation analysis was without legal or factual basis, and we disregard it entirely." Id. The court declined to do a standard deviation analysis of the applicant flow figures, saying:
"Because of the demonstrated unreliability of these figures there is no need to attempt a correct reanalysis of the statistical significance of these disparities." Id.
The statement quoted just above is diametrically opposite to a statement in the text accompanying the above quote. There the panel majority said:
The Suffolk samples for each year are concededly small, but the overall results simply confirm rather than dispel the prima facie case based upon the static work force statistics: over these years, during which blacks made up 24.8% of the qualified applicant pool, only 14.2% of those hired in these categories were black. In Portsmouth the limited data for 1975 ... not only fails to dispel the prima facie case but reinforces it.
Id. at 1197. The panel is thus saying in the footnote that the statistics are too unreliable to accord any significance, but in the text it uses these very statistics to make the point that the employer did in fact practice racial discrimination.
I have completed the standard deviation analysis that the panel majority claimed was unnecessary. It shows the following:
APPLICANT FLOW FIGURES (Office & Clerical Only) Suffolk ----------- % Black % Black Std. Year Applicants Hired Deviations ----------- ---------- ---------- ---------- 1969 24.1 20.0 .21 1970 13.0 0.0 .54 1971 25.3 50.0 -.80 1972 26.7 20.0 .34 1973 24.1 0.0 1.49 1974 36.1 14.3 1.20 1975 13.0 14.3 -.10 ---------- ---------- ---------- 1969-75 24.8 14.2 1.45 Combined Portsmouth ---------- 1975 23.1 4.3 2.16
These computations show to be without statistical foundation the panel's conclusion that the defendant's hiring statistics support an inference of racial discrimination. The panel found it significant that while the applicant pool for Suffolk over the seven year period was 24.8% black, there were only 14.2% black hires. It turns out, however, that this amounts to a standard deviation of only 1.45, well below the Castaneda standard of two or three. The difference between actual and expected hiring in Portsmouth for the one available year also does not exceed the Castaneda standard.
I will consider separately the majority's remarkable statement concerning the legal significance and interpretation of standard deviation computations. While the majority made the statement in the context of rebuttal evidence of a prima facie case, the statement necessarily affects the use of all statistical evidence. The majority said:
(A)uthority can be found for the proposition that most social scientists, applying laboratory rigor to rule out chance as even a theoretical possibility rather than the law's rougher gauge of the 'preponderance of the evidence,' are prepared to discard chance as a hypothesis when its probability level is no more than 5%, i.e., at approximately two standard deviations. (W. Hays & R. Winkler, Statistics: Probability, Inference and Decision... 394 (1st ed. 1971) ) ... (W)e conclude that courts of law should be extremely cautious in drawing any conclusions from standard deviations in the range of one to three.
652 F.2d at 1192. In the present case, the panel majority thus would not consider the applicant flow evidence as a rebuttal to the prima facie case because some of it is beyond one standard deviation.
Try as I may, I have been unable to find any support for the majority's conclusions (1) that law should be treated differently than other disciplines using statistical information and (2) that there is something suspect about the range of one to three standard deviations.
The majority has erred in equating the statistical significance of evidence solely to the legal concept of preponderance. The latter addresses the relative weight of the evidence introduced by the parties. The significance level goes primarily to the probative value of the evidence, and only incidentally to its weight. There is no basis for finding that some undisclosed significance level other than 95 or 99% is appropriate for the law.12 Numerous disciplines use such a standard. See, e.g., R. Beals, Statistics for Economists, 190-91 (1972); M. Hagood, Statistics for Sociologists, 447 (1941); Hammond & Householder, supra at 298 (psychology); J. Roscoe, Fundamental Research Statistics for the Behavioral Sciences, 182 (2d ed. 1975). Connolly & Peterson, in their treatise on statistics in employment discrimination litigation, discuss the meaning of 95% and higher significance levels in such litigation but make no mention of an appropriate level lower than 95%. See Connolly & Peterson, supra at 80 & n. 69. There is simply no basis in law or mathematics I can find for the panel majority's statement.
The panel majority has raised an interesting issue, though, as to the meaning of statistics that differ by less than two or three standard deviations. As explained in the first part of this dissent, when two statistics differ by more than two or three standard deviations, a statistician would reject the hypothesis that the difference between actual and expected values is due to chance. A valid question is whether the statistician would accept the hypothesis that the difference is due to chance when the statistics differ by less than two or three standard deviations. The panel majority answers this by saying that courts should be "extremely cautious in drawing any conclusions from standard deviations in the range of one to three." 652 F.2d at 1192. I have been unable to find anywhere authority for the statement that sets apart one to three standard deviations as a zone for special treatment. Rather, the authorities state that while you "accept" a hypothesis that is not rejected, this is a term of art and, while not ordinarily, may be permissibly, considered proof that the difference is due only to chance. See Mosteller, Rourke & Thomas, supra at 305-08; Roscoe, supra at 172.
Because there is no firm mathematical authority on the subject, the decision whether to accept rebuttal evidence which shows that the employer's hiring differs by less than two or three standard deviations from the expected hiring becomes subjective. To absolutely reject such evidence works an unfair hardship on any litigant in rebutting inferences raised by statistical evidence, not to mention any litigant depending on statistical evidence. Because of the importance of an employer's hiring practices in Title VII suits such as the instant one, I am inclined to encourage the admission of properly constructed statistical evidence of such practices. See United Virginia Bank, 615 F.2d at 154. A court should be mindful that evidence showing a difference of less than two or three standard deviations is not necessarily conclusive that the employer did not practice discrimination; but it is evidence, and, absent any authoritative statement that it is meaningless, I object to its in effect flat rejection by the majority. Indeed, the mathematicians are agreed that unless a standard deviation analysis discloses a variation of more than two or three standard deviations, a prima facie statistical case of discrimination has not been proved. Accord Hazelwood, 433 U.S. at 309 n. 14, 97 S.Ct. at 2742 n. 14.13 Thus, the majority's effective rejection of all statistical evidence between one and three standard deviations is in flat contradiction of the teaching of Hazelwood that more than two or three standard deviations deserves credence.
Much of this dissent employs terms and propositions less than familiar to many. In order that there be no mistake as to the gist of my thinking, it is as follows.
If statistics are used as evidence in cases such as this, they must be used in accordance with correct statistical principles in order to have any rational meaning at all.
The district court and the majority of the panel have used statistical principles in the establishment of plaintiff's prima facie case which are contrary to every mathematical authority on the subject.
If that were not enough, the majority then rejects standard hypothesis testing, again contrary to all the mathematical authorities.14
This decision expressly holds that a plaintiff's prima facie case is properly established by the use of incorrect statistical principles, but that correct statistical principles may not be employed by a defendant either to undo the prima facie case or in rebuttal.
I think such a holding not only offends reason, it is contrary to law and notions of common justice.
I have sought in this dissent to emphasize the methodology for interpretation of statistics. Unless this court commences to follow valid statistical procedures, it will continue, as here, to draw inferences which are unwarranted. The result is bad law which may only confuse future litigants, not to mention the courts in this circuit.
In this connection, I should point out that both the district court and the panel majority have apparently treated evidence offered by the defendant when its turn came as evidence only tending to rebut a prima facie case which had been made. While this is, of course, an appropriate use of evidence offered in rebuttal, an equally appropriate use and one just as often made of evidence is that which tends to show that the basic facts on which a plaintiff's case is made do not exist, thereby decrying the existence of a prima facie case. As the foregoing applies to this case, applicant flow statistics are admissible as evidence in chief by a plaintiff as well as when offered in rebuttal by a defendant. Hazelwood School Dist. v. United States, 433 U.S. 299, esp. n. 13, p. 308, 97 S.Ct. 2736, esp. n. 13, p. 2742, 53 L.Ed.2d 768 (1977); Hester v. Southern Railway Co., 497 F.2d 1374 (5th Cir. 1974). And, when in evidence, they should be used for whatever their probative value is, either to make a case, to rebut a prima facie case made, or to show that a prima facie case does not exist
The Supreme Court has noted that prima facie employment discrimination cases can rest both on statistical and non-statistical evidence. Teamsters v. United States, 431 U.S. 324, 339, 97 S.Ct. 1843, 1856, 52 L.Ed.2d 396 (1977); see W. Connolly & D. Peterson, Use of Statistics in Equal Employment Opportunity Litigation, 15-16 (1980). In the instant case, however, the district court found that the non-statistical evidence presented by the EEOC, which included evidence of certain personnel practices and the experience of 31 black applicants, did not demonstrate racial discrimination. EEOC v. American National Bank, 21 E.P.D. P 30,369 at 13,098-13,099 (E.D.Va., June 25, 1979). Thus, its only basis for establishing a prima facie case was statistical evidence. Id. at 13,058
The precision of statistics increases as the size of the sample increases. For instance, if a sample of four items is drawn from a large group of items and one of those items has a characteristic which sets it apart from the others, one must say that 25% of the sample has that characteristic. However, one would hesitate to state that 25% of the full group has that characteristic because it is realized that if two items with that characteristic had been drawn, it would have led to a percentage of 50%; if no items with that characteristic had been drawn, it would have required one to state that 0% of the large group had that characteristic. It is apparent that drawing conclusions on the basis of a sample of four is very imprecise. On the other hand, a sample of 25 items would be much more precise because a difference of one item with a given characteristic would lead to a change of only 4% in the final statistic
Hypothesis testing accounts for small sample sizes by using mathematical formulae which vary with the size of the sample. The effect of sample size in the formula used for testing racial statistics is included later, although a brief example here will help illustrate the effect of sample size. Assume there is a very large population that is 20% black. A sample of this population would also be expected to be 20% black, but suppose a certain sample is only 12% black. A statistician would state that this difference was significant (i.e. that it was not due to chance) with certain degrees of precision depending on the size of the sample. For instance, if there were 1000 people in the sample, a statistician would say with a 99.9% degree of precision that the difference between actual and expected black population was significant. If 100 people were in the sample, the precision falls to 95%; with 50 in the sample, the precision is 84%; with 25 it is 68%; with 10 it is 47%; and with 5 it is 14%
Statisticians do not usually state their conclusions in terms of the degree of precision. Rather, they state whether the difference between actual and expected values is statistically significant at a given confidence level. Statisticians usually use 95% or 99% confidence levels. Winkler & Hays, supra at 413. Thus, in the foregoing sample, only when the sample is 100 or larger would a statistician find the difference between 20% and 12% statistically significant.
Population samples of less than 30 to 40 are generally considered to be small samples and require special treatment in certain statistical analyses. P. Hoel & R. Jessen, Basic Statistics for Business and Economics, 190 (1977); Winkler & Hays, supra at 366. In the instant case, the defendant employed between 12 and 28 office-clerical employees at its Suffolk branches and between 92 and 132 office/clerical employees at its Portsmouth branches
As is well documented, the standard deviation for a binomial distribution is
where n is the size of the selected sample and p is the proportion of individuals in the total population possessing the characteristic which is being investigated. Connolly & Peterson, supra at 80. Sample size is thus an important element of this computation. The figure derived by the above formula can be converted to a figure which gives the standard deviation in terms of percentages rather than numbers. This figure is known as the standard error and equals
Inspection of this latter computation shows that as the sample size gets smaller, the standard error gets larger, thus demonstrating that small sample sizes cause larger allowed deviations and are thus less reliable. See Winkler & Hays, p. 308.
Another method for compensating for small sample sizes is known as the "student's t" distribution. In the Castaneda opinion, the Supreme Court noted that with large sample sizes statistics that differed by more than two or three standard deviations would lead a social scientist to reject the hypothesis that the difference between the actual and expected numbers was due to chance. 430 U.S. at 497 n. 17, 97 S.Ct. at 1281 n. 17. Two to three standard deviations correspond to a 95% to 99.9% significance level. Winkler & Hays, supra at 355-58. The "student's t" distribution teaches that when there is a sample size of less than thirty, the number of standard deviations must be increased in order to achieve the same significance level. See Winkler & Hays, supra, at 366; Mosteller, Rourke & Thomas, supra at 437-38. For example, tables for use with the "student's t" distribution show that with a sample size of 15, one needs to use 2.1 standard deviations for a 95% confidence interval and 3.8 standard deviations for a 99.9% confidence interval. For a sample size of 5, the standard deviations become 2.8 and 7.1 respectively. Winkler & Hays, supra at XV.
This court previously employed a standard deviation analysis on employment statistics in EEOC v. United Virginia Bank/Seaboard National, 615 F.2d 147 (4th Cir. 1980). In that case, which concerned a competitor of the defendant here, this court found that the EEOC had failed to demonstrate a prima facie case of employment discrimination. Id. at 153. This conclusion was based on the fact that a standard deviation analysis of the bank's work force statistics did not support a prima facie case, even when supported by the other statistical comparisons which were also found wanting
The computations I have made derive from the statistical figures which appear in the district court's opinion. The principal difference between my computations and those of the district court is that I have computed a specific standard deviation rather than a range. In addition, I have computed the standard deviation for the Nansemond County/City of Suffolk combined populations. Black employees make up 18.6% of the office/clerical work force in the combined region. Labor Market Information for Affirmative Action Programs: State of Virginia, 1970 Census Data, 187. Such a combined statistic is appropriate as Nansemond County merged into the City of Suffolk during the period under consideration
STATIC EMPLOYMENT STANDARD DEVIATIONS-Office/Clerical Personnel
Suffolk ------- % Black Total Standard Deviations Year Employees Employed Suffolk Nanse. Comb. ------- --------- -------- ------- ------ ------ 1968 0.0 28 1.80 2.88 2.54 1969 0.0 18 1.45 2.29 2.04 1970 5.3 19 .72 1.81 1.49 1971 11.1 18 -.11 1.16 .82 1972 10.0 20 .04 1.34 .99 1973 4.8 21 .83 1.94 1.64 1974 0.0 18 1.45 2.29 2.04 1975 0.0 12 1.18 1.88 1.66 Portsmouth ---------- % Black Total Standard Deviations Year Employees Employed Portsmouth Norfolk SMSA * -------------- ----------------- ------------- ------------- -------------- 1968 1.1 92 4.76 3.54 1969 0.9 108 5.16 3.94 1970 3.2 95 4.32 3.00 1971 4.9 103 4.08 2.63 1972 6.5 123 4.04 2.37 1973 5.9 118 4.14 2.52 1974 5.3 132 4.53 2.85 1975 9.3 108 3.08 1.38
Norfolk SMSA includes Norfolk, Portsmouth, Virginia Beach, Chesapeake and
As explained in footnote 6, small sample sizes require that a greater number of standard deviations be used in order to achieve the same precision. In the instant case, since the Suffolk statistics were within the Castaneda standard, it is unnecessary to consider the exact effect of the small sample size on precision, for that would only make the conclusion even more favorable to the defendant
There apparently was some question at trial as to whether the applicant flow data actually represented the complete number of applications received by the appellant. The district court was satisfied, however, that the information was sufficiently complete in order to accord it probative weight. I am unwilling to reach the same conclusion as did the panel majority, that the data was, in fact, unreliable. Its finding that the district court's holding was clearly erroneous I think is simply not supported by the record and is a clear violation of FRCP 52(a)
The district court felt that Castaneda and Hazelwood mandated that standard deviation comparisons be made with the percentage of blacks in the work force rather than the applicant pool. I must agree with the panel majority that there is nothing in either opinion to so require. 652 F.2d at 1197 n. 17. Nevertheless, I affirmatively disagree with the panel majority's conclusion that there is no legal or factual basis for the district court's comparison. See Id. The district court's comparison is perfectly valid and would be especially appropriate if information on the racial composition of the applicant pool was unavailable. See Connolly & Peterson, supra at 83-84
The Hays and Winkler text citation that the court gives in its opinion does not support an assertion that statistics in law should be treated differently than in other disciplines. It supports only the proposition that statisticians usually employ either 1% or 5% significance levels
I do not think the Supreme Court's recitation, in note 17 of Hazelwood, that its statistical analysis in that note suggests that precise calculations of statistical significance are necessary in employment of statistical proof is an indication that the use of hypothesis testing is unnecessary in a case, as here, in which the sole evidence of a prima facie case of discrimination is statistical
The opinion is plainly contrary to the teaching of Castaneda, Hazelwood and Mayor v. Educational Equality League, which cases acknowledged and properly used mathematical statistical principles