Response by NCBE to the NYSBA Task Force Report

White Paper

Executive Summary

Introduction

The National Conference of Bar Examiners (NCBE) offers this response to the Report of the New York State Bar Association Task Force on the New York Bar Examination (Task Force Report), which was released in March 2020. The Task Force Report includes numerous criticisms of the Uniform Bar Exam (UBE) and of NCBE, many of which are based upon errors and incorrect assumptions regarding psychometric methods and practices. Our response addresses these criticisms and errors while providing context and clarification regarding the UBE’s uniformity, value, and fairness; how the UBE is scored; and the study of New York’s adoption of the UBE that was conducted by NCBE at New York’s request.

Psychometric Expertise Supporting NCBE’s Tests

The UBE, like all NCBE test products, is scored and equated by NCBE’s research/psychometric staff. The basic scoring and equating methods used for the UBE were established by internationally renowned psychometricians, each with decades of experience in high-stakes testing and educational measurement. NCBE’s research/psychometric staff members all have advanced degrees in psychometrics or closely related fields—most have PhDs in psychometrics—and have been nationally recognized for their technical expertise by peers in the profession. NCBE also receives input on psychometric and technical issues from a Technical Advisory Panel made up of some of the world’s leading psychometricians, and frequently engages with the Center for Advanced Studies in Measurement and Assessment (CASMA) at the University of Iowa.

Uniformity, Value, and Fairness of the UBE The purpose of the UBE, like that of any bar exam or other licensure exam, is to help protect the public by offering a consistent assessment of whether examinees can demonstrate that they possess essential knowledge, skills, and abilities.

The UBE offers clear benefits for bar applicants, would-be clients, employers, and law schools via increased mobility and marketability, as well as increased consistency in the subjects tested on the bar exam across jurisdictions. The UBE includes the same questions, which are given the same weights and graded using the same grading materials with support provided by NCBE, in every jurisdiction. It tests on generally accepted fundamental principles, an understanding of which, combined with the legal skills and abilities also assessed by the UBE, provides the foundation needed to practice competently in any jurisdiction. The UBE assesses essential knowledge, skills, and abilities in a manner that is fair to all examinees. Research has shown that similarly prepared examinees perform similarly on the bar exam regardless of race, ethnicity, or gender. And although deeply rooted social inequities have contributed to some examinees, particularly those from historically underrepresented populations, lacking the resources and opportunities to be as well prepared to pass the bar exam as those from majority groups, there is no evidence that the UBE creates or worsens a disparate impact. Rather, any performance disparities on the UBE reflect what culminates from a lifetime of inequities in the larger social environment. NCBE takes seriously the need to work to eliminate any aspects of its exams that could contribute to performance disparities among groups. We maintain high standards in developing our test questions through the work of our diverse drafting committees and by conducting a rigorous process of external review, bias review, pretesting, and differential item functioning (DIF) analysis to ensure fairness. We conduct or facilitate studies of predictive bias, and conduct research with jurisdictions—as in the New York study just completed.

NCBE’s Equating Method

The MBE, the multiple-choice component of the UBE, uses a statistical procedure known as equating to adjust for potential differences in difficulty between exams. Equating makes it possible to report scaled scores with consistent score interpretations. The Task Force Report criticizes NCBE’s equating method based on an oversimplified example that illustrates a different kind of equating than the type NCBE uses. The example used by the Task Force does not provide a fully accurate or fair representation of the actual process used to score NCBE exams.

Impact of Reducing the Number of Scored Items on the MBE

Beginning in February 2017, the number of scored items on the MBE was reduced from 190 to 175 in order to increase the number of unscored items being pretested for future use. The number of equator items remained the same. The Task Force Report claims that this change had a negative impact on the exam. In fact, however, the change had a negligibleeffect.In particular, the Report claims that the change caused the reliability of the exam (a measure of the precision of scores) to decline. However, the reduction in the number of scored items was offset by an improved ability to select items that distinguish well between different levels of examinee proficiency, and the reliability of scores has in fact increased with almost every administration since February 2017.

Impact of the Changing Proficiency of Examinees over Time

The Task Force Report questions the comparability of scores given differences in examinee populations from one exam administration to another. However, the purpose of equating is precisely to ensure that scores have the same meaning over time, regardless of differences in examinee proficiency or in the difficulty of the exam.

Relative Grading

The Task Force Report, in criticizing the relative grading method recommended by NCBE for jurisdiction graders of the written portions of the UBE, appears to rely on an inaccurate description of relative grading. Relative grading is a means of providing uniformity to grading practices across different essays, graders, and jurisdictions. Graders should go through a calibration process before beginning their grading, and while grading they are asked to assign rank-ordered grades based on the merit of the answers, while using as much of the score scale as possible in order to limit the effect of grader bias. A relative grading approach that uses rank ordering is one step in a process that also includes scaling the written score to the MBE.

Scaling the Written Scores to the MBE

The Task Force Report claims that the UBE is vulnerable to “forum shopping,” in which examinees intentionally try to take the exam in a jurisdiction where they believe they will have a better chance of passing due to differences in examinee populations and grader variability. However, the scaling formula used by NCBE helps compensate for such differences and for variations among graders, which are an unavoidable part of any grading process for essay and performance test components.The Task Force also expresses concern that scaled written scores are not reliable enough to produce a reliable total exam score. However, this is not the case. While the reliability of written scores is somewhat lower than the reliability of the MBE, the combined score has a reliability well above the required minimum for a high-stakes exam.

Correlations Between MBE and Written Scores

The Task Force Report erroneously states that correlations between MBE scores and written scores are low and uses this claim to argue that written scores should not be scaled to MBE scores. In fact, MBE scores and written scores are strongly correlated. Therefore, it is appropriate to scale the written score to the MBE score.

Equal Weighting of the MBE and the Written Component

Contrary to the Task Force Report’s claim that the written component of the UBE is not given significant weight in UBE scoring, the written component is in fact weighted 50% of the total UBE score

New York UBE Study Included Appropriate and Sufficient Data for Analyses

The Task Force Report criticizes the UBE study that NCBE conducted for New York for including data from a limited number of data collection points (exam administrations). However, the number of examinees within each administration was large, providing sufficient data for analysis, including analysis of subgroups.

NCBE’s Objectivity in Conducting New York UBE Study

The Task Force Report calls into question NCBE’s objectivity in conducting New York’s UBE study. NCBE undertook the study at the request of the New York State Board of Law Examiners (BOLE) as part of its mission as a nonprofit corporation. The New York Court of Appeals, in collaboration with the BOLE, approved the design of the study, and the BOLE provided the data. NCBE’s role was to offer advice on study design, analyze the data, and prepare the report. A neutral, objective perspective was maintained throughout the report, which included as much detail as possible about the analysis that was performed so that anyone with questions about the results could examine the data themselves.

Read Full White Paper Here