Chapter Summaries
- Preface
- Chapter 1: Why Measurement Is Fundamental
This chapter argues that psychological and human science research adopts a much softer definition of measurement than does physical science. This is because we often misunderstand what scientific measurement actually requires and how difficult the maintenance of physical measures can often be. We argue that Rasch measurement can get us much closer to the sort of fundamental measurement on which progress in the human sciences will ultimately depend.
- Chapter 2: Important Principles of Measurement Made Explicit
In this chapter we take a typical developmental data set and work through it in a step by step fashion (with little statistics involved) to reveal the properties that should be incorporated in a good measurement model. The basis of this chapter was presented as an invited lecture to the Graduate School of Education at Harvard (Jan, 1996) as a successful introduction to understanding Rasch modeling. The example illustrates the following points:
- probabilistic models as more realistic for assessing human abilities;
- monitoring construct validity;
- monitoring order; and,
- the dialectical nature of the theoretical/empirical interface.
- Chapter 3: Basic Principles of the Rasch Model
This chapter introduces the developmental 'pathway' analogy. Since it, is often very difficult to grasp abstruse theoretical concepts (especially when the topic has a mathematical basis), a carefully thought out and systematically implemented analogy is used to facilitate the construction of these measurement concepts in concrete terms. We have carefully developed a 'developmental pathway' analogy to develop the crucial Rasch concepts of:
- unidimensionality;
- fit;
- difficulty/ability estimation and error;
- locations for item difficulties; and,
- locations for person abilities.
- Chapter 4: Building a Set of Items for Measurement
By using a substantive theoretical basis and the Rasch model, we demonstrate with actual (dichotomous) data the structure of a multiple choice test. The example helps the reader to identify a sufficiently unidimensional developmental pathway coupled with a range of interpretations for determining acceptable degrees of fit to this pathway. The establishment of a reasonable scale for a pathway then allows us to address specific questions about the samples of items and questions we have chosen. These include:
- What can be said about the abilities of the persons in the sample?
- Is the test well-matched to the sample?
- What can be inferred about the substantive theory?
- Chapter 5: Test Equating: Can Two Tests Measure the Same Ability?
Here we address the problem of maintaining construct validity across tests. By analyzing a short answer test, we reveal more features of the Rasch model. First, we conduct a straight forward examination of the concurrent validity question by providing by a single analysis of two different tests given to the same sample. A more rigorous test of the concurrent validity question is then provided by implementing the principles of 'common person equating'. Common person equating allows us to detect both items and persons which do not fit within our definition of the common underlying pathway or scale. Finally, the principles of 'common person equating' are then extended to 'common test equating' to compare abilities of groups of persons on the same test.
- Chapter 6: Measurement Using Likert Scales
This chapter illustrates how the dichotomous Rasch model can be extended to accommodate Likert scale data. This is accomplished by analyzing an existing attitude tests using the Rating Scale version of the Rasch model. The test, designed to measure 'computer anxiety' is analyzed to reveal underlying problems with the concept being measured. Moreover, the analysis reveals how a careful examination of these problems can provide information useful for further test development. Finally, the example is used to illustrate how the standard analyses used with Likert scale data can be routinely misleading. We take care to illustrate the importance of matching the sample with the testing/evaluation instrument.
- Chapter 7: The Partial Credit Rasch Model
The Rasch model can be extended to the analysis of tests where one or more intermediate levels of success might exist between 'complete failure' and 'complete success' i.e. partly correct answers. The Partial Credit model is highly applicable in educational testing situations where 'part marks' are awarded for partial success. This chapter uses Qualitative interview data (e.g. from Piaget's méthode clinique) to illustrate the properties of the Partial Credit model. Although this type of data has long been regarded as unquantifiable, the techniques demonstrated in this chapter show how Piagetian interviews (in particular) as well as qualitative interview data (in general) might be meaningfully quantified. Interpretation of the analysis emphasizes how empirical findings can inform refinement of theoretical concepts.
- Chapter 8: Measuring Facets Beyond Ability and Difficulty
This chapter focuses on measurement situations in which other aspects of the testing situation routinely interpose themselves between the ability of the candidates and the difficulty of the test, (e.g., when judges are used to evaluate test performances in terms of performance criteria). The many-facets Rasch model is used to examine rater severity and other aspects of the writing ability assessment in a high-stakes essay writing test. While choice of essay topic has little effect on outcome, the luck of the draw with examiner - even highly capable ones - could be crucial to the result. We explain how the Rasch model can be used to monitor and measurement this often-ignored test variable.
- Chapter 9: Revealing Stage-Based Development
The measurement of 'stage' based development is explored in this chapter by using the Rasch model to demonstrate patterns of clusters and gaps in item locations that are typical of results in some developmental investigations. By using both quantitative and qualitative criteria, we then discuss how these results might be seen as supportive of 'stage'-wise development in the analysis of a new developmental test. We then apply a conventional statistical technique, the t test, to the basic Rasch analysis results as a means for quantifying some properties of stage.
We then introduce readers to the key concepts incorporated in the SALTUS model, a Rasch based innovation to the assessment of developmental change. In this chapter, we also report the use of Rasch analysis is used to estimate development in both short term (five years) and long term (twelve years) longitudinal studies.
- Chapter 10: Rasch Model Applied Across the Human Sciences
Outside the restricted fields of educational and psychological testing there remains many a fertile field for the application of scientific measurement principles. Can Rasch modeling inform judges' decisions at the Olympics? What are the measurement bases of computer adaptive testing? What are the health science benefits of Rasch measurement?
- Chapter 11: Rasch Modeling Applied to Rating Scale Design
This chapter expands on the analysis of rating scale data presented in chapter 6 by discussing guidelines for investigating empirically the utility of rating scales in the development of high-quality measures. The key message is that the assumptions about both the quality of the measures and the utility of the rating scale in facilitating interpretable measures should be tested empirically and that such investigations must give explicit consideration to the influence of both the number and the labeling of categories in this process. We address these issues to demonstrate specifically how the design of rating scales has a large impact on the quality of the responses elicited, and to show how the Rasch model provides an appropriate framework for carrying out such investigations.
- Chapter 12: The Question of Model Fit
The question of model fit remains the most contentious issue Rasch measurement circles is that of fit. In this chapter we argue why we insist that our key task is to produce data that fits the Rasch model's specification rather than talk of fit in the usual way, (i.e., that the model should fit the data). Of course, the concept of fit must be considered hand-in-hand with that of unidimensionality. In Rasch measurement, we use fit statistics help us to detect the discrepancies between the Rasch model prescriptions and the data we have collected in practice. This chapter explains the concept of residuals and how residuals are aggregated to calculate indices of misfit. Unresolved issues concerning fit estimation are raised.
- Chapter 13: A Synthetic Overview
The final chapter provides a synthetic overview of the nature of measurement in investigations in the human sciences and the extent to which Rasch analysis can been seen to address the inherent problems satisfactorily. Discussion centers on philosophical issues with regard to latent traits (Rasch analysis is a latent trait model) and their behavioral indices. Central to this discussion is the role of inference in the measurement process. The possibility of instantiating fundamental measurement in the human sciences is canvassed and distinctions made amongst IRT models. We raise and discuss the problems inherent in matching real life data with theoretical models , and, hence, reflect on the Rasch concepts of unidimensionality and fit.
- Appendix A: Technical Aspects of the Rasch Model
Clearly explained development of the important Rasch modeling equations for the:
- the dichotomous model;
- extensions to the Rating Scale and Partial Credit model; and
- the multifaceted Rasch model.
Perspectives on the estimation of fit and unidimensionality. as well as technical and theoretical aspects of Rasch analysis.
- Appendix B: Rasch Modeling Resources
Key Rasch analysis software applications (WINSTEPS , RUMMand QUEST)are described and discussed. URLs are provided so readers can download trial copies of these programs as well as samples of the data files used in the book chapters.
Details concerning the availability of software and addresses from which it can be obtained are given. The control files (both forWINSTEPS and for QUEST) for each and every analysis in this text are given as working exemplars and guides.
A comprehensive list of computer applications for Rasch analysis is provided along with details of Rasch measurement group and email discussion lists.
- Glossary
|
|