Nonparametric Models and Methods for Social Sciences Data

  • Akritas, Michael G. (PI)
  • Osgood, D. Wayne (CoPI)

Project: Research project

Project Details


Social scientists often collect and analyze data that are longitudinal. Some longitudinal studies have long durations, which result in large number of observations per subject and give rise to what is called functional data. The analysis of such data often is complicated for two reasons in addition to the lack of independence. First, for many kinds of social science data, there are weak theories about functional forms explaining the effects of factors and weak measurement procedures (e.g., with attitudes) that produce ordinal or only somewhat stronger (but typically not interval) scales. Thus, there is need for flexible modeling and test procedures for non-normal and often heteroscedastic data. Second, missing observations are common whenever subjects are followed for long periods. Typically, 'missingness' is not completely at random, causing parametric-type remedies. In this project, the investigators propose a fully nonparametric approach to some inference questions regarding such data. They will consider inference procedures for a) multi-way heteroscedastic analysis of variance designs when some (or all) of the factors have many levels but small number of replications per cell, b) the effects of factors which adjust for the presence of covariates, c) the covariate effect and its interaction with categorical factors, and d) designs with missing data. Procedures that use weighted averages of (mid-) ranks and that are known to maintain a high level of efficiency for a wide variety of data types will also be developed for the above problems. Facets of the project are closely connected to the classical problem of lack-of-fit testing and some methods that will be developed also will be relevant in this area. This research builds upon prior results by the investigators, many of which were obtained using previous grants.

The flexible modeling provided by the nonparametric approach, coupled with the efficient test procedures afforded by (mid-) rank test statistics, are the key thrusts of this project. To ascertain the effect of several categorical factors and continuous covariates on a response of interest, researchers typically use parametric or semiparametric event history modeling, including linear models, generalized linear models, frailty models, marginal proportional hazards models, and random coefficient models. These models depend on assumptions that may or may not be satisfied for any given application. This can have unpleasant practical consequences as documented in several case studies. Moreover, missing observations require imputations that are done with parametric assumptions. In fact, it is widely believed that nonparametric procedures cannot be used when data are missing at random (as opposed to missing completely at random). Programs for implementing the nonparametric procedures will be developed and applied to a number of social sciences studies including a) questions regarding routine activities and deviant behavior, b) examination of the effects of various life circumstances on criminal offending, and c) examination of incarcerated boys recently released from correctional institutions. This award is jointly supported by the Division of Mathematical Sciences and the Directorate for Social, Behavioral, and Economic Sciences as part of the Mathematical Sciences Priority Area.

Effective start/end date9/1/038/31/07


  • National Science Foundation: $234,216.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.