< - - Back to Measurement matrix

Quick links:  Sample    Response Rate    Internal consistency   DIF Method   Evidence   Impact    Expert Opinion

Stommel et. al.  ( see abstract )

Center for Epidemiologic Studies Depression Scale (CES-D)

Name/ Reference

Stommel M, Given BA, Given CW, Kalaian, HA, Schulz R, McCorkle R. Gender Bias in the Measurement Properties of the Center for Epidemiologic Studies Depression Scale (CES-D). Psychiatry Research. (1993) 49:239-250

Source contact info

Dr. M. Stommel, Michigan State University, College of Nursing, A230 Life Sciences Bldg., East Lansing, MI 48824-1317, USA.)

Availability (private or public)

public

Conceptual framework

Epidemiological studies of depression have shown a greater prevalence rate of depressive symptoms among women compared with men.  Authors argue that disparity can be due to true differences or to a measurement artifact. Confirmatory factor-analytic models are used to examine gender biases of individual items of the Center for Epidemiologic Studies Depression (CES-D) Scale.

Purpose of measure & application (clinical, research, survey, screening)

self-report instruments to measure current depressive symptomatology in nonpsychiatric populations

 

Sample characteristics

 708 cancer patients sample: This sample combines cases from three different home-care studies based on different populations: cancer patients from lower Michigan primarily residing in small towns and rural areas (n = 240) (Given and Given, 1987) cancer patients from suburban Pittsburgh (n = 258) (Schulz, 1990), and cancer patients from urban Philadelphia (n = 210) (McCorkle, 1990).  Subjects in the three subsamples differ significantly with respect to age, marital status, race/ethnicity, as well as their cancer diagnosis and functional limitations in activities of daily living. Only the ratio of men to women shows substantial similarities across studies.

504 caregivers from a noncancer population (to confirm findings on data) selected from three Michigan home-care studies:  caregivers of physically impaired elderly (n = 100) (Given and Given, 1986), Alzheimer’s patients (n = 170) (Given and Given, 1987), and caregivers of patients with diverse chronic problems (n = 234) (Given and Given, 1989), all evenly split by gender. An examination of sociodemographic characteristics exhibited no significant differences across the three subgroups. Most of the 504 caregivers were married (78.2%). White (92.5%), Protestant (73.2%), middle class (mean household income: $32,500), and between 18 and 88 years of age (mean = 63.4). The proportion of male caregivers in these studies ranged from 16.2% to 27.6%. To avoid gender-study interactions, all male caregivers with complete CES-D responses were selected and supplemented with an equal number of randomly selected female caregivers from each study.  

Comparing the combined caregiver sample with the combined cancer patient sample yielded significant differences across all mentioned sociodemographic variables.

Recruitment methods

 708 cancer patients, a combination of cases from three different home-care studies based on different populations: cancer patients from lower Michigan primarily residing in small towns and rural areas (see Given and Given, 1987 below) cancer patients from suburban Pittsburgh (See Schulz, 1990 below), and cancer patients from urban Philadelphia (see McCorkle, 1990 below).  

504 caregivers from a noncancer population (to confirm findings on data) selected from three Michigan home-care studies:  caregivers of physically impaired elderly  (see Given and Given, 1986), Alzheimer’s patients (Given and Given, 1987), and caregivers of patients with diverse chronic problems (Given and Given, 1989), all evenly split by gender. The proportion of male caregivers in these studies ranged from 16.2% to 27.6%. To avoid gender-study interactions, all male caregivers with complete CES-D responses were selected and supplemented with an equal number of randomly selected female caregivers from each study.

Data collection method

See references above

Response rate

Not provided

Format & design (readability, # of items, time to complete, response categories)

20 items addressing depressive symptoms. Time frame:  within the last week. Response format:  “rarely or none of the time” (0), “some or a little of the time” (1),“occasionally or a moderate amount of time” (2), and “most or all of the time”3). In most studies, researchers use a total scale score summating the responses to all 20 four-point items  theoretical range: 0-60.  Tends to be skewed positively in nonpsychiatric populations with most respondents scoring in the lower ranges and mean scale scores not exceeding 10 in the general population.

Type of measurement (nominal, ordinal, interval, ratio)

ordinal

Scoring (range, direction, rules, missing data)

See above

Availability of translations & source

Not provided

Psychometric Properties:

Scale construction

“Analyses of the internal structure of the CES-D scale usually yield a four-factor model  which includes a seven-item “depressive affect” or “mood” subscale, a four-item “positive affect” or “well-being” subscale, a seven-item “somatic and retarded activity” subscale, and a two-item “interpersonal” subscale (Clark et al., 1981; Berkman et al., 1986; Ensel, 1986~; Hertzog et al., 1990). Not all items seem to fit well into this four-factor model. For various reasons, researchers have sometimes excluded a few items from the scale (Radloff, 1977; Ensel, 1986; Liang et al., 1989). While some researchers have found the subscale dimensions to be sufficiently independent to investigate their relations to predictor variables separately (Krause, 1986; Gatz and Hurwicz, 1990) others have argued that there is not enough empirical differentiation to warrant partitioning the CES-D scale into multiple subscales (Hertzog et al., 1990)” (p. 240)

Basic summary statistics

a mean of 13.2 for this sample on the total CES-D scale among cancer patients: women average 13.8 and men 12.6 among caregivers: women the mean = 15.8 and men =13.8

Variability

The skewed response pattern of the interpersonal items resulted from the fact that more than 90% of both male and female respondents in the cancer patient sample indicated that they “rarely or none of the time” thought people unfriendly or felt disliked.

Test-retest reliability

Not provided

Interrater reliability

Not provided

Internal consistency

Cronbach’s alpha= 0.89 for the 20-item scale and 0.88 for the 15-item scale

Content validity

Not provided

Construct validity

Not provided

Concurrent validity

Not provided

Predictive validity

Not provided

Sensitivity to change

Not provided

Differential Item Functioning (DIF)

Variable studied (e.g., groups)

female: n = 361, male: n = 347

Sample size

sample of 708 cancer patients (and a sample of 504 caregivers of chronically ill elderly for confirmation of results)

DIF method used

(e.g., MH, IRT, Logistic regression, MIMIC, other factor analysis)

Confirmatory factor-analytic models. Formal tests of DIF were not assessed.

The examination of the degree to which the CES-D scale is “factorially invariant” across groups of male and female cancer patients involves the imposition of several nested factor models on both groups of respondents. “These nested models are compared to a baseline model requiring only that the same indicator items load on the same subscale factors for male and female patients. The baseline model is derived from the usual four-factor model with CES-D items grouped into subscales. On the opposite end of the continuum of nested models, a highly restrictive model is proposed that incorporates the hypotheses (a) that all unstandardized factor loadings are equal for male and female respondents, (b) that all error variances are equal across the comparison groups, and also (c) that all covariances among the subscale factors are equal across the groups. If consistent with the data, this model would entail the absence of any gender bias since all free parameters are constrained to be equal in both gender groups, with the implication that the underlying factor model is identical in both subgroups. To test for possible deviations from this strict model, a series of Lagrange Multiplier tests are implemented.  This test helps evaluate which constraint(s) should be relaxed to yield improvements in the overall goodness-of-fit of the model. Since the Lagrange Multiplier test is used in an exploratory manner to discover which items produce gender-biased responses, the model is retested on a second, independent sample to avoid taking advantage of sampling chance.” (p. 241)

The interpersonal items were excluded from the scale due to: their lack of face validity, lack of contribution to scale variance, and lack of desirable psychometric attributes (skewness).

Test of model assumptions

“The attempt to fit the bias-free model to the data resulted in acceptable values for the overall goodness-of-fit indices: NFI = 0.976 and CFI = 0.989. However, this model clearly does not fit as well as the constraint-free null model: the X2 difference test yields a highly significant (p < 0.000) X2 value of 105.34 (df= 39). After relaxation of only five equality constraints identified through the Lagrange Multiplier tests, however, an alternative model was found that fit the data as well as the null model (X2 = 38.58, df = 34, p = 0.270). This alternative model no longer required equality across gender for two factor loadings, two error variances, and one correlation among the latent subscales. Confirming the fit of this same model on the caregiver sample also resulted in a nonsignificant X2 difference of 29.41 (df= 34, p = 0.517).  In the context of our discussion of gender bias, it is important to examine which population parameters differ between male and female respondents in the new well-fitting factor model. The five constraints that had to be relaxed involve factor loadings and error variances of the items: (1) “thought life a failure,” (2) “talked less than usual,” and (3) “had crying spells.” In addition, the strength of the correlation between the somatic symptoms factor and the depressive mood factor differed among male and female respondents.”  (p. 243)

Purification

Not performed

Evidence of uniform DIF

18 CES-D items were examined for possible gender bias.

“Without the interpersonal factor, the remaining items are grouped into a three factor model, including (1) depressive mood, (2) well-being, and (3) somatic symptoms factors. When these three factors are used, a gender-bias free model can be constructed that requires the same factor loadings, the same error variances, and the same interfactor covariances for both male and female respondents.

To the extent that these cross-group constraints are inconsistent with the data, they can be relaxed until a model is found that fits the data as well as the null model, which puts no cross-group equality constraints on any structural parameters. (p. 243)

 

“The test of gender difference in the response patterns of the “talked less” and “crying”  involves a multivariate regression with the “talked less” and “crying” items regressed on a dummy variable for gender (1 = female, 0 =male) and the remaining 15 CES-D items. Responses to both of these items depend on the gender of the respondent even after controlling for respondents’ general levels of depressive symptomatology as represented by the 15 unbiased CES-D items. Men who otherwise have the same level of depressive symptoms as women are less likely to have “crying spells,” a fact that marks this item as a gender-biased indicator of depression. The gender bias in the response to the “talked less” item is in the opposite direction: depressed men are more likely to reduce their verbal communication compared with equally depressed women” (p. 245)

“In the caregiver sample, the “crying” and “talked less” items produced similar gender-specific responses, with statistically significant biases of -0.13 and +0.23, respectively.)” (p. 246)

Summary: two CES-D items that show different response patterns among men and women. Three additional CES-D items were excluded because of their poor psychometric qualities, leaving a subset of 15 gender-bias free scale items.

Evidence of non-uniform DIF

Not Performed

Magnitude of DIF

Not Provided

Impact of DIF

“Employing the original 20-item scale in a two-way analysis of variance with the combined 1212 cases stratified by gender and subject group(cancer patients vs. caregivers) yields the following results: among cancer patients, women average 13.8 and men 12.6 on the total CES-D scale; among caregivers, the means are 15.8 (women) and 13.8 (men). These values represent significant 248 differences by gender (p < 0.004) and subject group (p < 0.004), but there is no interaction (p > 0.5 14). After removal of the two gender-biased items as well as the “failure” and the two interpersonal items, the following scale means obtain for the reduced 15item CES-D scale: among cancer patients, 12.1 (women) and 11.2 (men): among caregivers, 13.9 (women) and 12.2 (men). As in the case of the total CES-D scores based on all 20 items, the gender (p < 0.005) and subject group (p < 0.004) effects remain significant, but the reduction in the gender difference in CES-D scores from 1.6 to 1.3 (for the combined sample of 1212) is itself statistically significant. Despite the narrowing of the gender difference, the reduced 15item scale correlates very highly with the original 20-item scale (0.98). In addition, shortening the CES-D scale by the five selected items barely affects its overall reliability: the Cronbach’s alpha of 0.89 for the 20-item scale changes to 0.88 for the 15item scale.”

“While the reduced 15-item CES-D scale no longer exaggerates gender differences in depressive symptomatology, it retains almost all the information of the original 20-item scale as demonstrated by the very high correlation between the original 20-item and the shortened 15-item version of the CES-D.”

 Reviewer comments:

1. Subjects in the three aggregated subsamples differ significantly with respect to demographic variables and ADL functioning. These characteristics can themselves be associated with depression and with gender and were not adjusted for, for the purpose of the analyses.

Key references:

Berkman, L.F.; Berkman, C.S.; Kasl, S.; Freeman, D.H.; Leo, L.; Ostfeld, A.M.; Cornoni-Huntley, J.; and Brody, J.A. Depressive symptoms in relation to physical health and functioning in the elderly. American Journal of Epidemiology, 124:372-388, 1986.

Clark, V.A.; Aneshensel, C.S.; Frerichs, R.R.; and Morgan, T.M. Analysis of the effects of sex and age in response to items on the CES-D scale. Psychiatry Research, 5: 171-181, 1981.

Devins, G.M., and Orme, CM. Center for Epidemiologic Studies Depression Scale. In: Kayser, D.J., and Sweetland, R.C., eds. Test Critiques. Vol. 2. Kansas City : Test Corporation of America, Inc., 1986. pp. 144-160.

Ensel, W.M. Measuring depression: The CES-D scale. In: Lin, N.; Dean, A.; and Ensel, W.E., eds. Social Support, Life Events, and Depression. New York : Academic Press, 1986a. pp. 51-70.

Ensel, W.M. Sex, marital status, and depression: The role of life events and social support. In: Lin, N.; Dean, A.; and Ensel, W.M., eds. Social Support, Life Events, and Depression. New York : Academic Press, 19866. pp. 231-247.

Ensel, W.M., and Lin, N. The life stress paradigm and psychological distress. Journal of Health and Social Behavior, 32:321-341, 1991.

Gatz, M., and Hurwicz, M. Are old people more depressed? Cross-sectional data on Center for Epidemiologic Studies Depression Scale factors. Psychology and Aging, 5:284-290, 1990.

Given, B., and Given, C. W. Family homecare for cancer-A community-based model (Grant #l ROl NR01915). Funded by National Center for Nursing Research, 1987a.

Given, C.W., and Given, B. Caregiver responses to managing elderly patients at home (Grant #l ROl AG06584). Funded by the National Institute on Aging, 1986.

Given, C.W., and Given, B. Impact of Alzheimer’s disease on family caregivers (Grant #I ROI MH41766). Funded by the National Institute of Mental Health, 1987b.

Given, C.W., and Given, B. Caregiver responses to managing elderly patients at home (Grant #2 ROI AG06584). Funded by the National Institute on Aging, 1989.

Hertzog, C.; Van Alstine, J.; Usala, P.D.; Hultsch, D.F.; and Dixon , R. Measurement properties of the Center for Epidemiologic Studies Depression Scale (CES-D) in older populations. Psychological Assessment, 2~64-12, 1990.

Krause, N. Stress and sex differences in depressive symptoms among older adults. Journal of Gerontology, 41:127-73 1, 1986.

Liang, J.; Van Tran, T.; Krause, N.; and Markides, K.S. Generational differences in the structure of the CES-D scale in Mexican Americans. Journal of Gerontology, 44:S120-130, 1989.

McCorkle, R. Evaluation of home care for cancer patients (Grant #ROl NR01914). Funded by the National Center for Nursing Research, 1990.

Radloff, L.S. The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, I:385401, 1977.

Radloff, L.S., and Locke, B.Z. The community mental health assessment survey and the CES-D scale. In: Weissman, M.M.; Myers, J.K.; and Ross, C.E., eds. Community Surveys of Psychiatric Disorders. New Brunswick , NJ : Rutgers University Press, 1986. pp. 177-189.

Roberts, R.E. Reliability of the CES-D scale in different ethnic contexts. Psychiatry Research, 2: 125- 134, 1980.

Roberts, R.E.; Andrews, J.A.; Lewinsohn, P.M.; and Hops, H. Assessment of depression in adolescents using the Center for Epidemiologic Studies Depression Scale. Psychological Assessment, 2: I22- 128, 1990.

Schulz, R. Living with homecare: Cancer patients and caregivers (Grant #ROl CA48635). Funded by the National Cancer Institute. 1990.

( see abstract )

 < - - Back to   TOP   Measurement matrix