Gender and the recognition of vertebral fractures
Osteoporosis is a generalized skeletal disorder characterized by decreases in bone quantity and quality, or both, leading to an increased risk of fragility (that is low trauma or low energy) fractures. With the delineation of this disease, and the recognition that it may affect both sexes, and following the demonstration by Fuller Albright that treatment with estrogens could reverse the negative calcium balance that developed in women after menopause or oophorectomy, effective treatments for osteoporosis began to emerge in the late 20th century (1)..
From a radiological (“imaging”) perspective, apropos of osteoporosis, peripheral fracture recognition in general is rarely a challenge. More recently, techniques have been developed to measure bone density or mass. Not least, small aperture high resolution computed tomography and magnetic resonance imaging methods are yielding insights into bone micro-architecture and “bone quality” (2). All the more perplexing, therefore, is the fact that, despite these observations, there had been only limited evidence available for use in effectively diagnosing osteoporotic vertebral fractures (OVFs) from plain radiographs—especially as these are the most widely used tool in this context. Inevitably, as a consequence there had remained great uncertainty about their evolution and confusing data about their location and incidence as a function of gender.
The tangled web we weave
At this distance it is not clear by which criteria early radiologists made a diagnosis of vertebral fracturing, hampered as they were by the lack of a “gold standard” for diagnosis. In 1960 Simon, in discussing changes in bone shape only briefly referred to osteoporosis and that in the context of the collapse of a single vertebra (3). Apart from obvious damage to a vertebral body one early concept appears to have been that of wedge fracturing—the observation that the anterior height of a vertebra might on occasion be recognizably less than the posterior or vice versa—a finding that was apparently considered to be abnormal, and attributed to fracturing. Indeed, such wedge deformities are rarely seen in the spines of children. At a time when osteoporosis still had limited therapeutic implications, Fletcher, writing in 1947, analysed the implications of “wedge fracturing” by showing that, in military servicemen, the ratio of the anterior-to-posterior height of vertebrae in the thoraco-lumbar spine was similar to any biological measurement in having a distribution about a mean. The range extended to the degree that had become attributed to a potential fracture (4). Fletcher thus concluded that the very concept of a “wedge fracture” is suspect.
Hurxthal similarly explored vertebral measurements of wedging and biconcavity without the benefit of correlation with bone mineral density (BMD) (5).
However, morphometry—the application of measurements of vertebral asymmetry to diagnosis—came to dominate thinking in this context such that very many approaches to morphometry have been proposed (6-13). It is a particular irony that an important step in the evolution of morphometry was the use by Barnett and Nordin of a measurement to quantitate vertebral end-plate deformation. In doing so these investigators made no reference to, or interest in, “wedging” (6).
No morphometric diagnostic tool proposed has been subject to more than proof of concept and no testing which might justify use in clinical practice (6-13). Indeed, many authors have cautioned that morphometry is a tool of potential application in population studies but not one to be applied without reservation to clinical care (7,12).
The use of morphometry for fracture diagnosis culminated in the Genant Semi-quantitative (GSQ) diagnostic scheme (14), the most widely used tool in this context. Two aspects of this tool deserve particular mention. The “semi-quantitative” constraint was used because the method involved no actual measurement but was, rather, a visual estimate by the interpreting radiologist or clinician. Also, Genant et al. did emphasize in the report the importance of recognizing end plate damage as characteristic of OVF but did not make this a part of the iconic diagnostic visual tool, which is usually used alone (14). Thus, this advice has more often been ignored in pedagogy and practice (15,16). Validation of the tool never amounted to more than establishing that its use was reproducible among its proponents. The GSQ tool is, in at least one context, the currently recommended clinical technique of choice for diagnosing OVF (17), despite the reservations noted above (7,12).
After examining several tools for diagnosis, Jiang et al. proposed the use of a structured morphologic or qualitative approach to fracture diagnosis: the Algorithm-Based Qualitative tool (ABQ) (18). This focuses on the recognition of vertebral end-plate damage while excluding confounders, such as Schmorl nodes. In a previous report my colleagues and I have described modifying this tool to include anterior cortical fractures which are widely accepted, and visually self-evident, as fractures (mABQ) (19).
Several recent analyses had identified the diagnostic uncertainty posed by OVF (20,21) although this has had a long history (22). More recently some degree of clarity has emerged in the shape of evidence from large prospective studies. Three distinct studies from Hong Kong (23,24). Rotterdam, The Netherlands (25) and Canada (19) have been reported involving a combined total of more than 17,725 men and women. All three groups of investigators showed that qualitative evidence of vertebral fracture better aligns with lower bone densities and the risk of future vertebral and non-vertebral fractures, than attempts to diagnose OVF by morphometric means. These groups of workers developed their protocols in isolation rather than collaboratively and used differing terminology (Table 1) such that, while their findings lend themselves to broad conclusions, uncertainties remain in the application of detail (13).

Full table
In addition to the classifications of vertebral fracture noted above and in Table 1, other classifications of spinal fractures also exist (26) but many relate to high-energy (typically motor vehicle accident-induced) trauma. These are often “burst” fractures and commonly associated with long-tract neurological signs. Such characteristics are very rarely seen in the low-energy trauma context typical of osteoporosis.
Next steps
While some degree of clarity has emerged in respect of OVF diagnosis, as noted above, the Hong Kong and Canadian study groups have continued to analyze their data particularly in respect of what we know of the natural history of OVF and potential differences relating to gender and ethnicity. Previous studies had been handicapped by the uncertainties of, and differences between, the diagnostic tools used. Thus Leidig-Bruckner et al. on one hand noted that, in a population study, there are sex differences in the validity of using morphometric criteria as an index of prevalent OVF, with differences in the rates of quantitative vertebral deformities and other such findings between the sexes ranging up to 19.2% in women and 16.6% in men (27). Previously Szulc had commented that epidemiological studies suggest that vertebral deformities in men do not increase steeply with ageing, a finding would not be typical of osteoporosis as it is observed clinically (28). Szulc further suggested that many of morphometric deformities they observed might be unrelated to osteoporosis (28). BMD also did not differ between those with and without morphometric vertebral deformities although in evaluating the vertebrae these workers did slightly increase the GSQ Grade 1 threshold for such a deformity at T6–9 (28). Szulc went so far as to suggest that the majority of morphometric vertebral deformities in elderly men might be due to previous high-energy trauma.
Lauridsen measured a random sample of vertebrae from T8 to L3. They found “anterior wedging” of vertebrae to be much commoner in men than women at T8 to L3 in a random sample of 164 spinal radiographs (29). The degree of wedging was unrelated to age and no attempt was made to relate these findings to symptomatology. Matsumoto et al. in a study using spinal magnetic resonance images made similar observations in both sexes noting that the “ratio of anterior vertical height to posterior vertical height of the vertebral body in males, thinner subjects, smokers, and subjects with abnormalities of the endplates such as a Schmorl node [were] significantly smaller … than [in] females, fatter subjects, non-smokers, and those without endplate abnormalities”. Despite such observations these workers appeared to accept that vertebral wedging might be osteoporotic in provenance (30). Irrespective of fracture classification, Kherad, in the MrOsSweden study, found “prevalent vertebral fractures to be of low clinical relevance” in men (31).
Of particular interest, Jiang, in describing on her ABQ tool, noted that there was a counterintuitive difference between the segmental distribution in the spine between prevalent and incident fractures as diagnosed using morphometric but not morphologic criteria (18). This finding was reproduced in the Canadian (19) and Rotterdam (25) studies but not remarked upon in the comparable Hong Kong report (23).
In the index publication Wáng et al. (32) have more recently found in the Hong Kong MrOS population that “elderly males, with or without existing osteoporotic vertebral fracture, had much lower future vertebral fracture risk than elderly females”. This is not surprising but is different from the Canadian experience. The CaMos group had reported in an abstract presented at 2018 meeting of the American Society for Bone and Mineral Research that although prevalent vertebral morphometric deformities (without morphologic abnormalities) were common, incident morphologic vertebral fractures occurred much more frequently than incident morphometric deformities (without morphologic abnormalities) and better predicted further incident fractures. This difference in natural history of the vertebral abnormalities was interpreted as further evidence that radiographic vertebral morphometric deformities and morphologic vertebral fractures are essentially different (33). Furthermore, osteoporosis, being characterized by reduced bone quantity and/or quality with an increased predisposition to fractures, we observed increased loss of total hip BMD over time in participants with incident mABQ vertebral fracture as might be expected. GSQ were insufficiently frequent to assess any such association (33). That our findings were somewhat different from those of Wáng et al. (32) was perhaps because of ethnic differences, while follow-up extended over a decade in the CaMos report whereas for it was only 4 years in the Wáng et al. study. In these ways the full understanding of vertebral fracture recognition still eludes us.
In the absence of a “gold standard” it is perhaps useful to summarize the criteria which have been consistently used in the analyses of vertebral fracture diagnosis (13):
- Inverse correlation with BMD [MrOS Hong Kong (23); The Rotterdam Study, The Netherlands (25); CaMos, Canada (19)].
- Correlation with vertebral fracture outcomes [MrOS Hong Kong (23,24); The Rotterdam Study, The Netherlands (25); CaMos, Canada (19)].
- Correlation with non-vertebral fracture outcomes [MrOS Hong Kong (23); The Rotterdam Study, The Netherlands (25); CaMos, Canada (19)].
- Insights from biomechanical studies in general.
- Correlation with mortality (No published data.)
However, a consensus has yet to be reached as follows:
- Correlation of natural history outcomes consistent with what is known about the effects of age and gender on osteoporosis [MrOS Hong Kong (32); CaMos, Canada (33)].
- Potential role of ethnicity which might be reflected in the radiological signs of fracture.
- Prevalent and incident fracture distribution by segment, as it correlates with biomechanical studies [The Rotterdam Study, The Netherlands (25); CaMos, Canada (19) vs. MrOS Hong Kong (32)].
To understand the niceties of the existing data is not easy. One possibility is that cultural differences exist in the way in which anomalous findings, for example in the thoracic spine, are interpreted with respect to fracturing, degenerative change and their role in scoliosis (Figure 1). The reported mis-match in the segmental distribution of vertebral fractures referred to above (18,19,25) could be interpreted as an explanation for the growing suspicion of the over-diagnosis of vertebral fractures in osteoporosis. This possibility has been recognized (28,34,35) and more forcibly stated by Wáng et al. (36) and is implicit in the fact that reports of vertebral fracturing increasingly disregard fractures while studies not using this strategy can produce very counter-intuitive results (37).

Those groups lately working in this context and with enrollments large enough to be statistically credible [chiefly the MrOS Hong Kong group (23,24); the Rotterdam group (25); the CaMos group in Canada (19)] have done so in isolation. The methods used have been comparable but not identical and, not least, the descriptive terminology used has been different (Table 1). Even normal vertebrae may be labeled as characterized by “short vertebral height” (wedged but to a smaller degree than the later GSQ classification) (38).
Jiang et al. have already found that there are differences in the frequency of morphologic signs of fracture depending, for example, on the use of the ABQ paradigm which has very stringent criteria (18). The complexity of and uncertainties (both biological and statistical) surrounding the radiographic recognition of OVF are such that answering them was never to be a simple task.
The Hong Kong group have already clarified the radiological signs of fracturing (39,40) and proposed a classification of vertebral fractures and suggested a potential nomenclature (41). Perhaps it is time to build upon this. Especially so given that the increasing agreement that radiological over-diagnosis of low-grade vertebral fractures is a persisting clinical problem that threatens the credibility of the use of radiology in risk assessment (28,34-36,42).
These differences are highlighted in the contrast between the more recent Hong Kong and CaMos data.
What remains to be determined
- In the general sense of the word, a taxonomy of vertebral fractures/deformities remains to be agreed to. The varied terminology now used to describe and define vertebral fractures (Table 1) does not serve clarity particularly for those seeking guidance free from arcane terminology and for use in day-to day radiologic practice.
- The diagnostic strategies used in the pivotal studies described above were similar but not identical and the derivation of an evidence-based consensus concerning diagnostic criteria is desirable.
- By some means a consensus has to be arrived at by which some terms that are widely used such as “wedge fracture” are clarified as to their meaning, if any.
- The relevance, if any, of the use of the descriptor “short vertebral height” (38), if it is to be used as proposed to describe a normal vertebra with some “non-threshold” degree of wedging.
- The natural history throughout life of morphometric vertebral deformities.
- Given the above, further examinations of the impact of gender and ethnicity on the natural history of fracturing are not only desirable but necessary.
