ISSN: 2074-8132
Introduction. There are no fewer than two hundred algorithms for sex estimation based on cranial morphology, relying on statistical analysis of non-metric, linear, angular traits, and their combinations. Nevertheless, many physical anthropologists prefer to rely on visual observations. The objectives of this research encompass exploring potential reasons behind the preference for a visual approach and conducting an analysis of the comparative effectiveness of visual and statistical methods for sex estimation.
Materials and methods. The study is grounded in an analysis of publications related to methods of sex estimation based on cranial traits, spanning the past 70 years. Comparison of accuracy estimates was conducted using non-parametric tests, considering differences in statistical methods, validation approaches (no validation, cross-validation, independent test), and variable types (non-metric traits, craniometry, geometric morphometrics).
Results. General reasons for skepticism towards algorithms include unrealistic expectations regarding their capabilities, greater susceptibility to errors by models compared to humans, lack of control over classification. However, algorithms generally surpass experts in predicting the target variable. The average accuracy of visual sex estimations based on cranial traits is slightly lower than the estimates of statistical models and exhibits noticeable variability. The accuracy of estimations made by experienced anthropologists is comparable to the average performance of models. Nevertheless, the effectiveness of algorithms significantly diminishes when applied to datasets originating from sources other than the training set, particularly when dealing with craniometric traits. In a substantial portion of studies, the size of the training datasets is insufficient for a reliable assessment of model effectiveness, and the sex distribution is skewed towards male skulls, leading to some inflation of the accuracy of their estimates. Model effectiveness can also decline due to errors in the evaluation of non-metric traits, and the assessment of inter-researcher discrepancies does not allow for an evaluation of their impact on model accuracy.
Conclusion. Despite an extensive bibliography, there remains a lack of data on both the accuracy of the visual approach to sex estimation and the reliability of models with claimed high effectiveness. The adoption of flexible methodologies enabling researchers to independently control both variable selection and the composition of the training set will help overcome algorithm aversion and enhance the quality of estimates. © 2023. This work is licensed under a CC BY 4.0 license.
Introduction. There are several standardized methods for estimating the age of a skull. Most of these methods are based on the analysis of suture obliteration and the tooth wear scoring. However, many anthropologists prefer a more subjective approach, relying on general impressions without using a set of standardized criteria. This study aimed to assess the effectiveness of a visual method for age estimation and reconstruction of age-at-death structure in a skeletal sample.
Materials and methods. The study was based on a series of 116 skulls from the early 20th century collected by the Peter the Great Museum of Anthropology and Ethnography (Kunstkamera). These specimens had documented sex and age information. Two researchers independently assessed the age of the skull specimens and recorded the degree of suture fusion on the cranial vault as well as the level of tooth wear on the occlusal surfaces. The correlation between age and estimated scores was calculated using Spearman’s rank correlation coefficient. The discrepancy between estimated and actual ages was measured by calculating the mean absolute error (MAE) and systematic error (SE) as the average difference between documented and estimated ages for the entire sample as well as for each age group. Intraclass correlation coefficients were used to assess the consistency of the authors’ estimates.
Results. The authors' estimates showed moderately high agreement among themselves and a moderate positive correlation with actual age. The accuracy of the visual assessments was found to be comparable with that of more formalized methods for assessing the degree of suture obliteration. The estimates also exhibited the phenomenon of regression to the mean, with individuals in younger cohorts being systematically overestimated in terms of age and those in older cohorts being underestimated. The accuracy of determining the age-at-death distribution depends to some extent on the actual characteristics of the sample structure. Averaging estimates from different authors or several estimates from the same author, repeated over a large time interval, makes it possible to bring estimates closer to real data.
Conclusion. Increasing interobserver agreement of age estimates can be achieved by fixing traits on the same point scales, as well as by increasing age intervals. The accuracy of estimates can be improved by repeated age estimation as well as the “wisdom of the crowd” effect. © 2024. This work is licensed under a CC BY 4.0 license