Abstract :

Cervical spondylosis is a degenerative condition of the cervical spine that affects over 85% of individuals above 60 and nearly 25% under 40. Although X-rays are the most accessible imaging modality, manual interpretation achieves only 68.3% accuracy, often requiring costly CT or MRI scans for confirmation. To address this limitation, we developed CervNet, a multimodal deep learning model that integrates X-ray images with quantitative spinal parameters. Vertebrae C2–C7 were localized using a Single Shot Detector, and 77 clinically relevant spinal parameters were computed. An attention block extracted diagnostic patterns such as disc space narrowing, which were fused with EfficientNetB7-derived visual features including osteophytes and endplate sclerosis. CervNet achieved 99.09% accuracy, surpassing prior models. By mimicking real radiological workflows, CervNet improves diagnostic accuracy and interpretability, offering a practical solution for early detection of cervical spondylosis in resource-limited settings.