Assessment of terminological density in scientific publications on physical culture

Authors

DOI:

https://doi.org/10.15561/physcult.2025.0203

Keywords:

terminological density, sport and exercise sciences, scientific publications, thematic dictionaries, Medical Subject Headings (MeSH), Web of Science, bibliographic analysis, quantitative evaluation, scientific style

Abstract

Background and Study Aim. Scientific publications in the field of physical culture demonstrate considerable diversity in terminological usage and structural organization. With increasing standards for the quality of academic writing, the need for an objective and quantitative evaluation of terminological density has become more pressing. The aim of this study was to develop and apply a method for automated assessment of terminological density in scientific articles on physical culture using adapted thematic dictionaries.

Material and Methods. The study was based on articles retrieved from the Web of Science (WoS) database. A total of 16 593 bibliographic records related to physical culture were extracted over the past five years. Two dictionaries were employed for analysis: the official Medical Subject Headings (MeSH) in XML format and a thematic dictionary constructed from the WoS document corpus. The analysis included full-text PDF articles from 12 scientific journals, of which 6 were categorized as Q3, 1 as Q4, 3 were indexed in DOAJ, and 2 were not indexed. Terminological density was calculated in Python using the pandas library and evaluated on a scale ranging from very low to high.

Results. The assessment covered 12 journals in the field of physical culture. An optimal density level (0.010–0.019) was identified in 2 journals (16.7%), corresponding to a “balanced use of scientific terminology.” Three journals (25.0%) demonstrated low density (<0.01), characterized as “insufficient elaboration of the topic in scientific language.” In 7 journals (58.3%), a higher density (0.020–0.039) was observed, interpreted as either an “attempt to enhance scientific rigor” or an “excessive terminological load.”

Conclusions. The evaluation of terminological density provides an objective measure of the scientific style of publications in the field of physical culture. The differences identified across journals highlight variability in approaches to presenting scientific material. The integration of specialized dictionaries and the application of relative indicators offer a robust basis for ongoing monitoring and optimization of scientific discourse.

Author Biographies

Sergii Iermakov, Kharkiv State Academy of Design and Arts

sportart@gmail.com; Department of Methodologies of Cross-Cultural Practices; Kharkiv, Ukraine.

Georgiy Korobeynikov, Uzbek State University of Physical Education and Sports

k.george.65.w@gmail.com; Department of Theory and Methodology of International Wrestling (Tashkent region, Chirchik, Uzbekistan); Institute of Psychology, German Sport University Cologne (Cologne, Germany); Department of Combat Sports and Power Sports, National University of Physical Education and Sport (Kyiv, Ukraine).

David Curby, International Network of Wrestling Researchers

davcurb@gmail.com; Chicago, USA.

References

Han J, Kamber M, Pei J. Data mining: concepts and techniques. 3rd ed. Amsterdam Boston: Elsevier/Morgan Kaufmann; 2012.

Ramos J. Using TF-IDF to determine word relevance in document queries. In: Proceedings of the First Instructional Conference on Machine Learning, Vol. 242, Citeseer; 2003. P. 29–48.

Wang Y. Research on the TF–IDF algorithm combined with semantics for automatic extraction of keywords from network news texts. Journal of Intelligent Systems, 2024;33(1): 20230300. https://doi.org/10.1515/jisys-2023-0300

Wang W, Zhang J, Zhou F, Chen P, Wang B. Paper acceptance prediction at the institutional level based on the combination of individual and network features. Scientometrics, 2021;126(2): 1581–1597. https://doi.org/10.1007/s11192-020-03813-x

National Library of Medicine. MeSH Indexing Manual. Bethesda, MD: U.S. Department of Health & Human Services; 2022.

Mao Y, Lu Z. MeSH Now: automatic MeSH indexing at PubMed scale via learning to rank. Journal of Biomedical Semantics, 2017;8(1): 15. https://doi.org/10.1186/s13326-017-0123-3

Kim S, Yeganova L, Wilbur WJ. Meshable : searching PubMed abstracts by utilizing MeSH and MeSH-derived topical terms. Bioinformatics, 2016;32(19): 3044–3046. https://doi.org/10.1093/bioinformatics/btw331

Kiss A, Temesi Á, Tompa O, Lakner Z, Soós S. Structure and trends of international sport nutrition research between 2000 and 2018: bibliometric mapping of sport nutrition science. Journal of the International Society of Sports Nutrition, 2021;18(1): 12. https://doi.org/10.1186/s12970-021-00409-5

Venâncio TF, Costa MJ, Santos CC, Batalha N, Hernández-Beltrán V, Gamonales JM, et al. Evolution of documents related to strength training research on competitive swimmers: a bibliometric review. Frontiers in Sports and Active Living, 2025;7: 1603576. https://doi.org/10.3389/fspor.2025.1603576

Jagiello M, Lochbaum M. Pedagogical strategies for enhancing physical activity: a systematic review of trends and approaches. Pedagogy of Health. 2024;2(2):37–43. https://doi.org/10.15561/health.2024.0201

Jagiello M, Lochbaum M. Modern methods and means of physical culture in the rehabilitation of various population groups: a systematic review. Physical Culture, Recreation and Rehabilitation, 2024;3(2): 34–45. https://doi.org/10.15561/physcult.2024.0201

Yermakova T. Risk factors and prevention of falls in children under 3 years: a systematic review. Physical Culture, Recreation and Rehabilitation, 2025;4(1): 17–34. https://doi.org/10.15561/physcult.2025.0103

Yermakova T. Patterns and risk factors of falls among older adults: a systematic review. Pedagogy of Health. 2025;1(1):11–21. https://doi.org/10.15561/health.2025.0102

National Library of Medicine. Download MeSH Data. XML Format [cited 2025 May 17]. Available from: https://www.nlm.nih.gov/databases/download/mesh.html

Clarivate. KeyWords Plus generation, creation, and changes [Internet]. 2025 [updated 2025 Jul 21; cited 2025 Jul 22]. Available from: https://support.clarivate.com/ScientificandAcademicResearch/s/article/KeyWords-Plus-generation-creation-and-changes?language=en_US

Van Eck NJ, Waltman L. Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 2010;84(2): 523–538. https://doi.org/10.1007/s11192-009-0146-3

Cobo MJ, López‐Herrera AG, Herrera‐Viedma E, Herrera F. SCIMAT : A new science mapping analysis software tool. Journal of the American Society for Information Science and Technology, 2012;63(8): 1609–1630. https://doi.org/10.1002/asi.22688

Trieschnigg D, Pezik P, Lee V, De Jong F, Kraaij W, Rebholz-Schuhmann D. MeSH Up: effective MeSH text classification for improved document retrieval. Bioinformatics, 2009;25(11): 1412–1418. https://doi.org/10.1093/bioinformatics/btp249

Leblanc V, Hamroun A, Bentegeac R, Le Guellec B, Lenain R, Chazard E. Added Value of Medical Subject Headings Terms in Search Strategies of Systematic Reviews: Comparative Study. Journal of Medical Internet Research, 2024;26: e53781. https://doi.org/10.2196/53781

Lipscomb CE. Medical Subject Headings (MeSH). Bulletin of the Medical Library Association, 2000;88(3), 265–266.

Chen C, Hu Z, Liu S, Tseng H. Emerging trends in regenerative medicine: a scientometric analysis in CiteSpace. Expert Opinion on Biological Therapy, 2012;12(5): 593–608. https://doi.org/10.1517/14712598.2012.674507

Sujarwo, Paramitha ST, Hasyim AH, Ramadhan MG, Setiawan I. A bibliometric analysis of research on physical activity and fitness among preschool children in Asia (2020-2024). Edu Sportivo: Indonesian Journal of Physical Education, 2024;5(3): 243–257. https://doi.org/10.25299/esijope.2024.vol5(3).19085

Pradhan P, Zala LN. Bibliometrics analysis and comparison of global research literatures on research data management extracted from Scopus and Web of Science during 2000–2019. Library Philosophy and Practice (e-journal), 2021;5519:1-17.

Van Eck NJ, Waltman L. Citation-based clustering of publications using CitNetExplorer and VOSviewer. Scientometrics, 2017;111(2): 1053–1070. https://doi.org/10.1007/s11192-017-2300-7

Aria M, Cuccurullo C. Bibliometrix: An R-tool for comprehensive science mapping analysis. J Informetrics. 2017;11(4):959–975. https://doi.org/10.1016/j.joi.2017.08.007

Breuer T, Schaer P, Tunger D. Relevance assessments, bibliometrics, and altmetrics: a quantitative study on PubMed and arXiv. Scientometrics, 2022;127(5): 2455–2478. https://doi.org/10.1007/s11192-022-04319-4

Han O, Demydenko O. Terminological richness of english-language scientific-popular and media texts in physics. Advanced Linguistics, 2023;(12). https://doi.org/10.20535/2617-5339.2023.12.290971

Elsevier. Scopus Author Guidelines. Amsterdam: Elsevier; 2021.

Ding Y, Chowdhury GG, Foo S. Bibliometric cartography of information retrieval research by using co-word analysis. Inf Process Manag. 2001;37(6):817–842. https://doi.org/10.1016/S0306-4573(00)00051-0

Haunschild R, Bornmann L, Marx W. Climate change research in view of bibliometrics. PLoS One. 2016;11(7):e0160393. https://doi.org/10.1371/journal.pone.0160393

Bekhuis T, Demner-Fushman D, Crowley R. Comparative effectiveness research designs: An analysis of terms and coverage in Medical Subject Headings (MeSH) and Emtree. J Med Libr Assoc. 2013;101(2):92–100. https://doi.org/10.3163/1536-5050.101.2.004

Koloski B, Pollak S, Škrlj B, Martinc M. Extending Neural Keyword Extraction with TF-IDF tagset matching. In: Proc EACL Hackashop on News Media Content Analysis and Automated Report Generation; 2021. p. 22–29. https://aclanthology.org/2021.hackashop-1.4.pdf

Valkanas K, Diamandis P. Pareto distribution in virtual education: challenges and opportunities. Canadian Medical Education Journal, 2021; https://doi.org/10.36834/cmej.73511

Nisonger TE. The “80/20 Rule” and Core Journals. The Serials Librarian, 2008;55(1–2): 62–84. https://doi.org/10.1080/03615260801970774

Solnyshkina MI, Gatiyatullina GM, Kupriyanov RV, Ziganshina CR. Lexical density as a complexity predictor: the case of Science and Social Studies textbooks. Research Result Theoretical and Applied Linguistics. 2023;9(1). https://doi.org/10.18413/2313-8912-2023-9-1-0-2

Halliday MAK. Spoken and written language. Oxford: Oxford University Press; 1985.

Bajerowska A. Kilka uwag o fachowości przyczynek do rozważań teoretycznych [Some remarks on the professionalism of contributions to theoretical considerations]. Kwartalnik Neofilologiczny, 2024; 5–19. (In Polish). https://doi.org/10.24425/kn.2024.149614

Istiqomah F, Basthomi Y. Exploring nominalization and lexical density deployed within research article abstracts: A grammatical metaphor analysis. Englisia: Journal of Language, Education, and Humanities, 2024;11(2): 14. https://doi.org/10.22373/ej.v11i2.20390

Qaiser S, Ali R. Text Mining: Use of TF-IDF to Examine the Relevance of Words to Documents. International Journal of Computer Applications, 2018;181(1): 25–29. https://doi.org/10.5120/ijca2018917395

Li X, Zhang A, Li C, Ouyang J, Cai Y. Exploring coherent topics by topic modeling with term weighting. Information Processing & Management, 2018;54(6): 1345–1358. https://doi.org/10.1016/j.ipm.2018.05.009

Fu Z, Su Y, Meng Z, Collier N. Biomedical Named Entity Recognition via Dictionary-based Synonym Generalization. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore: Association for Computational Linguistics; 2023. p. 14621–14635. https://doi.org/10.18653/v1/2023.emnlp-main.903

Slater LT, Bradlow W, Ball S, Hoehndorf R, Gkoutos GV. Improved characterisation of clinical text through ontology-based vocabulary expansion. Journal of Biomedical Semantics, 2021;12(1): 7. https://doi.org/10.1186/s13326-021-00241-5

Kugic A, Pfeifer B, Schulz S, Kreuzthaler M. Embedding-based terminology expansion via secondary use of large clinical real-world datasets. Journal of Biomedical Informatics, 2023;147: 104497. https://doi.org/10.1016/j.jbi.2023.104497

Thießen F, D’Souza J, Stocker M. Probing Large Language Models for Scientific Synonyms. In: SEMANTICS 2023 EU: 19th International Conference on Semantic Systems, September 20-22, 2023, Leipzig, Germany. 2023;3510:1–14.

Pans M, Madera J, González LM, Pellicer-Chenoll M. Physical Activity and Exercise: Text Mining Analysis. International Journal of Environmental Research and Public Health, 2021;18(18): 9642. https://doi.org/10.3390/ijerph18189642

Pavliuk A, Rohach O, Sheludchenko S, Yefremova N, Boichuk V. Structural and derivational parameters of the sports terminology. Research Trends in Modern Linguistics and Literature, 2022;5: 16–31. https://doi.org/10.29038/2617-6696.2022.5.16.31

Pavliuk IB. Terminological fields of fitness terminology. Folium, 2023;(2): 59–65. (In Ukrainian). https://doi.org/10.32782/folium/2023.2.9

Tsybanyuk O, Mishkulynets O, Komisaryk M, Kuznietsova K, Chuyko H. Genesis of the transformation of terminology in the field of physical education and sports in Romania, historical context. Conhecimento & Diversidade, 2023;15(37): 381–403. https://doi.org/10.18316/rcd.v15i37.10966

Mănescu DC. Big Data Analytics Framework for Decision-Making in Sports Performance Optimization. Data, 2025;10(7): 116. https://doi.org/10.3390/data10070116

Lee Y, Kang JH, Lee S, Oh T, Choi S. The Evolution of Terminology: A Scoping Review of Terms and Concepts Used to Research Sport in the Digital Realm. Quest, 2024;76(4): 462–480. https://doi.org/10.1080/00336297.2024.2357370

Klégr A, Bozděchová I. Sports Terminology as a Source of Synonymy in Language: the Case of Czech. Revista Alicantina de Estudios Ingleses, 2019;(32): 163. https://doi.org/10.14198/raei.2019.32.07

Ranđelović N, Živković D, Piršl D, Piršl T, Đošić A. Classification of sports terms: Thematic approach. Fizicko vaspitanje i sport kroz vekove, 2023;10(1): 1–10. https://doi.org/10.5937/spes2301001R

Qutab I, Malik KI, Arooj H. Sentiment Classification Using Multinomial Logistic Regression on Roman Urdu Text. International Journal of Innovations in Science and Technology, 2022;4(2): 323–335. https://doi.org/10.33411/IJIST/2022040204

Manning CD, Schütze H. Foundations of statistical natural language processing. Cambridge, Mass: MIT Press; 1999.

Wang J. Utilizing Text Mining Technology to Enhance English Learners’ Vocabulary. International Journal of Electronics and Communication Engineering, 2024;11(9): 86–98. https://doi.org/10.14445/23488549/IJECE-V11I9P109

Lu W, Huang S, Yang J, Bu Y, Cheng Q, Huang Y. Detecting research topic trends by author-defined keyword frequency. Information Processing & Management, 2021;58(4): 102594. https://doi.org/10.1016/j.ipm.2021.102594

Zhang Q, Lu W, Yang Y, Chen H, Chen J. Automatic Identification of Research Articles Containing Data Usage Statements. In: Knowledge Discovery and Data Design Innovation, Dallas, Texas, USA: WORLD SCIENTIFIC; 2017. p. 67–87. https://doi.org/10.1142/9789813234482_0004

Kim A, Kim SS. Engaging in sports via the metaverse? An examination through analysis of metaverse research trends in sports. Data Science and Management, 2024;7(3): 181–188. https://doi.org/10.1016/j.dsm.2024.01.002

Hammerschmidt J, Calabuig F, Kraus S, Uhrich S. Tracing the state of sport management research: a bibliometric analysis. Management Review Quarterly, 2024;74(2): 1185–1208. https://doi.org/10.1007/s11301-023-00331-x

Shilbury D. A bibliometric analysis of four sport management journals. Sport Management Review, 2011;14(4): 434–452. https://doi.org/10.1016/j.smr.2010.11.005

Yan E, Williams J, Chen Z. Understanding disciplinary vocabularies using a full-text enabled domain-independent term extraction approach. Glanzel W (ed.) PLOS ONE, 2017;12(11): e0187762. https://doi.org/10.1371/journal.pone.0187762

Ellen Riloff, Jessica Shepherd. A Corpus-Based Approach for Building Semantic Lexicons. In: Second Conference on Empirical Methods in Natural Language Processing. 1997. P. 117124.

Horák A, Baisa V, Rambousek A, Suchomel V. A New Approach for Semi-Automatic Building and Extending a Multilingual Terminology Thesaurus. International Journal on Artificial Intelligence Tools, 2019;28(02): 1950008. https://doi.org/10.1142/S0218213019500088

Buhin Pandur M, Dobša J, Kronegger L. Topic modelling in social sciences: case study of Web of Science. In: Central European Conference on Intelligent and Information Systems; 2020 Oct; Varaždin, Croatia; 2020. P. 67–72.

Ahmad M, Mahmood AM, Siddique AR. Variation in academic writing: A corpus-based research on syntactic features across four disciplinary divisions. Novitas-ROYAL (Research on Youth and Language), 2023;17(2), 50–65. https://doi.org/10.5281/zenodo.10015816

Nasseri M, Thompson P. Lexical density and diversity in dissertation abstracts: Revisiting English L1 vs. L2 text differences. Assessing Writing, 2021;47: 100511. https://doi.org/10.1016/j.asw.2020.100511

Bakuuro J. In the Belly of Text Complexity: Unravelling the Nexus between Lexical Density and Readability. Athens Journal of Philology, 2024;11(3): 255–274. https://doi.org/10.30958/ajp.11-3-4

Khatra O, Shadgan A, Taunton J, Pakravan A, Shadgan B. A Bibliometric Analysis of the Top Cited Articles in Sports and Exercise Medicine. Orthopaedic Journal of Sports Medicine, 2021;9(1): 2325967120969902. https://doi.org/10.1177/2325967120969902

Staunton CA, Abt G, Weaving D, Wundersitz DWT. Misuse of the term ‘load’ in sport and exercise science. Journal of Science and Medicine in Sport, 2022;25(5): 439–444. https://doi.org/10.1016/j.jsams.2021.08.013

Francoeur A. Fawcett, Peter (1997) : Translation and Language. Linguistic Theories Explained, coll. «Translation Theories Explained », Manchester (UK), St. Jerome Publishing, 160 p. Meta: Journal des traducteurs, 1999;44(3): 514. https://doi.org/10.7202/002768ar

Kim H, Kim SH, Kim J, Kim EH, Gu JH, Lee D. A keyword-based approach to analyzing scientific research trends: ReRAM present and future. Scientific Reports, 2025;15(1): 12011. https://doi.org/10.1038/s41598-025-93423-5

Mulia Al-Amien M, Hidayati D, Haryadi D. Analysis Of Scientific Article Writing Ability. International Journal of Educational Management and Innovation, 2022;3(1): 103–110. https://doi.org/10.12928/ijemi.v3i1.5555

Mendoza-Muñoz M, Vega-Muñoz A, Carlos-Vivas J, Denche-Zamorano Á, Adsuar JC, Raimundo A, et al. The Bibliometric Analysis of Studies on Physical Literacy for a Healthy Life. International Journal of Environmental Research and Public Health, 2022;19(22): 15211. https://doi.org/10.3390/ijerph192215211

Memon AR, Chen S, To QG, Vandelanotte C. Vigorously cited: a bibliometric analysis of the 100 most cited sedentary behaviour articles. Journal of Activity, Sedentary and Sleep Behaviors, 2023;2(1): 13. https://doi.org/10.1186/s44167-023-00022-8

Li F, Xie W, Han Y, Li Z, Xiao J. Bibliometric and visualized analysis of exercise and osteoporosis from 2002 to 2021. Frontiers in Medicine, 2022;9: 944444. https://doi.org/10.3389/fmed.2022.944444

Arnal-Gómez A, Navarro-Molina C, Espí-López GV. Bibliometric analysis of core journals which publish articles of physical therapy on aging. Physical Therapy Research, 2020;23(2): 216–223. https://doi.org/10.1298/ptr.E10024

Downloads

Published

2025-12-30

How to Cite

1.
Iermakov S, Korobeynikov G, Curby D. Assessment of terminological density in scientific publications on physical culture. Physical Culture, Recreation and Rehabilitation. 2025;4(2):74-89. https://doi.org/10.15561/physcult.2025.0203
Statistics

Abstract views: 295 / PDF downloads: 170

Most read articles by the same author(s)