International Union of Speleology
Informatics Commission - Dictionary Sub-commission

UIS Cave and Karst Glossary Project

Leader: Peter Matthews, Australia.
Assisted by: Mike Lake, Australia.

Updated 2023-05-03

Rationale

The UIS Caver's Multi-lingual Dictionary has been established for over 24 years now (since 1998), and it has always been our intention to include multi-lingual definitions attached to its bare terms. That has been one of the major aims of this project: to process an existing comprehensive glossary of cave and karst terms into a digital version that we can link to the equivalent terms in the existing Dictionary. First in English, and then in other languages as they become available. And other specialist glossaries also to be later incorporated. The glossary is also to be a term-addressable stand-alone and downloadable web page for easy public access, and in addition, be incorporated into the UIS KarstLink cave and karst Ontology to allow Linked Data machine access to it across the Internet. Stage 1 of this plan has now been achieved: we have published our existing* pilot glossary, with English definitions and many non-English synonym terms, and linked the UIS Dictionary to it.

Of course many speleological glossaries have been produced in the past, but our chosen seed glossary has given us, with the help of the author, both a relatively easy conversion and tailoring job, and a comprehensive and established professional glossary as a starting point. Another benefit of it is that it incorporates, acknowledges and references terms that have come from earlier cave and karst glossaries. Later stages of the Glossary will begin to diverge from this pilot original* as we add new terms and possibly improve some of its original definitions.

Later Glossary stages will include among other things internal cross-referencing, integration of the non-English terms and synonyms with the UIS Multi-lingual Dictionary, and the proper presentation of scientific equations. We will of course be inviting improvements to its terms, additional terms, the incorporation of other specialist glossaries, and expansion into glossaries in other languages. The Glossary will also be integrated into the UIS KarstLink Ontology for Linked Data.

Although our initial version has 2700+ definitions it will need many more, so we hope that this new facility will encourage cavers, scientists and managers to expand it with their own specialist terms, and in further languages.

* Field, Malcolm (U.S. EPA. 2nd Edition, 2002):
A Lexicon of Cave and Karst Terminology with Special Reference to Environmental Karst Hydrology.
https://cfpub.epa.gov/ncea/risk/recordisplay.cfm?deid=54964 (PDF 2MB)

Vision

The overall vision for the UIS Dictionary Sub-commission is to provide the go‑to one‑stop‑shop for cave and karst terms and their definitions, in multiple languages, readily accessible for both people and machine access, and easy to expand with new peer-reviewed terms. This Glossary Project is an integral part of that vision.

Goals

Status

Progress so far:

Action Plan

This Glossary Action Plan may of course change as we go along. Although it includes some interactions with the Multi-lingual Dictionary, there will be further steps beyond the below for the development of the Multi-lingual Dictionary itself - for example, conversion of the current Dictionary system to a modern web-based system, but these steps are a separate project outside the scope of this Glossary Project.

Colour code: Done.     In progress.

  1. Select an existing comprehensive, credible and easily convertable glossary and get permission to use it. (done)
  2. Set up the project infrastructure: this master web page, a discussion forum, and a repository for the developmental and production digital files of the project. (done)
  3. Publicise the project and invite participation. (done)
  4. Convert the chosen glossary into a digitally processable format and publish an initial Draft 1 term-addressable web page from it. (done)
  5. Clean up any presentational problems with this initial web page and publish a credible web-page version of the original PDF glossary as Stage 1. (done)
  6. Analyse the correlations between Dictionary concepts and Glossary terms, then republish the Dictionary with its concepts linked to their respective definitions in the Glossary. (done)
  7. Invite definitions, preferably existing ones, for concepts that are in the Dictionary but do not yet have corresponding terms in the Glossary.
  8. Identify the different components that appear in various places in the source glossary, for example, language of a term, multiple definitions for a term, source references, and so on. (done)
  9. Consult with the KarstLink Team to see if they have any requirements to make it easier to later incorporate the definitions into the UIS KarstLink Linked Data Ontology. (done)
  10. Analyse and discuss the pilot glossary to see what issues need attention and what new features could or should be added to the Glossary web page. This to include the handling of the many non-English terms and synonyms currently present, and their possible relationship to the terms in the Caver's Multi-lingual Dictionary. (in progress)
  11. Parse the separate components of the digital glossary into a data store, probably in JSON-LD format for easier programming and Linked Data processing.
  12. Manually analyse and tabulate all correlations between Glossary terms and Dictionary concepts (the reverse of the above correlations). Add the Dictionary Concept Numbers into their respective Glossary terms in the Glossary data store.
  13. Consider whether any changes should be made to the Dictionary suggested by the contents of the Glossary.
  14. Invite feedback on suggested improvements to the Dictionary-Glossary system.
  15. Investigate whether there are other existing specialist glossaries that should be invited for inclusion into the Glossary data store, then process and add them.
  16. Write a program to generate from the Glossary data store a term-addressable and dictionary-concept-number-addressable web page of the definitions.
  17. Invite definitions in languages other than English and build these into the data store and ontology. These could be existing non-English glossaries, or newly translated versions of our English glossary, or a mixture of both.
  18. Assist the KarstLink Team to integrate the definitions into the UIS cave and karst Ontology.
  19. Invite further contributions to the glossary/ontology from cavers, scientists and managers after setting up the infrastructure to allow peer-reviewed additions and updates to be done easily and safely.

Invitation

Everyone interested is invited to follow or contribute to the project. To actively contribute ideas or participate in discussions, please open our project forum and register by clicking on the link near the top right-hand corner. If you want to be notified when there is new activity, you can "subscribe" to the project forum, or just to a topic (bottom-left corner once you are registered). Your comments and ideas on any aspect of the project are welcome. The forum also includes a "Help wanted" Topic where any specific tasks that we are looking for help on will be listed.

Notes on the file conversion

For those who want more detail of the conversion work from Field's published US-EPA PDF to the UIS web page:

Stage 1 of the glossary was produced from a Unicode UTF-16 text file exported from the WordPerfect 2-column original of Malcolm Field's Lexicon. Mike Lake then carried out some heavy editing to get the text into a consistent format and wrote a program to parse this file and add term numbers, resulting in a UTF-8 comma-separated-values (CSV) text file of three fields: term number, term, and description. The field separators were actually vertical bars rather than commas to allow for the commas and quotes in the description texts. The web page glossary table was then programmatically generated from this CSV file. Later stages will include more detailed parsing into a data store, which will facilitate adding the internal links, such as to references, see-also's, etc. All files and programs are accessible via github.

The changes made to the original Lexicon and to our Draft 1 in this Stage 1 release were basically cosmetic, and were mainly to restore presentation lost in the original export. The terms and definitions themselves have not been materially changed. If you notice anything else that still needs fixing, please do email us with the details.

Summary of changes made:

Current Glossary: http://uisic.uis-speleo.org/uisglossary-en.html
Forum: http://uisic.uis-speleo.org/forum/viewforum.php?f=19
Field's original: https://cfpub.epa.gov/ncea/risk/recordisplay.cfm?deid=54964
Repository: https://github.com/speleolinux/uisic
Dictionary: http://uisic.uis-speleo.org/lexintro.html
Update history: http://uisic.uis-speleo.org/lexhist.html
KarstLink: http://uisic.uis-speleo.org/exchange/karstlink/index-en.html
KL Ontology: http://ontology.uis-speleo.org
This page: http://uisic.uis-speleo.org/lexgloss.html
UISIC: http://uisic.uis-speleo.org
UIS: http://uis-speleo.org

Contact

Peter Matthews (Leader).   Mike Lake (Assisting).