Vocabulary Systems
Scope
This content is intended to provide guidance on the use of vocabulary systems that support editing and validation of vocabularies.
Audience
This module is primarily targeted to managers and editors of established vocabularies. It is assumed that learners have some experience with data management in practice.
Outcome
Learners should be able to use systems to edit, validate, transform and version vocabularies.
💡 Identifies troubleshooting tips, common errors and potential issues.
🚧 Exercises
🎬 videos
Introduction
Vocabularies are data, and data management inevitably requires application support. Vocabularies are no different! In this module we will step you through some vocabulary systems developed by KurrawongAI that will make the task of managing vocabularies easier.
If you've been working through Introduction to Voabularies, Advanced Vocabulary Editing or Vocabulary Reuse, you will have already encountered a key tool in any vocabulary managers toolkit: an editor. In this module we will look again at VocEdit, but draw your attention to a VocEdit > GitHub integration feature that help you manage vocabulary review and versioning. We'll also introduce you to VocExcel, a spreadsheet-based tool for editing simple vocabularies. And we'll introduce tools that help you validate a vocabulary, and a tool that converts a vocabulary from one RDF format to another.
VocEdit + GitHub
Throughout these Vocabulary modules we have presented exercises that direct you to save vocabulary files to your desktop - you can close VocEdit, open it again and just return to that same file to continue editing. But what if a file needs to be reviewed, edited or validated outside of the VocEdit tool? And you may want to store your vocabulary where others can access it. VocEdit is integrated with GitHub to achieve these goals.
🚧 Save a vocabulary to GitHub
💡 This exercise assumes you have a vocabulary saved in a GitHub repository that you have permission to configure. Don't have a vocabulary? You can upload the file from the first exercise to your GitHub repository.
- Go to the KurrawongAI VocEdit in Chrome browser.
- Select Integrations > GitHub > GitHub App Configuration
- Select a GitHub account from the list
- Select Only select repositories (this is the most anticipated option for training and first time users - but select All repositories if needed)
- Select repositories (assuming you selected Only select repositories option)
- Select the repository where you have a vocabulary file
- Install & Authorise -
💡 if you do not have permissions to install in the repository, the button in the last step will read Authorise & Request. In this case you will need to request access before proceeding with GitHub integration.
- Save
- Project > Open > GitHub File - you will be presented with the repository that you configured > Select > Next
- Either create a new branch or select an existing branch > Next
- Select a vocabulary file
- Edit the vocabulary - any changes that you like
- Save
- Project > Save
- Go to your GitHub repository - note the change!
Close your project in VocEdit. The next time you return to this project:
- Project > Open > GitHub File
Now your vocabulary is available in your GitHub repository where it can be reviewed, edited, accessed over the web and even deployed.
VocExcel
VocExcel is a vocabulary editor that is essentially a MS Excel template. Using VocExcel, you will be creating vocabulary data in tabular form. Don't worry, VocExcel comes with a transformation service that will convert the Excel file into a range of RDF formats.
💡 VocExcel is more suitable for simple or 'flat' vocabularies, such as a vocabulary where all
skos:Concepts are top concepts of askos:ConceptScheme. VocExcel can technically handle a vocabulary of any hierarchy depth, but anything beyond a two-level vocabulary (top concepts with one level ofskos:narrowerconcepts), is not as well supported. For deeper, multi-level hierarchy vocabularies we recommend VocEdit.
To walk you through VocExcel features, we'll create a vocabulary from scratch.
🚧 Create a vocabulary in VocExcel
💡 You will need access to MS Excel to do this exercise.
- Go to the KurrawongAI VocExcel in any browser.
- Get VocExcel Template
- Open the downloaded template (it may have opened automatically)
- Save As a new file, e.g. "VocExcel-newVocab.xlsx"
VocExcel will open at the Introduction tab. Note also the Documentation tab, which gives you a rundown of the properties that you may be editing.
- Open the Concept Scheme tab
On the Concept Scheme tab you can create a new Concept Scheme for the vocabulary. Note that VocExcel conforms to the VocPub Specification, and some fields with an asterix (*) are mandatory - they are:
- Vocabulary IRI
- Preferred Label
- Definition
- Creation Date (yyyy-mm-dd)
- Modified Date (yyyy-mm-dd) - repeat the Creation Date if this is the first edit
- Creator
- Publisher
- History Note
- Add data for all mandatory fields in the Concept Scheme
- Open the Concept tab
- Add data for all mandatory fields for a Concept
- Save the template
- Go to VocExcel in any browser.
- Upload an Excel or RDF file > choose MS Excel file you just saved > Upload
You will be presented with a Result. From here you can view the Concept Scheme or any Concept in the file, or the Full RDF Turtle result.
💡 Why not try downloading the RDF, uploading it to a GitHub repository, and continuing on with the VocEdit + GitHub steps above? Both VocEdit and VocExcel generate vocabulary data in the same format - you can switch between the tools if you like!
SHACL Validator
You can check that your vocabulary is valid by using a SHACL validator. What does valid mean, and what is SHACL?
A vocabulary validator will typically check for two things:
- the RDF syntax is valid - e.g. there are correct end of line terminators; all prefixes are defined;
- the vocabulary conforms to some vocabulary profile - a set of rules or quality measures that state what classes and properties feature in the vocabulary, and that certain datatypes are used to express property values.
SHACL (W3C, 2017), or the Shapes Constraint Language, is a way of encoding a vocabulary profile so that it is machine-readable and used in some tool to check a vocabulary's conformance with that profile.
A SHACL file can be used in various validation tools and services - here we'll demonstrate the KurrawongAI SHACL Validator, with a special focus on the VocPub profile, through an exercise.
🎬 You can find out about the SHACL Validator on the KurrawongAI YouTube channel.
🚧 Validate a vocabulary with VocPub SHACL Validator
- Go to the KurrawongAI SHACL Validator in any browser
- Data to validate > Upload > select
pestRiskPath_training.ttl(don't have the file? see the first exercise in Introduction to Vocabularies) - Data to validate form > Upload
- Scroll down to SHACL Shapes form
- Use Validators > expand VocPub > Add the most recent version > Close**
- Validate
A page of Validation Results will launch. This report lets you know where your vocabulary does not conform to the VocPub profile. The messages are colour coded and indicates issues in your vocabulary that are:
- a 🔴 Violation - these issues MUST be addressed for the vocabulary to conform with VocPub specification;
- a 🟡 Warning - it is recommended that these issues are addressed, but your vocabulary is still valid if you do not; and
- Informational - these are optional improvements to the vocabulary
💡 Below these results, a Full Validation Report is also available.
Note that each message includes a reference to the VocPub specification, such as:
For http://example.com/pestRiskPath: Requirement 2.15 - modified date - violated
The Requirement referred to here can be looked up in the VocPub Specification in the Vocabulary section (as this violation refers to a skos:ConceptScheme) at Requirement 2.15.
💡 Tip: Looking up the VocPub Specification requirement gives you more information about the Warning given in the Validator Results.
RDF Converter
This and other modules have featured exercises using Turtle (.ttl) files. Turtle is just one RDF serialisation - there are other standard formats that vocabulary data can be managed in.
What if you have a vocabulary file that needs to be in a different format? KurrawongAI have developed an RDF Coverter. This converter can be used to transform RDF files (not just SKOS vocabularies) into different formats.
🚧 Convert a Turtle file to XML
- Go to the KurrawongAI RDF Converter in any browser
- RDF Data > Upload > select
pestRiskPath_training.ttl(don't have the file? see the first exercise in Introduction to Vocabularies) - Next to the Convert button, select XML > Convert
- Scroll down to RDF Output form
- Copy or download results
💡 Tip: XML is a W3C Recommendation (W3C, 2008)
References and Further Reading
- AGLDWG (2025). VocPub profile specification. https://agldwg.github.io/vocpub-profile/specification
- W3C (n.d.). QSKOS. Retrieved March 5, 2025. https://www.w3.org/2001/sw/wiki/QSKOS
- W3C (2008). Extensible Markup Language (XML) 1.0 (W3C Recommendation). https://www.w3.org/TR/xml/
- W3C (2009). SKOS Simple Knowledge Organization System Reference (W3C Recommendation). https://www.w3.org/TR/skos-reference/
- W3C (2014). Turtle: Terse RDF triple language (W3C Recommendation). https://www.w3.org/TR/turtle/
- W3C (2017). Shapes Constraint Language (SHACL) (W3C Recommendation). https://www.w3.org/TR/shacl
- W3C (2020). JSON-LD 1.1: A JSON-based serialization for Linked Data (W3C Recommendation). https://www.w3.org/TR/json-ld11/