Compare commits

..

6 Commits

Author SHA1 Message Date
9951d6598d
README.md: update TODO 2022-01-30 21:09:51 +03:00
bca3d25bbd
README-dev.md: massive update on workflows 2022-01-30 20:56:29 +03:00
852e793c3d
README.md: update TODO
The `util/create-rdf.py` script is slightly out of date with the
latest changes in the repository. The paths to CSV and TTL files
need to be updated, as well as the namespace for the project.
2022-01-30 20:38:30 +03:00
0ee5f5218b
README-dev.md: adjust Python steps
On Linux and macOS we can run these scripts directly because they
have executable permissions and the shebang line points to a sane
Python, but on Windows only God can help us. Better to write the
slightly lamer invocation here with Python directly (even though
the slash style will be different)...
2022-01-30 20:06:05 +03:00
6216144070
util/generate-hugo-content.py: controlled vocabularies
We are planning to remove the controlled vocabularies from the CSV
files so we should not expect that this column will exist. Instead,
check if there is a controlled vocabulary in the data directory.

The controlled vocabularies were already exported once using the
util/export-controlled-vocabularies.py script so we don't actually
need them in the CSVs anymore.
2022-01-30 20:04:28 +03:00
812c9241bd
README-dev.md: start working on docs 2022-01-30 19:41:45 +03:00
3 changed files with 52 additions and 11 deletions

View File

@ -8,14 +8,14 @@ The ISEAL Core Metadata Set is maintained primarily in CSV format. This decision
- The ISEAL Core Metadata Set, which lives in `data/iseal-core.csv`
- The FSC<sup>®</sup> extension, which lives in `data/fsc.csv`
From the CSV we use a series of Python scripts to create the RDF ([TTL](https://en.wikipedia.org/wiki/Turtle_(syntax))) representations of the schema as well as the HTML documentation site. All of this is automated using GitHub Actions (see `.github/workflows`) whenever there is a new commit in the repository. Everything should Just Work<sup></sup> so you should only need to follow the documentation here if you want to work on the workflow locally or make larger changes. In that case, continue reading...
From the CSV we use a series of Python scripts to create the RDF ([TTL](https://en.wikipedia.org/wiki/Turtle_(syntax))) representations of the schema as well as the HTML documentation site. All of this is automated using GitHub Actions (see `.github/workflows`) whenever there is a new commit in the repository. You should only need to follow the documentation here if you want to work on the workflow locally or make larger changes. In that case, continue reading...
## General Requirements
## Technical Requirements
- Python 3.8+
- Node.js 12+ and NPM
- Python 3.8+ — to parse the CSV schemas, generate the RDF files, and populate the documentation site content
- Node.js 12+ and NPM — to generate the documentation site HTML
## Python Setup
### Python Setup
Create a Python virtual environment and install the requirements:
```console
@ -24,18 +24,57 @@ $ source virtualenv/bin/activate
$ pip install -r requirements.txt
```
Then run the utility scripts to parse the schemas:
Once you have the Python environment set up you will be able to use the utility scripts:
- `./util/generate-hugo-content.py` — to parse the CSV schemas and controlled vocabularies, then populate the documentation site content
- `./util/create-rdf.py` — to parse the CSV schemas and create the RDF (TTL) files
If you have made modifications to the CSV schemas—adding elements, changing descriptions, etc—and you want to test them locally before pushing to GitHub, then you will need to re-run the utility scripts:
```console
$ ./util/generate-hugo-content.py -i ./data/iseal-core.csv --clean -d
$ ./util/generate-hugo-content.py -i data/fsc.csv -d
$ python ./util/generate-hugo-content.py -i ./data/iseal-core.csv --clean -d
$ python ./util/generate-hugo-content.py -i ./data/fsc.csv -d
$ python ./util/create-rdf.py
```
## Node.js Setup
To generate the HTML documentation site:
Assuming these scripts ran without crashing, you can check your `git status` to see if anything was updated and then proceed to regenerating the documentation site HTML.
### Node.js Setup
Install the web tooling and dependencies required to build the site:
```console
$ cd site
$ npm install
```
The Python scripts above only populated the *content* for the documentation site. To regenerate the actual HTML for the documentation site you must run the `npm build` script:
```console
$ npm run build
```
Alternatively, you can view the site locally using the `npm run server` command:
```console
$ npm run server
```
The site will be built in memory and available at: http://localhost:1313/iseal-core/
## Workflows
These are some common, basic workflows:
- Add new metadata element(s) → re-run Python scripts and regenerate documentation site
- Update metadata descriptions → re-run Python scripts and regenerate documentation site
- Update controlled vocabularies → re-run Python scripts and regenerate documentation site
These are advanced workflows:
- Change documentation site layout
- Requires editing templates in `site/layouts` and regenerating documentation site
- Change documentation site style
- Requires editing styles in `site/source/scss` and regenerating documentation site
- Add a new schema extension
- Requires editing Python utility scripts
- Requires editing styles in `site/source/scss` and regenerating documentation site
- Requires editing templates in `site/layouts` and regenerating documentation site

View File

@ -22,9 +22,11 @@ Consult [`README-dev.md`](README-dev.md) for technical information about making
- Repository
- Add more information and instructions to README.md
- Update GitHub Actions once `util/create-rdf.py` is fixed
- Schema
- Remove combined "latLong" fields (they can be inferred from the separate fields)
- Remove controlled vocabularies from the schema CSVs
- Update `util/create-rdf.py`
- Site
- Change "Suggested element" to "DSpace mapping"?

View File

@ -90,7 +90,7 @@ def parseSchema(schema_df):
cardinality = row["element options"].capitalize()
prop_type = row["element type"].capitalize()
if row["element controlled values or terms"]:
if os.path.isfile(f"data/controlled-vocabularies/{element_name_safe}.txt"):
controlled_vocab = True
controlled_vocabulary_src = (