mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Update notes for 2017-04-19
This commit is contained in:
@ -251,3 +251,27 @@ $ bundle binstubs puma --path ./sbin
|
||||
- So it seems he did it in the crosswalk!
|
||||
- Keep working on Ansible stuff for deploying the CKM REST API
|
||||
- We can use systemd's `Environment` stuff to pass the database parameters to Rails
|
||||
- Abenet noticed that the "Workflow Statistics" option is missing now, but we have screenshots from a presentation in 2016 when it was there
|
||||
- I filed a ticket with Atmire
|
||||
- Looking at 933 CIAT records from Sisay, he's having problems creating a SAF bundle to import to DSpace Test
|
||||
- I started by looking at his CSV in OpenRefine, and I see there a _bunch_ of fields with whitespace issues that I cleaned up:
|
||||
|
||||
```
|
||||
value.replace(" ||","||").replace("|| ","||").replace(" || ","||")
|
||||
```
|
||||
|
||||
- Also, all the filenames have spaces and URL encoded characters in them, so I decoded them from URL encoding:
|
||||
|
||||
```
|
||||
unescape(value,"url")
|
||||
```
|
||||
|
||||
- Then create the filename column using the following transform from URL:
|
||||
|
||||
```
|
||||
value.split('/')[-1].replace(/#.*$/,"")
|
||||
```
|
||||
|
||||
- The `replace` part is because some URLs have an anchor like `#page=14` which we obviously don't want on the filename
|
||||
- Also, we need to only use the PDF on the item corresponding with page 1, so we don't end up with literally hundreds of duplicate PDFs
|
||||
- Alternatively, I could export each page to a standalone PDF...
|
||||
|
Reference in New Issue
Block a user