142 Commits

Author SHA1 Message Date
a1243cf54a
poetry.lock: Run poetry update 2021-03-17 10:09:14 +02:00
99cb76568f
pyproject.toml: Use csv-metadata-quality v0.4.7
This includes some minor optimizations and the ability to check for
duplicate items.

See: https://github.com/ilri/csv-metadata-quality/releases/tag/v0.4.7
2021-03-17 10:08:06 +02:00
ca116284ca
Use "unsafe" in quotes on frontpage
All checks were successful
continuous-integration/drone/push Build is passing
This was more me being cautious when I was writing the original tool
than a warning about it being actually unsafe. Now that this web fro
ntend will be used by less-technical users I should tone down the la
nguage.
2021-03-16 13:04:33 +02:00
2fcfc76ea5
csv_metadata_quality_web/main.py: Remove check for __main__
All checks were successful
continuous-integration/drone/push Build is passing
This is only needed if we are running directly in Python.
2021-03-14 22:05:04 +02:00
78f58b459c
Create application for gunicorn
This is apparently what gunicorn looks for.
2021-03-14 22:03:15 +02:00
863a540225
Move csv_metadata_quality_web to a package
Eventually I will want to refactor so this will be necessary.
2021-03-14 22:01:45 +02:00
cc203b2842
Regenerate requirements
Generated using poetry:

  $ poetry export --without-hashes -f requirements.txt > requirements.txt
  $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
2021-03-14 21:10:33 +02:00
de0703ceb2
poetry.lock: Run poetry update 2021-03-14 21:09:58 +02:00
2a852c9ed3
pyproject.toml: Pin new version of csv-metadata-quality
This version doesn't bother checking invalid multi-value separators.
Instead it just fixes them.
2021-03-14 21:09:07 +02:00
edd651317b
templates/index.html: Change default selections
Enable AGROVOC lookup on dcterms.subject as well as the "unsafe"
fixes. For the AGROVOC lookup I just think that it  might not be
obvious to non-technical users that you have to check the box AND
enter a field name, despite the placeholder value. In any case, it
doesn't hurt to enable AGROVOC lookup by default because it won't
fail if the default dcterms.subject field is not present in the
user's CSV.
2021-03-14 20:54:50 +02:00
6f396f392f Revert "Move style.min.css to css/v1/style.min.css"
This reverts commit 8f6d337d2d4611c35b4998c8acf8ceaba3bb89aa.

We are using Heroku now so we don't need this phony version.
2021-03-14 20:40:33 +02:00
f292a4902f
.drone.yml: Install git for some pip deps
All checks were successful
continuous-integration/drone/push Build is passing
2021-03-14 19:32:14 +02:00
b43b995a90
Add drone.yml for Drone CI
Some checks failed
continuous-integration/drone/push Build is failing
2021-03-14 18:33:42 +02:00
aa862620d7
README.md: Try a relative link for screenshot.png 2021-03-14 16:15:21 +02:00
f42a83c7ab
Remove Google App Engine config 2021-03-14 16:07:26 +02:00
1d46d490cb
README.md: Add build badge from GitHub Actions 2021-03-14 15:57:59 +02:00
4f48af7f24
Add GitHub workflow to build
For now only builds, as I'm not sure how to test the web application
yet.
2021-03-14 15:55:58 +02:00
e8dc08bcac
README.md: Update text 2021-03-14 13:54:03 +02:00
f4ddf4a7b5
README.md: Update intro text 2021-03-14 13:11:39 +02:00
d37654206f
README.md: Fix screenshot link 2021-03-14 13:08:50 +02:00
69501cbacb
Add README.md with screenshot and license 2021-03-14 13:07:24 +02:00
2a9ec1c3f3
LICENSE.txt: Use GPLv3 instead of AGPLv3
I would rather have the source publishing requirements be triggered
on distribution than on web hosting.
2021-03-14 13:05:32 +02:00
be9143204c
Add Procfile for Heroku
See: https://devcenter.heroku.com/articles/python-gunicorn
2021-03-14 12:29:28 +02:00
55815cf4c0
runtime.txt: Use Python 3.9.2
Actually it seems they do have Python 3.9.2.

See: https://devcenter.heroku.com/articles/python-support
2021-03-14 12:25:49 +02:00
73a13145b6
Add runtime.txt
Apparently to deploy on Heroku we need this. And they only support
Python 3.7? Damn...
2021-03-14 12:24:03 +02:00
8f6d337d2d
Move style.min.css to css/v1/style.min.css
Trying to break Google App Engine's aggressive caching.
2021-03-14 12:04:28 +02:00
aabb783d99
app.yaml: Set Cache-Control header to private for CSS
Google App Engine agressively caches stuff. They are currently serving
a 24-hour old version of my CSS after multiple updates and re-deploys.
Ughhh. From their docs:

> After a file is transmitted with a given expiration time, there is
> generally no way to clear it out of web-proxy caches, even if the user
> clears their own browser cache. Re-deploying a new version of the app
> will not reset any caches. Therefore, if you ever plan to modify a
> static file, it should have a short (less than one hour) expiration
> time. In most cases, the default 10-minute expiration time is
> appropriate.

The only way to break this for now is to change the CSS *directory*.
In the future I think we have to be sure to set the private cache
control header, which lets browsers cache it, but not public CDNs.

See: https://cloud.google.com/appengine/docs/standard/python3/how-requests-are-handled
2021-03-14 11:58:57 +02:00
bd31ac912e
Regenerate static assets 2021-03-14 11:42:07 +02:00
61040ea4a5
Use an ILRI theme 2021-03-14 11:41:48 +02:00
d483f7fc0b
.gitignore: Ignore sqlite requests response cache 2021-03-14 11:36:46 +02:00
c07716abcb
.gcloudignore: Ignore sqlite requests response cache 2021-03-14 11:36:09 +02:00
36b43a06b9
templates/index.html: Add link to test.csv
Offer the user to test with this file.
2021-03-14 11:34:55 +02:00
25aa74ba14
app.yaml: Try to serve static CSS directly
Google App Engine is currently caching an old version of my CSS, so
I am trying to get it to use the correct version. Let's try serving
it directly from the filesystem.

See: https://cloud.google.com/appengine/docs/standard/php/getting-started/serving-static-files
2021-03-14 10:36:57 +02:00
a2ace2d331
.gcloudignore: Don't upload source dir
We only need this during build.
2021-03-14 10:31:46 +02:00
94cdd7fb50
Regenerate requirements
Generated using poetry:

  $ poetry export --without-hashes -f requirements.txt > requirements.txt
  $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
2021-03-14 10:09:18 +02:00
74bde23567
poetry.lock: Run poetry update 2021-03-14 10:08:44 +02:00
4e52d1bcc9
Add configurable requests cache directory
As I expected, on Google App Engine we can't write the cache file
to the current working directory. I modified csv-metadata-quality
CLI to check for the REQUESTS_CACHE_DIR environment variable so we
don't really have to do anything different other than setting the
variable.
2021-03-14 10:06:39 +02:00
e7dd8d1421
Add AGROVOC lookup support
This works locally, but I don't think it will work on App Engine
because csv-metadata-quality uses requests-cache and creates the
agrovoc-response-cache.sqlite file in the current working directory.
2021-03-13 23:49:24 +02:00
9aab2ae83f
templates/index.html: Remove basic and experimental labels 2021-03-13 23:34:34 +02:00
6c3804d55b
Add support for skipping fields ("-x") 2021-03-13 23:34:11 +02:00
122d9fd53c
Add support for experimental checks ("-e") 2021-03-13 23:01:11 +02:00
198acdb1a7
Major refactor
Re-work upload and file processing so they are in the same Python
function. Now I will start exposing other command line options in
the form, like unsafe fixes, excluding fields, etc. Now I see tha
t it is easier to save the POSTed file and process it in the same
function so I don't have to pass around the other POSTed form val
ues as URL query parameters.

Now, as a result of changing the flow above, I also had to make a
change to the way I show the results page. Instead of processing
the file and returning the rendered results to the user directly,
I process the file, save the rendered results to /tmp, and return
a redirect to the user to the results page.
2021-03-13 22:11:26 +02:00
adc2d06094
main.py: Add note about using /tmp
I originally wanted to use an "uploads" directory or something, but
it seems we can only write to /tmp on Google App Engine. They really
want you to buy storage or database services! This is memory mapped
so it disappears when you re-deploy.
2021-03-13 22:09:17 +02:00
bc256b242d
static/css/style.min.css: Regenerate 2021-03-13 22:07:53 +02:00
f82cb6ce05
source/scss/main.scss: Increase container width
We need more space for the log on the results page and we actually
don't even need to worry about people running this on a phone.
2021-03-13 22:06:43 +02:00
4bdec3b889
app.yaml: Use Python 3.9
Python 3.9 is apparently now generally available.
2021-03-13 14:12:14 +02:00
f79be86361
templates/index.html: Use Bootstrap form components 2021-03-13 13:50:46 +02:00
3715c5e976
Use new commit for csv-metadata-quality
This one doesn't treat the fixing of invalid multi-value separators
as "unsafe".
2021-03-13 13:02:47 +02:00
8603ec4bca
main.py: Rework command line args
Turns out we only need to use sys.argv when we were trying to run
the csv-metadata-quality module directly in Python by importing it.
2021-03-13 12:52:22 +02:00
0471820f3a
main.py: Actually use sys.argv
I set this but never actually passed it to the subprocess. Now I'm
wondering if I actually need it, or if that was just when I was tr-
ying to import the csv-metadata-quality module?
2021-03-13 12:49:01 +02:00