Commit Graph

146 Commits

Author SHA1 Message Date
4bae262a97 Regenerate requirements
All checks were successful
continuous-integration/drone/push Build is passing
Generated using poetry:

  $ poetry export --without-hashes -f requirements.txt > requirements.txt
  $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
2021-03-19 12:00:09 +02:00
c5138f8065 Update csv-metadata-quality version for Mojibake support 2021-03-19 11:59:21 +02:00
36a072b1fd pyproject.toml: Bump version to 0.0.2
All checks were successful
continuous-integration/drone/push Build is passing
2021-03-17 10:10:01 +02:00
f6726ef210 Regenerate requirements
Generated using poetry:

  $ poetry export --without-hashes -f requirements.txt > requirements.txt
  $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
2021-03-17 10:09:39 +02:00
a1243cf54a poetry.lock: Run poetry update 2021-03-17 10:09:14 +02:00
99cb76568f pyproject.toml: Use csv-metadata-quality v0.4.7
This includes some minor optimizations and the ability to check for
duplicate items.

See: https://github.com/ilri/csv-metadata-quality/releases/tag/v0.4.7
2021-03-17 10:08:06 +02:00
ca116284ca Use "unsafe" in quotes on frontpage
All checks were successful
continuous-integration/drone/push Build is passing
This was more me being cautious when I was writing the original tool
than a warning about it being actually unsafe. Now that this web fro
ntend will be used by less-technical users I should tone down the la
nguage.
2021-03-16 13:04:33 +02:00
2fcfc76ea5 csv_metadata_quality_web/main.py: Remove check for __main__
All checks were successful
continuous-integration/drone/push Build is passing
This is only needed if we are running directly in Python.
2021-03-14 22:05:04 +02:00
78f58b459c Create application for gunicorn
This is apparently what gunicorn looks for.
2021-03-14 22:03:15 +02:00
863a540225 Move csv_metadata_quality_web to a package
Eventually I will want to refactor so this will be necessary.
2021-03-14 22:01:45 +02:00
cc203b2842 Regenerate requirements
Generated using poetry:

  $ poetry export --without-hashes -f requirements.txt > requirements.txt
  $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
2021-03-14 21:10:33 +02:00
de0703ceb2 poetry.lock: Run poetry update 2021-03-14 21:09:58 +02:00
2a852c9ed3 pyproject.toml: Pin new version of csv-metadata-quality
This version doesn't bother checking invalid multi-value separators.
Instead it just fixes them.
2021-03-14 21:09:07 +02:00
edd651317b templates/index.html: Change default selections
Enable AGROVOC lookup on dcterms.subject as well as the "unsafe"
fixes. For the AGROVOC lookup I just think that it  might not be
obvious to non-technical users that you have to check the box AND
enter a field name, despite the placeholder value. In any case, it
doesn't hurt to enable AGROVOC lookup by default because it won't
fail if the default dcterms.subject field is not present in the
user's CSV.
2021-03-14 20:54:50 +02:00
6f396f392f Revert "Move style.min.css to css/v1/style.min.css"
This reverts commit 8f6d337d2d4611c35b4998c8acf8ceaba3bb89aa.

We are using Heroku now so we don't need this phony version.
2021-03-14 20:40:33 +02:00
f292a4902f .drone.yml: Install git for some pip deps
All checks were successful
continuous-integration/drone/push Build is passing
2021-03-14 19:32:14 +02:00
b43b995a90 Add drone.yml for Drone CI
Some checks failed
continuous-integration/drone/push Build is failing
2021-03-14 18:33:42 +02:00
aa862620d7 README.md: Try a relative link for screenshot.png 2021-03-14 16:15:21 +02:00
f42a83c7ab Remove Google App Engine config 2021-03-14 16:07:26 +02:00
1d46d490cb README.md: Add build badge from GitHub Actions 2021-03-14 15:57:59 +02:00
4f48af7f24 Add GitHub workflow to build
For now only builds, as I'm not sure how to test the web application
yet.
2021-03-14 15:55:58 +02:00
e8dc08bcac README.md: Update text 2021-03-14 13:54:03 +02:00
f4ddf4a7b5 README.md: Update intro text 2021-03-14 13:11:39 +02:00
d37654206f README.md: Fix screenshot link 2021-03-14 13:08:50 +02:00
69501cbacb Add README.md with screenshot and license 2021-03-14 13:07:24 +02:00
2a9ec1c3f3 LICENSE.txt: Use GPLv3 instead of AGPLv3
I would rather have the source publishing requirements be triggered
on distribution than on web hosting.
2021-03-14 13:05:32 +02:00
be9143204c Add Procfile for Heroku
See: https://devcenter.heroku.com/articles/python-gunicorn
2021-03-14 12:29:28 +02:00
55815cf4c0 runtime.txt: Use Python 3.9.2
Actually it seems they do have Python 3.9.2.

See: https://devcenter.heroku.com/articles/python-support
2021-03-14 12:25:49 +02:00
73a13145b6 Add runtime.txt
Apparently to deploy on Heroku we need this. And they only support
Python 3.7? Damn...
2021-03-14 12:24:03 +02:00
8f6d337d2d Move style.min.css to css/v1/style.min.css
Trying to break Google App Engine's aggressive caching.
2021-03-14 12:04:28 +02:00
aabb783d99 app.yaml: Set Cache-Control header to private for CSS
Google App Engine agressively caches stuff. They are currently serving
a 24-hour old version of my CSS after multiple updates and re-deploys.
Ughhh. From their docs:

> After a file is transmitted with a given expiration time, there is
> generally no way to clear it out of web-proxy caches, even if the user
> clears their own browser cache. Re-deploying a new version of the app
> will not reset any caches. Therefore, if you ever plan to modify a
> static file, it should have a short (less than one hour) expiration
> time. In most cases, the default 10-minute expiration time is
> appropriate.

The only way to break this for now is to change the CSS *directory*.
In the future I think we have to be sure to set the private cache
control header, which lets browsers cache it, but not public CDNs.

See: https://cloud.google.com/appengine/docs/standard/python3/how-requests-are-handled
2021-03-14 11:58:57 +02:00
bd31ac912e Regenerate static assets 2021-03-14 11:42:07 +02:00
61040ea4a5 Use an ILRI theme 2021-03-14 11:41:48 +02:00
d483f7fc0b .gitignore: Ignore sqlite requests response cache 2021-03-14 11:36:46 +02:00
c07716abcb .gcloudignore: Ignore sqlite requests response cache 2021-03-14 11:36:09 +02:00
36b43a06b9 templates/index.html: Add link to test.csv
Offer the user to test with this file.
2021-03-14 11:34:55 +02:00
25aa74ba14 app.yaml: Try to serve static CSS directly
Google App Engine is currently caching an old version of my CSS, so
I am trying to get it to use the correct version. Let's try serving
it directly from the filesystem.

See: https://cloud.google.com/appengine/docs/standard/php/getting-started/serving-static-files
2021-03-14 10:36:57 +02:00
a2ace2d331 .gcloudignore: Don't upload source dir
We only need this during build.
2021-03-14 10:31:46 +02:00
94cdd7fb50 Regenerate requirements
Generated using poetry:

  $ poetry export --without-hashes -f requirements.txt > requirements.txt
  $ poetry export --without-hashes --dev -f requirements.txt > requirements-dev.txt
2021-03-14 10:09:18 +02:00
74bde23567 poetry.lock: Run poetry update 2021-03-14 10:08:44 +02:00
4e52d1bcc9 Add configurable requests cache directory
As I expected, on Google App Engine we can't write the cache file
to the current working directory. I modified csv-metadata-quality
CLI to check for the REQUESTS_CACHE_DIR environment variable so we
don't really have to do anything different other than setting the
variable.
2021-03-14 10:06:39 +02:00
e7dd8d1421 Add AGROVOC lookup support
This works locally, but I don't think it will work on App Engine
because csv-metadata-quality uses requests-cache and creates the
agrovoc-response-cache.sqlite file in the current working directory.
2021-03-13 23:49:24 +02:00
9aab2ae83f templates/index.html: Remove basic and experimental labels 2021-03-13 23:34:34 +02:00
6c3804d55b Add support for skipping fields ("-x") 2021-03-13 23:34:11 +02:00
122d9fd53c Add support for experimental checks ("-e") 2021-03-13 23:01:11 +02:00
198acdb1a7 Major refactor
Re-work upload and file processing so they are in the same Python
function. Now I will start exposing other command line options in
the form, like unsafe fixes, excluding fields, etc. Now I see tha
t it is easier to save the POSTed file and process it in the same
function so I don't have to pass around the other POSTed form val
ues as URL query parameters.

Now, as a result of changing the flow above, I also had to make a
change to the way I show the results page. Instead of processing
the file and returning the rendered results to the user directly,
I process the file, save the rendered results to /tmp, and return
a redirect to the user to the results page.
2021-03-13 22:11:26 +02:00
adc2d06094 main.py: Add note about using /tmp
I originally wanted to use an "uploads" directory or something, but
it seems we can only write to /tmp on Google App Engine. They really
want you to buy storage or database services! This is memory mapped
so it disappears when you re-deploy.
2021-03-13 22:09:17 +02:00
bc256b242d static/css/style.min.css: Regenerate 2021-03-13 22:07:53 +02:00
f82cb6ce05 source/scss/main.scss: Increase container width
We need more space for the log on the results page and we actually
don't even need to worry about people running this on a phone.
2021-03-13 22:06:43 +02:00
4bdec3b889 app.yaml: Use Python 3.9
Python 3.9 is apparently now generally available.
2021-03-13 14:12:14 +02:00