1
0
mirror of https://github.com/ilri/csv-metadata-quality.git synced 2025-05-16 17:43:54 +02:00

Improve exclude function

When a user explicitly requests that a field be excluded with -x we
skip that field in most checks. Up until now that did not include
the item-based checks using a transposed dataframe because we don't
know the metadata field names (labels) until we iterate over them.

Now the excludes are respected for item-based checks.
This commit is contained in:
2022-09-02 15:59:22 +03:00
parent 1f76247353
commit 040e56fc76
6 changed files with 54 additions and 24 deletions

@ -293,7 +293,7 @@ def mojibake(field, field_name):
return field
def countries_match_regions(row):
def countries_match_regions(row, exclude):
"""Check for the scenario where an item has country coverage metadata, but
does not have the corresponding region metadata. For example, an item that
has country coverage "Kenya" should also have region "Eastern Africa" acc-
@ -337,6 +337,12 @@ def countries_match_regions(row):
if match is not None:
title_column_name = label
# Make sure the user has not asked to exclude any metadata fields. If so, we
# should return immediately.
column_names = [country_column_name, region_column_name, title_column_name]
if any(field in column_names for field in exclude):
return row
# Make sure we found the country and region columns
if country_column_name != "" and region_column_name != "":
# If we don't have any countries then we should return early before