Skip to content

Commit

Permalink
Merge branch 'release/1.8.0'
Browse files Browse the repository at this point in the history
  • Loading branch information
s-emerson committed Sep 13, 2021
2 parents 17cabf5 + d1162fc commit 5806b77
Show file tree
Hide file tree
Showing 28 changed files with 1,467 additions and 388 deletions.
2 changes: 1 addition & 1 deletion AUTHORS
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ Contributors at the University of Florida
* Naomi Braun
* Philip Chase
* Samantha Emerson
* Melissa Moreno
* Kevin S. Hanson
* Matthew McConnell
* Ajantha Ramineni
Expand All @@ -35,4 +36,3 @@ Contributors at the University of Florida
Other Contributors

* L. D. Nicolas May <ldnicolasmay@gmail.com>

12 changes: 12 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,5 +1,17 @@
Changelog
=========
## [1.8.0] - 2021-09-13
### Summary

### Updated
* Add Z1X processing to LBD short version
* Update UDS Z1X to include handling for optional LBD short version fields
* Make C2T optional for telephone follow-ups to reflect NACC's DED

### Added
* Add new CV covid module


## [1.7.1] - 2021-02-03
### Summary
This release updates the telephone follow-up packet (TFP) module to include
Expand Down
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Copyright (c) 2016–2020, University of Florida.
Copyright (c) 2016–2021, University of Florida.
All rights reserved.

Redistribution and use in source and binary forms, with or without
Expand Down
17 changes: 13 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,15 @@ NACCulator

[![DOI](https://zenodo.org/badge/20501/ctsit/nacculator.svg)](https://zenodo.org/badge/latestdoi/20501/ctsit/nacculator)

Converts a CSV data file exported from REDCap into the NACC's UDS3 fixed-width
format.
NACCulator is a Python 3-based data converter that changes REDCap .csv exported
data to NACC’s fixed-width .txt format. It is configured for UDS3 forms,
including FTLD and LBD (versions 3.0 and 3.1). It will perform basic data
integrity checks during a run: verifying that each field is the correct type
and length, verifying that there are no illegal characters in the Char fields,
verifying that Num fields are within the acceptable range as defined in NACC's
Data Element Dictionary for each form, and checking that no blanking rules have
been violated. NACCulator outputs a .txt file that is immediately ready to
submit to NACC's database.

_Note:_ NACCulator _**requires Python 3.**_

Expand All @@ -30,7 +37,8 @@ REDCap visits (denoted by `redcap_event_name`) contain certain keywords:
"follow" for all followups,
"milestone" for milestone packets,
"neuropath" for neuropathology packets,
"telephone" for telephone followup packets
"telephone" for telephone followup packets,
"covid" for covid-related survey packets

NACCulator collects data from the Z1X form first and uses that to determine the
presence of other forms in the packet. The Z1X form for that record must be
Expand All @@ -47,7 +55,7 @@ the `-file` flag._

$ redcap2nacc -h
usage: redcap2nacc [-h]
[-fvp | -ivp | -tfp | -np | -m | -csf | -f {cleanPtid,replaceDrugId,fixHeaders,fillDefault,updateField,removePtid,removeDateRecord,getPtid}]
[-fvp | -ivp | -tfp | -np | -m | -cv | -csf | -f {cleanPtid,replaceDrugId,fixHeaders,fillDefault,updateField,removePtid,removeDateRecord,getPtid}]
[-lbd | -ftld] [-file FILE] [-meta FILTER_META] [-ptid PTID]
[-vnum VNUM] [-vtype VTYPE]

Expand All @@ -61,6 +69,7 @@ the `-file` flag._
-tfp3 Set this flag to process as TFP v3.0 (pre-2020) data
-np Set this flag to process as Neuropathology data
-m Set this flag to process as Milestone data
-cv Set this flag to process as COVID data
-csf Set this flag to process as NACC BIDSS CSF data

-f {cleanPtid,replaceDrugId,fixHeaders,fillDefault,updateField,removePtid,removeDateRecord,getPtid}, --filter {cleanPtid,replaceDrugId,fixHeaders,fillDefault,updateField,removePtid,removeDateRecord,getPtid}
Expand Down
Empty file added nacc/cv/__init__.py
Empty file.
153 changes: 153 additions & 0 deletions nacc/cv/blanks.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
###############################################################################
# Copyright 2015-2021 University of Florida. All rights reserved.
# This file is part of UF CTS-IT's NACCulator project.
# Use of this source code is governed by the license found in the LICENSE file.
###############################################################################

import csv
import os
import re
import sys


def convert_rule_to_python(name: str, rule: str) -> bool:
"""
Converts the text `rule` into a python function.
The returned function accepts one argument of type `Packet`.
Example:
packet["FOO"] = "I should be blank!"
packet["BAR"] = 0
r = convert_rule_to_python("FOO", "Blank if Question 1 BAR = 0 (No)")
if packet["FOOBAR"] != "" and r(packet):
raise RuleError("FOO should be blank, but is not!")
:param name: Canonical name of the field
:param rule: Blanking rule text
"""

special_cases = {

}

single_value = re.compile(
r"Blank if( Question(s?))? *\w+ (?P<key>\w+)"
r" *(?P<eq>=|ne|is|not =|!=) (?P<value>\d+)([^-]|$)")
range_values = re.compile(
r"Blank if( Question(s?))? *\w+ (?P<key>\w+)"
r" *(?P<eq>=|ne|is|not =|!=) (?P<start>\d+)-(?P<stop>\d+)( |$)")
blank_value = re.compile(
r"Blank if( Question(s?))? *\w+ (?P<key>\w+) *(?P<eq>=|ne|is|not =) blank")
not_answered = re.compile(
r"Blank if question not answered")

# First, check to see if the rule is a "Special Case"
if name in special_cases:
return special_cases[name](rule)

# Then, check to see if the rule is of the within-range type
m = range_values.match(rule)
if m:
return _blanking_rule_check_within_range(
m.group('key'), m.group('eq'), m.group('start'), m.group('stop'))

# Next, check to see if the rule is of the single-value type
m = single_value.match(rule)
if m:
return _blanking_rule_check_single_value(
m.group('key'), m.group('eq'), m.group('value'))

# Next, check to see if the rule is of the "blank if _ = blank" type
m = blank_value.match(rule)
if m:
return _blanking_rule_check_blank_value(
m.group('key'), m.group('eq'))

# For the FTLD forms, we need to also check to see if
# "Blank if question not answered" is included in the blanking rules
m = not_answered.match(rule)
if m:
return lambda packet: False

# Finally, raise an error since we do not know how to handle the rule
raise Exception("Could not parse Blanking rule: "+name)


def extract_blanks(csvfile):
with open(csvfile) as fp:
reader = csv.DictReader(fp)
blanks_fieldnames = [f for f in reader.fieldnames if 'BLANKS' in f]
for row in reader:
rules = '\t'.join([row[f] for f in blanks_fieldnames]).strip()
if rules:
yield "%s:\t%s" % (row['Data Element'], rules)


def _blanking_rule_check_single_value(key, eq, value):
def should_be_blank(packet):
""" Returns True if the value should be blank according to the rule """
if '=' == eq or 'is' == eq:
return packet[key] == value
elif 'ne' == eq or 'not =' == eq or '!=' == eq:
return packet[key] != value
else:
raise ValueError("'eq' must be '=' or 'ne', not '%s'." % eq)

return should_be_blank


def _blanking_rule_check_within_range(key, eq, start, stop):
def should_be_blank(packet):
""" Returns True if the value should be blank according to the rule """
first = int(start)
last = int(stop)+1
if '=' == eq:
return packet[key] in range(first, last)
elif 'ne' == eq:
return packet[key] not in list(range(first, last))
else:
raise ValueError("'eq' must be '=' or 'ne', not '%s'." % eq)

return should_be_blank


def _blanking_rule_check_blank_value(key, eq, value=None):
def should_be_blank(packet):
""" Returns True if the value should be blank according to the rule """
if '=' == eq:
return packet[key] == value
elif 'ne' == eq:
return packet[key] != value
else:
raise ValueError("'eq' must be '=' or 'ne', not '%s'." % eq)

return should_be_blank


def _blanking_rule_dummy(rule):
return lambda packet: False


def main():
"""
Extracts all blanking rules from all DED files in a specified directory.
Usage:
python blanks.py ./ded_ivp
Note: this module is more useful as an imported module; see
`convert_rule_to_python`.
"""
data_dict_path = './ded_ivp'
if len(sys.argv) > 1:
data_dict_path = sys.argv[1]

deds = [f for f in os.listdir(data_dict_path) if f.endswith('.csv')]
for ded in deds:
for rule in extract_blanks(os.path.join(data_dict_path, ded)):
print(rule)


if __name__ == '__main__':
main()
Loading

0 comments on commit 5806b77

Please sign in to comment.