mcs

Climate Change on Health

About

Soon

Features

Soon

Installation

conda env create -f arboseer.yml
conda activate arboseer

Building the dataset

The dataset used to train the LSTM model is composed of:

CNES - Dataset of Health Units
SINAN - Dataset of Dengue Cases
INMET - Dataset of INMET weather measurements
LST - Dataset of GOES-16 Land Surface Temperature
RRQPE - Dataset of GOES-16 Rainfal Rate

The full script for building the dataset can be found in the file build_dataset.sh/bat.

Note that raw files aren't removed after processing.

Download and process CNES Health Units data

In this step, we'll download the latest CNES data, convert it to parquet and add the lat/lon values for the addresses.

The first script downloads the data.

python src/utils/download_cnes_file.py FILETYPE UF DATE DEST_PATH/FILENAME.dbc

The next one converts it from dbc to parquet.

python src/utils/dbc_to_parquet.py INPUT_PATH OUTPUT_PATH

The final script adds the lat/lon values and trims the dataset fields.

python src/process_cnes_dataset.py INPUT_PATH OUTPUT_PATH

Upon finishing this step we should have the final CNES parquet.

Download and process SINAN dengue cases

In this step, we'll download the cases using PySus and extract the SINAN cases. PySus only works on Linux. The files can also be acquired on the SINAN website.

First, we'll download the file for every disease and year.

python src/utils/download_sinan_file.py FILETYPE YEAR OUTPUT_PATH

Then, we'll merge the files into a single dataset. As of now, the dataset has a fixed name: concat.parquet

python src/unify_sinan.py INPUT_PATH OUTPUT_PATH

Finally, we'll extract the cases we want, trimming the dataset fields in the process. Here we have the option to filter by UF (with --cod_uf: RJ is 33, SP is 35 and so on) or CNES (with --cnes_id). We also can fill the dataset, inserting rows with 0 cases on the dates that aren't present on the dataset.

python src/extract_sinan_cases.py INPUT_PATH OUTPUT PATH --cod_uf COD_UF --filled --start_date YYYY-MM-DD --end_date YYYY-MM-DD

Download INMET data

In this step, we'll download and unify the INMET data. You'll need an INMET API token to make requests.

python src/utils/download_inmet_data.py -s STATION -b YYYY -e YYYY -o OUTPUT_PATH --api_token INMET_API_TOKEN

Concat INMET data

With the INMET data downloaded, we'll unify all the files into a single dataset. We can also use --aggregated True to aggregate the values from an hourly basis to a daily basis.

python src/unify_inmet.py INPUT_PATH OUTPUT_PATH --aggregated True

Calculate LST

In this step, we'll be downloading LST data and converting it into a single dataset. Right now we're using a fixed extent around Rio de Janeiro and a fixed output file name lst.parquet. In this process, we also aggregate measurements from hourly to daily creating the MIN, MAX, and AVG of each temperature.

python src/calculate_min_max_avg_lst.py YYYYMMDD YYYYMMDD DOWNLOAD_PATH OUTPUT_PATH

Download and Calculate RRQPE

In this step, we'll be downloading RRQPE data and converting it into a single dataset. Right now we're using a fixed extent around Rio de Janeiro and a fixed output file name rrqpe.parquet. In this process, we also aggregate measurements from hourly to daily creating the SUM of the hourly rainfall rate.

python src/calculate_accumulated_rrqpe.py YYYYMMDD YYYYMMDD DOWNLOAD_PATH OUTPUT_PATH

Build

Finally, we'll combine all the data into a single dataset.

python src/build_dataset.py DENG_DATASET CNES_DATASET INMET_DATASET LST_DATASET RRQPE_DATASET OUTPUTDATASET --start_date YYYY-MM-DD --end_date YYYY-MM-DD

Train the LSTM model

To train the LSTM model, we simply call the train script passing the path to the built dataset, the output path (for figures and etc), and the date which will be used to split the data into test/train.

python src/train_lstm.py INPUT_PATH OUTPUT_PATH YYYY-MM-DD

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
exercises		exercises
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
arboseer.yml		arboseer.yml
build_dataset.bat		build_dataset.bat
build_dataset.sh		build_dataset.sh
predict.bat		predict.bat
predict.py		predict.py
results_notebook.ipynb		results_notebook.ipynb
setup.sh		setup.sh
stations.csv		stations.csv
teste.csv		teste.csv
teste.py		teste.py
timeseries.py		timeseries.py
train.bat		train.bat
train.py		train.py
train_bresci.py		train_bresci.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mcs

About

Features

Installation

Building the dataset

Download and process CNES Health Units data

Download and process SINAN dengue cases

Download INMET data

Concat INMET data

Calculate LST

Download and Calculate RRQPE

Build

Train the LSTM model

About

Releases

Packages

Contributors 2

Languages

License

AILAB-CEFET-RJ/mcs

Folders and files

Latest commit

History

Repository files navigation

mcs

About

Features

Installation

Building the dataset

Download and process CNES Health Units data

Download and process SINAN dengue cases

Download INMET data

Concat INMET data

Calculate LST

Download and Calculate RRQPE

Build

Train the LSTM model

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages