Developed in the Laboratory of Atmospheric Physics of Thessaloniki, Greece.
To process the data from broadband instruments of LAP.
Some plots and reports should be found here.
This is partial used in operational procedures (github.com/thanasisn/CS_id is still in use).
Name | Rows | Vars | Values | Size | Fill | Bytes/Value |
---|---|---|---|---|---|---|
BBDB | 16847272 | 81 | 533786332 | 2.6 GiB | 39.12% | 5.33 |
BBDB meta | 11702 | 82 | 505108 | 2.2 MiB | 52.64% | 4.52 |
TrackerDB | 8770526 | 23 | 78860710 | 169.4 MiB | 39.09% | 2.25 |
TrackerDB meta | 3208 | 9 | 28120 | 368.0 KiB | 97.4% | 13.4 |
Raw files hashes | 812419 | 4 | 3249676 | 4.1 MiB | 100% | 1.32 |
Total | 26445127 | 199 | 616429946 | 2.8 GiB | NA% | 4.91 |
Table: Datasets sizes on 2025-01-12
- Digest raw data
- Signal from CHP-1
- Tracker "async"
- CHP-1 internal temperature from thermistor
- Bad data ranges flagging
- From manual set execution ranges
- From acquisition signal physical limits
- Converts signal to radiation
- Computes temperature correction when possible
- Plots
- Overview of Clean/Dirty signal
- Daily signal with and without dark
- Overview of Direct radiation measurements
- Daily Direct radiation measurements
- Digest raw data
- Signal from CHP-1
- Bad data ranges flagging
- From manual set execution ranges
- From acquisition signal physical limits
- Converts signal to radiation
- Plots
- Overview of Clean/Dirty signal
- Daily signal with and without dark
- Quality Check of radiation data (QCRad)
- Flags data using mainly the algorithm of C. N. Long and Y. Shi (2006)
- Imports data from github.com/thanasisn/TSI
Sun_Dist_Astropy
Sun - LAP distanceTSI_TOA
TSI at TOA at LAPTSI_1au
TSITSI_source
TSI data source
- Imports atmospheric pressure data from proxies
Pressure
Atmospheric pressure at LAPPressure_source
Data source
- Keeps an
md5sum
of all input files to check for bit rot and other data corruption.
inspect_days_DB.R
interactive plot of some data in the DBinspect_days_Lap.R
interactive plot of some data from source filesinspect_days_Lap_sirena.R
interactive plot of some data from source files
- Fully port all to duckdb
- Replace and compare processes from "CM_21_GLB"
- All the major stages have been replaced
- Secondary processes are to be ported
- Process more instruments
- Import libRadtran data
- May import CSid
- Import other references
Some aspects on the implementation of this project.
- We use a dataset of
parquet
files as a database for all measurements and additional data. - We are migrating the original parquet dataset scheme to
Duckdb
to improve overall efficiency. - The
parquet
dataset use one file for each month, this facilitates:- Syncing of the data between different computers.
- Partial processing when needed without using the dataset function.
- It should be easy to migrate to a pure database like
duckdb
orsqlite
. - There are some files with extra meta data for the data in the database and the analysis performed.
- We use features of the
arrow
library, and alsodata.table
when it is more suitable or clear to code. - The analysis should be able to be performed with under 8Gb of RAM, but is not assured.
- There is a trade-of with the disk usage/wearing, especially when starting from scratch.
- New data should be easy to be added on daily base on all levels.
- New process and analysis should be easy to added for all data.
- Goal to become a framework for all broadband instruments data analysis and manipulation.
There is no centralized documentation for the project. Although you can refer to:
Readme.md
or other markdown files for a relevant overview- Summary notes on the start of each script
- Comments inside each script
- Compiled reports from each script