- bugfix: silero segmenter assigned file duration values
- added nan check for imported features
- added LOGO result output
- added manual seed to torch models
- fixed bugs in plot
- added import_files_append=False
- added a safety to remove nan values after mapping
- added error message and hint for data.target_tables_append
- fixed bug in dataset loading
- ccc in plots now configurable
- bugfix in plot
- changed class_label in plots to actual target
- made explore module more robust
- integrated pyannote for speaker prediction for predict module
- added some output to automatic speaker id
- added a speaker plot to pyannote results
- added first version of automatic speaker prediction for segment module
- some additions for robustness
- making lint work by excluding constants from check
- minor refactoring in ensemble module
- fixed duration display in segmentation
- added possibility to use original segmentations (without max. duration)
- added plot format for multidb
- refactorings and documentations
- added probability output to finetuning classification models
- switched path to prob. output from "store" to "results"
- Add balancing for finetune and update data README
- augmentation can now be done without target
- random splicing params configurable
- made kde default for plot continous/categorical plots
- fix shap value calculation
- print and save result of feature importance
- added Roc plots and classification report on Debug
- added n_jobs for sklearn processing
- re_named num_workers n_jobs
- removed hack in Praat script
- SVM C val defaults to 1
- fixed agender_agender naming bug
- added performance_weighted ensemble
- some cosmetics
- added use_splits for multidb
- added test speaker assign
- add a unique name to the uncertainty plot
- fix error in speaker embedding (still need speechbrain < 1.0)
- add get_target_name function in util
- added more ensemble methods, e.g. based on uncertainty
- fixed bug in false uncertainty estimation
- changed demo live recording
- changed combine speaker results to show speakers not samples
- added obligatory scatter plot for regression
- added ensemble late fusion and AST features
- added class probability output and uncertainty analysis
- handle single feature sets as strings in the config
- handles now audformat tables where the target is in a file index
- now best (not last) result is shown at end
- fix audio path detection in data csv import
- add finetuning to the demo module
- bugfixed: nan in finetuned model and double saving
- import features now get multiindex automatically
- plots epoch progression for finetuned models now
- functionality to push to hub
- fixed bug that prevented wavlm finetuning
- added regression to finetuning
- added other transformer models to finetuning
- added output the train/dev features sets actually used by the model
- added data, and automatic task label detection
- fixed bug in model_finetuned that label_num was constant 2
- first version with finetuning wav2vec2 layers
- made resample independent of config file
- added SHAP analysis
- started with finetuning
- fixed a naming error in trill features that prevented storage of experiment
- added default cuda if present and not stated
- add test module to nkuluflag
- test module now prints out reports
- fixed bug in wavlm
- fixed another audformat peculiarity to interprete time values as nanoseconds
- fixed audformat peculiarity that dataframes can have only one column
- Add more test for GC action
- added nkuluflag module
- bugfixes
- added whisper feature extractor
- updated documentation
- updated crema-d
- updated tests
- added sex=gender for speaker mappings
- fixed bug in demo module
- removed [MODEL] save
- added confidence intervals to result reporting
- added a parselmouth.Praat error if pitch out of range
- changed file path for demo_predictor
- fixed bugs in demo module
- made kernel for SVM/SVR configurable
- added test selection to test module
- added test-file folder to demo file lists
- made sounddevice use optional as Portaudio library causes difficulties
- fixed bug that caused clash with GPU/CPU use
- added support for string value in import_features
- added support for multiple extra training databases when doing multi-db experiments
- fixed bug no feature import
- add support for multiple import feature files
- fixed bug on demo without in- or output
- fixed bug that demo with DL feature extractors did not work
- added functionality in demo for regression
- fixed bug that test module did not work
- fixed bug that demo module did not work for ANNs
- added csv output for demo mode and file lists
- fixed bug and report number of epochs for early stopping
- root directory does not have to end with /
- added extra_train for multidb experiment
- added transformer layer selection for wav2vec2
- removed best_model and epoch progression for non-DL models
- added evaluation loss
- added 3-d scatter plots
- removed epoch-plots if epoch_num=1
- fixed bug preventing bin scaling to work
- added bins scaler
- fixed bug with scatter plots for numeric targets
- made type of numeric target distributions selectable, default "hist"
- added simple target distribution plots
- show the best and not the last result for multidb
- added results text for multidb
- added caption to multidb heatmap
- renamed datasets to databases in multidb
- added multidb module
- added functions to call modules with config file path directly
- fixed augmentation bug for python version 10
- made traditional augmentations (audiomentation module) configurable
- added augment and train interface
- added models for features importance computation
- added permutation algorithm to compute feature importance
- shifted util.py to utils
- added more latex report output
- got splitutils from a package
- added possibility to aggregate feature importance models
- added max val for reversing
- added xgb for feature importance
- added standard Wav2vec2 model
- added praat feature extractor for one sample
- fixed bug combining augmentations
- audiomentations interface changed
- combined augmentation methods
- fixed various bugs with augmentation
- added patience (early stopping)
- added MAE loss and measure
- added reverse and scale arguments to target variable
- also, the data store can now be csv
- worked over explore value counts section
- added bin_reals for all columns
- automatic epoch reset if not ANN
- scatter plots now show a regression line
- enabled scatter plots for all variables
- enabled scatter plots for continuous labels
- made a wav2vec default
- renamed praat features, ommiting spaces
- fixed plot distribution bugs
- added feature plots for continuous targets
- added explore visuals.
- all columns from databases should now be usable
- added imb_learn balancing of training set
- added CNN model and melspec extractor
- bugfix: got_gender was uncorrectly set
- Feinberg Praat scripts ignore error and log filename
- column names in datasets are now configurable
- added error message on file to praat extraction
- added stratification framework for split balancing
- added first version of spotlight integration
- small changes related to github worker
- fixed bug that prevented Praat features to be selected
- removed torch from automatic install. depends on cpu/gpu machine
- Removed print statements from feats_wav2vec2
- Version that should install without requiring opensmile which seems not to be supported by all Apple processors (arm CPU (Apple M1))
- forgot init.py in reporting module
- minor changes to experiment class
- minor cosmetics
- Latex report now with images
- Pypi version mixup
- made path to PDF output relative to experiment root
- enabled data-pacthes with quotes
- enabled missing category labels
- used tqdm for progress display
- start on the latex report framework
- added speechbrain speakerID embeddings
- added a filter that ensures that the labels have the same size as the features
- changed default behaviour of resampler to "keep original files"
- more databases and force wav while resampling
- minor catch for seaborn in plots
- added fill_na in plot effect size
- added datasets to distribution
- changes in wav2vec2
- various bugfixes
- fixed bug in dataset.csv that prevented correct paths for relative files
- fixed bug in export module concerning new file directory
- small enhancements with transformer features
- introduced export module
- added num_speakers for reloaded data
- re-formatted all with black
- added number of speakers shown after data load
- added init.py for submodules
- fix error on csv
- added bin_reals
- added statistics for effect size and correlation to plots
- fixed bug in split selection
- Introduced data.audio_path
- re-introduced min and max_length for silero segmenatation
- fixed bug in resample
- added wavlm model
- added error on filename for models
- added min and max_length for silero segmenatation
- fixed segment silero bug
- added all Wav2vec2 models
- added resampler module
- added error on file for embeddings
- added HUBERT embeddings
- some bugfixes
- new package structure
- fixed wav2vec2 bugs
- removed "cross_data" strategy
- bugfix, after fresh install, it seems some libraries have changed
- added no_warnings
- changed print() to util.debug()
- added progress to opensmile extract
- introduced SQUIM features
- added SDR predict
- added STOI predict
- added dominance predict
- added MOS predict
- added PESQ predict
- renamed autopredict predict
- added arousal autopredict
- added valence autopredict
- added autopredict module
- added snr as feature extractor
- added gender autopredict
- added age autopredict
- added snr autopredict
- changed error message in plot class
- added segmentation module
- added audeering public age and gender model embeddings and age and gender predictions
- added file checks: size in bytes and voice activity detection with silero
- bugfix: min/max duration_of_sample was not working
- added flexible value distribution plots
- added datafilter
- added caller information for debug and error messages in Util
- removed loso and added pre-selected logo (leave-one-group-out), aka folds
- bugfix: samples selection for augmentation didn't work
- added random-splicing
- bugfix: database object was not loaded when dataframe was reused
- enabled specific feature selection for praat and opensmile features
- enabled feature storage format csv for opensmile features
- added praat speech rate features
- added warnings for non-existent parameters
- added sample selection for scatter plotting
- added version attribute to setup.cfg
- added version attribute
- bugfix: sample_selection in EXPL was required wrongly
- added sample_selection for sample distribution plots
- fixed dataframe.append bug
- added auddim as features
- added FEATS store_format
- added device use to feat_audmodel
- bugfixes
- added scatter functions: tsne, pca, umap
- added clap features
- small bugs
- because of difficulties with numba and audiomentations importing audiomentations only when augmenting
- added error when experiment type and predictor don't match
- fixed further bugs and added augmentation to the test runs
- fixed a bug when running continuous variable as classification problem
- fixed test_runs
- added augmentation module based on audiomentation
- age labels should now be detected in databases
- added feature tree plot
- fixed a bug: additional test database was not label encoded
- added EXPL section and first functionality
- added test module (for test databases)
- added feature distribution plots
- added plot format
- added demo mode with list argument
- fixed a bug concerned with "no_reuse" evaluation
- demo mode with file argument
- fixed demo mode
- mainly replaced pd.append with pd.concat
- fixed bug preventing praat feature extraction to work
- fixed bug cvs import not detecting multiindex
- published as a pypi module
- added entry nkululeko.py script
- fixed bug that prevented scaling (normalization)
- smaller bug fixed concerning the loss_string
- smaller bug fixes and tried Soft_f1 loss
- smaller bug fixes and debug ouputs
- added GMM as a model type
- added audmodel embeddings as features
- added models: tree and tree_reg
- added models: bayes, knn and knn_reg
- fixed hello world example
- bug fix for 0.29
- added a new FeatureExtractor class to import external data
- removed some Pandas warnings
- added no_reuse function to database.load()
- with database.value_counts show only the data that is actually used
- made "label_data" configuration automatic and added "label_result"
- added "label_data" configuration to label data with trained model (so now there can be train, dev and test set)
- Fixed some bugs caused by the multitude of feature sets
- Added possibilty to distinguish between absolut or relative pathes in csv datasets
- added the rename_speakers funcionality to prevent identical speaker names in datasets
- fixed bug that no features were chosen if not selected
- made selectable features universal for feature sets
- added multiple feature sets (will simply be concatenated)
- added selectable features for Praat interface
- added David R. Feinberg's Praat features, praise also to parselmouth
- Revoked 0.20.0
- Added support for only_test = True, to enable later testing of trained models with new test data
- implemented reuse of trained and saved models
- added "max_duration_of_sample" for datasets
- added support for learning and dropout rate as argument
- added support for epoch number as argument
- added support for ANN layers as arguments
- added reuse of test and train file sets
- added parameter to scale continous target values: target_divide_by
- added preference of local dataset specs to global ones
- added regression value display for confusion matrices
- added leave one speaker group out
- fixed scaler, added robust
- Added minimum duration for test samples
- Added possibility to combine predictions per speaker (with mean or mode function)
- Added minimal sample length for databases
- Added k-fold-cross-validation for linear classifiers
- Added leave-one-speaker-out for linear classifiers
- Added random sample splits