Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRAFT: Add CFAIR Experiemts #62

Open
wants to merge 51 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
fdd7098
Add CFAIR split
fariedabuzaid Sep 29, 2024
7ed6c8b
Define CFAIR experiment
fariedabuzaid Sep 29, 2024
e575efd
Add L^{\inf}-radial MNIST experiment configuration
fariedabuzaid Oct 6, 2024
34bd806
Add import that is required when running the MNIST training file as is.
YalcinerMustafa Oct 7, 2024
169f9ae
update this keyword to match with the latest code-base.
YalcinerMustafa Oct 7, 2024
4b4a9ba
Fix configuration file and add gpu version to train on dgx.
YalcinerMustafa Oct 7, 2024
635de70
tryout a fix for loading tensors on the wrong device.
YalcinerMustafa Oct 7, 2024
ad655fd
Revert "tryout a fix for loading tensors on the wrong device."
YalcinerMustafa Oct 7, 2024
f0a487b
dont use lu layers, there seems to be a bug there.
YalcinerMustafa Oct 7, 2024
40ee80c
update the config file with l_inf norm.
YalcinerMustafa Oct 7, 2024
661b993
update config file to include lu layers.
YalcinerMustafa Oct 7, 2024
975bebb
revert the use of lu layers.
YalcinerMustafa Oct 7, 2024
6a8b63d
Add configuration entry for radial-exponential.
YalcinerMustafa Oct 8, 2024
478f6e0
add/fix config files.
YalcinerMustafa Oct 8, 2024
e32ec29
Add comment.
YalcinerMustafa Oct 8, 2024
bef50ba
Initialize the dequantizer superclass to smooth the cifar dataset, as…
YalcinerMustafa Oct 9, 2024
f0dbd1c
Add configuration for training with lognormal distribution with vario…
YalcinerMustafa Oct 9, 2024
1f70b8b
Add cofig file for ablation with lognormal.
YalcinerMustafa Oct 10, 2024
082a0cc
update config file.
YalcinerMustafa Oct 10, 2024
6232a9b
Make the models significantly smaller.
YalcinerMustafa Oct 11, 2024
862a2e5
Change experiment name to not confuse folder names.
YalcinerMustafa Oct 11, 2024
3fb1a27
Add config entry for learning slightly larger models on mnist.
YalcinerMustafa Oct 14, 2024
d4f7133
add new experiment config.
YalcinerMustafa Oct 15, 2024
ed440d2
Add method for showing samples from the dataset. Rescaling to 14*14 s…
YalcinerMustafa Oct 17, 2024
3f857a9
allow for rescaling mnist to 14*14 (instead of only 10*10).
YalcinerMustafa Oct 17, 2024
3a8fdb2
Update config file for training on mnist 14*14.
YalcinerMustafa Oct 17, 2024
65c9d7b
change experiment name and minor param change.
YalcinerMustafa Oct 17, 2024
9457ff9
improve config file for training mnist on 14*14.
YalcinerMustafa Oct 18, 2024
5a29004
update config file to try divers hyperparameters- training keeps term…
YalcinerMustafa Oct 18, 2024
75efe8d
Fix config file.
YalcinerMustafa Oct 18, 2024
00c41cf
Increase patience.
YalcinerMustafa Oct 21, 2024
de211d0
Narrow down the parameters to the ones that worked best.
YalcinerMustafa Oct 22, 2024
6adf28d
Add config file for training on all mnist digts rescaled to 14 times …
YalcinerMustafa Oct 23, 2024
6558f7c
Slightly reduce the number of hyperopt samples to run on all digits.
YalcinerMustafa Oct 23, 2024
0809203
Add a fixed and simplified configuration entry for learning multiple …
YalcinerMustafa Oct 28, 2024
29f901d
Fix config file to actually train differente digits (dont use the ove…
YalcinerMustafa Oct 28, 2024
d29d9f0
Add config entry to test multiple location params.
YalcinerMustafa Oct 28, 2024
419fda1
Add new training config for all digits that does not use the override…
YalcinerMustafa Oct 29, 2024
c02f54c
rename config file.
YalcinerMustafa Oct 29, 2024
af228ac
add config entry for training with lu layers.
YalcinerMustafa Oct 30, 2024
610af31
Search for best loc param under use of LU layers.
YalcinerMustafa Oct 30, 2024
8e99bfb
Turn on lu layers.
YalcinerMustafa Oct 31, 2024
6460ddc
Add config file for testing smaller networks with lu layer.
YalcinerMustafa Nov 2, 2024
52b6e75
Reduce numer of coupling nn layers to avoid model getting too big.
YalcinerMustafa Nov 2, 2024
0dca0bc
Add training for larger models.
YalcinerMustafa Nov 6, 2024
136eeac
Increase patience and allow for even lower learning rates to avoid ea…
YalcinerMustafa Nov 7, 2024
ccb6ca1
Change loc param and adjust lr.
YalcinerMustafa Nov 8, 2024
9d4829b
update dependencies (ray version no longer available)
Nov 10, 2024
4853111
Minor adoptions for CFAIR
Nov 10, 2024
a07bd78
Correct dimension
Nov 10, 2024
4d84c7f
update config
Nov 10, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 94 additions & 0 deletions experiments/cfair/cfair.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
__object__: src.explib.base.ExperimentCollection
name: cfair_ablation
experiments:
- &exp_rad_logN
__object__: src.explib.hyperopt.HyperoptExperiment
name: cfair_full_radial_logN
device: cuda
scheduler: &scheduler
__object__: ray.tune.schedulers.ASHAScheduler
max_t: 1000000
grace_period: 1000000
reduction_factor: 2
num_hyperopt_samples: &num_hyperopt_samples 4
gpus_per_trial: &gpus_per_trial 1
cpus_per_trial: &cpus_per_trial 1
tuner_params: &tuner_params
metric: val_loss
mode: min
trial_config:
logging:
images: true
"image_shape": [32, 32, 3]
dataset: &dataset
__object__: src.explib.datasets.Cifar10Split
label: 0
device: cuda
epochs: &epochs 200000
patience: &patience 2
batch_size: &batch_size
__eval__: tune.choice([32])
optim_cfg: &optim
optimizer:
__class__: torch.optim.Adam
params:
lr:
__eval__: 1e-4
weight_decay: 0.0

model_cfg:
type:
__class__: &model src.veriflow.flows.NiceFlow
params:
soft_training: true
training_noise_prior:
__object__: pyro.distributions.Uniform
low:
__eval__: 1e-20
high: 0.01
prior_scale: 1.0
coupling_layers: 5
coupling_nn_layers: [300, 300, 300]
nonlinearity: &nonlinearity
__eval__: tune.choice([torch.nn.ReLU()])
split_dim: 392
base_distribution:
__object__: src.veriflow.distributions.RadialDistribution
device: cuda
p: 1.0
loc:
__eval__: torch.zeros(3072).to("cuda")
norm_distribution:
__object__: pyro.distributions.LogNormal
loc:
__eval__: torch.zeros(7).to("cuda")
scale:
__eval__: (.5 * torch.ones(1)).to("cuda")
use_lu: true
- &exp_laplace
__overwrites__: *exp_rad_logN
name: cfair_full_laplace
trial_config:
model_cfg:
params:
base_distribution:
__exact__:
__object__: pyro.distributions.Laplace
loc:
__eval__: torch.zeros(784).to("cuda")
scale:
__eval__: torch.ones(784).to("cuda")
- &exp_normal
__overwrites__: *exp_rad_logN
name: cfair_full_laplace
trial_config:
model_cfg:
params:
base_distribution:
__exact__:
__object__: pyro.distributions.Normal
loc:
__eval__: torch.zeros(784).to("cuda")
scale:
__eval__: torch.ones(784).to("cuda")
66 changes: 66 additions & 0 deletions experiments/cfair/cfair_cpu.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
---
__object__: src.explib.base.ExperimentCollection
name: cfair_ablation
experiments:
- &exp_rad_logN
__object__: src.explib.hyperopt.HyperoptExperiment
name: cfair_full_radial_logN
scheduler: &scheduler
__object__: ray.tune.schedulers.ASHAScheduler
max_t: 1000000
grace_period: 1000000
reduction_factor: 2
num_hyperopt_samples: &num_hyperopt_samples 1
gpus_per_trial: &gpus_per_trial 0
cpus_per_trial: &cpus_per_trial 1
tuner_params: &tuner_params
metric: val_loss
mode: min
trial_config:
logging:
images: true
"image_shape": [32, 32, 3]
dataset: &dataset
__object__: src.explib.datasets.Cifar10Split
dataloc: "/home/mustafa/local/dataset/cifar/"
label: 0
epochs: &epochs 2
patience: &patience 1
batch_size: &batch_size
__eval__: tune.choice([32])
optim_cfg: &optim
optimizer:
__class__: torch.optim.Adam
params:
lr:
__eval__: 1e-4
weight_decay: 0.0

model_cfg:
type:
__class__: &model src.veriflow.flows.NiceFlow
params:
soft_training: true
training_noise_prior:
__object__: pyro.distributions.Uniform
low:
__eval__: 1e-20
high: 0.01
prior_scale: 1.0
coupling_layers: 10
coupling_nn_layers: [300, 300, 300]
nonlinearity: &nonlinearity
__eval__: tune.choice([torch.nn.ReLU()])
split_dim: 392
base_distribution:
__object__: src.veriflow.distributions.RadialDistribution
device: cpu
p:
__eval__: math.inf
loc:
__eval__: torch.zeros(784).to("cpu")
norm_distribution:
__object__: pyro.distributions.Exponential
rate:
__eval__: 1 * torch.ones(1).to("cpu")
use_lu: false
73 changes: 73 additions & 0 deletions experiments/mnist/mnist_0_scaled_14_linf_lognormal_cpu.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
__object__: src.explib.base.ExperimentCollection
name: mnist_logNormal_linf_medium_better
experiments:
- &mnist_logNormal_linf_loc_1_scale_05_medium_sized
__object__: src.explib.hyperopt.HyperoptExperiment
name: mnist_logNormal_linf_loc_1_scale_05_medium_sized
scheduler: &scheduler
__object__: ray.tune.schedulers.ASHAScheduler
max_t: 1000000
grace_period: 1000000
reduction_factor: 2
num_hyperopt_samples: &num_hyperopt_samples 1
gpus_per_trial: &gpus_per_trial 0
cpus_per_trial: &cpus_per_trial 1
tuner_params: &tuner_params
metric: val_loss
mode: min
device: &device cpu
trial_config:
logging:
images: true
"image_shape": [14, 14]
dataset: &dataset
__object__: src.explib.datasets.MnistSplit
scale: true
digit: 0
device: *device
scale_factor: 2
epochs: &epochs 2
patience: &patience 1
batch_size: &batch_size
__eval__: tune.choice([16, 32, 64])
optim_cfg: &optim
optimizer:
__class__: torch.optim.Adam
params:
lr:
__eval__: tune.loguniform(1e-7, 1e-4)
weight_decay: 0.0
model_cfg:
type:
__class__: &model src.veriflow.flows.NiceFlow
params:
soft_training: true
training_noise_prior:
__object__: pyro.distributions.Uniform
low:
__eval__: 1e-30 * torch.ones(1).to("cpu") #1e-20
high:
__eval__: 0.001 * torch.ones(1).to("cpu") #0.01
prior_scale: 5.0
coupling_layers: &coupling_layers
__eval__: tune.choice([i for i in range(3, 4)])
coupling_nn_layers: &coupling_nn_layers
__eval__: "tune.choice([[w] * l for l in [1, 2, 3] for w in [196, 392]])" # tune.choice([[c*32, c*16, c*8, c*16, c*32] for c in [1, 2, 3, 4]] + [[c*64, c*32, c*64] for c in range(1,5)] + [[c*128] * 2 for c in range(1,5)] + [[c*256] for c in range(1,5)])
nonlinearity: &nonlinearity
__eval__: tune.choice([torch.nn.ReLU()])
split_dim: 50
base_distribution:
__object__: src.veriflow.distributions.RadialDistribution
device: *device
p:
__eval__: math.inf
loc:
__eval__: torch.zeros(196).to("cpu")
norm_distribution:
__object__: pyro.distributions.LogNormal
loc:
__eval__: (1.2 * torch.ones(1)).to("cpu")
scale:
__eval__: (0.5 * torch.ones(1)).to("cpu")
use_lu: false
73 changes: 73 additions & 0 deletions experiments/mnist/mnist_0_scaled_14_linf_lognormal_gpu.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
__object__: src.explib.base.ExperimentCollection
name: mnist_logNormal_linf_loc_1_scale_05_mnist_14
experiments:
- &mnist_logNormal_linf_loc_1_scale_05_mnist_14
__object__: src.explib.hyperopt.HyperoptExperiment
name: mnist_logNormal_linf_loc_1_scale_05_medium_sized
scheduler: &scheduler
__object__: ray.tune.schedulers.ASHAScheduler
max_t: 1000000
grace_period: 1000000
reduction_factor: 2
num_hyperopt_samples: &num_hyperopt_samples 8
gpus_per_trial: &gpus_per_trial 2
cpus_per_trial: &cpus_per_trial 0
tuner_params: &tuner_params
metric: val_loss
mode: min
device: &device cuda
trial_config:
logging:
images: true
"image_shape": [14, 14]
dataset: &dataset
__object__: src.explib.datasets.MnistSplit
scale: true
digit: 0
device: *device
scale_factor: 2
epochs: &epochs 200000
patience: &patience 150
batch_size: &batch_size
__eval__: tune.choice([16, 32])
optim_cfg: &optim
optimizer:
__class__: torch.optim.Adam
params:
lr:
__eval__: tune.loguniform(1e-4, 1e-2)
weight_decay: 0.0
model_cfg:
type:
__class__: &model src.veriflow.flows.NiceFlow
params:
soft_training: true
training_noise_prior:
__object__: pyro.distributions.Uniform
low:
__eval__: 1e-30 * torch.ones(1).to("cuda") #1e-20
high:
__eval__: 0.001 * torch.ones(1).to("cuda") #0.01
prior_scale: 5.0
coupling_layers: &coupling_layers
__eval__: tune.choice([i for i in range(3, 4)])
coupling_nn_layers: &coupling_nn_layers
__eval__: "tune.choice([[w] * l for l in [1] for w in [294, 400]])"
nonlinearity: &nonlinearity
__eval__: tune.choice([torch.nn.ReLU()])
split_dim: 98
base_distribution:
__object__: src.veriflow.distributions.RadialDistribution
device: *device
p:
__eval__: math.inf
loc:
__eval__: torch.zeros(196).to("cuda")
norm_distribution:
__object__: pyro.distributions.LogNormal
loc:
__eval__: torch.ones(1).to("cuda")
scale:
__eval__: (0.5 * torch.ones(1)).to("cuda")
use_lu: false
Loading
Loading