Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing test in new installation #14

Open
erikedlund opened this issue Dec 3, 2024 · 7 comments
Open

Failing test in new installation #14

erikedlund opened this issue Dec 3, 2024 · 7 comments

Comments

@erikedlund
Copy link

erikedlund commented Dec 3, 2024

Thank you for publishing such a cool tool! I'm running into some issues installing and using the scoring function.

First, I'm seeing tests failing in a fresh installation:

FAILED tests/test_biowrappers.py::TestBioWrappers::test_depth_res - Failed: Unexpected success
=================================================== 1 failed, 21 passed, 3 xfailed, 144223 warnings in 347.77s (0:05:47) ================================================

DeepRank-GNN-esm_test.txt

When I attempt to use the model, an assertion error is thrown suggesting the embeddings are invalid:

$ deeprank-gnn-esm-predict ./053_5.pdb C A ../paper_pretrained_models/scoring_of_docking_models/gnn_esm/treg_yfnat_b64_e20_lr0.001_foldall_esm.pth.tar 2024-12-03 20:51:26,609 predict:60 INFO - Setting up workspace - /share/data/public/edlunde/git/DeepRank-GNN-esm/test/053_5-gnn_esm_pred_C_A 2024-12-03 20:51:26,616 predict:69 INFO - Renumbering PDB file. 2024-12-03 20:51:26,685 predict:112 INFO - Reading sequence of PDB 053_5.pdb 2024-12-03 20:51:26,716 predict:138 INFO - Generating embedding for protein sequence. 2024-12-03 20:51:26,716 predict:139 INFO - ################################################################################ 2024-12-03 20:51:33,577 predict:154 INFO - Read /share/data/public/edlunde/git/DeepRank-GNN-esm/test/053_5-gnn_esm_pred_C_A/all.fasta with 2 sequences 2024-12-03 20:51:33,586 predict:164 INFO - Processing 1 of 1 batches (2 sequences) 2024-12-03 20:51:43,586 predict:206 INFO - ################################################################################ 2024-12-03 20:51:43,669 predict:211 INFO - Generating graph, using 31 processors Graphs added to the HDF5 file Traceback (most recent call last): File "~/conda/envs/deeprank-gnn-esm-cpu-env/bin/deeprank-gnn-esm-predict", line 8, in <module> sys.exit(main()) File "~/conda/envs/deeprank-gnn-esm-cpu-env/lib/python3.9/site-packages/deeprank_gnn/predict.py", line 335, in main graph = create_graph(pdb_path=pdb_file.parent, workspace_path=workspace_path) File "~/conda/envs/deeprank-gnn-esm-cpu-env/lib/python3.9/site-packages/deeprank_gnn/predict.py", line 216, in create_graph GraphHDF5( File "~/conda/envs/deeprank-gnn-esm-cpu-env/lib/python3.9/site-packages/deeprank_gnn/GraphGenMP.py", line 141, in __init__ self._add_embedding(outfile=outfile, pdbs=pdbs, embedding_path=embedding_path) File "~/conda/envs/deeprank-gnn-esm-cpu-env/lib/python3.9/site-packages/deeprank_gnn/GraphGenMP.py", line 247, in _add_embedding assert not torch.all(torch.eq(embedding_tersor, 0)) AssertionError

I am running in CPU mode, for what it's worth, but the test failure seems unrelated.

@erikedlund
Copy link
Author

I've discovered a workaround for the crash. My .pdb files are named with a unique convention. Renaming the files to a pattern matching the RCSB convention (e.g. 1XXX.pdb) allows DeepRank-GNN-esm to proceed.

@ntxxt
Copy link

ntxxt commented Dec 5, 2024

Thank you for your interest in our tool!
Good work!

@ntxxt ntxxt closed this as completed Dec 18, 2024
@erikedlund
Copy link
Author

For the record, the first issue I reported here remains unsolved (the failing test).

@ntxxt ntxxt reopened this Dec 19, 2024
@ntxxt
Copy link

ntxxt commented Dec 19, 2024

Hi there,

In what part is the code failing? The assert check ensures embedding is added correctly to the graph.

If you re-name your pdb file as xxxx.pdb, the code should run fine. Also, try to use pdb_tidy command from pdb-tools to clean the pdb file if it is not the standard format.
I can also look at your input pdb file if you cannot identify the issue.

Best,
Xiaotong

@erikedlund
Copy link
Author

This is the pytest test output from the installation instructions. I filed this as a single issue because I wondered if the test failure indicated the source of the problem with the file names but it seems like the assumption for their length is intended.

FAILED tests/test_biowrappers.py::TestBioWrappers::test_depth_res - Failed: Unexpected success
=================================================== 1 failed, 21 passed, 3 xfailed, 144223 warnings in 347.77s (0:05:47) ================================================

@erikedlund
Copy link
Author

erikedlund commented Dec 20, 2024

Here's the complete output of a test failure from a fresh installation on a new Ubuntu machine with a 4090.
DeepRank-GNN-esm-test-failures.txt

The main failure seems to be in TestBioWrappers.test_depth_res.

Let me know if there are any other logs I can provide.

@ntxxt
Copy link

ntxxt commented Jan 8, 2025

Hi there, sorry for the late response,

I just took a look, and the error is caused by this line, which involves matrix calculation based on the predicted results:

self.FDR = false_positive / (true_positive + false_positive)

The issue arises because, in this case, the test PDB and targets are 'fake' data, As a result, both the true positive and false positive values predicted by the model are 0, leading to a division by zero error.

If you passed all the other tests, means your environment is set up correctly and ready to be used. This error is unlikely to occur in real-world scenarios, so it shouldn't be a concern moving forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants