update enrichment analysis #15

enryH · 2024-11-25T10:06:43Z

No description provided.

- line lenght restriction to 100 characters

maybe the dataset is not he best one to test enrichment analysis (few diff. reg. protein groups)

how many proteins/genes do need to be rejected to be considered valid. Before it was at least 2 genes.

- will be tested by building docs (maybe add a unittest later)

not sure if the example is the best.

enryH · 2024-11-25T10:07:40Z

acore/enrichment_analysis.py

@@ -308,7 +388,8 @@ def run_enrichment(
            num_background = num_background[0]
        else:
            num_background = 0
-        if method == "fisher" and num_foreground > 1:


to discuss. should we mistrust enrichements which are based on single hits?

I am not sure they would be significant but filtering them would reduce the number of tests, which may be good in terms of performance.

…ting

- maybe it was there for compability, but then we could add kwargs to the function to catch any other arguments (without using these)

- NA can lead to float columns instead of 0. to investigate

enryH · 2024-11-26T14:02:51Z

acore/multiple_testing.py

+    _rejected, _pvals_corrected, _, _ = multitest.multipletests(p[mask], alpha, method)
+    pval_corrected[mask] = _pvals_corrected
+    rejected = np.full(p.shape, np.nan)
+    rejected[mask] = _rejected


ToDo: add a test that ensures that 1.0 and 0.0 are interpreted as bool.

acore/enrichment_analysis.py

docs/api_examples/enrichment_analysis.py

resolve things discussed with Alberto.

enryH · 2024-12-03T15:32:34Z

double check docstrings of module (for rendering)

- prepare to defien a common return type by specifying message TYPE_COLS_MSG

sphinx deals differently with parsing the strings...

…orrectly otherwise numpy notation should work fine

…imes - default parameter is rejected, so it is easier to not mistype parameter name

works locally. see: https://stackoverflow.com/a/14414965/9684872

- output should be tsv due to how pandas return is constructed for now

…t analysis

- enrichment only separate for up- and down regulated protein groups - keep hint on to do for PTM dataset

albsantosdel · 2024-12-18T14:37:24Z

Hi, I checked the documentation notebook and the table showing the functional PCA loadings shows multiple GOs per row when there should be only one GO per row. I wonder if you are annotating each protein with all the associated terms (separated by ;)?

Go term annotations had several Go terms concatenated using `;`, which are now split up. Performance can be improved.

annotation field contains identifier again

enryH added 6 commits November 20, 2024 11:50

🎨 update docstrings and module string

6861fe3

- line lenght restriction to 100 characters

🚧 test enrichment analysis 🎨 type annotations

64d3414

maybe the dataset is not he best one to test enrichment analysis (few diff. reg. protein groups)

🎨🐛 Minimum number of genes rejected to be eligable

d37dc5e

how many proteins/genes do need to be rejected to be considered valid. Before it was at least 2 genes.

✅ add minimal test for enrichment analysis

faf7158

✨ add fetching information from Uniprot KB

c1c66f9

- will be tested by building docs (maybe add a unittest later)

🎨 fetch annotation from uniprot in enrichment example

0b3cd6e

not sure if the example is the best.

enryH commented Nov 25, 2024

View reviewed changes

enryH added 8 commits November 25, 2024 12:12

🐛 make decomposition a module

545299b

🎨✅ add fdr alpha parameters as option, optimize and test multiple tes…

18121c9

…ting

⚡ just work with numpy array, not lists

3015045

🐛 remove unused parameter

13f45e9

- maybe it was there for compability, but then we could add kwargs to the function to catch any other arguments (without using these)

🐛 correct name of testing fuction (to fct tested)

3961d84

:constructions: add enrichtment example to docs

7f51e02

🐛✨ cast rejected to bool, annotate features using pandas functionality

e63f937

🐛 newer versions of numpy need explicit type

4adc624

- NA can lead to float columns instead of 0. to investigate

enryH requested a review from albsantosdel November 26, 2024 13:59

enryH commented Nov 26, 2024

View reviewed changes

🚧 annotate and use some more formatting option

6a7db73

enryH changed the title ~~Gsea~~ update enrichment analysis Nov 28, 2024

enryH commented Nov 28, 2024

View reviewed changes

acore/enrichment_analysis.py Outdated Show resolved Hide resolved

enryH commented Nov 28, 2024

View reviewed changes

acore/enrichment_analysis.py Outdated Show resolved Hide resolved

enryH commented Nov 28, 2024

View reviewed changes

acore/enrichment_analysis.py Outdated Show resolved Hide resolved

enryH commented Nov 28, 2024

View reviewed changes

docs/api_examples/enrichment_analysis.py Show resolved Hide resolved

enryH force-pushed the gsea branch from b1643a9 to 6a7db73 Compare December 2, 2024 11:30

enryH added 3 commits December 2, 2024 22:47

🎨 pass on parameter for min set, docstrings

142b778

resolve things discussed with Alberto.

🎨 improve visibility of data and inspect annotations

27c6e38

🐛 make it Python 3.9 compatible

8ed860c

enryH added 2 commits December 3, 2024 16:47

🎨 type annotations and docstring updates

ec10c71

- prepare to defien a common return type by specifying message TYPE_COLS_MSG

🎨 docstrings fmt

9c96af7

enryH added 17 commits December 4, 2024 14:29

🎨 more docstring updates

a6f21c9

🐛 regex in parameter name vs interpreded in docstring

4d98dd5

sphinx deals differently with parsing the strings...

🎨 docstrings: test if listing works with blank lines, write Example c…

9f4d50d

…orrectly otherwise numpy notation should work fine

🎨 rename reject_col to rejected_col after typing it wrongly several t…

bc5ab2c

…imes - default parameter is rejected, so it is easier to not mistype parameter name

🎨 try to set hyperlink

834b54e

🚚✨ move uniprot code to module, creating user-facing function

794c1af

🐛 test line break in docstring

9d7c5f2

works locally. see: https://stackoverflow.com/a/14414965/9684872

🐛 specify fields as parameter

dba2000

- output should be tsv due to how pandas return is constructed for now

🚧 use fetch_annotations from pkg, add ssgsea

ddac13c

🎨 type hints, formatting, add pca example to api example of enrichmen…

5373372

…t analysis

🎨 run CI only on PR

52ce9c5

🎨 format tests

47bcdbd

🎨 add up and down regulated example, clean-up function and annotate

7acf4b3

🎨 document log fold change, data used and improve docstrings

436128f

📝 extend docs with pca plot from vuecore and ks example

5bdbfb8

🎨 add changes based on discussion with Alberto

ce99038

- enrichment only separate for up- and down regulated protein groups - keep hint on to do for PTM dataset

🎨 lower log2fc cutoff to include downregulated protein groups

45c2f5e

enryH mentioned this pull request Dec 18, 2024

add PTM example for enrichtment analysis #18

Open

enryH added 2 commits December 19, 2024 15:09

🐛 split up GO term annotations

b71acc8

Go term annotations had several Go terms concatenated using `;`, which are now split up. Performance can be improved.

🐛 allow also single protein groups results

e685440

enryH marked this pull request as ready for review December 27, 2024 12:48

🎨 do not have accession as a column ( "Entry")

8a2fac8

annotation field contains identifier again

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update enrichment analysis #15

update enrichment analysis #15

enryH commented Nov 25, 2024

enryH Nov 25, 2024

albsantosdel Dec 18, 2024

enryH Nov 26, 2024

enryH commented Dec 3, 2024

albsantosdel commented Dec 18, 2024

update enrichment analysis #15

Are you sure you want to change the base?

update enrichment analysis #15

Conversation

enryH commented Nov 25, 2024

enryH Nov 25, 2024

Choose a reason for hiding this comment

albsantosdel Dec 18, 2024

Choose a reason for hiding this comment

enryH Nov 26, 2024

Choose a reason for hiding this comment

enryH commented Dec 3, 2024

albsantosdel commented Dec 18, 2024