-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update enrichment analysis #15
base: main
Are you sure you want to change the base?
Conversation
- line lenght restriction to 100 characters
maybe the dataset is not he best one to test enrichment analysis (few diff. reg. protein groups)
how many proteins/genes do need to be rejected to be considered valid. Before it was at least 2 genes.
- will be tested by building docs (maybe add a unittest later)
not sure if the example is the best.
@@ -308,7 +388,8 @@ def run_enrichment( | |||
num_background = num_background[0] | |||
else: | |||
num_background = 0 | |||
if method == "fisher" and num_foreground > 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to discuss. should we mistrust enrichements which are based on single hits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure they would be significant but filtering them would reduce the number of tests, which may be good in terms of performance.
- maybe it was there for compability, but then we could add kwargs to the function to catch any other arguments (without using these)
- NA can lead to float columns instead of 0. to investigate
_rejected, _pvals_corrected, _, _ = multitest.multipletests(p[mask], alpha, method) | ||
pval_corrected[mask] = _pvals_corrected | ||
rejected = np.full(p.shape, np.nan) | ||
rejected[mask] = _rejected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ToDo: add a test that ensures that 1.0 and 0.0 are interpreted as bool
.
resolve things discussed with Alberto.
double check docstrings of module (for rendering) |
- prepare to defien a common return type by specifying message TYPE_COLS_MSG
sphinx deals differently with parsing the strings...
…orrectly otherwise numpy notation should work fine
…imes - default parameter is rejected, so it is easier to not mistype parameter name
works locally. see: https://stackoverflow.com/a/14414965/9684872
- output should be tsv due to how pandas return is constructed for now
- enrichment only separate for up- and down regulated protein groups - keep hint on to do for PTM dataset
Go term annotations had several Go terms concatenated using `;`, which are now split up. Performance can be improved.
annotation field contains identifier again
No description provided.