Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variant to disease/phenotype predicates #1545

Open
EvanDietzMorris opened this issue Jan 10, 2025 · 1 comment
Open

Variant to disease/phenotype predicates #1545

EvanDietzMorris opened this issue Jan 10, 2025 · 1 comment

Comments

@EvanDietzMorris
Copy link
Collaborator

The model is generally lacking good ways to represent sequence variant to disease relationships. Gene associated with condition exists, but as far as I can tell there is no way to provide more granularity or specifics. Variant to disease association exists, but I don't think there are predicates for those associations yet.

This is a known issue, but it has become timely, as we (ROBOKOP) are currently working with colleagues at ClinGen to bring these kinds of edges into ROBOKOP (and for Translator), so I wanted to get the ball rolling again.

I personally think it would be nice to have two ways to represent these edges:

  • gene to disease/phenotype edges with qualifiers providing more granularity
  • variant to disease/phenotype edges

In the short term though, we are specifically interested in predicates like "is pathogenic for" for variant to disease edges.

As example would be:
CAID:CA115937 is pathogenic for MONDO:0016419
https://erepo.clinicalgenome.org/evrepo/ui/classification/4b6c7f5f-b13d-435d-bba1-0d501ef69489

Predicates like "is likely pathogenic for" or "may be pathogenic for" would also be helpful, representing the various Clinical Validity Classifications in ClinGen. I'm not sure if we'd also want ones for Benign associations, or if negation would be better for those.

@bpow @Vibhorgupta31

@bpow
Copy link

bpow commented Jan 14, 2025

For additional context, the VariantToDiseaseAssociation class has, in its documentation page, an example predicate of "is pathogenic for", so at least someone else at some point thought that it as a good idea.

For the specific nomenclature of variant-to-disease/phenotype, I think that "pathogenic" is a good terminology to use. It's the long-standing terminology that multiple professional groups (American College of Medical Genetics and Genomics, Association for Molecular Pathology, ClinVar, ClinGen, etc.) use. We could potentially also have an expressly negated predicate is_benign_for to indicate that, not only is there not sufficient evidence for pathogenicity, but there is expressly evidence to refute pathogenicity. Alternatively, we could potentially address this with appropriate qualifiers. Current ACMG/AMP recommendations have a 5-valued set of categories (Benign, Likely Benign, Variant of Uncertain Significance, Likely Pathogenic, Pathogenic), but if that's too many additional predicates than we could probably address that sort of thing in qualifiers (but I'd be interested in suggested ways to map this specific terminology which has rather formal domain-specific definition to more-general qualifiers).

For gene-to-disease / gene-to-phenotype edges, I'd advocate for being careful about how we represent the predicates. "Gene is associated with disease" is often a true statement, but I often hear people colloquially say that a "gene causes a disease" or a "gene causes a phenotype", where in general, it is rather a variant allele of a gene which, through inactivation, reduced/increased activity, or novel effect can be said to be the cause of a disease or phenotype.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants