Skip to content

Commit

Permalink
Update the docs.
Browse files Browse the repository at this point in the history
  • Loading branch information
yjcyxky committed Oct 29, 2023
1 parent 74a7854 commit b32bee6
Show file tree
Hide file tree
Showing 19 changed files with 221 additions and 12 deletions.
Binary file added docs/assets/kge/kge-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-10.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-4a.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-4b.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-4c.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-6a.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-6b.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-6c.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-6d.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-6e.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-7.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-8.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/kge/kge-9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
33 changes: 21 additions & 12 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## Knowledge Curation Tutorial
# Knowledge Curation Tutorial

Taking Long COVID & ME/CFS as an example

Expand All @@ -7,13 +7,13 @@ Taking Long COVID & ME/CFS as an example

If you have any questions, please contact the administrator. We are working on the improvement of the label studio and curation process. We will update the tutorial in time. Thank you for your support and understanding. Also, we are looking forward to your feedback.


### Log in
## Log in

Log in into the Prophet Label Studio from the web link https://prophet-studio.3steps.cn/user/login/. Asking administrator to have an account, password and organization to log in. Or you can follow the step in our homepage.

![img](./assets/figure01.png)

## Components

### Main Interface

Expand Down Expand Up @@ -61,6 +61,8 @@ Each tag represents an author, and filters can be used underneath the author tag
![img](./assets/figure11.png)


## Labeling

### Labeling Interface

After selecting the specific article, you may enter the main working interface. The details are as follows.
Expand All @@ -76,59 +78,66 @@ Each author's annotations on this article will be displayed below the respective
![img](./assets/figure15.png)


#### Colorful Entity Label
### Colorful Entity Label

Each entity category has a different color to differentiate and mark it, so be careful not to make mistakes or confuse it. Details can be seen in another document. If you expect to add new entity categories, please contact the administrator.

![img](./assets/figure16.png)


#### Annotation
### Annotation

This area shows content that has been marked up by the author and can be categorized by time or entity category.

![img](./assets/figure17.png)

#### Judgement
### Judgement

You can determine whether this article is relevant to your research topic by selecting Yes, No or Unknown.

![img](./assets/figure18.png)

#### Scoring
### Scoring

The five stars are a scoring system that scores the literature and knowledge against criteria set by you individually or as a team, with one star being the lowest and five stars being the highest.

![img](./assets/figure19.png)

#### Relations
### Relations

There should be at least one connection between the labeled entity and the entity. If you expect to add new relation types, please contact the administrator. Currently, we have not standardized the relation types, so you can add any relation types you want. We will collect the feedback and standardize the relation types in the future.

![img](./assets/figure20.png)

#### Submit & Update
### Submit & Update

To save the annotation, you can click the Submit or Update button. Please save your progress often and save or update your document after you have completed the knowledge curation of a piece of literature, as this may save you from losing the current progress of your work.

![img](./assets/figure21.png)

#### Knowledge Graph Editor - Mapping Findings with Knowledge Graph
### Knowledge Graph Editor - Mapping Findings with Knowledge Graph

The mapping tool is used to map the labeled content to the knowledge graph. You need to select the labeled content and the corresponding knowledge graph node to map them. When you open the component, it will sync all relations you labeled into the table. You can click the "Edit" button to finish the mapping for each relation. You can search standardized source id and target id with source name and target name. After picking up the source id, target id and key sentence, you can click the "Update" button to update the mapping. If you want to delete the mapping, you can click the "Delete" button.

NOTE: Currently, the knowledge graph don't support Protein entity, so you can change the source type and target type to "Gene" first and then search the source id and target id.

![img](./assets/figure22.png)

#### Full Text Needed
More details about the mapping tool can be found in the [Knowledge Graph Editor Tutorial](./kge.md).

### Full Text Needed

Yes and No to indicate whether the full text is needed to further explore the knowledge. NOTE: Currently, we don't support the full text labeling, but we need to know whether the full text is needed.

![img](./assets/figure23.png)

#### Note
### Note

The "Content" box at the bottom allows you to write down your notes or ideas.

![img](./assets/figure24.png)


## Notice

More details about the notice can be found in the [Notice](./notice.md).
69 changes: 69 additions & 0 deletions docs/kge.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
## Step 1

Once you have labeled the entities and made the connections between them, you are ready to start editing the knowledge graph (Figure 1). Please remember to save and update before you match the knowledge.

![Figure 1](./assets/kge/kge-1.png)

## Step 2

Press the circle-shaped chart in the lower right corner (Figure 2). And then you'll be taken to the Knowledge Graph Editor screen, the details are as follows (Figure 3).

![Figure 2](./assets/kge/kge-2.png)

![Figure 3](./assets/kge/kge-3.png)

## Step 3

Now you are ready to start matching entities and their relationships in the editor. As shown in the figure, Source Type, Source ID, Target ID, Relation Type, and Key Sentences are available for selection (Figure 4a, 4b, 4c). To see more information about an option, you can hover the mouse cursor over the option to view it (Figure 4c).

![Figure 4a](./assets/kge/kge-4a.png)

![Figure 4b](./assets/kge/kge-4b.png)

![Figure 4c](./assets/kge/kge-4c.png)

## Step 4

While we generally make changes to Source ID, Target ID, Relation Type, and Key Sentences, and generally leave the Source Type unchanged, except for the Protein category. Because in the Knowledge Graph Editor, we need to align Protein entities to Gene entities (Figure 5), details can be found in the Knowledge Mining Guidelines for Collaborative Development.

![Figure 5](./assets/kge/kge-5.png)

- A. Source ID: Source ID needs to be the same as Source Name. The databases to be selected for each entity class table can be found in the Collaboration document (Figure 6a).

![Figure 6a](./assets/kge/kge-6a.png)

- B. Target ID: Source ID needs to be the same as Source Name. The databases to be selected for each entity class table can be found in the Collaboration document (Figure 6b).

![Figure 6b](./assets/kge/kge-6b.png)

- C. Relation Type: Relation Type should be based on the associative editing mentioned within the literature, details can be viewed in the Collaboration document (Figure 6c).

![Figure 6c](./assets/kge/kge-6c.png)

- D. Key Sentence: Key phrases are generated from what is labeled on the Prophet Label Studio, and there can be multiple key phrases to choose from (Figure 6d).

![Figure 6d](./assets/kge/kge-6d.png)

- E. Unknown: If there is an entity that is not in the editor or you are not sure about it, you can edit it as Unknown and wait for further refinement (Figure 6e).

![Figure 6e](./assets/kge/kge-6e.png)

## Step 5

After editing, if you want to save/update, you can press the Update button. If you made a mistake in editing or don't need the content of the entry, you can press the Delete button. And if you want to re-edit, you can press the Edit button (Figure 7).

![Figure 7](./assets/kge/kge-7.png)

## Step 6

When you start marking and end marking, you may need to use the two buttons in the upper right corner (Figure 8). The details are as follows.

![Figure 8](./assets/kge/kge-8.png)

- F. Clean Cache: In order to reduce duplicate searching, the search function will display the search history. If the search history is too long to affect the search, you can press Clean Cache to clear the history cache (Figure 9).

![Figure 9](./assets/kge/kge-9.png)

- G. Update Table: If you need to match the knowledge of another article, at this point you can click on the Update Table button to successfully convert the article in the editor (Figure 10).

![Figure 10](./assets/kge/kge-10.png)
131 changes: 131 additions & 0 deletions docs/notice.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
## Knowledge curation management

Please keep your progress frequently and save or update your files after completing a knowledge mining of a literature, which may save you from losing your current work progress.

## Share your results

- Prophet Label Studio Account configuration is completed by themselves through the invitation link, and the account and password are set by themselves;

- Through the Prophet Label Studio, the login page to confirm your project project;

- The final completion will be presented in the form of an individual and a team knowledge map.

Visit [OpenProphetDB](https://prophetdb.org) for the current ongoing projects and contact the appropriate project group members to invite them to join the appropriate Organization.

## Communication

There will be a meeting every two weeks to track the literature knowledge mining progress and discuss related issues, and you can watch it through the link below,

Meeting link: Ask the administrator for the link.

Video link of the previous conference: Ask the administrator for the link.

## Improve your labeling situation

On the annotation page, you can see the top tab to see the team members or add your annotation.

Meanwhile, please improve your annotation status as follows:

- Yes, No and Unknown in the middle of the bottom of the title of the article represent whether the article is related to your research objectives, if relevant, Yes, No, Unknown, unknown or need to be judged by the full text;

- The five stars on the right side of the title of the article are the scoring system, which scores the literature and knowledge according to the standards set by you or your team. One star is the lowest and five stars are the highest;

- Yes and No at the bottom of the article represent whether you need further knowledge, and enter your notes or ideas in the "Content" box below;

## Preliminary consideration

- Set the focus of knowledge mining according to the final results of each project. The following examples:

1. The ME / CFS group, focusing on molecular mechanisms (such as genes, proteins and other promising drug targets);

2. The mechanism should be included in the subcellular localization and in the cell type / tissue / organ.

- Each entity and its correspondence should appear in at least one article with a corresponding PMID;

- The required articles can be keyword retrieved through the Pubmed.

## Substance

### Color schemes

- Gene: red 2
- protein: Orange color 3
- disease: Brown color 4
- symptom: Deep purple color 5
- metabolite: Green, 6
- pathway: Dark pink 7
- anatomy: Gray 8
- microbe: Yellow: 9
- chemical: Black, and 0
- biological _process: light purple q
- cellular _component: Light pink w
- molecular _function: Brown, e
- Key Sentence: blue t

At present, there is no standard color scheme, and each label and color scheme can be modified and improved according to the requirements.

The existing labels and color matching are considered to set the principle of easy to distinguish and not easy to confuse. After the color, the characters (numbers or letters) are shortcut keys, which can be quickly selected and marked by the corresponding characters on the keyboard.

### ID identifiers

To implement an automatic annotation of the biochemical entities, consider:

- **Gene and RNA identifier**: match the most appropriate term from ENTREZ, preferentially using ENTREZ:ID.

- **Protein identifier**: convert proteins into genes through NCBI and other websites, match the most appropriate terms from ENTREZ, and preferentially use ENTREZ:ID for annotation.

- **Disease and disease state identifier**: match the most appropriate term from MONDO, preferentially with the corresponding MONDO:ID.

- **Symptom Identifier**: match the most appropriate term from SYMP, preferentially with SYMP:ID.

- **Metabolite identifier**: match the most appropriate term from the HMDB with the corresponding HMDB:ID.

- **Pathway identifier**: match the most appropriate term from KEGG or React, preferentially use the corresponding KEGG:ID, followed by React:ID.

- **Solution structure identifier**: match the most appropriate term from MESH, preferentially annotate with the corresponding MESH:ID.

- **Drug and chemical identifier**: match the most appropriate term from DrugBank, with the corresponding DrugBank:ID.

- **Cell composition / biological process / molecular mechanism identifier**: match the most appropriate term from the Gene Ontology (GO), preferentially annotated using GO:ID.

## Relation

- Be sure to choose the relationship after the standard entity;

- Select the relationship that is consistent with the content of the original article first, and try to use the description consistent with the original article or express the connection between the selected entities;

- **No specific connection is mentioned in the confirmation text, or if there is no relationship you chose in the Label Studio, then the general relationship of be `associated_with` is chosen.**

## Common questions

Q1: Proteins cannot be aligned in the Knowledge Graph Editor, such as interleukin-8 / IL-8.

A1: IL-8 is interleukin-8, a cytokine produced by a variety of cells and is a protein entity. Since proteins align to gene entities in the Knowledge Graph Editor, a search in NCBI yields that the gene ID of IL-8 is CXCL8, and searching for CXCL8 in the editor will bring up the corresponding entity and complete the editing.

Q2: Different names for the same entity, such as Post-Acute Sequelae of SARS-CoV-2 infection (PASC) or Long-haul COVID or Post-COVID-19 Syndrome or Post-Acute COVID-19 Syndrome

A2: Unified alignment is Long COVID-19 | MONDO:0100233.

Q3: Can I find entities by their IDs when aligning in the Knowledge Graph Editor?

A3: Of course you can! If you enter MONDO:0100233, the entity Long COVID-19 will be searched.

Q4: How should I mark a ratio or multiple side-by-side entities in the markup content?

A4: Mark each entity and indicate the ratio by key phrases in the alignment editor.

Q5: In which label do immune cells such as B cells, T cells, human lung cells, and cardiomyocytes count?

A5: Immune cells can be labeled as B/T cell activation under Biological Processes, and lung cells and cardiomyocytes can be labeled as Lung and Heart under Anatomical Structures.

## Attention

- It is best to use Google Chrome to open the Prophet Label Studio website.

- Spaces that may be generated when marking are not allowed, for example, spaces before and after the highlighted markers in Long COVID should be avoided as much as possible and only the entity Long COVID is preferred.

- It is not necessary to repeat markup of a content (markup of the first occurrence of the ENTITY), but it has to be made explicit again in the KEY SENTENCES.

- Every article needs at least one key sentence (with two entities in it is preferred).

- Mark the full name of the entity + the abbreviation, and the content in parentheses as well, but common entities may not need to be marked with the full name, e.g., myalgic encephalomyelitis / chronic fatigue syndrome (ME/CFS), while some rare abbreviations must be marked with the full name to facilitate subsequent review.

0 comments on commit b32bee6

Please sign in to comment.