Beginning the winter of 2021, I have beening do some simple NLP research with the help of a teacher from Nanjing University, who I really thank for. Here is some of my work in the past three months. If anyone is insterested in it, you could leave a massage and we could have further discussion.
The project mainly focus on calculate text-similaries by different models, and the data we use is the submission information of academic papers, which is an open dataset for nlp.
-
KnowledgeGruph_P1 : Data Analysis. Based on the given data of papers of different categories, we would find some statistics of them, such as finding top categories on different years.
-
KnowledgeGruph_P2 : Further Data Analysis . Now we want to focus to the categories which ocuppy the top-five number of paper from 2014 to 2020, finding thier daily
-
KnowledgeGruph_P3 : Transform the statistics
-
KnowledgeGruph_P4 : Calculate Texts Similarities by the method of a paper
-
KnowledgeGruph_P5 : Data Abstraction
-
KnowledgeGruph_P6 : Calculate Text-Similarity By IF-IDF
-
KnowledgeGruph_P7 : EDA of the result of P6
-
KnowledgeGruph_P8 : Transform the processed data
-
KnowledgeGruph_P9 + P10 : Calculate Text-Similarities by LDA Model
-
KnowledgeGruph_P11 :Calculate Text-Similarities by Doc2Vec Model
-
KnowledgeGruph_P12 :Triple Extraction by Spacy_1
-
KnowledgeGruph_P13 :Triple Extraction by Spacy_2
-
KnowledgeGruph_P14 :Triple Extraction by Open Information Extraction (OIE)