MediMineR

This is a project to automatically detect fields for extraction by comparing similar phrases across documents.

The initial output is a list of common terms which should be used as boundaries for extraction

The outputted file needs to be manually manipulated by the user to ensure that replacement is meaningful and also because the find and replace dictionary may need domain specific knowledge which is beyond the scope of the programme.

Once the find and replace dictionary is satisfactory the replacement can then be done and then automatically the values associated with the key will be extracted and cleaned for the user to use.

It also extracts from tables

The end result is a large HashMap with all values from each document which can then be used for any further analysis

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.settings		.settings
src		src
.classpath		.classpath
.gitignore		.gitignore
.project		.project
FindAndReplace.txt		FindAndReplace.txt
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MediMineR

About

Releases

Packages

Languages

sebastiz/MediMineR

Folders and files

Latest commit

History

Repository files navigation

MediMineR

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages