Skip to content

wasimblogs/queryAnswer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

queryAnswer

Design choices that affects Query Answering capabilities of a QA system

  • Lemmatizer / Stemmer : Which provides better result and why?
  • Stopword Filter : As comprehensive and unobstructing as possible
  • Spell errors and check
  • distance metric
  • TF-IDF design : Among the variants of TFIDF which is the most suitable for query answering?
  • Vocabulary of corpus
  • Unigram / Bigram : Does bigram vocabulary help?
  • Handover to human correspondent
  • Syntatic Parsing of sentences to uncover relations between words
  • Named entity recognition and Noun Phrase extraction

Database -Quora question pair sets can be downloaded from :http://qim.ec.quoracdn.net/quora_duplicate_questions.tsv

Installation:

  • Requires NLTK only

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages