Skip to content

NLP based Legal document summariser. Takes large documents with complex legal jargons and gives as output a layman understandable output

Notifications You must be signed in to change notification settings

SwethaMagesh/sankshepika-mlpro

Repository files navigation

Sankshepika

Problem statement

Legal texts often contain complex language, specialized terms, and long sections, which can overwhelm people without legal knowledge. These documents are condensed into clear, straightforward summaries by summarizers. Key points are extracted, allowing non-experts to understand vital details without going into the complexities. Moreover, time is saved, and the need for costly legal consultations is reduced by legal summarizers. This promotes access to legal information and encourages a fair and just legal system.


Challenges addressed

  • The pretrained model vocabulary is predominantly US based. “US attorney office” pops up in Indian dataset summaries
  • The LLMs which can take in prompts cannot handle token limits ie. the size of the legal documents
  • The extractive and other prior methods are less human understandable. (preferred by experts)

Proposed system

image image image


Results

  • Obtained ROUGE score of 0.45 on an average. image

Conclusion

  • The pretrained models suffer from incorrect or hallucinated information especially from US Law
  • Pre-trained transformer fine tuning takes a lot of time and resources
  • Different approaches - Abstractive & Extractive have different pros and cons
  • Combining summaries is easier for LLMs with token limit to get a reasonable output leveraging all methods

About

NLP based Legal document summariser. Takes large documents with complex legal jargons and gives as output a layman understandable output

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published