Skip to content

Directly applying advancements in transfer learning from BERT results in poor accuracy in domain-specific areas like law because of a word distribution shift from general domain corpora to domain-specific corpora. In our project, we will demonstrate how the pre-trained language model BERT can be adapted to additional domains, such as contract la…

Notifications You must be signed in to change notification settings

Lukas-Justen/Law-OMNI-BERT-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Law-OMNI-BERT-Project

This is the repository for our "Law and Artificial Intelligence" project at Northwestern University. The team member for the project are Noah Caldwell-Gatsos @ncaldwell17, Rhett D'souza @rhettdsouza13 and Lukas Justen @Lukas-Justen.

Problem

Directly applying advancements in transfer learning from BERT results in poor accuracy in domain-specific areas like law because of a word distribution shift from general domain corpora to domain-specific corpora. In our project, we will demonstrate how the pre-trained language model BERT can be adapted to additional domains, such as contract law or court judgments.

Goal

We did not create and train the model, that requires resources beyond the scope of the project. Instead, what we propose is a framework for creating a domain-specific BERT by using legal contracts as a case study. This framework will cover why this is necessary, what kind of data is necessary, how the model is trained, and how the model’s performance can be evaluated.

Finally, we built a small frontend that allows you to visualize the complexity of a corpora. We hoped that this will help other people to gain insights into their datasets and figure out whether it makes sense to apply BERT to their domain.

Repository Structure

Case Study

Frontend

About

Directly applying advancements in transfer learning from BERT results in poor accuracy in domain-specific areas like law because of a word distribution shift from general domain corpora to domain-specific corpora. In our project, we will demonstrate how the pre-trained language model BERT can be adapted to additional domains, such as contract la…

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published