Harvard uploads 6.5 million court cases to train legal AI

The task was undertaken by the Harvard Law School as part of its Caselaw Access Project which aimed to digitize every US case since the 1600s

2Nov

Harvard Law School's Library Innovation Lab revealed it has successfully managed to scan and digitize more than 40 million legal documents related to every reported US state and federal legal case from the 1600s up until summer 2018.

A big factor behind this move was to help the training of legal AI systems. AI models require copious amounts of good quality, preferably unbiased data to train its software. With the creation of the Caselaw Access Project, which is free and accessible to everyone, it will give any developer attempting to build a legal AI access to millions of cases, saving the time and labor costs required to build their own databases.


Visit Innovation Enterprise's Machine Learning Innovation Summit, part of DATAx New York on December 12–13, 2018


With 360 years of searchable caselaw at the fingertips any AI developer, Adam Ziegler, managing director at the Library Innovation Lab, told The MIT Technology Review in 2017, "I think there will be a lot more experimentation, and the progress will accelerate.

"It’s really hard to build a smart interface if you can’t get to the basic data," he added.

Uber's ipo tipped to be worth  120bn small

Read next:

Uber launches new driver loyalty program

i