The justice system in the USA is much different from the one in Poland. Undertaking legal action is much more common there than in our country so you’ve probably heard the phrase “I’ll sue you” on multiple occasions in American movies. In reality, a lot of cases that involve corporate matters look like that: tons of papers, printed emails, documents, etc., are thrown into one room. Then a team of interns starts reading every document to find those that are actually relevant for their particular case.
Usually from tens or hundreds of thousands of documents, they find a couple hundreds or thousands of documents that may be useful. Then more experienced lawyers come in and do the work with the relevant material only. The whole process is extremely time- and energy-consuming. However, it’s also a perfect candidate for automation.
The process described above is called discovery, which means that its goal is to discover evidence. The computerized version is called e-Discovery and, in order to understand how it works, I’ve talked with a software engineer from Relativity. The whole process is fairly simple from a US lawyer’s standpoint, but not so much for an average Polish IT specialist.
Relativity is a US-based company that develops and offers an e-Discovery SaaS solution. Recently the Krakow branch was awarded a Top Tech Employer certificate, as it’s a great place to work as IT professionals. As we’ve found out during the certification process, Relativity’s employees are really highly motivated since 95% of the surveyed employees feel that their work matters and are satisfied with their roles on their teams. A lot of that comes from product development itself, as the process provides pretty interesting challenges.
How does it all work?
The first step is to collect all documents that may include any evidence. This means emails, text messages, Word documents, PDFs, Excel spreadsheets, recordings and photos. Depending on the company strategy, it might be ad hoc, specific to the case or an automatic process to archive data on a daily basis. If data is collected every day, as you may think, you end up with terabytes of data.
At this point data is unstructured and it’s hard to make any sense out of it. That’s why the next step is crucial. Data needs to be indexed and structured in order to be searchable. In theory this step could be parallelized almost indefinitely, but since there is a lot of duplication within the data, it’s better to skip processing of duplicated content and parallelize processing of independent pieces of information.
Depending on the scenario, either processed or unprocessed data has to be copied to the Relativity platform. In both cases data has to be moved reliably and in a timely manner. This is the Krakow team’s responsibility. They use different methods for different cases. When there is an excellent connection between the company and Relativity, UDP-based communication can be used, which is similar to what HTTP/3 will offer. When a company sends data over WiFi, good old HTTP will actually be comparable in speed and more reliable. When it comes to cloud, Azure Storage Data Movement Library is preferred, as it offers really high throughput rates. Relativity and Microsoft cooperate in this area.
After the data is processed and transferred to the Relativity platform, it’s ready for the evidence discovery phase. This is where lawyers take over and search for documents containing certain phrases. They can also view each document in the built in document viewer, visualize whole email threads and sort documents according to their relevance to the case. Relativity also offers another component, which uses machine learning (ML) to find irregularities in the documents, such as unusual clauses in agreements or meetings not matching regular patterns.
In the end this multistep process saves a ton of work and produces better results than the traditional approach. This means that it’s easier for the legal team to discover the truth in their data that leads them to seek justice in a courtroom. On top of that, the whole process is modular and some parts can be applied to other areas - like information governance. As you can see, it’s all about the data - collecting, processing, transferring and gathering insight from the terabytes of data, where only a small portion is actually relevant.
Along the way, there are technical challenges and problems that Relativity employees can face and solve on a daily basis.
Impact on the real world
e-Discovery - and legaltech as a larger domain - is barely recognized in Poland, but the impact on the real world is clear. Most major corporations and law firms have to use it in this day and age. It can also help to seek justice in multiple civil cases against companies and governments. e-Discovery was used to uncover several big scandals in recent years. So it’s not only about making users' lives easier, but also about adding value to the output of their work.