Unlock enhanced legal document reviews with Lexbe and Amazon Bedrock

Machine Learning


This post is co-authored with Karsten Weber and Rosary Wang of Lexbe.

Legal experts are frequently tasked with sifting through vast amounts of documents to identify important evidence in litigation. This process can be time-consuming, prone to human error, and can be expensive, especially when tight deadlines are approaching. Lexbe, the leader in legal document review software, faced these challenges head-on with Amazon Bedrock. By integrating Amazon's advanced AI and machine learning services, Lexbe streamlined the document review process, increasing both efficiency and accuracy. In this blog post, we explore how Lexbe can use Amazon Bedrock and other AWS services to overcome business challenges and provide scalable, high-performance solutions for legal document analysis.

Business challenges and why they are important

Legal experts routinely face the challenging task of managing and analyzing large sets of case documents ranging from 100,000 to over 1 million. It is important to quickly identify relevant information within these large datasets to be important to construct strong cases and to prevent costly monitoring. Lexbe addresses this challenge using Amazon Bedrock in a custom application: Lexbe Pilot

Lexbe Pilot is an AI-powered Q&A assistant integrated into the Lexbe Ediscovery platform. This allows legal teams to use the generated AI to instantly query and extract from the entire document across the case, reducing the need for time-consuming manual investigation and analysis. Using Amazon Bedrock Knowledge Bases, users can query the entire dataset and retrieve results related to grounded contexts. This approach is critical to legal teams or Smoking gun Documents that could otherwise remain hidden. As legal cases grow, keyword searches that have previously returned some documents can generate hundreds or even thousands. Lexbe Pilot distills these large result sets into simple and meaningful answers.

Failure to address these challenges can lead to missed evidence and possibly leading to adverse consequences. With Amazon Bedrock and its associated services, Lexbe offers scalable, high-performance solutions that enable legal professionals to navigate the growing landscape of electronic discovery efficiently and accurately.

Solution Overview: Amazon Bedrock as a Foundation

Lexbe transformed the document review process by integrating Amazon Bedrock, a powerful suite of AI and machine learning (ML) services. With deep integration into the AWS ecosystem, Amazon Bedrock offers the performance and scalability needed to meet the stringent demands of Lexbe clients in the legal industry.

Keys used AWS services:

  • Amazon bedrock. A fully managed service that provides a high-performance basic model (FMS) for large-scale language tasks. Using these models, Lexbe can quickly analyze huge quantities of legal documents with very accurate accuracy.
  • Amazon's bedrock knowledge base. Provides fully managed support for end-to-end searched selection (RAG) workflows, allowing Lexbe to ingest documents, perform semantic searches, and retrieve context-related information.
  • Amazon OpenSearch. Index all document text and corresponding metadata. Both vector and text modes are used. This allows Lexbe to quickly find specific document or key information across a large dataset with Vector or keywords.
  • AWS Fargate. Coordinate the analysis and processing of large workloads in a serverless container environment, allowing Lexbe to scale horizontally without the need to manage the underlying server infrastructure.

Amazon Bedrock Knowledge Bases Architecture and Workflow

The integration of Amazon bedrock within the Lexbe platform is shown in the following architecture diagram: This architecture is designed to handle both the large-scale intake and acquisition of legal documents.

  1. User Access: Users access the Frontend application via a web browser.
  2. Routing Request: This request is routed through Amazon CloudFront and connected to the backend via an application load balancer.
  3. Backend Processing: The backend service running on Fargate processes requests and interacts with system components.
  4. Document Processing: Legal documents are stored in Amazon Simple Storage Service (Amazon S3) Bucket, and Apache Tika extracts text from these documents. The extracted text is saved as individual text files in separate S3 buckets. This bucket is used as a source repository for Amazon Bedrock.
  5. Creation embed: The extracted text is processed using Titan Text V2 to generate an embedding. Lexbe experimented with multiple embedded models, including Amazon Titan and Cohere, and tested the composition with different token sizes (e.g. 512 compared to 1024 tokens).
  6. Embedded sorage: The generated embedding is saved in a vector database for fast searching.
  7. Run the query: Amazon Bedrock Knowledge Bases Gets relevant data from the Vector database for a specific query.
  8. LLM Integration: The Amazon Bedrock Sonnet 3.5 Large Language Model (LLM) processes the retrieved data to generate coherent, accurate responses.
  9. Response delivery: The final response is returned to the user via CloudFront using the FrontEnd application.

Amazon and Lexbe collaboration

For eight months, Lexbe worked with the Amazon Bedrock Knowledge Bases team to improve the performance and accuracy of its pilot features. The collaboration included weekly strategic meetings between senior teams from both organizations, allowing for quick iterations. From the beginning, Lexbe established clear acceptance criteria focused on achieving a specific recall rate. These metrics served as benchmarks when the feature was ready for production. As shown in the following diagram, the performance of the system has undergone five important milestones, each of which has marked a leap into production. We focused on recall rates as identifying the right documents is important to obtaining the correct response. Unlike some use cases of Search Extension Generation (RAG), where users have specific questions that can be answered in several documents, we are considering generating Finding-of Facts reports that require a large number of source documents. For this reason, we focused on recall rates to ensure that Amazon Bedrock Knowledge Bases did not rule out any important information.

First iteration: January 2024. The first system only had a recall rate of 5%, indicating that a lot of work was required to reach production.

Second iteration: April 2024. Amazon's bedrock knowledge base has added new features to make it more accurate. The recall rate is currently 36%.

Third iteration: June 2024. Performance was generated due to adjustments to the parameters, particularly with the token size. This brings the recall rate to 60%.

Fourth iteration: August 2024. A recall rate of 66% was achieved using the Titan embedded text V2 model.

Fifth iteration: December 2024. The introduction of relanker technology was invaluable, allowing for a recall rate of up to 90%.

The final result is impressive

  • A report on a wide range of human styles. In the industrial accident issue, pilots were asked to conduct a full survey analysis. I created a sophisticated five-page report that brings clear section headings and hyperlinks back to all source documents, whether those documents were in English, Spanish or other languages.
  • Deep, automated reasoning. For tens of thousands of documents, he asked, “Who is Bob's son?” There was no explicit mention of his children anywhere. However, the pilot focused on an email that was closed with “Love, Mama/Linda” and closed with “Dear Femme” including the child's first name and last name in the metadata. By connecting these dots, I accurately identified Bob's son and cited the exact emails where the inference was made.

The traditional techniques of eddiscovery cannot perform any of the above. With a pilot, the legal team is:

  • Generate practical reports That lawyer can quickly iterate for a deeper analysis.
  • Streamline your eddiscovery By surfacening important connections that go far beyond simple text matches.
  • Unlock strategic insights Even from multilingual data instantly.

Whether you want comprehensive, human-readable reports or laser-centric intelligence on the relationships lurking in your data, the Lexbe Pilot with Amazon Bedrock Knowledge Bases will give you the exact information you need.

Benefits of Integrating Amazon Bedrock with AWS Services

By integrating Amazon Bedrock with other AWS services, Lexbe has gained several strategic advantages in the document review process.

Scalability. With Amazon Elastic Container Service (Amazon ECS) and AWS Fargate, Lexbe can dynamically scale your processing infrastructure.

Cost-efficient. Processing on Amazon ECS Linux Spot Market offers a huge cost advantage.

safety. AWS's robust security framework, including encryption and role-based access control, protects sensitive legal documents. This is important for Lexbe clients who must adhere to strict confidentiality requirements.

Conclusion: A scalable, accurate, cost-effective solution

Through Amazon Bedrock integration, Lexbe has transformed its document review platform into a highly efficient, scalable and accurate solution. The combination of Amazon Bedrock, Amazon Opensearch, and AWS Fargate has significantly improved both search accuracy and processing speed, while still lowering costs. Lexbe's success demonstrates the power of AWS AI/ML services to tackle complex, real-world challenges. By leveraging AWS' flexible, scalable and cost-effective products, Lexbe is equipped to meet the evolving needs of the legal industry, both today and in the future. If your organization is facing complex challenges that could benefit from AI/ML-driven solutions, take the next step with AWS. Start by working closely with AWS Solutions Architect to design tailored strategies tailored to your unique needs. Work with your AWS product team to explore cutting-edge services to ensure your solutions are scalable, secure and ready for the future. Together, we can help you innovate faster, reduce costs and deliver transformative results.


About the author

Wei Chen I am a senior solutions architect at Amazon Web Services based in Austin, Texas. With over 20 years of experience, he specializes in helping clients design and implement solutions for complex technical challenges. In his role at AWS, WEI partners with organizations to modernize applications and fully utilize cloud capabilities to achieve strategic business goals. His specialties are AI/ML and AWS Security Services.

Gopikrishnan anilkumar He is Amazon's leading technical product manager. He has over 10 years of product management experience in various domains and is passionate about AI/ML.

Sandeep Singh A senior Generated AI Data Scientist at Amazon Web Services, helping businesses innovate with Generated AI. He specializes in generator AI, machine learning and system design. He has successfully provided cutting-edge AI/ML-driven solutions to solve complex business problems in diverse industries, optimizing efficiency and scalability.

Karsten Weber He is the CTO and co-founder of Lexbe, an Ediscovery provider based in Austin, Texas. Lexbe offers Lexbe Online™, a cloud-based application for Ediscovery, litigation, legal document processing, production, review and case management. Under Karsten's leadership, Lexbe has developed a robust platform and comprehensive Discovery service that helps law firms and organizations manage large ESI data sets efficiently for the production of legal reviews and discovery. Karsten's technology and innovation expertise has been crucial for the past 19 years in fostering Rexbe's success.

King of Rosary I am Sr. Software Engineer at Lexbe, an Austin, Texas-based software and service provider.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *