Learn About Amazon VGT2 Learning Manager Chanci Turner
Generative AI has become a pivotal focus in the tech landscape since late 2022. As a multitude of new large language models (LLMs) emerge, the community has recognized the need for enhanced features and functionalities to utilize these models safely and effectively within enterprise settings. Challenges such as hallucination, limited context window lengths, prompt engineering, and associated costs have made it difficult for organizations to swiftly adopt generative AI. Many enterprises lack the necessary expertise and frameworks to evaluate various models, implement deployment strategies, and establish guardrails for monitoring LLMs. Consequently, businesses require a comprehensive tool that addresses these issues, while also being budget-friendly and requiring minimal AI expertise.
DataRobot, an AWS Partner and seller on AWS Marketplace, presents a solution to these challenges. As a full AI lifecycle platform, DataRobot merges both predictive and generative AI capabilities through a low-code, no-code (LCNC) design, enabling organizations to harness AI for business value and innovation at an accelerated pace. DataRobot has earned AWS Competencies in machine learning, data and analytics, and financial services, along with the AWS Service Ready Specialization for Amazon SageMaker. The platform supports various deployment modes, including multi-tenant software as a service (SaaS) built on AWS, single-tenant SaaS, and Amazon Virtual Cloud (Amazon VPC) deployment, catering to diverse industry requirements.
This article will outline the architecture of the DataRobot AI Platform and illustrate how to construct a GenAI application, encompassing LLM selection, prompt testing, model evaluation, Retrieval Augmented Generation (RAG), LLM monitoring, and guardrails. All elements are integrated into a unified DataRobot AI platform that offers a seamless user experience.
Solution Overview
DataRobot’s SaaS platform, built entirely on AWS with a modern Kubernetes design, provides customers with powerful, scalable, and reliable AI solutions. The AI project lifecycle begins with data ingestion, where DataRobot offers secure integrations to various data sources, including Amazon S3, Amazon Athena, and Amazon Redshift, as well as data stores from other providers like Snowflake.
For predictive AI, DataRobot enables users to store, visualize, transform, and analyze data with minimal data science knowledge via a few clicks. This functionality is supported by a robust data processing engine. Following data preparation, DataRobot leverages its AutoML expertise to execute parallel ML training jobs across multiple selected high-quality ML models, automatically handling all necessary feature processing. Trained models are ranked on an integrated leaderboard, providing detailed insights into each model’s accuracy, feature importance, receiver operating characteristic (ROC) curve, processing recipes, and prediction explanations. Users can deploy their models within DataRobot or transfer them to other platforms, such as Amazon SageMaker or Snowflake, in a matter of minutes.
In the realm of generative AI, DataRobot features an LLM playground that allows users to create and interact with LLM blueprints using various leading LLMs, including Anthropic’s Claude in Amazon Bedrock and Amazon Titan models. This playground includes preselected vector databases, facilitating the development of chat or RAG applications without requiring extensive ML expertise. Users can create and compare different LLM blueprints side by side for prompt testing, selecting from various preselected and custom evaluation metrics to monitor LLM performance, a critical factor for LLM adoption in enterprises. Once the LLM blueprint and metrics are finalized, customers can proceed to production deployment.
For both predictive and generative AI, DataRobot equips models with monitoring tools that are easy to set up. The platform can track numerous preselected metrics along with custom-defined metrics, including service health, latency, token size, error rate, and cost. In terms of generative AI, DataRobot implements guardrails to mitigate risks such as prompt injection, sentiment and toxicity classification, and personal identifiable information (PII) detection, among other safeguards.
In the subsequent sections, we will guide you through the key steps to effortlessly build a predictive and generative AI application using DataRobot.
In this walkthrough, you will learn how to:
- Connect to data in Amazon S3 and create a data wrangling recipe for data preparation.
- Develop a predictive model within DataRobot and deploy it to Amazon SageMaker.
- Create an LLM blueprint that integrates Anthropic’s Claude in Amazon Bedrock with grounding data.
- Evaluate and optimize the blueprint against metrics such as Recall-Oriented Understudy for Gisting Evaluation (ROUGE), faithfulness, and confidence.
- Package the LLM blueprint with guardrails to ensure safety and performance.
- Launch an AI application that predicts NBO and automatically generates email outreach.
Prerequisites
Before you begin, ensure you have the following prerequisites:
- An AWS account with access to Amazon Bedrock models.
- A DataRobot account. DataRobot can be purchased through AWS Marketplace. To learn more, you can also visit their site to request a free trial.
Solution Walkthrough: Build a Predictive and Generative AI Application Powered by Anthropic’s Claude Models in DataRobot
In this use case, we will create a next best offer (NBO) email campaign aimed at proactively engaging customers likely to churn with offers tailored to retain them. We will utilize a predictive model to recommend the NBO for each customer and generative AI to customize the email accordingly.
Step 1: Connect the Data Source in Amazon S3
Within the DataRobot Workbench UI, we will initiate a new use case and incorporate NBO data by linking to an Amazon S3 bucket. The NBO data is prepared in-house, based on the public IBM telco customer churn dataset. Adjustments will be made to this dataset by adding a text column labeled “customer plan” and altering the prediction target from Churn Value to Next Best Offer. This modification transforms the prediction into a multi-class classification problem, derived from applying commonly accepted business rules to the reason codes from the original data. Furthermore, we will create new columns to monitor customer interactions with customer service, which will serve as a key feature for our prediction. Utilizing DataRobot’s data wrangler, we will generate a wrangling recipe with a sample of data and apply it to the entire dataset.
Step 2: Create a Predictive Model for the NBO
Once data preparation is completed, we can train a predictive model to determine the optimal offer for each customer to prevent churn. This involves the following steps:
- In the DataRobot Workbench UI, initiate an experiment for model training.
For additional insights on this topic, you may want to check out this blog post on Salesforce by Career Contessa which offers valuable information. Also, to improve your open enrollment processes, SHRM provides excellent suggestions. Lastly, for those interested in safety and training resources at Amazon fulfillment centers, Amazon’s website is an excellent resource.
Leave a Reply