Generative AI with Python
The Developer’s Guide to Pretrained LLMs, Vector Databases, Retrieval-Augmented Generation, and Agentic Systems
Book Information
- Publisher: SAP PRESS
- Authors: Bert Gollnick
- Year: 2025
- Edition: 1
- Pages: 392
- Languages: English
Description
Your guide to generative AI with Python is here! Start with an introduction to generative AI, NLP models, LLMs, and LMMs—and then dive into pretrained models with Hugging Face. Work with LLMs using Python with the help of tools like OpenAI and LangChain. Get step-by-step instructions for working with vector databases and using retrieval-augmented generation. With information on agentic systems and AI application deployment, this guide gives you all you need to become an AI master!
- Work with pretrained LLM and NLP models on Hugging Face and LangChain
- Create vector databases and implement retrieval-augmented generation
- Add an agentic system using frameworks such as CrewAI and AG2
Key Highlights
- Natural language processing (NLP) models
- Large language models (LLMs)
- Pretrained models
- Prompt engineering
- Vector databases
- Retrieval-augmented generation (RAG)
- Agentic systems
- OpenAI
- LangChain
- Hugging Face
- CrewAI
- AG2
You'll learn about
- Large Language Models:
Set up LLMs and then learn how to apply your models using Python. Walk through the available tools: OpenAI, Meta’s Llama model family, Groq, and open-source LLMs. Work with prompt templates, chains, and more.
- Vector Databases:
Create and use vector databases to store and query large collections of documents. Master all aspects of the pipeline: loading a raw document, processing it, and storing it in your vector database.
- Retrieval Augmented Generation:
Leverage large-scale pretrained language models and external knowledge sources with retrieval-augmented generation. Retrieve relevant information from large corpora, integrate it into the generation process, and evaluate the quality and diversity of the generated texts.
- Agentic Systems:
Use AI models to build agents that act autonomously to achieve their goals. Discover the different frameworks for this task: LangGraph, AG2, CrewAI, OpenAI Agents, and Pydantic AI.
Table of Contents
- Preface
- Objective of This Book
- Target Audience
- Prerequisites: What You Should Already Know
- Structure of This Book
- How to Use This Book Effectively
- Downloadable Code and Additional Materials
- System Setup
- Python Installation
- IDE Installation
- Git Installation
- Getting the Source Material
- Setting up Your Local Environment
- Acknowledgments
- Conventions Used in This Book
- 1 Introduction to Generative AI
- 1.1 Introduction to Artificial Intelligence
- 1.2 Pillars of Generative AI Advancement
- 1.2.1 Computational Power
- 1.2.2 Model and Data Size
- 1.2.3 Investments
- 1.2.4 Algorithmic Improvements
- 1.3 Deep Learning
- 1.4 Narrow AI and General AI
- 1.5 Natural Language Processing Models
- 1.5.1 NLP Tasks
- 1.5.2 Architectures
- 1.6 Large Language Models
- 1.6.1 Training
- 1.6.2 Use Cases
- 1.6.3 Limitations
- 1.7 Large Multimodal Models
- 1.8 Generative AI Applications
- 1.8.1 Consumer
- 1.8.2 Business
- 1.8.3 Prosumer
- 1.9 Summary
- 2 Pretrained Models
- 2.1 Hugging Face
- 2.2 Coding: Text Summarization
- 2.3 Exercise: Translation
- 2.3.1 Task
- 2.3.2 Solution
- 2.4 Coding: Zero-Shot Classification
- 2.5 Coding: Fill-Mask
- 2.6 Coding: Question Answering
- 2.7 Coding: Named Entity Recognition
- 2.8 Coding: Text-to-Image
- 2.9 Exercise: Text-to-Audio
- 2.9.1 Task
- 2.9.2 Solution
- 2.10 Capstone Project: Customer Feedback Analysis
- 2.11 Summary
- 3 Large Language Models
- 3.1 Brief History of Language Models
- 3.2 Simple Use of LLMs via Python
- 3.2.1 Coding: Using OpenAI
- 3.2.2 Coding: Using Groq
- 3.2.3 Coding: Large Multimodal Models
- 3.2.4 Coding: Running Local LLMs
- 3.3 Model Parameters
- 3.3.1 Model Temperature
- 3.3.2 Top-p and Top-k
- 3.4 Model Selection
- 3.4.1 Performance
- 3.4.2 Knowledge Cutoff Date
- 3.4.3 On-Premise versus Cloud-Based Hosting
- 3.4.4 Open-Source, Open-Weight, and Proprietary Models
- 3.4.5 Price
- 3.4.6 Context Window
- 3.4.7 Latency
- 3.5 Messages
- 3.5.1 User
- 3.5.2 System
- 3.5.3 Assistant
- 3.6 Prompt Templates
- 3.6.1 Coding: ChatPromptTemplates
- 3.6.2 Coding: Improve Prompts with LangChain Hub
- 3.7 Chains
- 3.7.1 Coding: Simple Sequential Chain
- 3.7.2 Coding: Parallel Chain
- 3.7.3 Coding: Router Chain
- 3.7.4 Coding: Chain with Memory
- 3.8 Safety and Security
- 3.8.1 Security
- 3.8.2 Safety
- 3.8.3 Coding: Implementing LLM Safety and Security
- 3.9 Model Improvements
- 3.10 New Trends
- 3.10.1 Reasoning Models
- 3.10.2 Small Language Models
- 3.10.3 Test-Time Computation
- 3.11 Summary
- 4 Prompt Engineering
- 4.1 Prompt Basics
- 4.1.1 Prompt Process
- 4.1.2 Prompt Components
- 4.1.3 Basic Principles
- 4.2 Coding: Few-Shot Prompting
- 4.3 Coding: Chain-of-Thought
- 4.4 Coding: Self-Consistency Chain-of-Thought
- 4.5 Coding: Prompt Chaining
- 4.6 Coding: Self-Feedback
- 4.7 Summary
- 5 Vector Databases
- 5.1 Introduction
- 5.2 Data Ingestion Process
- 5.3 Loading Documents
- 5.3.1 High-Level Overview
- 5.3.2 Coding: Load a Single Text File
- 5.3.3 Coding: Load Multiple Text Files
- 5.3.4 Exercise: Load Multiple Wikipedia Articles
- 5.3.5 Exercise: Loading Project Gutenberg Book
- 5.4 Splitting Documents
- 5.4.1 Coding: Fixed-Size Chunking
- 5.4.2 Coding: Structure-Based Chunking
- 5.4.3 Coding: Semantic Chunking
- 5.4.4 Coding: Custom Chunking
- 5.5 Embeddings
- 5.5.1 Overview
- 5.5.2 Coding: Word Embeddings
- 5.5.3 Coding: Sentence Embeddings
- 5.5.4 Coding: Create Embeddings with LangChain
- 5.6 Storing Data
- 5.6.1 Selection of a Vector Database
- 5.6.2 Coding: File-Based Storage with a Chroma Database
- 5.6.3 Coding: Web-Based Storage with Pinecone
- 5.7 Retrieving Data
- 5.7.1 Similarity Calculation
- 5.7.2 Coding: Retrieve Data from Chroma Database
- 5.7.3 Coding: Retrieve Data from Pinecone
- 5.8 Capstone Project
- 5.8.1 Features
- 5.8.2 Dataset
- 5.8.3 Preparing the Vector Database
- 5.8.4 Exercise: Get All Genres from the Vector Database
- 5.8.5 App Development
- 5.9 Summary
- 6 Retrieval-Augmented Generation
- 6.1 Introduction
- 6.2 Coding: Simple Retrieval-Augmented Generation
- 6.2.1 Knowledge Source Setup
- 6.2.2 Retrieval
- 6.2.3 Augmentation
- 6.2.4 Generation
- 6.2.5 RAG Function Creation
- 6.3 Advanced Techniques
- 6.3.1 Advanced Preretrieval Techniques
- 6.3.2 Advanced Retrieval Techniques
- 6.3.3 Advanced Postretrieval Techniques
- 6.4 Coding: Prompt Caching
- 6.5 Evaluation
- 6.5.1 Challenges in RAG Evaluation
- 6.5.2 Metrics
- 6.5.3 Coding: Metrics
- 6.6 Summary
- 7 Agentic Systems
- 7.1 Introduction to AI Agents
- 7.2 Available Frameworks
- 7.3 Simple Agent
- 7.3.1 Agentic RAG
- 7.3.2 ReAct
- 7.4 Agentic Framework: LangGraph
- 7.4.1 Simple Graph: Assistant
- 7.4.2 Router Graph
- 7.4.3 Graph with Tools
- 7.5 Agentic Framework: AG2
- 7.5.1 Two Agent Conversations
- 7.5.2 Human in the Loop
- 7.5.3 Agents Using Tools
- 7.6 Agentic Framework: CrewAI
- 7.6.1 Introduction
- 7.6.2 First Crew: News Analysis Crew
- 7.6.3 Exercise: AI Security Crew
- 7.7 Agentic Framework: OpenAI Agents
- 7.7.1 Getting Started with a Single Agent
- 7.7.2 Working with Multiple Agents
- 7.7.3 Agent with Search and Retrieval Functionality
- 7.8 Agentic Framework: Pydantic AI
- 7.9 Monitoring Agentic Systems
- 7.9.1 AgentOps
- 7.9.2 Logfire
- 7.10 Summary
- 8 Deployment
- 8.1 Deployment Architecture
- 8.2 Deployment Strategy
- 8.2.1 REST API Development
- 8.2.2 Deployment Priorities
- 8.2.3 Coding: Local Deployment
- 8.3 Self-Contained App Development
- 8.4 Deployment to Heroku
- 8.4.1 Create a New App
- 8.4.2 Download and Configure CLI
- 8.4.3 Create app.py File
- 8.4.4 Procfile Setup
- 8.4.5 Environment Variables
- 8.4.6 Python Environment
- 8.4.7 Check the Result Locally
- 8.4.8 Deployment to Heroku
- 8.4.9 Stop Your App
- 8.5 Deployment to Streamlit
- 8.5.1 GitHub Repository
- 8.5.2 Creating a New App
- 8.6 Deployment with Render
- 8.7 Summary
- 9 Outlook
- 9.1 Advances in Model Architecture
- 9.2 Limitations and Issues of LLMs
- 9.2.1 Hallucinations
- 9.2.2 Biases
- 9.2.3 Misinformation
- 9.2.4 Intellectual Property
- 9.2.5 Interpretability and Transparency
- 9.2.6 Jailbreaking LLMs
- 9.3 Regulatory Developments
- 9.4 Artificial General Intelligence and Artificial Superintelligence
- 9.5 AI Systems in the Near Term
- 9.6 Useful Resources
- 9.7 Summary
Disclaimer
SAP, other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. Our Company is not affiliated to SAP SE or any of its affiliated companies including but not limited to: Sybase, Business Objects, Hybris, Ariba and SuccessFactors. All other names, brands, logos, etc. are registered trade or service marks of their respective owners.