LangChain: Build Intelligent AI Agents Using Python

How LangChain Helps You Build Production-Ready AI Agents with Python

This article is part of our 5-part series on AI Agent & Workflow Development Tools where we explore the leading platforms and frameworks for building production-ready AI solutions.

📚 Series: Tools We Use for AI Development

Azure AI Foundry - How Azure AI Foundry helps you build secure enterprise AI solutions
LangChain (this article) - How LangChain helps you build production-ready AI agents with Python
Semantic Kernel - How Semantic Kernel helps you build multi-agent AI systems in .NET
n8n - How n8n democratizes AI automation with low-code workflows
Microsoft Agent Framework - How Microsoft Agent Framework enables scalable multi-agent workflows

What is LangChain?

LangChain is the most popular open-source Python framework for building AI applications powered by large language models (LLMs). It transforms simple LLM API calls into sophisticated AI agents capable of reasoning, using tools, maintaining memory, and executing complex workflows.

LangChain solves the critical challenge of LLM orchestration: connecting language models to external data sources, APIs, and tools while managing context, memory, and error handling. Instead of writing custom prompt engineering logic and tool calling code, LangChain provides battle-tested abstractions that handle the complexity for you.

The framework is designed for production-grade AI systems, not just prototypes. With over 100k GitHub stars and adoption by companies like Robinhood, Notion, and Zapier, LangChain has become the de facto standard for Python AI development.

Why LangChain for AI Agents?

Traditional LLM applications are stateless and reactive—they respond to prompts but can’t plan, remember, or interact with external systems. AI Agents built with LangChain overcome these limitations:

Autonomous reasoning: Agents decide which actions to take based on context
Tool usage: Connect to databases, APIs, search engines, and custom functions
Memory systems: Maintain conversation history and long-term knowledge
Error recovery: Retry failed operations and handle exceptions gracefully
Multi-step workflows: Break complex tasks into manageable steps

LangChain is particularly powerful for:

Retrieval-Augmented Generation (RAG): Ground LLM responses in your data
Conversational AI: Build chatbots with context and memory
Data analysis agents: Query databases and visualize results
Automation workflows: Replace manual tasks with intelligent agents

Core LangChain Architecture

LangChain is organized into modular components that you compose together. Understanding this architecture is essential for building robust agents.

The Component Hierarchy

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# 1. Model: The LLM (OpenAI, Anthropic, local models, etc.)
model = ChatOpenAI(
    model="gpt-4o",
    temperature=0.7,
    api_key="your-api-key"
)

# 2. Prompt: Template for LLM input
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful AI assistant specialized in {domain}."),
    ("user", "{question}")
])

# 3. Output Parser: Structure the LLM response
parser = StrOutputParser()

# 4. Chain: Connect components with LCEL (LangChain Expression Language)
chain = prompt | model | parser

# Execute the chain
result = chain.invoke({
    "domain": "Python development",
    "question": "How do I optimize database queries?"
})

Key Concepts:

Runnables: Every component implements the Runnable interface (.invoke(), .stream(), .batch())
LCEL (LangChain Expression Language): The | operator chains components together
Type safety: Pydantic models ensure data validation at runtime

LangChain vs LangGraph

LangChain provides linear chains (step-by-step execution), while LangGraph enables cyclic workflows (loops, conditionals, human-in-the-loop). Use LangGraph for:

Multi-agent collaboration
Iterative refinement (agent tries, evaluates, retries)
Complex state machines

We’ll cover both in this guide.

Building Your First LangChain Agent

Agents are autonomous systems that use LLMs to decide which tools to call. Unlike chains (predefined steps), agents reason about the best action dynamically.

Agent Types in LangChain

Agent Type	Best For	Tools	Memory
ReAct	General-purpose reasoning	Any	Optional
OpenAI Functions	Structured tool calling	OpenAI function schema	Built-in
Conversational	Chatbots with history	Any	Required
Plan-and-Execute	Multi-step tasks	Any	Task list

Creating a ReAct Agent with Tools

The ReAct pattern (Reasoning + Acting) is the most versatile agent architecture. The agent alternates between thinking and tool usage.

from langchain.agents import create_react_agent, AgentExecutor
from langchain_openai import ChatOpenAI
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_core.prompts import PromptTemplate
from langchain.tools import Tool

# 1. Define tools the agent can use
search = DuckDuckGoSearchRun()

def calculate(expression: str) -> str:
    """Evaluate a mathematical expression safely."""
    try:
        # Use ast.literal_eval for safety (only allows literals)
        import ast
        result = eval(expression, {"__builtins__": {}}, {})
        return f"Result: {result}"
    except Exception as e:
        return f"Error: {str(e)}"

tools = [
    Tool(
        name="Search",
        func=search.run,
        description="Search the internet for current information. Input should be a search query."
    ),
    Tool(
        name="Calculate",
        func=calculate,
        description="Perform mathematical calculations. Input should be a valid Python expression (e.g., '2 + 2', '10 * 5')."
    )
]

# 2. Create the agent with a ReAct prompt
prompt = PromptTemplate.from_template("""
You are an intelligent agent capable of reasoning and using tools.

Tools available:
{tools}

Tool names: {tool_names}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Question: {input}
Thought: {agent_scratchpad}
""")

llm = ChatOpenAI(model="gpt-4o", temperature=0)
agent = create_react_agent(llm, tools, prompt)

# 3. Create executor (handles tool calling logic)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,  # Print reasoning steps
    max_iterations=10,  # Prevent infinite loops
    handle_parsing_errors=True  # Graceful error handling
)

# 4. Execute the agent
response = agent_executor.invoke({
    "input": "What is the current price of Bitcoin multiplied by 100?"
})

print(response["output"])

What happens under the hood:

Agent receives the question
Thought: “I need to search for Bitcoin’s current price”
Action: Calls the Search tool with “current Bitcoin price”
Observation: Gets the search result (e.g., “$45,000”)
Thought: “Now I need to multiply by 100”
Action: Calls the Calculate tool with “45000 * 100”
Observation: Gets “4,500,000”
Final Answer: Returns the result to the user

LangChain Tools: Connecting Agents to the Real World

Tools are functions that agents call to interact with external systems. LangChain provides hundreds of pre-built tools and makes it easy to create custom ones.

Using Pre-Built Tools

from langchain_community.tools import (
    WikipediaQueryRun,
    PythonREPLTool,
    ShellTool,
    FileReadTool,
)
from langchain_community.utilities import WikipediaAPIWrapper

# Wikipedia search
wikipedia = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())

# Execute Python code (use with caution!)
python_repl = PythonREPLTool()

# Shell commands (production: restrict to safe commands)
shell = ShellTool()

# File operations
file_reader = FileReadTool()

tools = [wikipedia, python_repl, shell, file_reader]

Production Warning: PythonREPLTool and ShellTool execute arbitrary code. Use them only in sandboxed environments or with strict input validation.

Creating Custom Tools

For production systems, you’ll need custom tools that integrate with your business logic.

from langchain.tools import StructuredTool
from pydantic import BaseModel, Field
from typing import List
import requests

# 1. Define input schema with Pydantic
class CustomerLookup(BaseModel):
    customer_id: str = Field(description="The unique customer ID")
    include_orders: bool = Field(
        default=False,
        description="Whether to include order history"
    )

# 2. Implement the tool function
def lookup_customer(customer_id: str, include_orders: bool = False) -> dict:
    """
    Query customer database and return customer details.
    Production: Replace with actual database call.
    """
    # Simulated API call
    response = requests.get(
        f"https://api.example.com/customers/{customer_id}",
        params={"include_orders": include_orders}
    )

    if response.status_code == 200:
        return response.json()
    else:
        return {"error": f"Customer {customer_id} not found"}

# 3. Create the tool with structured schema
customer_tool = StructuredTool.from_function(
    func=lookup_customer,
    name="CustomerLookup",
    description="Retrieve customer information from the CRM system. Use this when you need details about a specific customer.",
    args_schema=CustomerLookup
)

# 4. Use in an agent
tools = [customer_tool]
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

response = agent_executor.invoke({
    "input": "Find details for customer ID 12345 including their order history"
})

Best Practices:

Descriptive names: Help the LLM understand when to use the tool
Clear descriptions: Explain what the tool does and when to use it
Type safety: Use Pydantic schemas for complex inputs
Error handling: Return meaningful error messages, not exceptions

LangChain Memory: Building Stateful Agents

LLMs are stateless—they don’t remember previous interactions. Memory systems solve this by storing and retrieving conversation history.

Memory Types

Memory Type	Use Case	Retention	Storage
ConversationBufferMemory	Short chats	All messages	In-memory
ConversationBufferWindowMemory	Limit context	Last N messages	In-memory
ConversationSummaryMemory	Long conversations	Summarized	LLM-compressed
VectorStoreMemory	Semantic retrieval	Relevant context	Vector DB
EntityMemory	Track facts about entities	Structured facts	Dictionary

Implementing Conversation Memory

from langchain.memory import ConversationBufferMemory
from langchain.agents import initialize_agent, AgentType

# 1. Create memory that stores chat history
memory = ConversationBufferMemory(
    memory_key="chat_history",  # Key for prompt template
    return_messages=True  # Return as ChatMessage objects
)

# 2. Initialize agent with memory
agent = initialize_agent(
    tools=tools,
    llm=ChatOpenAI(model="gpt-4o", temperature=0),
    agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
    memory=memory,
    verbose=True
)

# 3. Conversation with context
agent.invoke({"input": "My name is Alice and I work at TechCorp."})
agent.invoke({"input": "What's my name?"})  # Agent remembers: "Alice"
agent.invoke({"input": "Where do I work?"})  # Agent remembers: "TechCorp"

Window Memory for Long Conversations

To prevent exceeding context limits, use sliding window memory:

from langchain.memory import ConversationBufferWindowMemory

# Only keep last 5 message pairs (10 messages total)
memory = ConversationBufferWindowMemory(
    k=5,  # Number of exchanges to remember
    memory_key="chat_history",
    return_messages=True
)

Summary Memory for Token Efficiency

For very long conversations, summarize old messages to save tokens:

from langchain.memory import ConversationSummaryMemory

memory = ConversationSummaryMemory(
    llm=ChatOpenAI(model="gpt-4o-mini"),  # Use cheaper model for summaries
    memory_key="chat_history",
    return_messages=True
)

# As conversation grows, old messages are summarized:
# "User discussed Q4 sales targets and marketing budget constraints."

Retrieval-Augmented Generation (RAG) with LangChain

RAG grounds LLM responses in your proprietary data. Instead of relying on the model’s training data, you retrieve relevant documents and inject them into the prompt.

RAG Architecture

User Query → Embedding → Vector Search → Retrieve Docs → LLM + Context → Response

Building a Production RAG System

from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain_community.document_loaders import DirectoryLoader, TextLoader

# 1. Load documents
loader = DirectoryLoader(
    "./docs",
    glob="**/*.md",
    loader_cls=TextLoader
)
documents = loader.load()

# 2. Split into chunks (critical for retrieval quality)
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,  # Characters per chunk
    chunk_overlap=200,  # Overlap to preserve context
    separators=["\n\n", "\n", " ", ""]  # Split on paragraphs, then sentences
)
chunks = text_splitter.split_documents(documents)

# 3. Create embeddings and vector store
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(
    documents=chunks,
    embedding=embeddings,
    persist_directory="./chroma_db"  # Persist to disk
)

# 4. Create retrieval chain
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 4}  # Retrieve top 4 chunks
)

qa_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model="gpt-4o", temperature=0),
    chain_type="stuff",  # "stuff" = inject all docs into prompt
    retriever=retriever,
    return_source_documents=True  # Include sources in response
)

# 5. Query the knowledge base
result = qa_chain.invoke({"query": "How do I configure authentication?"})

print(result["result"])
print("\nSources:")
for doc in result["source_documents"]:
    print(f"- {doc.metadata['source']}")

Advanced RAG: Multi-Query Retrieval

Generate multiple query variations to improve recall:

from langchain.retrievers import MultiQueryRetriever

# Automatically generates 3 variations of the user query
retriever = MultiQueryRetriever.from_llm(
    retriever=vectorstore.as_retriever(),
    llm=ChatOpenAI(model="gpt-4o-mini")
)

# User asks: "How do I deploy?"
# LLM generates:
# 1. "What are the deployment steps?"
# 2. "How to configure production deployment?"
# 3. "Deployment guide and instructions"
# → Retrieves results for all 3, deduplicates

RAG with Re-Ranking

Improve relevance by re-scoring retrieved documents:

from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import LLMChainExtractor

# 1. Initial retrieval (fast, may include irrelevant docs)
base_retriever = vectorstore.as_retriever(search_kwargs={"k": 10})

# 2. Re-rank with LLM (slow, but accurate)
compressor = LLMChainExtractor.from_llm(ChatOpenAI(model="gpt-4o-mini"))
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor,
    base_retriever=base_retriever
)

# Retrieves 10 chunks, filters to most relevant 3-4

LangGraph: Building Multi-Agent Systems

LangGraph is LangChain’s framework for building stateful, cyclic workflows. Unlike linear chains, LangGraph supports loops, conditionals, and multi-agent collaboration.

LangGraph Core Concepts

Nodes: Functions that process state
Edges: Transitions between nodes
State: Shared data passed through the graph
Conditional edges: Dynamic routing based on state

Creating a Research Agent with LangGraph

from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated, Sequence
from langchain_core.messages import BaseMessage
import operator

# 1. Define state (shared across all nodes)
class AgentState(TypedDict):
    messages: Annotated[Sequence[BaseMessage], operator.add]
    research_results: str
    should_continue: bool

# 2. Define nodes (agent actions)
def researcher(state: AgentState) -> AgentState:
    """Research the topic using search tools."""
    query = state["messages"][-1].content
    # Use search agent to gather information
    search_agent = initialize_agent(tools=[search_tool], llm=llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)
    result = search_agent.invoke({"input": f"Research: {query}"})

    state["research_results"] = result["output"]
    state["should_continue"] = True
    return state

def writer(state: AgentState) -> AgentState:
    """Write a report based on research."""
    research = state["research_results"]

    prompt = f"Write a comprehensive report based on this research:\n\n{research}"
    response = llm.invoke(prompt)

    state["messages"].append(response)
    state["should_continue"] = False
    return state

def reviewer(state: AgentState) -> AgentState:
    """Review the report quality."""
    report = state["messages"][-1].content

    prompt = f"Review this report for accuracy and completeness:\n\n{report}\n\nIs it ready to publish? Reply 'APPROVED' or 'NEEDS_REVISION'"
    review = llm.invoke(prompt).content

    if "APPROVED" in review:
        state["should_continue"] = False
    else:
        state["should_continue"] = True
        state["messages"].append(f"Revision needed: {review}")

    return state

# 3. Build the graph
workflow = StateGraph(AgentState)

# Add nodes
workflow.add_node("researcher", researcher)
workflow.add_node("writer", writer)
workflow.add_node("reviewer", reviewer)

# Add edges
workflow.set_entry_point("researcher")
workflow.add_edge("researcher", "writer")
workflow.add_edge("writer", "reviewer")

# Conditional edge: loop if revision needed
workflow.add_conditional_edges(
    "reviewer",
    lambda state: "writer" if state["should_continue"] else END
)

# 4. Compile and execute
app = workflow.compile()

result = app.invoke({
    "messages": [HumanMessage(content="Research the impact of AI on healthcare")],
    "research_results": "",
    "should_continue": True
})

print(result["messages"][-1].content)

Flow:

Researcher gathers information
Writer creates a report
Reviewer checks quality
If approved → END
If needs revision → loop back to Writer

Human-in-the-Loop with LangGraph

Add manual approval steps:

from langgraph.checkpoint.memory import MemorySaver

# Add checkpointing to save state
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)

# Execute with interrupts
config = {"configurable": {"thread_id": "1"}}

# Run until human approval needed
for output in app.stream(input_data, config):
    print(output)
    if "reviewer" in output:
        # Pause for human review
        user_input = input("Approve? (yes/no): ")
        if user_input.lower() == "yes":
            # Continue execution
            pass
        else:
            # Provide feedback and re-run
            pass

Production LangChain: Best Practices

1. Error Handling and Retries

from langchain_core.runnables import RunnableWithFallbacks

# Fallback to cheaper model if primary fails
primary_chain = prompt | ChatOpenAI(model="gpt-4o")
fallback_chain = prompt | ChatOpenAI(model="gpt-4o-mini")

chain_with_fallback = primary_chain.with_fallbacks([fallback_chain])

# Automatic retry with exponential backoff
from langchain_core.runnables import RunnableRetry

chain_with_retry = RunnableRetry(
    runnable=chain,
    max_attempts=3,
    wait_exponential_jitter=True
)

2. Streaming Responses

For better UX, stream LLM outputs token-by-token:

for chunk in chain.stream({"question": "Explain quantum computing"}):
    print(chunk, end="", flush=True)

3. Batch Processing

Process multiple inputs efficiently:

questions = [
    {"question": "What is Python?"},
    {"question": "What is Java?"},
    {"question": "What is JavaScript?"}
]

# Parallel execution
results = chain.batch(questions)

4. Observability with LangSmith

import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your-api-key"

# All chains automatically log to LangSmith
# View traces at: https://smith.langchain.com

5. Prompt Management

from langchain.prompts import load_prompt

# Store prompts in JSON/YAML files
prompt = load_prompt("prompts/customer_support.json")

# Version control your prompts
# Track performance of different prompt versions in LangSmith

Conclusion: Building Production AI with LangChain

LangChain has evolved from a simple prompt wrapper to a comprehensive ecosystem for building production AI systems. Key takeaways:

Start with chains, graduate to agents: Use simple chains for predictable workflows, agents for autonomous tasks
Tools are critical: The value of agents comes from tool integration—invest in building robust custom tools
Memory matters: Conversational agents need memory; choose the right type for your use case
RAG is essential: For enterprise AI, RAG grounds responses in your data and reduces hallucinations
Use LangGraph for complexity: Multi-step reasoning, human-in-the-loop, and multi-agent systems require LangGraph
Production patterns:
- Streaming for UX
- Fallbacks for reliability
- LangSmith for observability
- Structured outputs with Pydantic

The future of LangChain includes:

LangGraph Studio: Visual graph builder
LangServe: Deploy chains as REST APIs
Deeper integrations: More pre-built tools and vector stores

LangChain is the Python equivalent of Semantic Kernel (.NET) and provides the most mature tooling for AI agents in the Python ecosystem.

Frequently Asked Questions (FAQ)

What is LangChain used for?

LangChain is used to build AI agents and applications powered by large language models (LLMs). It provides tools for prompt engineering, tool calling, memory management, RAG (Retrieval-Augmented Generation), and multi-agent workflows in Python.

Is LangChain free to use?

Yes, LangChain is open-source and free under the MIT license. However, you’ll need API keys for LLM providers (OpenAI, Anthropic, etc.) which have their own pricing. You can also use free local models with LangChain.

What’s the difference between LangChain and LangGraph?

LangChain provides linear chains and basic agents. LangGraph enables cyclic workflows with loops, conditionals, and multi-agent collaboration. Use LangGraph for complex, stateful systems that need iterative refinement.

Can I use LangChain with local LLMs?

Yes! LangChain supports Ollama, Hugging Face models, LlamaCpp, and other local LLM providers. You’re not locked into paid APIs like OpenAI.

How does LangChain RAG work?

LangChain RAG:

Splits documents into chunks
Converts chunks to vector embeddings
Stores in a vector database (Chroma, Pinecone, Weaviate)
At query time, retrieves relevant chunks
Injects chunks into the LLM prompt as context

What is the difference between LangChain and Semantic Kernel?

LangChain is Python-first with a massive ecosystem of integrations. Semantic Kernel is .NET-focused with strong typing and enterprise patterns. LangChain has more community tools; Semantic Kernel has better Azure integration.

How do I debug LangChain agents?

Enable verbose mode (verbose=True) to see agent reasoning steps. Use LangSmith for detailed tracing, including token usage, latency, and errors. Add logging to custom tools.

Next Steps: Master LangChain

Explore the Official LangChain Documentation
Try LangChain Templates for quick starts
Learn LangGraph for advanced workflows
Join the LangChain Discord Community
Deploy with LangServe for production APIs

Coming Next in the Tools We Use Series:

AutoGen: Microsoft’s Multi-Agent Framework
CrewAI: Role-Based Multi-Agent Systems
LlamaIndex: Advanced RAG and Knowledge Graphs