Blogs

Engineering An AI Agent To Navigate Large-scale Event Data - Part 2

December 11, 2025
10
 Ayesha Imran
Ayesha Imran
Engineering An AI Agent To Navigate Large-scale Event Data - Part 2

Part 2: From Query Patterns to Intelligent Tools & Agent Design

A simple search application can take in keywords, find exact matches and return results. It cannot however, reliably and accurately decipher natural language queries which involve semantic understanding of the users' intent. The benefit of using an agentic search application is that the AI Agent has the power to decipher the user queries fully, breaking them down into multiple parts (“Which talks from 2024 discuss LLM deployment in prod?” → filter: 2024, semantic search: LLM deployment in prod) and using one (or more) of the tools at its disposal to fetch relevant data, reason over it and answer the query. Therefore, it’s also understandable that an AI agent is only as good as the tools it wields. Part 1 of this series established the data foundation: a graph schema designed for LLM query decomposition, multimodal embeddings connected to source entities, and query patterns that translate natural language into database operations. That foundation was built with a specific goal - enabling an AI agent to autonomously navigate complex user queries about MLOps conference content, which is fairly representative of how users tend to have a collection of mix modalities for their usual search use cases.

This post completes the journey. The query patterns tested in Part 1 now become parameterized tools, and a ReAct agent learns to select and compose them based on user intent. The focus here is on tool design philosophy, the consolidation of dozens of query patterns into seven comprehensive tools, and the few-shot prompt engineering that teaches the agent effective tool usage. The full application code, including the FastAPI backend and Next.js frontend, is available in the GitHub repository. You can yourself try out the agent as well and share feedback with us. 

Dataset: MLOPs Conference Talks from 2022-2024 (.csv)
Embedding model: EmbeddingGemma by Google
Frameworks: Python FastAPI for backend, Next.js for frontend
Platform: Netlify for frontend deployment, Render for backend deployment
Metadata, text and vector storage: ApertureDB
Event agent: try out the agent
GitHub Repository: Repo 

The Application Flow

The diagram shows the execution flow of the agentic application. The user prompts the AI agent with a natural language query. The AI agent then iterates in a ReAct (reasoning + action) loop which involves reasoning or “thinking” about an action to take i.e deciding which tool best matches the query requirements and hence should be used, as well as which params the tool should be called with. Once the agent has reached a decision, it calls one or more tools (multiple tools can be called in parallel thanks to LangGraph), the tools execute ApertureDB queries under the hood, parse the raw result to strip away unnecessary details and information (like raw ApertureDB logs etc) and then return back to the AI agent. The AI agent uses the retrieved results to decide the next course of action - either calling a tool again or generating the final answer. The last tool call’s top 10 retrieved results (if they contain Talk items) are also returned and displayed in the UI for the user to navigate directly. An important thing to mention here is the importance of sanitizing the returned DB results so you only provide back to the LLM the necessary information rather than overloading its context window with unnecessary material. This type of context engineering requires a lot of nuance and curation but it heavily affects the quality of agentic applications. 

The Philosophy of Tool Design

The critical insight that has shaped this implementation is that tool design matters more than model selection. A powerful LLM with poorly designed tools produces unreliable results; a well-designed toolset enables even modest models to deliver consistent, high-quality outputs. Following best practices in context engineering for AI Agents outlined by Anthropic, LangChain and Manus, we followed these five design principles during our tool development process.

First, tools must be self-contained. Each tool handles a complete workflow - query construction, database execution, result formatting - without requiring the agent to coordinate multiple steps. This dramatically reduces the reasoning burden on the LLM and minimizes failure modes.

Second, tools must be parameterized. Rather than creating separate tools for "filter by date" and "filter by views," a single tool accepts multiple optional parameters that the agent can combine as needed. Pydantic schemas provide both type validation and documentation that the LLM reads to understand parameter usage.

Third, tools must be well-documented. The docstrings serve dual purposes: they explain functionality to developers and provide instructions to the LLM during tool selection. Comprehensive descriptions of when to use each tool and what each parameter means directly improve agent accuracy.

Fourth, tools must produce structured outputs. Consistent JSON response formats reduce parsing errors and enable the agent to reason about results predictably. Every tool returns a success flag, meaningful error messages on failure, and structured data that maps cleanly to natural language responses.

Fifth, tools must degrade gracefully. Database queries can fail, speakers might not exist, searches might return no results. Each tool handles these cases explicitly, returning informative messages rather than crashing. The agent can then communicate failures naturally to users or retry another tool or the same tool with different parameters (tried and tested!).

From Query Patterns to Curated Tools

Part 1 tested dozens of ApertureDB query patterns: metadata filtering with sorting, semantic search across multiple descriptor sets, graph traversals from Person to Talk entities, constrained semantic search that filters before vector similarity, and aggregation queries for analytics. The observation that drove tool consolidation: many of these queries share common structures with variable parameters.

The filtering queries - by date, by views, by category, by speaker - all reduce to a single FindEntity operation with dynamic constraints. The semantic searches across transcripts, abstracts, and speaker bios all follow the same FindDescriptor → FindEntity pattern with different descriptor sets. During this consolidation process, we identified seven tools that cover the full query space, each handling multiple related use cases through parameterization.  A few of the core tools are covered briefly in the remaining article. You can find the complete code for all the tools in the GitHub repo. 

Here’s a refresher for the schema that we curated and discussed in Part 1 of the series:

  • Entities:
    • Talk (core): typed properties for filters (views:int, date, tech_level, company, track, keywords, abstract, outcomes, youtube_id). Deterministic UUIDs.
    • Person: normalized speakers.
      TranscriptChunk: timestamped 10-segment chunks for semantic search.
  • Connections:
    • TalkHasSpeaker (speaker lookup + multi-hop reasoning).
    • TalkHasTranscriptChunk (constrained semantic search).
    • TalkHasMeta (talk-level embedding link).
  • Descriptor Sets:
    • Transcript chunks (~17k vectors), talk metadata (~280), speaker bios (~263).
  • Key design principles: typed fields for agent filters, graph modeling for traversal logic, chunk embeddings for precise semantic search, deterministic ingestion for reproducibility.

Our database schema curation + query testing laid the foundations for these comprehensive tools. ApertureDB’s well-curated and versatile query language is what enabled the easy schema curation to begin with.

Tool 1: search_talks_by_filters is heavily metadata-based, handling all queries that don't require semantic understanding. The Pydantic schema illustrates how parameters become self-documenting:

class SearchTalksByFiltersInput(BaseModel):
    date_from: Optional[str] = Field(
        None, 
        description="Filter talks published from this date (format: YYYY-MM-DD, YYYY-MM, or YYYY)"
    )
    min_views: Optional[int] = Field(
        None,
        description="Minimum YouTube view count required. Example: 1000 for talks with at least 1K views"
    )
    company_name: Optional[str] = Field(
        None,
        description="Filter by speaker's company name. Example: 'Google', 'Microsoft', 'OpenAI'"
    )
    sort_by: Optional[Literal["date", "views", "title", "tech_level"]] = Field(
        "date",
        description="Sort results by: 'date', 'views', 'title', 'tech_level'"
    )
    limit: Optional[int] = Field(10, description="Maximum number of results to return")

The description fields are read by the LLM during tool selection - they're not just documentation, they're instructions. When a user asks "Show me talks from Google with over 1000 views," the agent reads these descriptions and constructs the appropriate parameter combination. The tool internally builds ApertureDB constraints dynamically based on which parameters are provided.

Tool 2: search_talks_semantically handles natural language queries by searching across multiple embedding spaces. The key design decision: search all three descriptor sets (transcripts, abstracts, speaker bios) and merge results ranked by similarity. This ensures comprehensive coverage regardless of where relevant content appears.

sets_to_search = [
    (SET_TRANSCRIPT, "TalkHasTranscriptChunk", "transcript"),
    (SET_META, "TalkHasMeta", "abstract/metadata"),
    (SET_BIO, "TalkHasSpeakerBio", "speaker bio")
]

The tool also supports constrained semantic search - a pattern that exemplifies ApertureDB's unified architecture advantage. When a user asks "Find talks about RAG from 2024," the tool first filters Talk entities by date, then searches only within those talks' connected embeddings. This happens in a single atomic query:

q = [
    {"FindEntity": {
        "_ref": 1, "with_class": "Talk",
        "constraints": {"yt_published_at": [">=", {"_date": "2024-01-01"}]}
    }},
    {"FindDescriptor": {
        "set": "ds_transcript_chunks_v1",
        "is_connected_to": {"ref": 1, "connection_class": "TalkHasTranscriptChunk"},
        "k_neighbors": 10, "distances": True
    }}
]

With separate vector and relational databases, this constrained search pattern requires complex orchestration - multiple round trips, application-level joins,  addressing consistency challenges. ApertureDB's connected embeddings make it a single query.

Tool 3: analyze_speaker_activity leverages the graph structure established in Part 1. The separate Person entities connected via TalkHasSpeaker edges enable powerful speaker-centric analytics. For individual speaker analysis, the tool traverses from Person to all connected Talks:

q = [
    {"FindEntity": {
        "_ref": 1, "with_class": "Person",
        "constraints": {"name": ["==", speaker_name]}
    }},
    {"FindEntity": {
        "with_class": "Talk",
        "is_connected_to": {"ref": 1, "direction": "in", "connection_class": "TalkHasSpeaker"},
        "results": {"list": ["talk_title", "yt_views", "category_primary", "event_name"]}
    }}
]

For dataset-wide analysis, the tool aggregates all talks by speaker, calculates statistics (talk counts, total views, categories covered), and identifies repeat speakers. For users unfamiliar with the graph data model, the schema design we did in Part 1 might have seemed like an overhead but it is what efficiently supports queries that would otherwise be expensive string operations in a flat table structure (learn more about value of graphs from this blog).

Tools 4-7 complete the suite. get_talk_details provides deep-dive capability with optional transcript chunks filtered by time range and semantically similar talks. find_similar_content handles recommendations using content-based, speaker-based, or topic-based similarity. analyze_topics_and_trends extracts tool mentions, technology references, and keyword frequencies using pattern matching against curated lists. get_unique_values serves as a discovery tool, answering "What events exist?" or "What categories are available?" - essential for agents to understand the data landscape before filtering.

The Few-Shot Prompt

The system prompt spans over 500 lines and represents the accumulated wisdom from iterative testing. The agent's performance improved dramatically with detailed few-shot examples showing parameter selection and tool chaining patterns, as compared to initial tests with a simpler system prompt. The prompt structure includes four components: 

  1. database schema explanation, 
  2. tool descriptions with selection guidelines, 
  3. fourteen few-shot examples, and 
  4. response formatting guidelines.

The tool selection guidelines provide explicit mappings from query types to tools:

Filtering & Sorting → search_talks_by_filters
Topic-Based Search → search_talks_semantically  
Speaker Analysis → analyze_speaker_activity
Detailed Information → get_talk_details
Recommendations → find_similar_content
Trend Analysis → analyze_topics_and_trends
Discovery → get_unique_values

The few-shot examples demonstrate not just which tool to use, but how to construct parameters. Consider this example for expert finding:

### Example 4: Expert Finding (Tool 2)
**User Query**: "Find experts who talk about vector databases and RAG"
**Tool Call**:
search_talks_semantically.invoke({
    "query": "vector databases RAG retrieval augmented generation embeddings",
    "search_type": "bio",
    "limit": 8
})

The example teaches the agent that searching speaker bios (search_type: "bio") is the appropriate strategy for finding experts, and that expanding the query with related terms improves recall. Without such examples, agents often make suboptimal parameter choices, searching transcripts when bios would be more relevant, or using overly narrow queries. The entire system prompt for the agent is very comprehensive and spans 280 lines. But this well-curated prompt with sufficient few-shot examples on tool usage is what makes the AI Agent’s performance so reliable and successful. 

The ReAct Agent

The agent implementation uses LangChain's prebuilt LangGraph-based create_agent function with Gemini 2.5 Pro as the reasoning model. It uses the ReAct (Reasoning + Acting) pattern that fits database exploration tasks naturally: the agent thinks about what information is needed, selects and executes a tool, observes results, and reasons about whether additional tools are required.

from langgraph.prebuilt import create_react_agent
from langchain_google_genai import ChatGoogleGenerativeAI


model = ChatGoogleGenerativeAI(
    model="gemini-2.5-pro",
    temperature=0.7,
    google_api_key=GOOGLE_API_KEY
)


tools = [
    search_talks_by_filters,
    search_talks_semantically,
    analyze_speaker_activity,
    get_talk_details,
    find_similar_content,
    analyze_topics_and_trends,
    get_unique_values
]


agent = create_agent(model=model, tools=tools, system_prompt=system_prompt)

The Agent in Action

We have deployed our agent application as a FastAPI backend with Server-Sent Events (SSE) for real-time streaming, paired with a Next.js frontend that visualizes the agent's reasoning process. The screenshots demonstrate the agent handling real queries with full transparency into its decision-making.

For a simple semantic search - "Which talks discuss AI agents with memory?" - the agent selects search_talks_semantically with an appropriate query, receives 10 results, and synthesizes a response explaining why each talk is relevant. The UI visualizes the chain of thought: "Agent is thinking... Decided to call the tool: search_talks_semantically(...) Tool output received: Found 10 results... Processing..." The retrieved content panel displays talk cards with thumbnails, speaker names, categories, view counts, and similarity scores, giving users both the agent's interpretation and direct access to source material.

More interesting is multi-tool chaining. When asked "Find talks on Model development, training, and architecture and also give info on the top speakers for these talks," the agent autonomously executes multiple tools in sequence: search_talks_by_filters to find relevant talks, analyze_speaker_activity to analyze the speakers presenting in this domain, and get_unique_values to discover available filtering options. The execution trace shows each tool call with its parameters, the success status, and processing indicators. The final response combines talk recommendations with speaker analytics identifying top contributors.

This autonomous tool composition is the payoff from careful tool design. Because each tool is self-contained and returns structured results, the agent can chain them without confusion. Because the few-shot examples demonstrate multi-tool patterns, the agent knows this composition is possible and appropriate.

Lessons Learned

Building this agent has led to several insights that generalize beyond this specific application:

  • Tool design determines agent capability more than model selection. The seven tools, with their comprehensive parameters and structured outputs, define what the agent can accomplish. Upgrading the LLM might improve response quality, but the tools define the possibility space. Time invested in tool refinement; better parameter descriptions, more informative error messages, richer structured outputs, pays dividends in agent reliability.
  • ApertureDB's unified architecture eliminated an entire category of complexity. Graph traversals, vector similarity search, and metadata filtering happen in singular atomic queries. The constrained semantic search pattern - filter first, then search embeddings - otherwise requires significant orchestration with separate systems. For AI agent applications that need to combine structured queries with semantic understanding, having graph, vector, and blob storage in one transactional system is a significant advantage. The connected embeddings pattern, where each vector links back to its source entity, enables query patterns that simply aren't practical with separate databases.
  • Few-shot examples are not optional. The improvement from adding detailed examples to the system prompt was substantial and immediate. Examples teach the agent not just tool selection but parameter construction, query expansion strategies, and multi-tool patterns. The fourteen examples in the system prompt cover the full range of query types users might ask - from simple filtering to compound analytical requests.
  • Pydantic schemas serve as LLM documentation. The Field(description=...) annotations are read by the model during tool selection. Well-written descriptions directly improve parameter accuracy. This dual-purpose design - validation for developers, documentation for LLMs - is worth adopting broadly.

The completed system demonstrates what becomes possible when data architecture, tool design, and agent orchestration are aligned toward a common goal. The graph schema from Part 1 enables the traversals that power speaker analytics. The connected embeddings enable constrained semantic search. The parameterized tools expose these capabilities to an LLM that can reason about user intent and compose operations autonomously. Each layer builds on the previous, and the result is an intelligent search interface that handles complex, multi-faceted queries about MLOps conference content. Not only that, but a similar series of steps can easily be adopted for other 

The best part, you can test the application yourself and use the steps outlined or the code to build your own. The full source code is available on GitHub. ApertureDB documentation and API reference can be found at docs.aperturedata.io.

I would like to acknowledge the valuable feedback from Sonam Gupta (Telnyx), and Vishakha Gupta, in creating this Agent and blog. 

‍

Part 1 of this series

‍

Ayesha Imran | Website | LinkedIn | Github

Software Engineer experienced in scalable RAG and Agentic AI systems, LLMOps, backend and cloud computing. Interested and involved in AI & NLP research.

‍

Related Blogs

Engineering the Memory Layer For An AI Agent To Navigate Large-scale Event Data
Blogs
Engineering the Memory Layer For An AI Agent To Navigate Large-scale Event Data
An Agentic search platform for MLOps World talks using ApertureDB, multimodal models and LangGraph
Read More
Watch Now
Applied
Human Memory As The Perfect Template For AI Memory
Blogs
Human Memory As The Perfect Template For AI Memory
This blog explores how human memory inspires the next generation of AI systems that don’t just recall data but learn, adapt, and reason through context. By mirroring how we process multimodal information, AI memory can evolve into a living, dynamic engine for intelligent decision-making.
Read More
Watch Now
Applied
How Multimodal Vector Databases Are Transforming Challenges Across Industries
Blogs
How Multimodal Vector Databases Are Transforming Challenges Across Industries
Learn how vector databases like ApertureDB help healthcare ads stay compliant by flagging missing facts and improving transparency.
Read More
Watch Now
Applied
Beyond SQL: The Query Language Multimodal AI Really Needs
Blogs
Beyond SQL: The Query Language Multimodal AI Really Needs
ApertureDB has its own query language, AQL, using JSON as its native format, because traditional languages like SQL, Cypher, were insufficient for managing, searching, and processing multimodal AI data at scale. AQL allows for expressing complex data types and operations, and ApertureDB also offers simplified interfaces, including SQL and SPARQL wrappers, and natural language access through RAG and MCP. The core idea is to prioritize the problem's solution over existing language barriers, aligning with the evolving needs of AI systems.
Read More
Watch Now
Product
Building Real World RAG-based Applications with ApertureDB
Blogs
Building Real World RAG-based Applications with ApertureDB
Combining different AI technologies, such as LLMs, embedding models, and a database like ApertureDB that is purpose-built for multimodal AI, can significantly enhance the ability to retrieve and generate relevant content.
Read More
Managing Visual Data for Machine Learning and Data Science. Painlessly.
Blogs
Managing Visual Data for Machine Learning and Data Science. Painlessly.
Visual data or image/video data is growing fast. ApertureDB is a unique database...
Read More
What’s in Your Visual Dataset?
Blogs
What’s in Your Visual Dataset?
CV/ML users need to find, analyze, pre-process as needed; and to visualize their images and videos along with any metadata easily...
Read More
Transforming Retail and Ecommerce with Multimodal AI
Blogs
Transforming Retail and Ecommerce with Multimodal AI
Multimodal AI can boost retail sales by enabling better user experience at lower cost but needs the right infrastructure...
Read More
Vector Databases and Beyond for Multimodal AI: A Beginner's Guide Part 1
Blogs
Vector Databases and Beyond for Multimodal AI: A Beginner's Guide Part 1
Multimodal AI, vector databases, large language models (LLMs)...
Read More
How a Purpose-Built Database for Multimodal AI Can Save You Time and Money
Blogs
How a Purpose-Built Database for Multimodal AI Can Save You Time and Money
With extensive data systems needed for modern applications, costs...
Read More
Minute-Made Data Preparation with ApertureDB
Blogs
Minute-Made Data Preparation with ApertureDB
Working with visual data (images, videos) and its metadata is no picnic...
Read More
Why Do We Need A Purpose-Built Database For Multimodal Data?
Blogs
Why Do We Need A Purpose-Built Database For Multimodal Data?
Recently, data engineering and management has grown difficult for companies building modern applications...
Read More
Building a Specialized Database for Analytics on Images and Videos
Blogs
Building a Specialized Database for Analytics on Images and Videos
ApertureDB is a database for visual data such as images, videos, embeddings and associated metadata like annotations, purpose-built for...
Read More
Vector Databases and Beyond for Multimodal AI: A Beginner's Guide Part 2
Blogs
Vector Databases and Beyond for Multimodal AI: A Beginner's Guide Part 2
Multimodal AI, vector databases, large language models (LLMs)...
Read More
Challenges and Triumphs: Multimodal AI in Life Sciences
Blogs
Challenges and Triumphs: Multimodal AI in Life Sciences
AI presents a new and unparalleled transformational opportunity for the life sciences sector...
Read More
Your Multimodal Data Is Constantly Evolving - How Bad Can It Get?
Blogs
Your Multimodal Data Is Constantly Evolving - How Bad Can It Get?
The data landscape has dramatically changed in the last two decades...
Read More
Can A RAG Chatbot Really Improve Content?
Blogs
Can A RAG Chatbot Really Improve Content?
We asked our chatbot questions like "Can ApertureDB store pdfs?" and the answer it gave..
Read More
ApertureDB Now Available on DockerHub
Blogs
ApertureDB Now Available on DockerHub
Getting started with ApertureDB has never been easier or safer...
Read More
Are Vector Databases Enough for Visual Data Use Cases?
Blogs
Are Vector Databases Enough for Visual Data Use Cases?
ApertureDB vector search and classification functionality is offered as part of our unified API defined to...
Read More
Accelerate Industrial and Visual Inspection with Multimodal AI
Blogs
Accelerate Industrial and Visual Inspection with Multimodal AI
From worker safety to detecting product defects to overall quality control, industrial and visual inspection plays a crucial role...
Read More
ApertureDB 2.0: Redefining Visual Data Management for AI
Blogs
ApertureDB 2.0: Redefining Visual Data Management for AI
A key to solving Visual AI challenges is to bring together the key learnings of...
Read More

Start Your Multimodal AI Journey Today

Try ApertureDB free for 30 days or schedule a demo to get started.

Stay Connected:
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
ApertureData Featured on the DesignRush list of AI Productivity Tools