
Building AI Agents with Multimodal Models
NVIDIA Deep Learning Institute (DLI) · NVIDIA · Updated March 2026
Platform rating
4.6/5
Champ rating
8.6/10
Duration
8 hours instructor-led
Classes
24
Learn to build powerful AI agents using multimodal models that combine text, image, and video understanding for complex reasoning tasks.
What you'll get
Fit
Best for
Not ideal for
Prerequisites & pricing
Prerequisites
Python and deep learning experience
Pricing
Contact for pricing
Certification
Certificate
Alternatives to Building AI Agents with Multimodal Models

AI Agents Course
Hugging Face · Hugging Face
Free interactive course on agent fundamentals, frameworks, real-world assignments, and benchmark challenges with optional certification.

Model Context Protocol (MCP) Course
Hugging Face · Hugging Face
Free MCP course (with Anthropic collaboration) focused on protocol architecture, SDKs, end-to-end apps, and deployment-oriented use cases.

Level Up Your AI Agent Skills
Databricks Academy · Databricks
Free 90-minute AI agent fundamentals training with four videos, industry use cases, and badge-based assessment.