OpenAI's flagship multimodal model with strong reasoning, tool use and agentic workflows.
Specifications
- Provider
- OpenAI
- Type
- Vendor / proprietary
- Modality
- Text + Vision
- Category
- Multimodal model
- Context window
- 256K
- License
- Proprietary (API)
- Released
- August 7, 2025
What it was trained for
OpenAI's flagship model, built as a general-purpose multimodal system that handles text and images and adapts its reasoning depth to the difficulty of the task.
Best for
- ▸Complex multi-step reasoning and analysis
- ▸Agentic workflows and tool orchestration
- ▸Software engineering and code generation
- ▸Document and image understanding
- ▸Drafting and long-form writing
- ▸General-purpose assistant and chat applications
Capabilities
Vision inputFunction calling / toolsStructured outputsAdjustable reasoning effortLong context
Performance & positioning
Positioned as OpenAI's most capable general model, strong across reasoning, coding, and multimodal understanding with a tunable speed-versus-depth tradeoff.
More from OpenAI
