Step 3.7 Flash on NVIDIA GPUs using enterprise-ready multimodal AI

AI applications are moving beyond text generation to multimodal systems that can recognize, search, and reason about images, documents, videos, and language in real time, turning fragmented information into actionable insights.

The latest version of StepFun, Step 3.7 Flash, brings these capabilities to production and enterprise scale and is available on NVIDIA accelerated infrastructure. It is a 198B parameter expert mixed vision language model with approximately 11 billion enabling parameters per forward pass, optimized for agent workflows that combine perception, search, and multi-step inference at production scale.

With native image and video inputs, three configurable inference levels (low, medium, and high), and a 256k context window, it is designed for enterprise use cases such as financial analysis, concurrent coding agents, and other high-throughput multimodal use cases. Developers can use StepFun’s NVFP4 quantized checkpoints, available through Hugging Face, to enhance inference with reduced memory bandwidth and storage requirements.

model	Step 3.7 Flash
total parameters	198B
visual encoder parameters	1.8B
active parameters	11B
context length	256K
expert	288 (active 8)

Table 1. Summary of flash specifications for major step 3.7 (number of parameters, context length, MoE configuration, etc.)

Step 3.7 Flash is deployed using open source frameworks such as SGLang, NVIDIA TensorRT-LLM, and vLLM to take advantage of kernels optimized for NVIDIA hardware.

Build with NVIDIA endpoints

Najlepszy kod polecajacy Binance commented on Insights from Nabil Batawi, Group CHRO, Alkhorayef Group, KSA, ETHRWorldME: Your point of view caught my eye and was very inte
Parker Robinson commented on AI platform Hugging Face says hackers have stolen authentication tokens from Spaces: Bitcoin Mining for Passive Income in 2026 https://
100 USDT commented on How to Make AI Work for You, at Work: Thanks for sharing. I read many of your blog posts
创建Binance账户 commented on AI jobs in financial services: $350k for junior hires: Your article helped me a lot, is there any more re
1win commented on Do AI apps really need a GPU or NPU?: Saved as a favorite, I really like your website!

Step 3.7 Flash on NVIDIA GPUs using enterprise-ready multimodal AI

Build with NVIDIA endpoints

Production-ready deployment with NVIDIA NIM

Day 0 fine-tuning using the NVIDIA NeMo framework

RECENT POSTS

RSI is the new AGI, but it’s just as difficult to identify

Are you using AI in the pit? Here are five safe apps for your toolbelt.

You can now ask AI to create your video feed on YouTube

Build with NVIDIA endpoints

Production-ready deployment with NVIDIA NIM

Day 0 fine-tuning using the NVIDIA NeMo framework

Related Posts