Try NVIDIA NIM APIs

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Publisher

Use Case

NIM Type

Sorting by Most Recent

Featured Models

Run Anywhere

nvidia parakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

asr multilingual nvidia nim streaming speech-to-text

Run Anywhere

black-forest-labs FLUX.1-dev

FLUX.1 is a state-of-the-art suite of image generation models

run on rtx image generation text-to-image

Run Anywhere

nvidia cosmos-predict1-7b

Generates physics-aware video world states from text and image prompts for physical AI development.

physical ai image-to-world robotics text-to-world synthetic data generation

Run Anywhere

nvidia llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

advanced reasoning function calling instruction following math

Run Anywhere

nvidia magpie-tts-multilingual

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

nvidia nim nvidia riva tts multilingual text-to-speech

Run Anywhere

deepseek-ai deepseek-r1-distill-llama-8b

Distilled version of Llama 3.1 8B using reasoning data generated by DeepSeek R1 for enhanced performance.

distillation coding math reasoning run on rtx

Run Anywhere

nvidia nemoretriever-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

chart detection object detection table detection data ingestion nemo retriever

Run Anywhere

nvidia nemoretriever-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

chart detection object detection table detection data ingestion nemo retriever

Run Anywhere

nvidia nemoretriever-page-elements-v2

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

chart detection object detection table detection data ingestion nemo retriever

Run Anywhere

colabfold msa-search

Generates a multiple sequence alignment from a query sequence and a protein sequence database search.

biology bionemo protein folding nim drug discovery

nvidia parakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

asr streaming speech-to-text multilingual nvidia nim nvidia

black-forest-labs FLUX.1-dev

FLUX.1 is a state-of-the-art suite of image generation models

image generation run on rtx text-to-image black-forest-labs

nvidia llama-3.1-nemotron-ultra-253b-v1

Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.

math advanced reasoning instruction following function calling nvidia

meta llama-4-maverick-17b-128e-instruct

A general purpose multimodal, multilingual 128 MoE model with 17B parameters.

language generation image-to-text vision assistant visual question answering meta

meta llama-4-scout-17b-16e-instruct

A multimodal, multilingual 16 MoE model with 17B parameters.

language generation image-to-text vision assistant visual question answering meta

qwen qwq-32b

Powerful reasoning model capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems.

coding chat math advanced reasoning qwen

nvidia cosmos-predict1-7b

Generates physics-aware video world states from text and image prompts for physical AI development.

synthetic data generation physical ai robotics text-to-world image-to-world nvidia

nvidia cosmos-predict1-5b

Generates future frames of a physics-aware world state based on simply an image or short video prompt for physical AI development.

synthetic data generation physical ai policy evaluation robotics video-to-world nvidia

nvidia sparsedrive

End-to-end autonomous driving stack integrating perception, prediction, and planning with sparse scene representations for efficiency and safety.

autonomous vehicles bev av stack automotive nvidia

nvidia bevformer

Advanced transformer for multi-frame bird's-eye-view 3D perception in autonomous driving.

autonomous vehicles bev automotive perception nvidia

nvidia llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

math advanced reasoning instruction following function calling nvidia

nvidia llama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

math advanced reasoning instruction following function calling nvidia

nvidia magpie-tts-multilingual

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

tts text-to-speech nvidia nim nvidia riva multilingual nvidia

nvidia nv-embedcode-7b-v1

The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.

nemo retriever embedding retrieval augmented generation nvidia

deepseek-ai deepseek-r1-distill-llama-8b

Distilled version of Llama 3.1 8B using reasoning data generated by DeepSeek R1 for enhanced performance.

distillation coding run on rtx reasoning math deepseek-ai

nvidia nemoretriever-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detection chart detection nemo retriever table detection data ingestion nvidia

nvidia nemoretriever-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detection chart detection nemo retriever table detection data ingestion nvidia

nvidia nemoretriever-page-elements-v2

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detection chart detection nemo retriever table detection data ingestion nvidia

colabfold msa-search

Generates a multiple sequence alignment from a query sequence and a protein sequence database search.

nim bionemo biology drug discovery protein folding colabfold

openfold openfold2

Predicts the 3D structure of a protein from its amino acid sequence, multiple sequence alignments, and templates.

biology nim bionemo drug discovery protein folding openfold

google gemma-3-27b-it

Cutting-edge open multimodal model exceling in high-quality reasoning from images.

vision assistant visual question answering language generation image-to-text google

google gemma-3-1b-it

A lightweight, multilingual, advanced SLM text model for edge computing, resource constraint applications

translation chat text-to-text language generation google

nvidia nemoretriever-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

optical character recognition nemo retriever data ingestion table extraction supported language - english nvidia

deepseek-ai deepseek-r1-distill-qwen-32b

Distilled version of Qwen 2.5 32B using reasoning data generated by DeepSeek R1 for enhanced performance.

coding distillation reasoning math deepseek-ai

deepseek-ai deepseek-r1-distill-qwen-14b

Distilled version of Qwen 2.5 14B using reasoning data generated by DeepSeek R1 for enhanced performance.

coding distillation reasoning math deepseek-ai

deepseek-ai deepseek-r1-distill-qwen-7b

Distilled version of Qwen 2.5 7B using reasoning data generated by DeepSeek R1 for enhanced performance.

coding distillation reasoning math deepseek-ai

microsoft phi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

code generation chat text-to-text language generation microsoft

microsoft phi-4-multimodal-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.

speech recognition visual qa language generation image-to-text chart and table understanding microsoft

arc evo2-40b

Evo 2 is a biological foundation model that is able to integrate information over long genomic sequences while retaining sensitivity to single-nucleotide changes.

dna generation nim bionemo biology drug discovery arc

openai whisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

asr ast speech-to-text batch whisper openai multilingual nvidia nim nvidia riva openai

nvidia canary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asr ast streaming speech-to-text batch spanish multilingual nvidia nim nvidia riva nvidia

nvidia canary-0.6b-turbo-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asr ast fast speech-to-text batch multilingual nvidia nim nvidia riva nvidia

mistralai mistral-small-24b-instruct

Latency-optimized language model excelling in code, math, general knowledge, and instruction-following.

code reasoning agent-centric multilingual mistralai

deepseek-ai deepseek-r1

State-of-the-art, high-efficiency LLM excelling in reasoning, math, and coding.

chat math advanced reasoning deepseek-ai

nvidia llama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

dialogue safety llm safety guard model content safety nvidia

nvidia nemoguard-jailbreak-detect

Industry leading jailbreak classification model for protection from adversarial attempts

llm security jailbreak detection prompt injection nvidia nim nvidia

nvidia llama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

llm safety content moderation guard model content safety nvidia

igenius colosseum_355b_instruct_16k

NVIDIA DGX Cloud trained multilingual LLM designed for mission critical use cases in regulated industries including financial services, government, heavy industry

heavy industry government highly regulated use case support financial services igenius

tiiuae falcon3-7b-instruct

Instruction tuned LLM achieving SoTA performance on reasoning, math and general knowledge capabilities

coding code generation language generation improved reasoning math scientific knowledge tiiuae

igenius italia_10b_instruct_16k

Multilingual LLM with emphasis on European languages supporting regulated use cases including financial services, government, heavy industry

heavy industry government highly regulated use case support financial services igenius

qwen qwen2.5-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

chinese language generation chat text-to-text large language models qwen

nvidia genmol

Fragment-Based Molecular Generation by Discrete Diffusion.

chemistry nim bionemo molecule generation drug discovery nvidia

nvidia cosmos-nemotron-34b

Multi-modal vision-language model that understands text/img/video and creates informative responses

vlm vision language model image caption image to text nvidia

qwen qwen2.5-coder-32b-instruct

Advanced LLM for code generation, reasoning, and fixing across popular programming languages.

code completion code generation text-to-code qwen

qwen qwen2.5-coder-7b-instruct

Powerful mid-size code model with a 32K context length, excelling in coding in multiple languages.

code completion code generation text-to-code qwen

meta sam2

SAM 2 is a segmentation model that enables fast, precise selection of any object in any video or image.

meta computer vision segmentation video meta

writer palmyra-creative-122b

Powerful LLM designed for creative thinking and writing.

content generation chat text-to-text writer

nvidia llama-3.2-nv-embedqa-1b-v2

Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.

nemo retriever run on rtx embedding retrieval augmented generation text-to-embedding nvidia

nvidia llama-3.2-nv-rerankqa-1b-v2

Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.

nemo retriever retrieval augmented generation reranking nvidia

nvidia usdcode

State-of-the-art LLM that answers OpenUSD knowledge queries and generates USD-Python code.

openusd synthetic data generation digital twin code generation chat nvidia nim nvidia

meta llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

reasoning code generation text-to-text instruction following math meta

university-at-buffalo cached

Context-aware chart extraction that can detect 18 classes for chart basic elements, excluding plot elements.

nemo retriever chart element detection image-to-text university-at-buffalo

nvidia nv-yolox-page-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detection data ingestion chart detection nemo retriever table detection run on rtx extraction nvidia

baidu paddleocr

Model for table extraction that receives an image as input, runs OCR on the image, and returns the text within the image and its bounding boxes.

optical character recognition table extraction optical character detection nemo retriever run on rtx data ingestion extraction baidu

nvidia audio2face-3d

Converts streamed audio to facial blendshapes for realtime lipsyncing and facial performances.

speech-to-animation digital humans audio-to-face nvidia nim nvidia

nvidia conformer-ctc-asr

Automatic speech recognition model that transcribes speech in lower case English with record-setting accuracy and performance

asr streaming speech-to-text spanish nvidia nim nvidia riva nvidia

nvidia corrdiff

Generative downscaling model for generating high resolution regional scale weather fields.

ai weather prediction weather simulation earth-2 nvidia

nvidia fourcastnet

FourCastNet predicts global atmospheric dynamics of various weather / climate variables.

weather simulation ai weather prediction climate science earth-2 nvidia

hive deepfake-image-detection

Advanced AI model detects faces and identifies deep fake images.

computer vision ai safety deep fake detection content moderation hive

nvidia nemotron-4-mini-hindi-4b-instruct

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

indic chat text-to-text language generation nvidia

ibm granite-guardian-3.0-8b

Detects jailbreaking, bias, violence, profanity, sexual content, and unethical behavior

guardrail text-to-text ibm

ibm granite-3.0-8b-instruct

Advanced Small Language Model supporting RAG, summarization, classification, code, and agentic AI

small language model chat text-to-text ibm

ibm granite-3.0-3b-a800m-instruct

Highly efficient Mixture of Experts model for RAG, summarization, entity extraction, and classification

small language model moe language generation text-to-text ibm

nvidia llama-3.1-nemotron-70b-instruct

Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA in order to improve the helpfulness of LLM generated responses.

code generation chat text-to-text language generation nvidia

zyphra zamba2-7b-instruct

Efficient hybrid state-space model designed for conversational and reasoning tasks.

chat language generation text-to-text zyphra

institute-of-science-tokyo llama-3.1-swallow-70b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

sovereign ai large language model chat regional language generation institute-of-science-tokyo

institute-of-science-tokyo llama-3.1-swallow-8b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

sovereign ai large language model chat regional language generation institute-of-science-tokyo

nvidia studiovoice

Enhance speech by correcting common audio degradations to create studio quality speech output.

run on rtx nvidia maxine speech-to-speech digital human speech enhancement nvidia

nvidia mistral-nemo-minitron-8b-8k-instruct

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

small language model code generation chat text-to-text language generation nvidia

nvidia llama-3.1-nemotron-70b-reward

Leaderboard topping reward model supporting RLHF for better alignment with human preferences.

text-to-text reward model rlhf nvidia

meta llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

code generation chat text-to-text language generation meta

meta llama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

image-text retrieval visual qa image-to-text image captioning visual grounding meta

meta llama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

image-text retrieval visual qa image captioning image-to-text visual grounding meta

meta llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

code generation chat text-to-text language generation meta

nvidia llama-3.1-nemotron-51b-instruct

Unique language model that delivers an unmatched accuracy-efficiency performance.

language generation chat text-to-text nvidia

qwen qwen2-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

chinese language generation chat text-to-text large language models qwen

abacusai dracarys-llama-3.1-70b-instruct

Fine-tuned Llama 3.1 70B model for code generation, summarization, and multi-language tasks.

code generation text-to-text abacusai

deepmind alphafold2-multimer

Predicts the 3D structure of a protein from its amino acid sequence.

nim bionemo biology protein folding drug discovery deepmind

nvidia consistory

Generates consistent characters across a series of images without requiring additional training.

image generation text-to-image nvidia

nvidia vila

Multi-modal vision-language model that understands text/img/video and creates informative responses

vlm vision language model image caption image to text nvidia

hive ai-generated-image-detection

Robust image classification model for detecting and managing AI-generated content.

image classification computer vision ai safety content moderation hive

meta esm2-650m

Generates embeddings of proteins from their amino acid sequences.

nim protein embedding bionemo biology drug discovery meta

deepmind alphafold2

Predicts the 3D structure of a protein from its amino acid sequence.

nim bionemo biology protein folding drug discovery deepmind

yentinglin llama-3-taiwan-70b-instruct

Sovereign AI model finetuned on Traditional Mandarin and English data using the Llama-3 architecture.

regional language generation chat code generation large language models yentinglin

tokyotech-llm llama-3-swallow-70b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

large language model chat regional language generation tokyotech-llm

microsoft phi-3.5-vision-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from images.

vision assistant visual question answering language generation image-to-text microsoft

ai21labs jamba-1.5-mini-instruct

Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.

chat language generation text-to-text ai21labs

ai21labs jamba-1.5-large-instruct

Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.

chat language generation text-to-text ai21labs

nvidia nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

chat text-to-text language generation nvidia

nvidia mistral-nemo-minitron-8b-base

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

language generation text-to-text chat small language model nvidia

microsoft phi-3.5-moe-instruct

Advanced LLM based on Mixture of Experts architecure to deliver compute efficient content generation

moe code generation chat text-to-text language generation microsoft

microsoft phi-3.5-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

code generation chat text-to-text language generation large language models microsoft

nvidia nv-dinov2

NV-DINOv2 is a visual foundation model that generates vector embeddings for the input image.

image-to-embedding computer vision deepstream nvidia nim object classification nvidia

rakuten rakutenai-7b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chat text-to-text language generation large language models rakuten

rakuten rakutenai-7b-chat

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chat text-to-text language generation large language models rakuten

nvidia nv-grounding-dino

Grounding dino is an open vocabulary zero-shot object detection model.

object detection computer vision deepstream nvidia nim nvidia

briaai BRIA-2.3

An enterprise-grade text-to-image model trained on a compliant dataset produces high quality images.

image generation text-to-image briaai

nvidia radtts-hifigan-tts

Natural, high-fidelity, English voices for personalizing text-to-speech services and voiceovers

text-to-speech text-to-speech nvidia nim nvidia

nvidia megatron-1b-nmt

Enable smooth global interactions in 36 languages.

text translation neural machine translation nvidia nim nvidia

nvidia fastpitch-hifigan-tts

Expressive and engaging English voices for Q&A assistants, brand ambassadors, and service robots

text-to-speech nvidia nim nvidia

nvidia parakeet-ctc-1.1b-asr

Record-setting accuracy and performance for English transcription.

asr streaming english speech-to-text batch nvidia nim nvidia

nvidia parakeet-ctc-0.6b-asr

State-of-the-art accuracy and speed for English transcriptions.

asr streaming english batch run on rtx speech-to-text fast nvidia nim nvidia

ipd proteinmpnn

ProteinMPNN is a deep learning model for predicting amino acid sequences for protein backbones.

biology nim bionemo drug discovery protein generation ipd

microsoft florence-2

Vision foundation model capable of performing diverse computer vision and vision language tasks.

image classification image object detection cv multimodal vision assistant vlm visual question answering computer vision language generation image-to-text text-to-image microsoft

writer palmyra-fin-70b-32k

Specialized LLM for financial analysis, reporting, and data processing

finance text-to-text writer

google shieldgemma-9b

Guardrail model to ensure that responses from LLMs are appropriate and safe

guardrail text-to-text google

google gemma-2-2b-it

Advanced small language generative AI model for edge applications

code generation chat text-to-text language generation google

nvidia usdsearch

AI-powered search for OpenUSD data, 3D models, images, and assets using text or image-based inputs.

openusd synthetic data generation digital twin usd text-to-3d nvidia nim nvidia

GettyImages edify-image

Getty Images’ API service for 4K image generation. Trained on NVIDIA Edify using Getty Images' commercially safe creative libraries.

outpaint image generation replace image modification inpaint gettyimages

nvidia eyecontact

Estimate gaze angles of a person in a video and redirect to make it frontal.

telepresence nvidia maxine digital human nvidia

nvidia audio2face-2d

Create facial animations using a portrait photo and synchronize mouth movement with audio.

speech-to-animation telepresence nvidia maxine digital human nvidia

nvidia usdvalidate

Verify compatibility of OpenUSD assets with instant RTX render and rule-based validation.

validation openusd synthetic data generation digital twin usd visualization 3d nvidia

thudm chatglm3-6b

Supports Chinese and English languages to handle tasks including chatbot, content generation, coding, and translation.

chat text-to-text regional language generation thudm

mistralai mamba-codestral-7b-v0.1

Model for writing and interacting with code across a wide range of programming languages and tasks.

code completion code generation code generation mistralai

baichuan-inc baichuan2-13b-chat

Support Chinese and English chat, coding, math, instruction following, solving quizzes

chinese language generation text translation chat text-to-text baichuan-inc

meta llama-3.1-405b-instruct

Advanced LLM for synthetic data generation, distillation, and inference for chatbots, coding, and domain-specific tasks.

synthetic data generation chat code generation meta

meta llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

code generation chat text-to-text language generation meta

meta llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

run on rtx code generation chat text-to-text language generation meta

nv-mistralai mistral-nemo-12b-instruct

Most advanced language model for reasoning, code, multilingual tasks; runs on a single GPU.

run on rtx code generation chat language generation text-to-text nv-mistralai

nvidia nv-rerankqa-mistral-4b-v3

Multilingual text reranking model.

nemo retriever reranking retrieval augmented generation nvidia

nvidia nv-embedqa-e5-v5

English text embedding model for question-answering retrieval.

embedding retrieval augmented generation nemo retriever text-to-embedding nvidia

nvidia nv-embedqa-mistral-7b-v2

Multilingual text question-answering retrieval, transforming textual information into dense vector representations.

nemo retriever embedding retrieval augmented generation nvidia

nvidia maisi

MAISI is a pre-trained volumetric (3D) CT Latent Diffusion Generative Model.

image generation medical imaging nvidia nim nvidia

microsoft phi-3-medium-128k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

code generation chat text-to-text language generation large language models microsoft

bigcode starcoder2-7b

Advanced programming model for code completion, summarization, and generation

code completion code generation code generation bigcode

bigcode starcoder2-15b

Advanced programming model for code completion, summarization, and generation

code completion code generation code generation bigcode

google gemma-2-27b-it

Cutting-edge text generation model text understanding, transformation, and code generation.

code generation chat text-to-text language generation google

google gemma-2-9b-it

Cutting-edge text generation model text understanding, transformation, and code generation.

chat code generation text-to-text language generation google

nvidia llama3-chatqa-1.5-70b

Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines.

text-to-text non-commercial use only chat nvidia

nvidia llama3-chatqa-1.5-8b

Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines.

text-to-text non-commercial use only chat nvidia

nvidia nemotron-4-340b-reward

Grades responses on five attributes helpfulness, correctness, coherence, complexity and verbosity.

synthetic data generation text-to-text reward model nvidia

01-ai yi-large

Powerful model trained on English and Chinese for diverse tasks including chatbot and creative writing.

code generation chat text-to-text multilingual 01-ai

nvidia nemotron-4-340b-instruct

Creates diverse synthetic data that mimics the characteristics of real-world data.

synthetic data generation chat text-to-text nvidia

mistralai mistral-7b-instruct-v0.3

This LLM follows instructions, completes requests, and generates creative text.

chat text-to-text language generation mistralai

nvidia nvclip

NV-CLIP is a multimodal embeddings model for image and text.

computer vision multimodal embeddings text and image run on rtx nvidia nim nvidia

stabilityai stable-diffusion-3-medium

Advanced text-to-image model for generating high quality images

image generation text-to-image stabilityai

nvidia ocdrnet

OCDNet and OCRNet are pre-trained models designed for optical character detection and recognition respectively.

optical character recognition image optical character detection cv vlm computer vision tao toolkit video nvidia

writer palmyra-med-70b-32k

Leading LLM for accurate, contextually relevant responses in the medical domain.

text-to-text healthcare writer

writer palmyra-med-70b

Leading LLM for accurate, contextually relevant responses in the medical domain.

text-to-text healthcare writer

nvidia nv-embed-v1

Generates high-quality numerical embeddings from text inputs.

non-commercial use only retrieval augmented generation text-to-embedding nvidia

upstage solar-10.7b-instruct

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

non-commercial use only chat text-to-text language generation large language models upstage

baai bge-m3

Embedding model for text retrieval tasks, excelling in dense, multi-vector, and sparse retrieval.

embeddings retrieval augmented generation text-to-embedding baai

mediatek breeze-7b-instruct

LLM for improved language comprehension and chatbot-oriented capabilities in Traditional Chinese.

chat text-to-text regional language generation mediatek

nvidia visual-changenet

Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask

image image generation cv image segmentation vlm computer vision tao toolkit video nvidia nim nvidia

google codegemma-1.1-7b

Advanced programming model for code generation, completion, reasoning, and instruction following.

code generation code completion google

ibm granite-34b-code-instruct

Software programming LLM for code generation, completion, explanation, and multi-turn conversion.

code generation chat large language models text-to-code ibm

ibm granite-8b-code-instruct

Software programming LLM for code generation, completion, explanation, and multi-turn conversion.

code generation chat large language models text-to-code ibm

nvidia retail-object-detection

EfficientDet-based object detection network to detect 100 specific retail objects from an input video.

object detection image cv vlm computer vision tao toolkit video nvidia nim nvidia

ipd rfdiffusion

A generative model of protein backbones for protein binder design.

biology nim bionemo drug discovery protein generation ipd

microsoft phi-3-small-8k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

code generation chat text-to-text language generation large language models microsoft

microsoft phi-3-small-128k-instruct

Long context cutting-edge lightweight open language model exceling in high-quality reasoning.

code generation chat text-to-text language generation large language models microsoft

microsoft phi-3-medium-4k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

code generation chat text-to-text language generation large language models microsoft

microsoft phi-3-vision-128k-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from images.

image cv vision assistant vlm visual question answering computer vision language generation image-to-text video microsoft

google paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

image cv vision assistant vlm visual question answering computer vision language generation image-to-text video google

aisingapore sea-lion-7b-instruct

LLM to represent and serve the linguistic and cultural diversity of Southeast Asia

chat text-to-text regional language generation large language models aisingapore

microsoft phi-3-mini-4k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

code generation chat text-to-text language generation large language models microsoft

databricks dbrx-instruct

A general-purpose LLM with state-of-the-art performance in language understanding, coding, and RAG.

chat text-to-text language generation large language models databricks

snowflake arctic-embed-l

Optimized community model for text embedding.

nemo retriever embedding retrieval augmented generation text-to-embedding snowflake

microsoft phi-3-mini-128k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

code generation chat text-to-text language generation large language models microsoft

mistralai mixtral-8x22b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

advanced reasoning code generation chat text-to-text large language models mistralai

meta llama3-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

large language models code generation chat text-to-text language generation meta

meta llama3-8b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

code generation chat text-to-text language generation large language models meta

google recurrentgemma-2b

Novel recurrent architecture based language model for faster inference when generating long sequences.

code generation chat text-to-text language generation google

google codegemma-7b

Cutting-edge model built on Google's Gemma-7B specialized for code generation and code completion.

code generation chat language generation text-to-code google

google gemma-2b

Lightweight language model deployable on laptop, desktop or the cloud for summarization and reasoning.

code generation chat text-to-text language generation google

nvidia embed-qa-4

GPU-accelerated generation of text embeddings used for question-answering retrieval.

embeddings retrieval augmented generation text-to-embedding nvidia

nvidia rerank-qa-mistral-4b

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

ranking retrieval augmented generation nvidia

stabilityai stable-diffusion-xl

Generate images and stunning visuals with realistic aesthetics.

image generation text-to-image stabilityai

microsoft kosmos-2

Groundbreaking multimodal model designed to understand and reason about visual elements in images.

image cv multimodal vlm visual question answering computer vision image understanding image-to-text video microsoft

google deplot

Translate images of plots into tables with one-shot visual language understanding.

nemo retriever multimodal data ingestion image-to-text google

nvidia neva-22b

Multi-modal vision-language model that understands text/images and generates informative responses

image cv vision assistant non-commercial use only vlm visual question answering computer vision image-to-text video nvidia

adept fuyu-8b

Multi-modal model for a wide range of tasks, including image understanding and language generation.

image cv multimodal vlm computer vision image understanding language generation image-to-text video adept

nvidia vista-3d

VISTA-3D is a specialized interactive foundation model for segmenting and anotating human anatomies.

interactive annotation image segmentation non-commercial use only medical imaging nvidia

google gemma-7b

Cutting-edge text generation model text understanding, transformation, and code generation.

code generation chat text-to-text language generation google

mistralai mistral-7b-instruct-v0.2

This LLM follows instructions, completes requests, and generates creative text.

text-to-text language generation nvidia nim mistralai

nvidia fq2bam

Generate BAM output given one or more pairs of FASTQ files, by running BWA-MEM & GATK best practices.

parabricks genomics dna sequencing nvidia

nvidia deepvariant

Run Google's DeepVariant optimized for GPU. Switch models for high accuracy on all major sequencers.

parabricks genomics dna sequencing nvidia

stabilityai stable-video-diffusion

Stable Video Diffusion (SVD) is a generative diffusion model that leverages a single image as a conditioning frame to synthesize video sequences.

image generation text-to-image stabilityai

stabilityai sdxl-turbo

A fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation

image generation text-to-image stabilityai

nvidia molmim

MolMIM performs controlled generation, finding molecules with the right properties.

chemistry nim bionemo molecule generation drug discovery nvidia

meta esmfold

Predicts the 3D structure of a protein from its amino acid sequence.

biology nim bionemo protein folding drug discovery meta

mit diffdock

Predicts the 3D structure of how a molecule interacts with a protein.

chemistry nim bionemo docking drug discovery mit

mistralai mixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

advanced reasoning code generation chat text-to-text large language models mistralai

nvidia cuopt

World-record accuracy and performance for complex route optimization.

route optimization nvidia