Skip to content

/

© 2026 OpenRouter, Inc

Product

Chat
Rankings
Apps
Models
Providers
Pricing
Enterprise
Labs

Company

About
Announcements
CareersHiring
Privacy
Terms of Service
Support
State of AI
Works With OR
Data

Developer

Documentation
API Reference
SDK
Status

Connect

Discord
GitHub
LinkedIn
X
YouTube

Perceptron

Browse models provided by Perceptron (Terms of Service)

1 model

Tokens processed on OpenRouter

Perceptron: Perceptron Mk1Perceptron Mk1
Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for video and embodied reasoning.** It accepts image and video inputs paired with natural language queries, and produces detailed visual understanding responses, either structured or natural language. It excels at video understanding tasks like video QA, summarization, and event detection. On image inputs, it advances point-by-example grounding from multimodal prompts, OCR and document parsing on messy real-world inputs, open vocabulary object detection and counting, and hand pose estimation. Reasoning can be enabled per request to trade latency for deeper analysis on harder tasks. Structured annotations are emitted inline with text only when explicitly requested via the `annotation_format` parameter (pass `"point"`, `"box"`, or `"polygon"` for spatial localization on images, or `"clip"` (start/end timestamps) for temporal segments in video). Without `annotation_format`, the model returns natural-language text only.
by perceptronMay 12, 202633K context$0.15/M input tokens$1.50/M output tokens