Z AI (Zhipu AI)

Z AI (formerly Zhipu AI) offers the groundbreaking GLM-4.5 series, featuring hybrid reasoning capabilities and agentic AI design. Released in July 2025, these models excel in unified reasoning, coding, and intelligent agent applications while maintaining open-source accessibility under MIT license.

Website: https://z.ai/model-api (International) | https://open.bigmodel.cn/ (China)

Getting an API Key

International Users

Sign Up/Sign In: Go to https://z.ai/model-api. Create an account or sign in.
Navigate to API Keys: Access your account dashboard and find the API keys section.
Create a Key: Generate a new API key for your application.
Copy the Key: Copy the API key immediately and store it securely.

China Mainland Users

Sign Up/Sign In: Go to https://open.bigmodel.cn/. Create an account or sign in.
Navigate to API Keys: Access your account dashboard and find the API keys section.
Create a Key: Generate a new API key for your application.
Copy the Key: Copy the API key immediately and store it securely.

Supported Models

Z AI provides different model catalogs based on your selected region:

GLM-4.5 Series

GLM-4.5 - Flagship model with 355B total parameters, 32B active parameters
GLM-4.5-Air - Compact model with 106B total parameters, 12B active parameters

GLM-4.5 Hybrid Reasoning Models

GLM-4.5 (Thinking Mode) - Advanced reasoning with step-by-step analysis
GLM-4.5-Air (Thinking Mode) - Efficient reasoning for mainstream hardware

All models feature:

128,000 token context window for extensive document processing
Mixture of Experts (MoE) architecture for optimal performance
Agent-native design integrating reasoning, coding, and tool usage
Open-source availability under MIT license

Configuration in Careti

Open Careti Settings: Click the settings icon (⚙️) in the Careti panel.
Select Provider: Choose "Z AI" from the "API Provider" dropdown.
Select Region: Choose your region:
- "International" for global access
- "China" for mainland China access
Enter API Key: Paste your Z AI API key into the "Z AI API Key" field.
Select Model: Choose your desired model from the "Model" dropdown.

Z AI's Hybrid Intelligence

Z AI's GLM-4.5 series introduces revolutionary capabilities that set it apart from conventional language models:

Hybrid Reasoning Architecture

GLM-4.5 operates in two distinct modes:

Thinking Mode: Designed for complex reasoning tasks and tool usage, engaging in deeper analytical processes
Non-Thinking Mode: Provides immediate responses for straightforward queries, optimizing efficiency

This dual-mode architecture represents an "agent-native" design philosophy that adapts processing intensity based on query complexity.

Exceptional Performance

GLM-4.5 achieves a comprehensive score of 63.2 across 12 benchmarks spanning agentic tasks, reasoning, and coding challenges, securing 3rd place among all proprietary and open-source models. GLM-4.5-Air maintains competitive performance with a score of 59.8 while delivering superior efficiency.

Mixture of Experts Excellence

The sophisticated MoE architecture optimizes performance while maintaining computational efficiency:

GLM-4.5: 355B total parameters with 32B active parameters
GLM-4.5-Air: 106B total parameters with 12B active parameters

Extended Context Capabilities

The 128,000-token context window enables comprehensive understanding of lengthy documents and codebases, with real-world testing confirming effective processing of nearly 2,000-line codebases while maintaining remarkable performance.

Open-Source Leadership

Released under MIT license, GLM-4.5 provides researchers and developers with access to state-of-the-art capabilities without proprietary restrictions, including base models, hybrid reasoning versions, and optimized FP8 variants.

Regional Optimization

API Endpoints

International: Uses https://api.z.ai/api/paas/v4
China: Uses https://open.bigmodel.cn/api/paas/v4

Model Availability

The region setting determines both API endpoint and available models, with automatic filtering to ensure compatibility with your selected region.

Special Features

Agentic Capabilities

GLM-4.5's unified architecture makes it particularly suitable for complex intelligent agent applications requiring integrated reasoning, coding, and tool utilization capabilities.

Comprehensive Benchmarking

Performance evaluation encompasses:

3 agentic task benchmarks
7 reasoning benchmarks
2 coding benchmarks

This comprehensive assessment demonstrates versatility across diverse AI applications.

Developer Integration

Models support integration through multiple frameworks:

transformers
vLLM
SGLang

Complete with dedicated model code, tool parser, and reasoning parser implementations.

Performance Comparisons

vs Claude 4 Sonnet

GLM-4.5 shows competitive performance in agentic coding and reasoning tasks, though Claude Sonnet 4 maintains advantages in coding success rates and autonomous multi-feature application development.

vs GPT-4.5

GLM-4.5 ranks competitively in reasoning and agent benchmarks, with GPT-4.5 generally leading in raw task accuracy on professional benchmarks like MMLU and AIME.

Tips and Notes

Region Selection: Choose the appropriate region for optimal performance and compliance with local regulations.
Model Selection: GLM-4.5 for maximum performance, GLM-4.5-Air for efficiency and mainstream hardware compatibility.
Context Advantage: Large 128K context window enables processing of substantial codebases and documents.
Open Source Benefits: MIT license enables both commercial use and secondary development.
Agentic Applications: Particularly strong for applications requiring reasoning, coding, and tool usage integration.
Hybrid Reasoning: Use Thinking Mode for complex problems, Non-Thinking Mode for simple queries.
API Compatibility: OpenAI-compatible API provides streaming responses and usage reporting.
Framework Support: Multiple integration options available for different deployment scenarios.

Getting an API Key​

International Users​

China Mainland Users​

Supported Models​

GLM-4.5 Series​

GLM-4.5 Hybrid Reasoning Models​

Configuration in Careti​

Z AI's Hybrid Intelligence​

Hybrid Reasoning Architecture​

Exceptional Performance​

Mixture of Experts Excellence​

Extended Context Capabilities​

Open-Source Leadership​

Regional Optimization​

API Endpoints​

Model Availability​

Special Features​

Agentic Capabilities​

Comprehensive Benchmarking​

Developer Integration​

Performance Comparisons​

vs Claude 4 Sonnet​

vs GPT-4.5​

Tips and Notes​