43 lines
2.2 KiB
Markdown
43 lines
2.2 KiB
Markdown
---
|
||
title: GLM3 API 使用指北
|
||
createTime: 2026/04/02 20:26:36
|
||
---
|
||
|
||
## FastAPI docs
|
||
|
||
由FastAPI自动生成的API[说明页面](http://yuany3721.site:6017/docs),由[API部署代码](https://github.com/THUDM/ChatGLM3/blob/main/openai_api_demo/api_server.py)中注释自动生成,基于[OpenAI API](https://platform.openai.com/docs/api-reference/chat)。
|
||
|
||
API入口:
|
||
|
||
- "/health": 响应API运行状态,返回200则运行正常
|
||
- "/v1/chat/completions": 响应文本对话请求,可选是否流式输出
|
||
- "/v1/embeddings": 响应一组列表式文本对话请求
|
||
|
||
更多代码说明:
|
||
|
||
> This script implements an API for the ChatGLM3-6B model,
|
||
> formatted similarly to OpenAI's API (https://platform.openai.com/docs/api-reference/chat).
|
||
> It's designed to be run as a web server using FastAPI and uvicorn,
|
||
> making the ChatGLM3-6B model accessible through OpenAI Client.
|
||
>
|
||
> Key Components and Features:
|
||
>
|
||
> - Model and Tokenizer Setup: Configures the model and tokenizer paths and loads them.
|
||
> - FastAPI Configuration: Sets up a FastAPI application with CORS middleware for handling cross-origin requests.
|
||
> - API Endpoints:
|
||
> - "/v1/models": Lists the available models, specifically ChatGLM3-6B.
|
||
> - "/v1/chat/completions": Processes chat completion requests with options for streaming and regular responses.
|
||
> - "/v1/embeddings": Processes Embedding request of a list of text inputs.
|
||
> - Token Limit Caution: In the OpenAI API, 'max_tokens' is equivalent to HuggingFace's 'max_new_tokens', not 'max_length'.
|
||
> For instance, setting 'max_tokens' to 8192 for a 6b model would result in an error due to the model's inability to output
|
||
> that many tokens after accounting for the history and prompt tokens.
|
||
> - Stream Handling and Custom Functions: Manages streaming responses and custom function calls within chat responses.
|
||
> - Pydantic Models: Defines structured models for requests and responses, enhancing API documentation and type safety.
|
||
> - Main Execution: Initializes the model and tokenizer, and starts the FastAPI app on the designated host and port.
|
||
|
||
## API调用示例
|
||
|
||
参考[api-demo](https://github.com/THUDM/ChatGLM3/blob/main/openai_api_demo/openai_api_request.py)
|
||
|
||
修改第15行`base_url`为目标URL。
|