手把手教你用FastAPI给DeepSeek-OCR模型做个Web界面，还能兼容OpenAI的API格式

张开发

• 2026/6/3 1:08:50 • 15 分钟阅读

分享文章

手把手教你用FastAPI给DeepSeek-OCR模型做个Web界面，还能兼容OpenAI的API格式

从零构建兼容OpenAI的DeepSeek-OCR Web服务实战指南当我们需要将本地AI模型快速转化为可交互的Web服务时FastAPI无疑是最佳选择之一。本文将带你完整实现一个支持图片上传、文本识别的OCR服务并使其API格式与OpenAI完全兼容让现有OpenAI生态工具能够无缝接入。1. 环境准备与项目初始化在开始编码前我们需要配置好开发环境。推荐使用Python 3.12版本通过conda或venv创建隔离环境conda create -n deepseekocr python3.12.9 conda activate deepseekocr pip install torch2.6.0 transformers4.46.3 fastapi uvicorn[standard] python-multipart Pillow项目目录结构建议如下project/ ├─ app.py # 后端主服务 ├─ static/ │ └─ ui.html # 单页前端界面 └─ README.md # 项目说明2. 核心功能设计与实现2.1 FastAPI后端架构我们的后端需要实现几个关键端点/v1/chat/completions兼容OpenAI的API格式/parserToText直接图片转文本的简化接口/ui前端页面快捷入口首先创建FastAPI应用实例并配置CORSfrom fastapi import FastAPI from fastapi.middleware.cors import CORSMiddleware app FastAPI(titleDeepSeek-OCR服务) app.add_middleware( CORSMiddleware, allow_origins[*], allow_methods[*], allow_headers[*], )2.2 模型加载与推理DeepSeek-OCR模型的加载需要特别注意硬件适配from transformers import AutoModel, AutoTokenizer MODEL_NAME deepseek-ai/DeepSeek-OCR tokenizer AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_codeTrue) model AutoModel.from_pretrained(MODEL_NAME, trust_remote_codeTrue) # 自动适配硬件 if torch.cuda.is_available(): device torch.device(cuda:0) model model.eval().to(device) try: model model.to(torch.bfloat16) except: model model.to(torch.float16) else: device torch.device(cpu) model model.eval().to(device)2.3 图片处理模块支持三种图片输入方式Base64编码的data URI本地文件路径远程HTTP(S) URLdef handle_image_input(image_url: str) - str: if image_url.startswith(data:): # 处理Base64图片 header, b64 image_url.split(,, 1) raw base64.b64decode(b64) return save_to_temp(raw) elif image_url.startswith((http://, https://)): # 下载远程图片 resp requests.get(image_url, timeout30) return save_to_temp(resp.content) else: # 本地文件处理 with open(image_url, rb) as f: return save_to_temp(f.read())3. OpenAI兼容API实现3.1 /v1/chat/completions接口这是最核心的接口需要完全匹配OpenAI的请求响应格式app.post(/v1/chat/completions) async def chat_completions(request: Request): payload await request.json() messages payload.get(messages) # 解析消息内容 prompt_text, image_path parse_messages(messages) # 执行OCR推理 ocr_result run_ocr(prompt_text, image_path) return { id: chatcmpl-123, object: chat.completion, created: int(time.time()), model: deepseek-ocr, choices: [{ index: 0, message: {role: assistant, content: ocr_result}, finish_reason: stop }], usage: { prompt_tokens: len(prompt_text), completion_tokens: len(ocr_result), total_tokens: len(prompt_text) len(ocr_result) } }3.2 消息解析逻辑OpenAI格式的messages数组可能包含混合的文本和图片内容def parse_messages(messages: List[dict]) - Tuple[str, Optional[str]]: texts [] image_url None for msg in messages: content msg.get(content) if isinstance(content, str): texts.append(content) elif isinstance(content, list): for item in content: if item.get(type) text: texts.append(item.get(text, )) elif item.get(type) image_url and not image_url: image_url item.get(image_url, {}).get(url) return \n.join(texts), image_url4. 前端交互实现4.1 单页Web UI设计我们使用纯HTMLJS实现一个简洁的前端主要功能包括图片上传与预览预设指令选择自定义提示输入结果展示原始文本和Markdown渲染!doctype html html head titleDeepSeek-OCR Web UI/title style /* 简约的暗色主题样式 */ body { background: #0f172a; color: #e2e8f0; } .card { background: #1e293b; border-radius: 0.5rem; } button { background: #3b82f6; color: white; } /style /head body div classcontainer h1DeepSeek-OCR/h1 div classcard input typefile idimageUpload acceptimage/* img idimagePreview stylemax-width: 300px; /div div classcard textarea idprompt placeholder输入指令.../textarea button idsubmit识别/button /div div classcard div idresult/div /div /div script // 前端交互逻辑 document.getElementById(submit).addEventListener(click, async () { const file document.getElementById(imageUpload).files[0]; const prompt document.getElementById(prompt).value; // 将图片转为Base64 const reader new FileReader(); reader.onload async () { const response await fetch(/v1/chat/completions, { method: POST, headers: { Content-Type: application/json }, body: JSON.stringify({ model: deepseek-ocr, messages: [{ role: user, content: [ { type: text, text: prompt }, { type: image_url, image_url: { url: reader.result } } ] }] }) }); const result await response.json(); document.getElementById(result).innerText result.choices[0].message.content; }; reader.readAsDataURL(file); }); /script /body /html4.2 图片处理与API调用前端关键是将用户上传的图片转换为Base64格式并通过API发送async function processImage(file) { return new Promise((resolve) { const reader new FileReader(); reader.onload () resolve(reader.result); reader.readAsDataURL(file); }); } async function callOCRAPI(imageData, prompt) { const response await fetch(/v1/chat/completions, { method: POST, headers: { Content-Type: application/json }, body: JSON.stringify({ model: deepseek-ocr, messages: [{ role: user, content: [ { type: text, text: prompt }, { type: image_url, image_url: { url: imageData } } ] }] }) }); return await response.json(); }5. 部署与优化建议5.1 服务启动与配置使用uvicorn运行服务uvicorn app:app --host 0.0.0.0 --port 8000对于生产环境建议添加Gunicorn作为WSGI服务器Nginx反向代理进程管理工具如Supervisor5.2 性能优化技巧模型量化将模型转换为FP16或INT8减少内存占用model model.half() # 转换为FP16批处理支持修改API支持同时处理多张图片缓存机制对相同图片的重复请求返回缓存结果异步处理长时间任务使用Celery等队列系统app.post(/async-ocr) async def async_ocr_task(image: UploadFile File(...)): task_id str(uuid.uuid4()) # 将任务放入队列 celery.send_task(process_ocr, args[await image.read()], task_idtask_id) return {task_id: task_id}5.3 安全增强措施添加API密钥认证API_KEYS {your-secret-key: True} app.middleware(http) async def auth_middleware(request: Request, call_next): if request.url.path.startswith(/v1/): if request.headers.get(Authorization) not in API_KEYS: return JSONResponse({error: Unauthorized}, status_code401) return await call_next(request)限制文件上传大小app FastAPI( max_upload_size10 * 1024 * 1024 # 10MB )添加速率限制from fastapi import FastAPI, Request from fastapi.middleware import Middleware from slowapi import Limiter from slowapi.util import get_remote_address limiter Limiter(key_funcget_remote_address) app FastAPI(middleware[Middleware(limiter)])6. 客户端调用示例6.1 使用OpenAI官方SDK由于我们兼容OpenAI API格式可以直接使用openai包from openai import OpenAI client OpenAI(base_urlhttp://localhost:8000/v1, api_keysk-xxx) response client.chat.completions.create( modeldeepseek-ocr, messages[{ role: user, content: [ {type: text, text: 提取图片中的文字}, {type: image_url, image_url: {url: path/to/image.png}} ] }] ) print(response.choices[0].message.content)6.2 直接HTTP请求示例import requests import base64 with open(receipt.jpg, rb) as image_file: base64_image base64.b64encode(image_file.read()).decode(utf-8) response requests.post( http://localhost:8000/v1/chat/completions, headers{Content-Type: application/json}, json{ model: deepseek-ocr, messages: [{ role: user, content: [ {type: text, text: 提取发票信息}, {type: image_url, image_url: {url: fdata:image/jpeg;base64,{base64_image}}} ] }] } ) print(response.json()[choices][0][message][content])这套解决方案不仅实现了OCR核心功能还通过OpenAI兼容接口大大扩展了应用场景。开发者可以将其集成到现有支持OpenAI的系统中或者基于Web界面快速构建业务应用。

更多文章

前端开发 2026/5/19 10:29:47

3个理由告诉你为什么专业设计师都爱用Bebas Neue字体

3个理由告诉你为什么专业设计师都爱用Bebas Neue字体【免费下载链接】Bebas-Neue Bebas Neue font 项目地址: https://gitcode.com/gh_mirrors/be/Bebas-Neue 你是否曾为寻找一款既专业又免费的标题字体而烦恼？商业字体价格昂贵，免费字体又缺乏设…

【百例RUST - 008】枚举第一章基础用法第01节类似C语言的定义方式定义格式 enum 枚举名称{枚举值1,枚举值2, }案例代码 fn main(){// 类似于 C语言的定义方式#[derive(Debug)] // 这里的宏, 只是为了方便输出看结果enum IpAddKind{V4,V6,}#[derive(Debug)] // 这里的…

张开发

前端开发 2026/5/19 12:46:26

CentOS 7.6服务器上，5分钟搞定向日葵命令行版（SunloginClient Shell）的安装与绑定

CentOS 7.6服务器快速部署向日葵命令行版全指南在纯命令行环境下管理Linux服务器时，我们常常会遇到需要图形化辅助的场景。想象一下凌晨三点，服务器突然出现异常，而SSH命令行排查又难以定位问题根源——这时候如果能够快速建立图形化连接&a…

张开发

手把手教你用FastAPI给DeepSeek-OCR模型做个Web界面，还能兼容OpenAI的API格式

最新文章

2025最权威的六大降重复率助手实测分析

零成本构建移动服务器：基于Termux的安卓Web服务实战

别再只用默认指标了！用通达信APP自定义一个‘分时T+0’盯盘助手，保姆级配置指南

告别“一锤子买卖”：给你的Xilinx FPGA设计加上Multiboot双镜像冗余备份

苹果15年来首次换帅，新CEO能否带领苹果打赢AI硬件之战？

从‘联网盒子’到‘数据枢纽’：T-BOX的十年演进与未来猜想（附：独立硬件 vs 融入域控的深度分析）

推荐文章

相关文章

分享文章

更多文章

3个理由告诉你为什么专业设计师都爱用Bebas Neue字体

Excel公式美化终极指南：让复杂公式一目了然的免费工具

PyTorch 模型部署：TorchScript vs ONNX 深度对比

Lettuce 6.x在Jakarta EE 10/CDI 4.0环境下的依赖注入实战指南

什么是 Token？2026 年主流大模型计费规则、价格与性能全面对比

Path of Building终极指南：5步掌握流放之路最强Build规划

终极指南：5分钟用游戏手柄控制Windows电脑的完整教程

深度探索VRC Gesture Manager：解锁虚拟形象动画调试的高效实战指南

避开这3个坑，你的ESP32音乐频谱灯效果才能更流畅（FFT采样与灯效优化心得）

eNSP实战部署：从零搭建网络仿真环境的避坑指南

【百例RUST - 008】枚举

CentOS 7.6服务器上，5分钟搞定向日葵命令行版（SunloginClient Shell）的安装与绑定