""" 测试 05: 性能测试 测量 LLM 推理延迟、TTS 合成速度等指标 """ import asyncio import json import sys import time from pathlib import Path import websockets import numpy as np sys.path.insert(0, str(Path(__file__).parent)) from test_config import SERVER_URL, AUTH_TOKEN, DEVICE_ID, TEST_CASES async def test_performance(): """性能测试""" print("=" * 60) print(" 测试 05: 性能基准测试") print("=" * 60) print() results = [] # 测试不同长度的文本 test_cases = [ ("短文本", "你好"), ("中文闲聊", "今天天气怎么样?"), ("飞控指令-简单", "起飞"), ("飞控指令-复杂", "起飞然后在前方十米悬停"), ("长文本", "你好,我想知道今天的天气情况,还有能不能给我讲个笑话听?"), ] for name, text in test_cases: result = await measure_turn(name, text) results.append(result) print() # 汇总统计 print("=" * 60) print(" 性能统计") print("=" * 60) print(f"\n {'测试项':<20} {'LLM(ms)':<10} {'TTS(ms)':<10} {'总耗时(s)':<10} {'实时率':<10}") print(f" {'-'*60}") for r in results: llm = r.get("llm_ms", 0) or 0 tts = r.get("tts_ms", 0) or 0 total = r.get("total_time", 0) rtf = r.get("realtime_factor", 0) print(f" {r['name']:<18} {llm:<10} {tts:<10} {total:<10.2f} {rtf:<10.2f}") # 计算平均值 avg_llm = np.mean([r.get("llm_ms", 0) or 0 for r in results]) avg_tts = np.mean([r.get("tts_ms", 0) or 0 for r in results]) avg_total = np.mean([r.get("total_time", 0) for r in results]) avg_rtf = np.mean([r.get("realtime_factor", 0) for r in results]) print(f" {'-'*60}") print(f" {'平均值':<18} {avg_llm:<10.0f} {avg_tts:<10.0f} {avg_total:<10.2f} {avg_rtf:<10.2f}") print() print("=" * 60) print(f" ✅ 性能测试完成") print(f" 平均 LLM 延迟: {avg_llm:.0f}ms ({avg_llm/1000:.2f}s)") print(f" 平均 TTS 延迟: {avg_tts:.0f}ms ({avg_tts/1000:.2f}s)") print(f" 平均实时率: {avg_rtf:.2f}x") print("=" * 60) return True async def measure_turn(name: str, text: str) -> dict: """测量单轮性能""" print(f"\n 测试: {name}") print(f" 文本: {text}") session_id = f"test-perf-{int(time.time())}" turn_id = f"turn-{int(time.time())}" try: async with websockets.connect(SERVER_URL) as ws: # 建立会话 session_start = { "type": "session.start", "proto_version": "1.0", "transport_profile": "text_uplink", "session_id": session_id, "auth_token": AUTH_TOKEN, "client": { "device_id": DEVICE_ID, "locale": "zh-CN", "capabilities": { "playback_sample_rate_hz": 24000, "prefer_tts_codec": "pcm_s16le" }, "protocol": {"dialog_result": "cloud_voice_dialog_v1"}, } } await ws.send(json.dumps(session_start, ensure_ascii=False)) await ws.recv() # session.ready # 发送文本 turn_text = { "type": "turn.text", "proto_version": "1.0", "transport_profile": "text_uplink", "turn_id": turn_id, "text": text, "is_final": True, "source": "device_stt" } t_start = time.time() await ws.send(json.dumps(turn_text, ensure_ascii=False)) # 接收响应 audio_chunks = [] metrics = {} dialog_result = None while True: msg = await asyncio.wait_for(ws.recv(), timeout=60) if isinstance(msg, bytes): audio_chunks.append(msg) else: data = json.loads(msg) msg_type = data.get("type") if msg_type == "dialog_result": dialog_result = data elif msg_type == "turn.complete": metrics = data.get("metrics", {}) break t_end = time.time() total_time = t_end - t_start # 计算音频长度 audio_length_s = 0 if audio_chunks: full_pcm = b"".join(audio_chunks) audio_data = np.frombuffer(full_pcm, dtype=np.int16) audio_length_s = len(audio_data) / 24000 llm_ms = metrics.get("llm_ms", 0) or 0 tts_ms = metrics.get("tts_first_byte_ms", 0) or 0 realtime_factor = audio_length_s / total_time if total_time > 0 else 0 print(f" LLM: {llm_ms}ms, TTS: {tts_ms}ms, 总耗时: {total_time:.2f}s, 实时率: {realtime_factor:.2f}x") # 结束会话 await ws.send(json.dumps({ "type": "session.end", "proto_version": "1.0", "session_id": session_id })) return { "name": name, "llm_ms": llm_ms, "tts_ms": tts_ms, "total_time": total_time, "audio_length_s": audio_length_s, "realtime_factor": realtime_factor, "routing": dialog_result.get("routing") if dialog_result else None, } except Exception as e: print(f" ❌ 失败: {e}") return { "name": name, "llm_ms": 0, "tts_ms": 0, "total_time": 0, "audio_length_s": 0, "realtime_factor": 0, "error": str(e) } async def main(): success = await test_performance() sys.exit(0 if success else 1) if __name__ == "__main__": asyncio.run(main())