我今天对我的应用程序进行了一些压力测试,我发现我的服务器响应时间很难处理几个用户。一个用户的响应时间约为 6 秒。三个用户接近 20 秒,十个用户超过 30 秒。我开始在后端添加一些日志以尝试找到瓶颈。我正在跟踪请求的开始时间和结束时间,看起来无论是测试一个用户还是十个用户,我的日志都显示总计算时间约为 5 秒。但是我的前端显示用户越多延迟就越大。
这是来自我的服务器的日志:
[62d44229] Response ready at: 7.769sNov 1 07:05:52 PM[62d44229] Response sent at: 7.769sNov 1 07:05:52 PMINFO: 69.180.179.235:0 - "POST /generate_search_ideas HTTP/1.1" 200 OKNov 1 07:05:52 PM[POST]200torsera-dev.onrender.com/generate_search_ideasclientIP="69.180.179.235" requestID="9b03e306-0a41-454b" responseTimeMS=16051 responseBytes=843 userAgent="python-requests/2.32.3"
7.769s 来自我的后端日志,而显示 16s 的 [POST] 日志responseTimeMS
是服务器请求日志。
我的服务器内存和 CPU 使用率都相对较低。所以我想知道延迟的原因是什么。
它似乎必须与请求或响应处理有关,但我不确定如何测试。
我真的非常感激有人能帮我彻底解决这个问题。
编辑:下面是我正在测试的函数。我认为实际函数并不重要,但您可以看到我用来跟踪函数总运行时间的方法。
@app.post('/generate_search_ideas')
async def generate_search_ideas(request: Request, background_tasks: BackgroundTasks):
start = time.time()
request_id = str(uuid.uuid4())[:8]
print(f"[{request_id}] Request received at: {start}")
# Retrieve the session cookie from the request
session_cookie = request.cookies.get('session')
# If no session cookie is present, raise an Unauthorized error
if not session_cookie:
raise HTTPException(status_code=401, detail='Unauthorized')
try:
# Verify the session cookie and check if it has been revoked
decoded_claims = auth.verify_session_cookie(session_cookie, check_revoked=True)
# Extract the user_id from the decoded claims
user_id = decoded_claims['user_id']
except auth.InvalidSessionCookieError:
# If the session cookie is invalid, raise an Unauthorized error
raise HTTPException(status_code=401, detail='Unauthorized')
try:
data = await request.json()
print(f"[{request_id}] Request parsed at: {time.time() - start:.3f}s")
userInput = data['userInput']
systemPrompt = "You are helpful and assist by generating related ideas for brainstorming."
# Time the LLM calls
llm_start = time.time()
isFiction = await isFictionRelated(userInput)
print(f"[{request_id}] Fiction check completed at: {time.time() - start:.3f}s")
modifiedUserInput = userInput + " Unconventional ideas please." if isFiction else userInput
initialIdeas = await fetchIdeas(modifiedUserInput, systemPrompt)
print(f"[{request_id}] Initial ideas fetched at: {time.time() - start:.3f}s")
print(f"""
Time waiting for LLM: {time.time() - llm_start}
Total request time: {time.time() - start}
""")
validInitialIdeas = list(filter(filterShortIdeas, initialIdeas))
targetIdeaCount = 6
if len(validInitialIdeas) < targetIdeaCount:
print("rrunning")
additionalIdeas = await generateAdditionalIdeas(modifiedUserInput, validInitialIdeas, targetIdeaCount - len(validInitialIdeas))
validAdditionalIdeas = list(filter(filterShortIdeas, additionalIdeas))
allIdeas = validInitialIdeas + validAdditionalIdeas
return JSONResponse(content= allIdeas, status_code=200)
else:
print(f"[{request_id}] Response ready at: {time.time() - start:.3f}s")
response = JSONResponse(content=validInitialIdeas, status_code=200)
print(f"[{request_id}] Response sent at: {time.time() - start:.3f}s")
return response
except Exception as e:
logger.error(f"Error generating search ideas: {str(e)}")
raise HTTPException(status_code=500, detail=f"An error occurred while generating search ideas: {str(e)}")
返回数据后,我的最终打印语句print(f"[{request_id}] Response sent at: {time.time() - start:.3f}s")
表明总运行时间约为 7 秒,但我的前端表明它已等待近 16 秒。
3
–
–
–
|