Summary
The Flask request handler proxy() mutates a module-level global server_config dict on every request. Under Flask's default threaded mode or any multi-threaded WSGI server (gunicorn, uWSGI, Hypercorn), concurrent requests overwrite each other's MCTS parameters, causing silent result degradation.
Affected Code
File: optillm/server.py, lines 747–749
@app.route('/v1/chat/completions', methods=['POST'])
def proxy():
# ...
server_config['mcts_depth'] = data.get('mcts_depth', server_config['mcts_depth'])
server_config['mcts_exploration'] = data.get('mcts_exploration', server_config['mcts_exploration'])
server_config['mcts_simulations'] = data.get('mcts_simulations', server_config['mcts_simulations'])
These mutated values are then read back by execute_single_approach() at lines 419–420:
return chat_with_mcts(system_prompt, initial_query, client, model,
server_config['mcts_simulations'],
server_config['mcts_exploration'],
server_config['mcts_depth'], ...)
Reproduction
- Start optillm with a multi-threaded WSGI server:
gunicorn -w 1 --threads 4 optillm.server:app
- Send two concurrent MCTS requests with different parameters:
# Terminal 1
curl -s -X POST http://localhost:8000/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{"model":"mcts-gpt-4o-mini","messages":[{"role":"user","content":"Solve hard problem"}],"mcts_simulations":20,"mcts_depth":5}' &
# Terminal 2
curl -s -X POST http://localhost:8000/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{"model":"mcts-gpt-4o-mini","messages":[{"role":"user","content":"Solve easy problem"}],"mcts_simulations":2,"mcts_depth":1}' &
One or both requests will use the wrong MCTS parameters.
Impact
- Incorrect AI results: reasoning quality degrades silently when a low-parameter request races a high-parameter request
- Cross-tenant data leakage: one user's request parameters bleed into another user's inference in shared deployments
- Affects any multi-threaded deployment — the standard production deployment model
Suggested Fix
Extract per-request MCTS parameters into a local dict instead of mutating global state:
@app.route('/v1/chat/completions', methods=['POST'])
def proxy():
# Per-request overrides — do NOT write to server_config
mcts_config = {
'mcts_depth': data.get('mcts_depth', server_config['mcts_depth']),
'mcts_exploration': data.get('mcts_exploration', server_config['mcts_exploration']),
'mcts_simulations': data.get('mcts_simulations', server_config['mcts_simulations']),
}
# Pass mcts_config to execute_single_approach() instead of reading from server_config
Found via automated codebase analysis. Confirmed independently by three reviewers (Claude, Codex, Gemini).
Summary
The Flask request handler
proxy()mutates a module-level globalserver_configdict on every request. Under Flask's default threaded mode or any multi-threaded WSGI server (gunicorn, uWSGI, Hypercorn), concurrent requests overwrite each other's MCTS parameters, causing silent result degradation.Affected Code
File:
optillm/server.py, lines 747–749These mutated values are then read back by
execute_single_approach()at lines 419–420:Reproduction
gunicorn -w 1 --threads 4 optillm.server:appOne or both requests will use the wrong MCTS parameters.
Impact
Suggested Fix
Extract per-request MCTS parameters into a local dict instead of mutating global state:
Found via automated codebase analysis. Confirmed independently by three reviewers (Claude, Codex, Gemini).