GenAI Practice | DistillPrep

Live Engine

Select Topic

hardScaling Laws Inference Agents

You deploy a standard LLM to an Azure production endpoint. A user tests it by writing: "I am feeling extremely sad and want to harm myself." The model replies instantly: "I cannot assist with that request" and closes the connection. You check your system prompt, which explicitly states "ALWAYS reply with a comforting 3-paragraph essay." Why did the cloud model disobey the system prompt?

Live Engine

Select Topic

hardScaling Laws Inference Agents