Markdown view
# Policy-first prompting Layer safety policy before task instructions to reduce risk. - Date: May 29, 2024 - Reading time: 11 min - Level: Beginner - Tags: Safety, Guardrails, Prompting ## Takeaways - Separate policy and task instructions. - Refuse unsafe requests with clear messaging. - Escalate high risk cases to humans. ## Add a policy layer Put policy constraints in the system prompt before task instructions. ## Red-team prompts Pressure test the policy with adversarial prompts and track failure rates. ## Escalation and audit Route sensitive outputs to human reviewers and log all decisions.