Policy-first prompting

Layer safety policy before task instructions to reduce risk.

Beginner11 min readMay 29, 2024

SafetyGuardrailsPrompting

Actions

Key takeaways

Add a policy layer

Put policy constraints in the system prompt before task instructions.

Pressure test the policy with adversarial prompts and track failure rates.

Route sensitive outputs to human reviewers and log all decisions.