Skip to main content

Policy Profile Proposal

Published Confluence page for Project Palisade.

Confluence page ID: 6105169945 Parent folder ID: 6018662571 Remote version: 1 Last remote update: 2026-07-01T13:46:46.964Z Sync status: Published to Confluence.

Purpose

This document proposes a concrete YAML-style policy profile for Palisade.

The policy profile should describe which modalities are accepted, which modules are enabled, what thresholds apply, which model services can be used, whether normalized artifacts can be returned, how bounded state works, and whether remediation is allowed.

This is a concrete example for discussion, not a final strict schema.

Policy Profile Example

profile_id: libre-assist-base-v0
profile_version: 0.1.0
status: draft
mode: shadow

application:
app_id: libre-assist
owner: adc-ai-innovation
environment: pilot
data_classification: synthetic_or_deidentified_only

runtime:
surface:
- check_api
- proxy
default_verdict_on_error: flag
max_total_latency_ms: 3000
telemetry:
export: opentelemetry
include_prompt_content: false
include_response_content: false

modalities:
enabled:
- text
- image
- audio
future:
- video

normalization:
text:
enabled: true
normalize_encoding: true
detect_language: true
audio:
enabled: true
stt_model_ref: workers-ai-or-approved-stt-v0
return_transcript_to_caller: policy_controlled
image:
enabled: true
max_dimension_px: 1536
allow_resize: true
allow_filtering: true
ocr: policy_controlled
video:
enabled: false
future_strategy: split_sampled_frames_and_reuse_image_modules
return_normalized_artifacts:
default: false
allowed_artifact_types:
- transcript
- normalized_text

state:
durable_object:
enabled: true
create_per_execution: true
execution_id_required: true
continuation:
enabled: true
require_conversation_id: true
max_history_turns: 10
long_term_memory: false

baseline_safety:
content_filters:
violence:
action: block
threshold: medium
hate_harassment_insults:
action: block
threshold: medium
sexual_content:
action: flag
threshold: medium
self_harm:
action: block
threshold: low
misconduct_illicit_behavior:
action: block
threshold: medium
prompt_security:
prompt_injection:
action: block
threshold: medium
jailbreak:
action: block
threshold: medium
prompt_leakage:
action: block
threshold: medium
privacy:
pii:
action: remediate
strategy: redact
phi:
action: flag
strategy: review
credentials_secrets:
action: block

domain_policy:
medical_advice:
action: block
detect:
- diagnosis
- treatment_change
- medication_or_insulin_dosing_change
- emergency_instruction_without_escalation
adverse_event_or_product_complaint:
action: flag
require_report_marker: true
off_label_or_unsupported_claims:
action: flag
grounding_and_hallucination:
action: flag
require_source_alignment: true
tone:
action: flag
require_assistant_style: true
consent_and_data_handling:
action: block
require_policy_permission_for_sensitive_artifacts: true

modules:
- module_id: text-deterministic-baseline-v0
type: deterministic
enabled: true
domains:
- pii
- credentials_secrets
- denied_terms
timeout_ms: 250

- module_id: llm-safety-classifier-v0
type: llm
enabled: true
provider: workers_ai
model_ref: llama-guard-or-approved-safety-model
prompt_ref: palisade-safety-system-prompt-v0
output_schema_ref: module-result-json-v0
domains:
- violence
- hate_harassment_insults
- sexual_content
- self_harm
- misconduct_illicit_behavior
- prompt_injection
timeout_ms: 1500

- module_id: image-specialist-classifier-v0
type: specialist_model
enabled: true
provider: approved_model_endpoint
endpoint_ref: aws-sandbox-or-replicate-image-safety
domains:
- violence
- sexual_content
- medical_device_context
timeout_ms: 2000

- module_id: audio-transcript-safety-v0
type: normalization_plus_text_checks
enabled: true
depends_on:
- audio_stt
domains:
- pii
- phi
- medical_advice

remediation:
enabled: true
require_post_check: true
max_attempts: 1
allowed_strategies:
text:
- redact_pii
- rewrite_for_tone
image:
- blur_sensitive_region
- crop_sensitive_region
audio:
- mute_sensitive_segment
- re_synthesize_safe_segment
never_remediate_domains:
- self_harm
- emergency_escalation
- credentials_secrets

outputs:
report_contract_ref: json-report-contract-v0
include_domain_results: true
include_remediation_explanation: true
include_normalized_artifact_refs: true

Key Behaviors

  • The profile compiles into an execution plan before module execution starts.
  • Modules are enabled explicitly; omitted modules do not run.
  • The baseline safety configuration is mandatory and app-specific domain policy is additive.
  • Workers orchestrate module calls. Supported catalog models can run through Workers AI; custom or specialist models should be called through approved external HTTPS model services.
  • Durable Objects are used for bounded execution and session coordination, not default long-term memory.
  • Normalized artifacts are internal by default and returned only when policy allows it.
  • Remediation must run a post-check before a remediated output is returned.