Policy Profile Proposal

Published Confluence page for Project Palisade.

Confluence page ID: 6105169945 Parent folder ID: 6018662571 Remote version: 1 Last remote update: 2026-07-01T13:46:46.964Z Sync status: Published to Confluence.

Purpose

This document proposes a concrete YAML-style policy profile for Palisade.

The policy profile should describe which modalities are accepted, which modules are enabled, what thresholds apply, which model services can be used, whether normalized artifacts can be returned, how bounded state works, and whether remediation is allowed.

This is a concrete example for discussion, not a final strict schema.

Policy Profile Example

profile_id: libre-assist-base-v0
profile_version: 0.1.0
status: draft
mode: shadow

application:
  app_id: libre-assist
  owner: adc-ai-innovation
  environment: pilot
  data_classification: synthetic_or_deidentified_only

runtime:
  surface:
    - check_api
    - proxy
  default_verdict_on_error: flag
  max_total_latency_ms: 3000
  telemetry:
    export: opentelemetry
    include_prompt_content: false
    include_response_content: false

modalities:
  enabled:
    - text
    - image
    - audio
  future:
    - video

normalization:
  text:
    enabled: true
    normalize_encoding: true
    detect_language: true
  audio:
    enabled: true
    stt_model_ref: workers-ai-or-approved-stt-v0
    return_transcript_to_caller: policy_controlled
  image:
    enabled: true
    max_dimension_px: 1536
    allow_resize: true
    allow_filtering: true
    ocr: policy_controlled
  video:
    enabled: false
    future_strategy: split_sampled_frames_and_reuse_image_modules
  return_normalized_artifacts:
    default: false
    allowed_artifact_types:
      - transcript
      - normalized_text

state:
  durable_object:
    enabled: true
    create_per_execution: true
    execution_id_required: true
    continuation:
      enabled: true
      require_conversation_id: true
      max_history_turns: 10
      long_term_memory: false

baseline_safety:
  content_filters:
    violence:
      action: block
      threshold: medium
    hate_harassment_insults:
      action: block
      threshold: medium
    sexual_content:
      action: flag
      threshold: medium
    self_harm:
      action: block
      threshold: low
    misconduct_illicit_behavior:
      action: block
      threshold: medium
  prompt_security:
    prompt_injection:
      action: block
      threshold: medium
    jailbreak:
      action: block
      threshold: medium
    prompt_leakage:
      action: block
      threshold: medium
  privacy:
    pii:
      action: remediate
      strategy: redact
    phi:
      action: flag
      strategy: review
    credentials_secrets:
      action: block

domain_policy:
  medical_advice:
    action: block
    detect:
      - diagnosis
      - treatment_change
      - medication_or_insulin_dosing_change
      - emergency_instruction_without_escalation
  adverse_event_or_product_complaint:
    action: flag
    require_report_marker: true
  off_label_or_unsupported_claims:
    action: flag
  grounding_and_hallucination:
    action: flag
    require_source_alignment: true
  tone:
    action: flag
    require_assistant_style: true
  consent_and_data_handling:
    action: block
    require_policy_permission_for_sensitive_artifacts: true

modules:
  - module_id: text-deterministic-baseline-v0
    type: deterministic
    enabled: true
    domains:
      - pii
      - credentials_secrets
      - denied_terms
    timeout_ms: 250

  - module_id: llm-safety-classifier-v0
    type: llm
    enabled: true
    provider: workers_ai
    model_ref: llama-guard-or-approved-safety-model
    prompt_ref: palisade-safety-system-prompt-v0
    output_schema_ref: module-result-json-v0
    domains:
      - violence
      - hate_harassment_insults
      - sexual_content
      - self_harm
      - misconduct_illicit_behavior
      - prompt_injection
    timeout_ms: 1500

  - module_id: image-specialist-classifier-v0
    type: specialist_model
    enabled: true
    provider: approved_model_endpoint
    endpoint_ref: aws-sandbox-or-replicate-image-safety
    domains:
      - violence
      - sexual_content
      - medical_device_context
    timeout_ms: 2000

  - module_id: audio-transcript-safety-v0
    type: normalization_plus_text_checks
    enabled: true
    depends_on:
      - audio_stt
    domains:
      - pii
      - phi
      - medical_advice

remediation:
  enabled: true
  require_post_check: true
  max_attempts: 1
  allowed_strategies:
    text:
      - redact_pii
      - rewrite_for_tone
    image:
      - blur_sensitive_region
      - crop_sensitive_region
    audio:
      - mute_sensitive_segment
      - re_synthesize_safe_segment
  never_remediate_domains:
    - self_harm
    - emergency_escalation
    - credentials_secrets

outputs:
  report_contract_ref: json-report-contract-v0
  include_domain_results: true
  include_remediation_explanation: true
  include_normalized_artifact_refs: true

Key Behaviors

The profile compiles into an execution plan before module execution starts.
Modules are enabled explicitly; omitted modules do not run.
The baseline safety configuration is mandatory and app-specific domain policy is additive.
Workers orchestrate module calls. Supported catalog models can run through Workers AI; custom or specialist models should be called through approved external HTTPS model services.
Durable Objects are used for bounded execution and session coordination, not default long-term memory.
Normalized artifacts are internal by default and returned only when policy allows it.
Remediation must run a post-check before a remediated output is returned.

Purpose​

Policy Profile Example​

Key Behaviors​

Purpose

Policy Profile Example

Key Behaviors