Roles

Roles are the core authorization mechanism in Thand Agent that define what permissions users can request and under what conditions. They act as templates that specify the scope of access, workflows for approval, and inheritance relationships that enable flexible permission management.

Quick Start

A basic role definition:

version: "1.0"
roles:
  aws-developer:
    name: AWS Developer Access
    description: Developer access to AWS resources
    enabled: true
    
    permissions:
      allow:
        - ec2:DescribeInstances
        - s3:GetObject
        - s3:ListBuckets
    
    scopes:
      groups:
        - developers

Core Concepts

What is a Role?

A Thand role is a configuration template that defines:

  • Permissions: What actions can be performed (allow/deny rules)
  • Resources: Which resources can be accessed (with allow/deny rules)
  • Inheritance: Which other roles this role builds upon
  • Providers: Which provider instances can be used with this role
  • Scopes: Who can request this role (users/groups)
  • Workflows: How access requests are processed and approved

Role vs Provider Roles

It’s important to distinguish between:

  • Thand Roles: Defined in your agent configuration (documented here)
  • Provider Roles: Native roles in external systems (AWS IAM roles, Azure roles, etc.)

Thand roles can inherit from other roles and provider roles to leverage existing cloud IAM configurations.

Intelligent Permission Merging

Thand Agent features intelligent permission merging that:

  • Consolidates condensed actions: k8s:pods:get,list + k8s:pods:create,update = k8s:pods:create,get,list,update
  • Preserves GCP-style permissions: Permissions with dots in the action (e.g., gcp:compute.instances.get) are treated atomically and not condensed
  • Resolves Allow/Deny conflicts: Parent permissions take precedence - Parent Allow overrides Child Deny, Parent Deny overrides Child Allow
  • Filters by provider: Inherited permissions with provider prefixes are automatically filtered to match the role’s configured providers, and matching prefixes are stripped from the output
  • Handles complex inheritance: Multi-level role inheritance with proper conflict resolution
  • Supports provider-specific naming: AWS ARNs, GCP service accounts, Azure resource IDs with complex naming patterns

Table of Contents

  1. Role Structure
  2. Permissions
  3. Resources
  4. Inheritance
  5. Scopes & Access Control
  6. Provider Integration
  7. Workflow Integration
  8. Configuration Management
  9. Best Practices
  10. Troubleshooting

Role Structure

Basic Configuration

version: "1.0"
roles:
  role-name:
    name: Human Readable Name
    description: Description of what this role provides
    enabled: true                    # Optional, defaults to true
    
    # Core role definition
    permissions:     # What actions are allowed/denied
      allow: []
      deny: []
    resources:       # What resources can be accessed
      allow: []
      deny: []
    inherits: []     # What other roles to inherit from
    providers: []    # Which providers can be used
    
    # Access control
    scopes:          # Who can request this role
      users: []
      groups: []
    
    # Process control  
    workflows: []         # How requests are processed
    authenticators: []    # Which auth providers are valid

Complete Role Example

version: "1.0"
roles:
  aws-developer:
    name: AWS Developer Access
    description: Developer access to AWS resources with approval workflow
    enabled: true
    
    # Inheritance - build upon existing roles
    inherits:
      - aws-basic-user                    # Local role
      - aws-dev:arn:aws:iam::aws:policy/AmazonEC2ReadOnlyAccess  # AWS managed policy
    
    # Explicit permissions with intelligent merging
    permissions:
      allow:
        - ec2:DescribeInstances,StartInstances,StopInstances  # Condensed actions
        - s3:GetObject,PutObject          # Multiple S3 actions
        - logs:DescribeLogGroups,DescribeLogStreams
      deny:
        - ec2:TerminateInstances          # Explicit denial
    
    # Resource restrictions
    resources:
      allow:
        - "arn:aws:ec2:us-east-1:123456789012:instance/*"
        - "arn:aws:s3:::dev-bucket/*"
      deny:
        - "arn:aws:s3:::prod-bucket/*"
    
    # Provider restrictions
    providers:
      - aws-dev
      - aws-staging
    
    # Who can request this role
    scopes:
      users:
        - developer@example.com
      groups:
        - developers
        - engineering
    
    # Approval process
    workflows:
      - manager-approval
      - security-review
    
    # Valid authentication methods
    authenticators:
      - google-oauth
      - saml-sso

Configuration Fields Reference

Field Type Required Description
name string Yes Human-readable role name
description string Yes Description of role purpose
enabled boolean No Whether role is active (default: true)
permissions object No Allow/deny permission rules
resources object No Allow/deny resource rules
inherits array No List of roles to inherit from
providers array No List of provider instances this role can use
scopes object No User/group access restrictions
workflows array No Approval workflows to execute
authenticators array No Valid authentication providers

Permissions

Permissions define what actions can be performed when a role is activated. Thand Agent supports intelligent permission merging that handles condensed actions, inheritance conflicts, and provider-specific permission formats.

Basic Permission Structure

permissions:
  allow:    # List of allowed actions
    - action1
    - action2
  deny:     # List of explicitly denied actions  
    - action3
    - action4

Condensed Actions

Thand Agent intelligently handles condensed actions where multiple related actions are specified in a single permission string:

permissions:
  allow:
    # Condensed format - multiple actions in one string
    - "k8s:pods:get,list,watch,create,update,delete"
    - "s3:GetObject,PutObject,ListBucket"
    - "ec2:DescribeInstances,StartInstances,StopInstances"
    
    # Individual format - also supported
    - "logs:DescribeLogGroups"
    - "logs:DescribeLogStreams"

Intelligent Merging

When roles are inherited or merged, condensed actions are intelligently combined:

# Base role
base-role:
  permissions:
    allow:
      - "k8s:pods:get,list,watch"
      - "s3:GetObject,ListBucket"

# Child role
child-role:
  inherits: [base-role]
  permissions:
    allow:
      - "k8s:pods:create,update,delete"  # Will merge with base
      - "s3:PutObject,DeleteObject"      # Will merge with base

# Resulting merged permissions:
# - "k8s:pods:create,delete,get,list,update,watch"  (merged and sorted)
# - "s3:DeleteObject,GetObject,ListBucket,PutObject" (merged and sorted)

Cloud Provider Permission Patterns

AWS Permissions

permissions:
  allow:
    - "ec2:*"                          # All EC2 actions
    - "s3:GetObject,PutObject"         # Specific S3 actions
    - "iam:PassRole"                   # IAM role assumption
    - "logs:DescribeLogGroups,DescribeLogStreams,CreateLogStream"
  deny:
    - "ec2:TerminateInstances"         # Explicit denial
    - "s3:DeleteBucket"                # Protect against deletion

Azure Permissions

permissions:
  allow:
    - "Microsoft.Compute/virtualMachines/read,start,restart"
    - "Microsoft.Storage/storageAccounts/read"
    - "Microsoft.Authorization/roleAssignments/read"
  deny:
    - "Microsoft.Compute/virtualMachines/delete"
    - "Microsoft.Storage/storageAccounts/delete"

GCP Permissions

GCP permissions are atomic: Unlike AWS or Kubernetes permissions, GCP permissions contain dots in the action portion (e.g., compute.instances.get). These are detected automatically and treated as atomic - they are never condensed with other permissions.

permissions:
  allow:
    # GCP permissions are NOT condensed - each is kept separate
    - "gcp-prod:compute.instances.get"
    - "gcp-prod:compute.instances.list"
    - "gcp-prod:compute.instances.start"
    - "gcp-prod:storage.buckets.list"
    - "gcp-prod:iam.serviceAccounts.get"
  deny:
    - "gcp-prod:compute.instances.delete"
    - "gcp-prod:storage.buckets.delete"

The system automatically detects GCP-style permissions by checking if the last segment (after the final colon) contains a dot. If it does, the permission is treated as atomic.

Kubernetes Permissions

permissions:
  allow:
    - "k8s:pods:get,list,watch,create,update,patch"
    - "k8s:services:get,list,create,update,delete"
    - "k8s:configmaps:get,list,create,update,delete"
    - "k8s:secrets:get,list"  # Read-only for secrets
  deny:
    - "k8s:secrets:create,update,delete"  # No secret modifications
    - "k8s:pods:delete"                   # Cannot delete pods

Allow/Deny Conflict Resolution

When the same action appears in both allow and deny lists, the system resolves conflicts using clear precedence rules.

Single Role Conflicts

# Within a single role, deny takes precedence
role:
  permissions:
    allow:
      - "k8s:pods:get,list,create,update,delete"
    deny:
      - "k8s:pods:delete"  # Removes 'delete' from the allow list

# Resolves to:
# allow: ["k8s:pods:create,get,list,update"]
# deny: []  (deny removed since conflict was resolved)

Inheritance Conflicts (Parent Wins)

In inheritance chains, the parent role (the one doing the inheriting) takes precedence over child roles (the inherited ones):

  • Parent Allow overrides Child Deny: If parent allows an action that child denied, the action is allowed
  • Parent Deny overrides Child Allow: If parent denies an action that child allowed, the action is denied
# Child role (the inherited role)
child-role:
  permissions:
    allow: ["ec2:StartInstances", "ec2:DescribeInstances"]
    deny: ["ec2:TerminateInstances"]

# Parent role (the role doing the inheriting)
parent-role:
  inherits: [child-role]
  permissions:
    allow: ["ec2:TerminateInstances"]  # Overrides child's deny
    deny: ["ec2:StartInstances"]       # Overrides child's allow

# Final resolved permissions:
# allow: ["ec2:DescribeInstances", "ec2:TerminateInstances"]  
#        (child's allow minus parent's deny, plus parent's allow)
# deny: ["ec2:StartInstances"]  
#       (parent's deny, child's deny removed by parent's allow)

This allows you to build restrictive roles that inherit permissive ones, or permissive roles that override restrictions from inherited roles.

Wildcard Permissions

Support for wildcard patterns varies by provider. Wildcards automatically subsume more specific permissions:

permissions:
  allow:
    # AWS wildcards
    - "ec2:*"                    # All EC2 actions
    - "s3:*Object*"              # All object-related S3 actions
    
    # Kubernetes wildcards  
    - "k8s:*:*"                  # All Kubernetes actions
    - "k8s:pods:*"               # All pod actions
    
    # Azure wildcards
    - "Microsoft.Compute/*"      # All compute actions

Wildcard Subsumption

When wildcards are present, more specific permissions under that wildcard are automatically removed (subsumed):

permissions:
  allow:
    - "ec2:*"                    # Wildcard
    - "ec2:DescribeInstances"    # Will be removed (subsumed by ec2:*)
    - "ec2:StartInstances"       # Will be removed (subsumed by ec2:*)
    - "s3:GetObject"             # Kept (not under ec2:*)

# After condensing, becomes:
# allow: ["ec2:*", "s3:GetObject"]

A wildcard does NOT subsume itself. For example, ec2:* is kept even when other ec2:* wildcards exist. Only more specific permissions (like ec2:DescribeInstances) are subsumed.


Resources

Resources define what resources the permissions can be applied to. They provide fine-grained control over which specific cloud resources, files, databases, or other assets can be accessed.

Basic Resource Structure

resources:
  allow:    # List of allowed resources
    - resource1
    - resource2
  deny:     # List of explicitly denied resources
    - resource3
    - resource4

Cloud Provider Resource Patterns

AWS Resources (ARNs)

resources:
  allow:
    # EC2 resources
    - "arn:aws:ec2:*:*:instance/*"           # All EC2 instances
    - "arn:aws:ec2:us-east-1:123456789012:instance/*"  # Specific region/account
    
    # S3 resources
    - "arn:aws:s3:::dev-bucket/*"            # Specific bucket contents
    - "arn:aws:s3:::app-*/*"                 # Pattern-based bucket access
    
    # IAM resources
    - "arn:aws:iam::123456789012:role/app-*" # Application roles only
    
    # RDS resources
    - "arn:aws:rds:*:*:db:dev-*"            # Development databases
  deny:
    - "arn:aws:s3:::prod-bucket/*"          # Sensitive production data
    - "arn:aws:iam::*:role/admin-*"         # Administrative roles

Azure Resources

resources:
  allow:
    # Virtual machines
    - "/subscriptions/*/resourceGroups/dev-*/providers/Microsoft.Compute/virtualMachines/*"
    - "/subscriptions/12345/resourceGroups/app-*/providers/Microsoft.Compute/*"
    
    # Storage accounts
    - "/subscriptions/*/resourceGroups/*/providers/Microsoft.Storage/storageAccounts/dev*"
    
    # Resource groups
    - "/subscriptions/*/resourceGroups/development-*"
    - "/subscriptions/*/resourceGroups/staging-*"
  deny:
    - "/subscriptions/*/resourceGroups/production-*"  # No production access
    - "/subscriptions/*/resourceGroups/*/providers/Microsoft.Storage/storageAccounts/prod*"

GCP Resources

resources:
  allow:
    # Compute instances
    - "projects/dev-project/zones/*/instances/*"
    - "projects/*/zones/us-central1-*/instances/app-*"
    
    # Storage buckets
    - "projects/*/global/buckets/dev-*"
    - "projects/my-project/global/buckets/staging-*"
    
    # Networks
    - "projects/*/global/networks/default"
    - "projects/*/regions/*/subnetworks/dev-*"
  deny:
    - "projects/prod-project/*"               # No production project access
    - "projects/*/global/buckets/sensitive-*" # Sensitive buckets

Kubernetes Resources

resources:
  allow:
    # Namespace-scoped resources
    - "namespace:development"
    - "namespace:staging"
    - "namespace:feature-*"
    
    # Specific resource types
    - "namespace:dev/pods/*"
    - "namespace:dev/services/*"
    - "namespace:*/configmaps/app-*"
  deny:
    - "namespace:production"           # No production namespace
    - "namespace:*/secrets/*"          # No secret access
    - "namespace:kube-system"          # No system namespace

Resource Inheritance and Merging

Resources from inherited roles are merged using the same intelligent system as permissions:

# Base role
base-role:
  resources:
    allow:
      - "arn:aws:s3:::app-bucket/*"
      - "arn:aws:ec2:*:*:instance/i-dev-*"
    deny:
      - "arn:aws:s3:::app-bucket/sensitive/*"

# Child role
child-role:
  inherits: [base-role]
  resources:
    allow:
      - "arn:aws:s3:::logs-bucket/*"     # Additional resource
      - "arn:aws:ec2:*:*:instance/i-staging-*"
    deny:
      - "arn:aws:s3:::logs-bucket/audit/*"  # Additional restriction

# Merged result:
# allow: 
#   - "arn:aws:s3:::app-bucket/*"
#   - "arn:aws:s3:::logs-bucket/*"  
#   - "arn:aws:ec2:*:*:instance/i-dev-*"
#   - "arn:aws:ec2:*:*:instance/i-staging-*"
# deny:
#   - "arn:aws:s3:::app-bucket/sensitive/*"
#   - "arn:aws:s3:::logs-bucket/audit/*"

Resource Pattern Matching

Wildcards

resources:
  allow:
    - "arn:aws:s3:::*-dev/*"           # Any bucket ending with '-dev'
    - "arn:aws:ec2:*:*:instance/*"     # All instances in any region
    - "projects/*/zones/us-*/*"        # US zones only

Path-based Patterns

resources:
  allow:
    # Hierarchical access
    - "projects/my-project/zones/us-central1-a/*"
    - "/subscriptions/12345/resourceGroups/dev-*/providers/*"
    
    # File-system style paths
    - "/app/data/dev/*"
    - "/shared/logs/application-*"

Inheritance

Role inheritance is a powerful feature that allows roles to build upon each other, promoting reusability and consistent security patterns. Thand Agent features intelligent inheritance that properly handles complex permission merging, provider-specific role names, and conflict resolution.

How Inheritance Works

When a role inherits from other roles:

  1. Provider Filtering: Inherited permissions/resources/groups with provider prefixes are filtered to only include those matching the parent role’s providers list. Matching prefixes are stripped from the output.
  2. Permission Expansion: Condensed actions (e.g., k8s:pods:get,list) are expanded to individual permissions for merging
  3. GCP Permission Detection: Permissions with dots in the action (e.g., compute.instances.get) are detected and kept atomic
  4. Intelligent Merging: All allow and deny lists are combined with proper conflict resolution
  5. Conflict Resolution: Parent Allow overrides Child Deny, Parent Deny overrides Child Allow
  6. Action Condensing: Final condensable permissions are re-condensed for clean output (GCP-style permissions remain atomic)
  7. Scope Validation: Inherited roles must be applicable to the requesting identity

Inheritance Types

1. Local Role Inheritance

Inherit from other Thand roles:

roles:
  base-user:
    name: Base User
    permissions:
      allow:
        - "ec2:DescribeInstances,DescribeImages"
        - "s3:ListBuckets,GetBucketLocation"
  
  power-user:
    name: Power User
    inherits:
      - base-user  # Inherits base-user permissions
    permissions:
      allow:
        - "ec2:StartInstances,StopInstances,RebootInstances"  # Additional permissions
        - "s3:GetObject,PutObject"

# Resulting power-user permissions (intelligently merged):
# allow:
#   - "ec2:DescribeImages,DescribeInstances,RebootInstances,StartInstances,StopInstances"
#   - "s3:GetBucketLocation,GetObject,ListBuckets,PutObject"

2. Provider Role Inheritance

Inherit from cloud provider managed roles using provider-specific syntax:

roles:
  aws-admin:
    name: AWS Administrator
    inherits:
      # Direct AWS managed policy
      - "arn:aws:iam::aws:policy/AdministratorAccess"
      
      # Provider-scoped inheritance
      - "aws-prod:arn:aws:iam::aws:policy/ReadOnlyAccess"
    
  gcp-viewer:
    name: GCP Viewer
    inherits:
      # GCP predefined role
      - "roles/viewer"
      
      # Provider-scoped GCP role
      - "gcp-prod:roles/compute.viewer"
    permissions:
      allow:
        - "compute.instances.start,stop"  # Additional specific permissions

  azure-contributor:
    name: Azure Contributor
    inherits:
      # Azure built-in role
      - "Contributor"
      
      # Provider-scoped Azure role
      - "azure-prod:/subscriptions/12345/providers/Microsoft.Authorization/roleDefinitions/b24988ac-6180-42a0-ab88-20f7382dd24c"

3. Complex Provider-Specific Inheritance

Handle complex role names with multiple colons (AWS ARNs, service accounts):

roles:
  kubernetes-admin:
    name: Kubernetes Administrator
    inherits:
      # AWS ARN with multiple colons - uses first colon as delimiter
      - "aws-prod:arn:aws:iam::123456789012:role/KubernetesAdmin"
      
      # GCP service account with @ symbol
      - "gcp-prod:k8s-admin@my-project.iam.gserviceaccount.com"
      
      # Azure resource ID with multiple path segments
      - "azure-prod:/subscriptions/12345/resourceGroups/k8s/providers/Microsoft.ManagedIdentity/userAssignedIdentities/k8s-admin"

  multi-cloud-viewer:
    name: Multi-Cloud Viewer
    inherits:
      - local-base-viewer           # Local role
      - "aws-prod:arn:aws:iam::aws:policy/ReadOnlyAccess"
      - "gcp-prod:roles/viewer"
      - "azure-prod:Reader"
    permissions:
      allow:
        - "custom:audit,monitor"     # Additional custom permissions

4. Mixed Inheritance with Intelligent Merging

Combine local and provider roles with complex permission merging:

roles:
  base-k8s:
    name: Base Kubernetes
    permissions:
      allow:
        - "k8s:pods:get,list,watch"
        - "k8s:services:get,list"

  k8s-developer:
    name: Kubernetes Developer
    inherits: [base-k8s]
    permissions:
      allow:
        - "k8s:pods:create,update,patch"      # Merges with inherited get,list,watch
        - "k8s:services:create,update,delete" # Merges with inherited get,list
        - "k8s:configmaps:get,list,create,update,delete"
      deny:
        - "k8s:pods:delete"                   # Prevents pod deletion

  k8s-admin:
    name: Kubernetes Administrator  
    inherits: [k8s-developer]
    permissions:
      allow:
        - "k8s:pods:delete"                   # Overrides parent deny
        - "k8s:secrets:get,list,create"
        - "k8s:*:*"                          # Admin access to all
      deny:
        - "k8s:secrets:delete"                # Even admins can't delete secrets

# Final k8s-admin permissions after intelligent merging:
# allow:
#   - "k8s:*:*"  (covers everything including specific permissions)
# deny:  
#   - "k8s:secrets:delete"  (explicit restriction even for admin)

Inheritance Resolution Process

The inheritance system resolves permissions in this order:

  1. Parse Inheritance: Extract provider prefixes and role names (first colon is the delimiter)
  2. Provider Filtering: Filter inherited items by the parent role’s providers list
  3. Scope Validation: Ensure each inherited role is applicable to the requesting identity
  4. Recursive Resolution: Resolve inheritance chains (A inherits B inherits C)
  5. Permission Expansion: Expand condensable actions to individual permissions (GCP-style permissions with dots are kept atomic)
  6. Intelligent Merging: Combine permissions from all inheritance levels
  7. Conflict Resolution: Apply Parent-over-Child conflict resolution (Parent Allow overrides Child Deny, Parent Deny overrides Child Allow)
  8. Action Condensing: Condense related actions back for clean output (GCP-style permissions remain atomic)
  9. Wildcard Subsumption: Remove permissions subsumed by wildcards (e.g., ec2:* subsumes ec2:DescribeInstances)

Provider-Specific Inheritance Syntax

When inheriting from provider roles, use the provider name as a prefix:

# Format: provider-name:role-identifier
inherits:
  - "aws-prod:arn:aws:iam::123456789012:role/MyRole"      # AWS role
  - "gcp-prod:roles/storage.admin"                         # GCP role  
  - "azure-prod:Storage Blob Data Contributor"             # Azure role
  - "k8s-prod:cluster-admin"                              # Kubernetes role

Parser Behavior:

  • Uses the first colon as the delimiter between provider and role
  • Everything before first colon = provider name
  • Everything after first colon = role identifier
  • Handles complex identifiers like AWS ARNs with multiple colons correctly
  • Provider name is validated against configured providers

Provider Filtering and Prefix Stripping

When permissions, resources, or groups have provider prefixes, they are automatically filtered based on the role’s providers list. Matching prefixes are stripped from the output:

# Base role with provider-prefixed permissions and resources
base-cloud-role:
  permissions:
    allow:
      - "aws-prod:ec2:DescribeInstances"
      - "gcp-prod:compute.instances.get"
      - "azure-prod:Microsoft.Compute/virtualMachines/read"
  resources:
    allow:
      - "aws:*"      # Provider engine type prefix
      - "gcp:*"      # Provider engine type prefix
      - "azure:*"    # Provider engine type prefix

# Role that only uses AWS providers
aws-only-role:
  inherits: [base-cloud-role]
  providers: [aws-prod, aws-dev]  # Only AWS providers
  # Resulting permissions (prefix stripped):
  #   - "ec2:DescribeInstances"  (was "aws-prod:ec2:DescribeInstances")
  # Resulting resources (prefix stripped):
  #   - "*"  (was "aws:*" - "aws" matches aws-prod's engine type)
  # GCP and Azure items are filtered out completely

# Role that uses multiple providers
multi-cloud-role:
  inherits: [base-cloud-role]
  providers: [aws-prod, gcp-prod]  # AWS and GCP
  # Resulting permissions (prefixes stripped):
  #   - "ec2:DescribeInstances"      (was "aws-prod:ec2:DescribeInstances")
  #   - "compute.instances.get"      (was "gcp-prod:compute.instances.get")
  # Resulting resources (prefixes stripped):
  #   - "*"  (from both "aws:*" and "gcp:*")
  # Azure items are filtered out completely

How prefix matching works:

  • Exact provider name match: aws-prod:permission matches providers list containing aws-prod
  • Engine type match: aws:* matches any provider with engine type aws (e.g., aws-prod, aws-dev)
  • When a prefix matches, it is removed from the output
  • Items without a provider prefix are always included as-is

Provider prefixes are stripped when they match, leaving clean permission/resource strings without provider annotations. This allows the same base role to be used across different provider configurations.

Inheritance Validation

The system validates inheritance chains:

Cyclic Inheritance Detection

# This will be detected and rejected
role-a:
  inherits: [role-b]
role-b:
  inherits: [role-c]  
role-c:
  inherits: [role-a]  # Cycle detected!

Missing Role Detection

# This will fail if 'nonexistent-role' doesn't exist
my-role:
  inherits: [nonexistent-role]  # Error: role not found

Scope Compatibility

# Inherited role must be applicable to the requesting user
admin-role:
  scopes:
    groups: [admins]
    
developer-role:
  inherits: [admin-role]  # Will fail if user is not in 'admins' group
  scopes:
    groups: [developers]

Inheritance Best Practices

1. Build Role Hierarchies

# Base roles with minimal permissions
readonly-base:
  permissions:
    allow: ["*:Describe*", "*:List*", "*:Get*"]

# Specialized roles building on base
ec2-readonly:
  inherits: [readonly-base]
  resources:
    allow: ["arn:aws:ec2:*:*:*"]

# Team-specific roles
dev-team-ec2:
  inherits: [ec2-readonly]
  scopes:
    groups: [developers]

2. Use Provider Managed Roles

# Leverage existing cloud roles
aws-power-user:
  inherits:
    - "aws-prod:arn:aws:iam::aws:policy/PowerUserAccess"
  # Add company-specific restrictions
  resources:
    deny:
      - "arn:aws:s3:::sensitive-*"

3. Layer Security Controls

restrictive-admin:
  name: Restrictive Admin
  inherits:
    - "aws-prod:arn:aws:iam::aws:policy/AdministratorAccess"
  # Add explicit denials even for admins
  permissions:
    deny:
      - "iam:DeleteUser"
      - "iam:DeleteRole"
      - "s3:DeleteBucket"
  resources:
    deny:
      - "arn:aws:s3:::critical-*"

Scopes & Access Control

Scopes control who can request a role. This enables role-based access control at the user/group level, ensuring that only authorized identities can request specific roles.

Scope Structure

scopes:
  users:    # Specific user identities
    - user1@example.com
    - user2@example.com
  groups:   # Group memberships
    - group1
    - group2

User Scopes

Grant access to specific users using various identity formats:

scopes:
  users:
    - alice@example.com           # Email address
    - bob.smith@company.com       # Full name email
    - service-account@project.iam.gserviceaccount.com  # Service account
    - "123456789"                 # User ID
    - "alice.smith"               # Username

Group Scopes

Grant access to groups (depends on identity provider):

scopes:
  groups:
    - developers                  # Simple group name
    - engineering                 # Department
    - on-call                     # Role-based group
    - team-alpha                  # Team designation
    - contractors                 # Employment type

Identity Provider Integration

Different identity providers may have different group formats:

scopes:
  groups:
    # Active Directory groups
    - "DOMAIN\\Domain Users"
    - "CORP\\Engineering"
    
    # OIDC/OAuth groups  
    - "developers"
    - "admin-users"
    
    # SAML groups
    - "cn=developers,ou=groups,dc=company,dc=com"
    
    # GitHub teams
    - "my-org/developers"
    - "my-org/admin-team"

Mixed Scopes

Combine users and groups for flexible access control:

scopes:
  users:
    - emergency-admin@example.com     # Emergency access user
    - service-bot@example.com         # Automated service
  groups:
    - on-call                         # On-call team members
    - security-team                   # Security personnel
    - senior-engineers                # Senior staff

Public Roles

Omit scopes to allow any authenticated user to request the role:

roles:
  basic-viewer:
    name: Basic Viewer Access
    description: Read-only access available to all authenticated users
    # No 'scopes' field - available to all users
    permissions:
      allow:
        - "*:Describe*"
        - "*:List*"
        - "*:Get*"

Scope Inheritance

When roles inherit from other roles, scope checking is applied to each role in the inheritance chain:

roles:
  admin-base:
    name: Admin Base
    scopes:
      groups: [admins]
    permissions:
      allow: ["*:*"]
  
  senior-admin:
    name: Senior Admin
    inherits: [admin-base]          # User must be in 'admins' group
    scopes:
      groups: [senior-staff]        # AND in 'senior-staff' group
    permissions:
      allow: ["sensitive:*"]

# For senior-admin role to work, user must be in BOTH groups:
# - 'admins' (required by admin-base)
# - 'senior-staff' (required by senior-admin)

Scope Validation Examples

Successful Access

# Role definition
developer-role:
  scopes:
    groups: [developers, engineering]

# User identity
user:
  email: alice@example.com
  groups: [developers, qa-team]

# Result: ✅ Access granted (user in 'developers' group)

Failed Access

# Role definition  
admin-role:
  scopes:
    users: [admin@example.com]
    groups: [administrators]

# User identity
user:
  email: alice@example.com
  groups: [developers]

# Result: ❌ Access denied (user not in allowed users or groups)

Provider Integration

Roles specify which provider instances can be used for role elevation. This enables multi-cloud and multi-environment access control.

Single Provider

Restrict a role to a specific provider instance:

roles:
  aws-dev-access:
    name: AWS Development Access
    providers:
      - aws-dev  # Only the aws-dev provider instance
    permissions:
      allow:
        - "ec2:*"
        - "s3:*"

Multi-Provider

Allow a role to work across multiple provider instances:

roles:
  multi-cloud-viewer:
    name: Multi-Cloud Viewer Access
    providers:
      - aws-prod
      - azure-prod  
      - gcp-prod
    permissions:
      allow:
        - "*:Describe*"
        - "*:List*"
        - "*:Get*"

Environment-Specific Providers

Organize providers by environment:

roles:
  development-admin:
    name: Development Administrator
    providers:
      - aws-dev
      - azure-dev
      - gcp-dev
      - k8s-dev
    permissions:
      allow: ["*:*"]
  
  production-readonly:
    name: Production Read-Only
    providers:
      - aws-prod
      - azure-prod
      - gcp-prod
    permissions:
      allow:
        - "*:Describe*"
        - "*:List*"
        - "*:Get*"

Provider Inheritance Compatibility

When inheriting from provider roles, ensure the provider supports the inherited role:

roles:
  aws-ec2-admin:
    name: EC2 Administrator
    providers:
      - aws-prod
    inherits:
      # This AWS managed policy must exist in the aws-prod provider
      - "aws-prod:arn:aws:iam::aws:policy/AmazonEC2FullAccess"
    
  gcp-compute-admin:
    name: GCP Compute Administrator  
    providers:
      - gcp-prod
    inherits:
      # This GCP role must be available in the gcp-prod provider
      - "gcp-prod:roles/compute.admin"

Provider Validation

The system validates provider compatibility:

# This will fail if aws-staging doesn't have the specified role
problematic-role:
  providers:
    - aws-staging
  inherits:
    - "aws-prod:arn:aws:iam::123456789012:role/CustomRole"  # Different provider!

Correct approach:

correct-role:
  providers:
    - aws-staging
  inherits:
    - "aws-staging:arn:aws:iam::123456789012:role/CustomRole"  # Same provider

Workflow Integration

Roles integrate with workflows to define approval processes, time limits, and other governance controls.

Basic Workflow Assignment

roles:
  sensitive-admin:
    name: Sensitive Admin Access
    workflows:
      - manager-approval     # Requires manager approval
      - security-review      # Additional security review
    permissions:
      allow: ["*:*"]

Multiple Workflows

Multiple workflows can be applied to a single role for different purposes:

Sequential Execution

Workflows are typically executed in sequence, with each workflow having specific responsibilities:

roles:
  production-access:
    name: Production Access
    workflows:
      - identity-verification    # Step 1: Verify identity
      - manager-approval         # Step 2: Manager approval  
      - security-approval        # Step 3: Security team approval
      - time-limit               # Step 4: Apply time limits
    permissions:
      allow: ["*:*"]

Scoped Workflows

Different workflows can be scoped to specific resources, users, teams, or permissions:

roles:
  multi-scoped-admin:
    name: Multi-Scoped Administrator
    workflows:
      # Base approval for all requests
      - manager-approval
      
      # Additional security review for sensitive resources
      - security-review          # Scoped to sensitive resources
      
      # Emergency bypass for on-call team
      - emergency-bypass         # Scoped to on-call users
      
      # Extended approval for high-privilege actions
      - ciso-approval           # Scoped to admin permissions
      
      # Audit logging for all actions
      - audit-trail             # Applied to all requests
    permissions:
      allow: ["*:*"]
    resources:
      allow: ["*"]

Resource-Scoped Workflow Example

roles:
  database-admin:
    name: Database Administrator
    workflows:
      # Standard approval for read operations
      - team-lead-approval      # Scoped to read-only permissions
      
      # DBA approval for schema changes
      - dba-approval           # Scoped to DDL operations
      
      # CISO approval for production databases
      - ciso-approval          # Scoped to production resources
      
      # Immediate notification for all access
      - security-notification  # Applied to all requests
    permissions:
      allow:
        - "rds:Describe*,List*"           # Read operations
        - "rds:CreateDBSnapshot"          # Backup operations
        - "rds:ModifyDBInstance"          # Configuration changes
        - "rds:CreateDBInstance"          # New instance creation
    resources:
      allow:
        - "arn:aws:rds:*:*:db:dev-*"     # Development databases
        - "arn:aws:rds:*:*:db:staging-*" # Staging databases  
        - "arn:aws:rds:*:*:db:prod-*"    # Production databases

Team-Scoped Workflow Example

roles:
  escalated-support:
    name: Escalated Support Access
    workflows:
      # Different approval chains for different teams
      - l2-approval            # Scoped to L2 support team
      - security-approval      # Scoped to security team members
      - manager-approval       # Scoped to engineering managers
      - emergency-access       # Scoped to on-call personnel
      
      # Universal workflows
      - access-logging         # Applied to all users
      - time-restriction       # Applied to all requests
    scopes:
      groups:
        - l2-support           # L2 support team
        - security-team        # Security personnel
        - engineering-managers # Engineering managers
        - on-call             # On-call rotation
    permissions:
      allow: ["*:*"]

Permission-Scoped Workflow Example

roles:
  cloud-engineer:
    name: Cloud Engineer
    workflows:
      # Light approval for read operations
      - self-approval          # Scoped to read-only permissions
      
      # Team approval for standard operations
      - peer-review           # Scoped to create/update operations
      
      # Management approval for destructive operations
      - manager-approval      # Scoped to delete operations
      
      # Security review for IAM operations
      - security-review       # Scoped to IAM/security permissions
      
      # Audit trail for all actions
      - comprehensive-audit   # Applied to all permissions
    permissions:
      allow:
        - "ec2:Describe*,List*,Get*"     # Read operations
        - "ec2:Start*,Stop*,Reboot*"     # Management operations
        - "ec2:Create*,Update*,Modify*"  # Creation operations
        - "ec2:Terminate*,Delete*"       # Destructive operations
        - "iam:List*,Get*,Describe*"     # IAM read operations
        - "iam:Create*,Update*,Delete*"  # IAM write operations

Conditional Workflows

Workflows can implement conditional logic based on context:

roles:
  adaptive-access:
    name: Adaptive Access Control
    workflows:
      # Risk-based routing workflow
      - risk-assessment        # Analyzes request context and routes accordingly
      
      # Conditional workflows based on risk assessment:
      # - Low risk: automatic approval
      # - Medium risk: manager approval + time limits
      # - High risk: security team + CISO approval + enhanced monitoring
      
      # Time-based workflows
      - business-hours-check   # Different approval paths for after-hours access
      
      # Location-based workflows  
      - geo-validation        # Additional verification for non-standard locations
      
      # Frequency-based workflows
      - usage-pattern-check   # Escalated approval for unusual access patterns
    permissions:
      allow: ["*:*"]

Dynamic Workflow Selection

Workflows can be dynamically selected based on request attributes:

roles:
  smart-admin:
    name: Smart Administrative Access
    workflows:
      # Base workflow engine that routes to appropriate sub-workflows
      - dynamic-router
      
      # Available sub-workflows (selected by dynamic-router):
      # For emergency situations:
      - emergency-fast-track    # Immediate approval with post-review
      
      # For business hours, standard requests:
      - standard-approval      # Manager approval within business hours
      
      # For after-hours requests:
      - extended-approval      # Manager + security team approval
      
      # For high-risk operations:
      - enhanced-security      # Multi-level approval + monitoring
      
      # For audit/compliance requests:
      - compliance-track       # Compliance team approval + audit trail
    permissions:
      allow: ["*:*"]

Workflow Context

Workflows receive rich context about the role request, enabling intelligent routing and scoped processing:

Request Context

  • Role name: Which role is being requested
  • User identity: Who is requesting access (user ID, email, groups)
  • Duration: How long access is requested for
  • Justification: User-provided reason for access
  • Requested resources: Specific resources if applicable
  • Provider instance: Which provider instance will be used
  • Time context: Request time, business hours, timezone
  • Location context: User’s IP address, geolocation
  • Risk factors: Unusual access patterns, privilege escalation

Permission Context

  • Requested permissions: Specific actions being requested
  • Permission risk level: Classification of permission sensitivity
  • Resource sensitivity: Classification of target resources
  • Blast radius: Potential impact of the requested access

Historical Context

  • Access history: User’s previous access patterns
  • Approval history: Past approval decisions for similar requests
  • Incident context: Recent security incidents or alerts
  • Compliance status: Current compliance posture

Example: Context-Aware Workflow

roles:
  context-aware-admin:
    name: Context-Aware Administrator
    workflows:
      # Main routing workflow that uses all available context
      - intelligent-router
      
      # Context-specific workflows:
      - first-time-access      # For users with no access history
      - repeat-access          # For users with established patterns
      - anomaly-detected       # For unusual access patterns
      - high-risk-resource     # For sensitive resource access
      - compliance-required    # For regulated environments
      - incident-response      # For emergency/incident scenarios
    permissions:
      allow: ["*:*"]

Integration Examples

Emergency Access with Multiple Safeguards

roles:
  break-glass:
    name: Emergency Break Glass Access
    workflows:
      # Immediate notification workflows
      - emergency-notification   # Immediately notify security team
      - incident-tracking        # Create incident ticket automatically
      
      # Approval workflows (can run in parallel)
      - on-call-approval         # On-call engineer approval (fastest)
      - security-notification    # Security team real-time notification
      
      # Monitoring and control workflows
      - enhanced-monitoring      # Real-time activity monitoring
      - time-enforcement         # Strict time limits (1-2 hours max)
      
      # Post-access workflows
      - post-incident-review     # Schedule mandatory follow-up review
      - access-report           # Generate detailed access report
    scopes:
      groups: [on-call, security-team, incident-commanders]
    permissions:
      allow: ["*:*"]

Development Access with Tiered Approval

roles:
  dev-access:
    name: Development Access
    workflows:
      # Automated workflows for low-risk operations
      - self-approval           # Automatic approval for dev environments
      - usage-tracking          # Track usage patterns and anomalies
      
      # Peer review for moderate-risk operations
      - peer-review            # Another developer reviews the request
      
      # Management approval for high-risk operations
      - tech-lead-approval     # Technical lead approval for prod access
      
      # Governance workflows
      - compliance-check       # Automated compliance validation
      - audit-logging          # Comprehensive audit trail
    scopes:
      groups: [developers, qa-engineers]
    permissions:
      allow: ["*:*"]
    resources:
      allow: 
        - "*dev*"              # Development resources (self-approval)
        - "*staging*"          # Staging resources (peer-review)
        - "*prod*"             # Production resources (tech-lead-approval)

Audit Access with Enhanced Oversight

roles:
  audit-access:
    name: Audit Access
    workflows:
      # Pre-approval workflows
      - compliance-team-approval # Compliance team must approve
      - legal-review            # Legal team review for sensitive audits
      
      # Access control workflows
      - just-in-time           # Activate access only when needed
      - session-recording      # Record all activities during access
      
      # Oversight workflows
      - dual-control           # Require two auditors for sensitive operations
      - supervisor-monitoring  # Audit supervisor real-time monitoring
      
      # Post-access workflows  
      - access-summary         # Generate summary of all actions taken
      - evidence-preservation  # Preserve audit evidence securely
    scopes:
      groups: [auditors, compliance-team, legal-team]
    permissions:
      allow: ["*:List*", "*:Describe*", "*:Get*", "*:Read*"]
    resources:
      allow: ["*"]  # Auditors may need access to any resource

Service Account Access with Automation Controls

roles:
  ci-cd-deployment:
    name: CI/CD Deployment Access
    workflows:
      # Automated approval workflows
      - ci-validation          # Validate CI/CD context and credentials
      - deployment-window      # Check if within allowed deployment window
      
      # Safety workflows
      - canary-deployment      # Gradual rollout for production deployments
      - rollback-preparation   # Prepare automatic rollback mechanisms
      
      # Monitoring workflows
      - deployment-monitoring  # Monitor deployment health
      - security-scanning      # Real-time security scanning of deployments
      
      # Notification workflows
      - team-notification      # Notify relevant teams of deployments
      - stakeholder-update     # Update stakeholders on deployment status
    scopes:
      users:
        - ci-service@example.com
        - deployment-bot@example.com
    permissions:
      allow:
        - "ec2:*Instance*"
        - "s3:GetObject,PutObject"
        - "ecs:*Service*"
        - "lambda:UpdateFunctionCode"
      deny:
        - "*:Delete*"             # No deletion permissions for automation
        - "*:Create*User*"        # No user creation

Configuration Management

File Structure Options

Roles can be organized in multiple ways to suit different organizational needs:

Single File Approach

# roles.yaml
version: "1.0"
roles:
  aws-developer:
    name: AWS Developer
    permissions: { ... }
  gcp-admin:
    name: GCP Administrator  
    permissions: { ... }
  azure-viewer:
    name: Azure Viewer
    permissions: { ... }

Multiple Files by Provider

config/roles/
├── aws.yaml          # AWS-specific roles
├── azure.yaml        # Azure-specific roles
├── gcp.yaml          # GCP-specific roles
├── kubernetes.yaml   # Kubernetes-specific roles
└── common.yaml       # Cross-provider roles

aws.yaml:

version: "1.0"
roles:
  aws-ec2-admin:
    name: AWS EC2 Administrator
    providers: [aws-prod, aws-dev]
    permissions:
      allow: ["ec2:*"]
  
  aws-s3-readonly:
    name: AWS S3 Read-Only
    providers: [aws-prod]
    permissions:
      allow: ["s3:Get*", "s3:List*"]

Multiple Files by Team/Function

config/roles/
├── developers.yaml    # Developer roles
├── admins.yaml       # Administrative roles
├── security.yaml     # Security team roles
├── readonly.yaml     # Read-only access roles
└── emergency.yaml    # Break-glass access roles

developers.yaml:

version: "1.0"
roles:
  frontend-developer:
    name: Frontend Developer
    scopes:
      groups: [frontend-team]
    permissions:
      allow: ["s3:GetObject", "cloudfront:*"]
  
  backend-developer:
    name: Backend Developer
    scopes:
      groups: [backend-team]
    permissions:
      allow: ["ec2:*", "rds:Describe*"]

Loading Configuration

Configure role loading in the main agent configuration:

Directory-Based Loading

# Load all YAML files from directory
roles:
  path: "./config/roles"
  # Recursively loads all *.yaml and *.yml files

URL-Based Loading

# Load from remote URL
roles:
  url:
    uri: "https://config.company.com/roles.yaml"
    headers:
      Authorization: "Bearer ${VAULT_TOKEN}"
    refresh_interval: "5m"      # Refresh every 5 minutes

Vault Integration

# Load from HashiCorp Vault
roles:
  vault:
    path: "secret/agent/roles"
    key: "roles"              # Key within the secret
    refresh_interval: "10m"    # Refresh interval

Inline Definitions

# Define roles directly in main config
roles:
  admin:
    name: Administrator
    permissions:
      allow: ["*:*"]
  readonly:
    name: Read-Only User
    permissions:
      allow: ["*:Describe*", "*:List*", "*:Get*"]

Combined Loading

# Load from multiple sources
roles:
  sources:
    - path: "./config/roles/local"
    - url:
        uri: "https://config.company.com/shared-roles.yaml"
        headers:
          Authorization: "Bearer ${CONFIG_TOKEN}"
    - vault:
        path: "secret/team/roles"
        key: "definitions"

Configuration Validation

The agent validates role configurations on startup:

Syntax Validation

  • YAML syntax correctness
  • Required field presence
  • Data type validation
  • Reference integrity

Semantic Validation

  • Inheritance cycle detection
  • Provider compatibility
  • Permission format validation
  • Resource pattern validation

Runtime Validation

  • Provider role existence
  • User/group scope resolution
  • Workflow availability
  • Authentication provider integration

Hot Reloading

For certain loading methods, roles can be updated without restarting:

roles:
  path: "./config/roles"
  auto_reload: true           # Enable hot reloading
  reload_interval: "30s"      # Check for changes every 30 seconds

Supported for hot reloading:

  • File-based loading (path)
  • URL-based loading (url)
  • Vault-based loading (vault)

Not supported for hot reloading:

  • Inline definitions
  • Combined loading with inline components

Best Practices

1. Role Design Principles

Principle of Least Privilege

# ✅ Good - specific permissions
ec2-restart-role:
  name: EC2 Instance Restart
  permissions:
    allow:
      - "ec2:DescribeInstances,StartInstances,StopInstances,RebootInstances"
      - "ec2:DescribeInstanceStatus"
  resources:
    allow:
      - "arn:aws:ec2:*:*:instance/i-app-*"  # Only app instances

# ❌ Avoid - overly broad permissions  
ec2-admin-role:
  name: EC2 Admin
  permissions:
    allow: ["ec2:*"]  # Too broad

Time-Bounded Access

# Configure time limits in workflows, not roles
time-limited-admin:
  name: Time-Limited Admin
  workflows:
    - time-limited-approval  # Implements max 2-hour access
  permissions:
    allow: ["*:*"]

Clear Naming and Documentation

# ✅ Good - descriptive names and documentation
aws-rds-backup-operator:
  name: AWS RDS Backup Operator
  description: |
    Allows operators to manage RDS backups including:
    - Creating manual snapshots
    - Restoring from snapshots  
    - Managing automated backup settings
    - Read access to backup status and logs
    
    Does NOT allow:
    - Deleting production databases
    - Modifying database configurations
    - Creating new database instances
  
# ❌ Avoid - unclear names
role1:
  name: Some Database Access
  description: Database stuff

2. Inheritance Patterns

Build Logical Role Hierarchies

# Base roles with fundamental permissions
cloud-readonly-base:
  name: Cloud Read-Only Base
  permissions:
    allow: ["*:Describe*", "*:List*", "*:Get*"]

# Service-specific roles
aws-readonly:
  name: AWS Read-Only
  inherits: [cloud-readonly-base]
  providers: [aws-prod, aws-dev]
  
ec2-readonly:
  name: EC2 Read-Only
  inherits: [aws-readonly]
  permissions:
    allow: ["ec2:*"]
  resources:
    allow: ["arn:aws:ec2:*:*:*"]

# Team-specific roles
dev-team-ec2:
  name: Development Team EC2 Access
  inherits: [ec2-readonly]
  scopes:
    groups: [developers]
  permissions:
    allow: ["ec2:StartInstances", "ec2:StopInstances"]
  resources:
    allow: ["arn:aws:ec2:*:*:instance/i-dev-*"]

Leverage Provider Managed Roles

# ✅ Good - use existing cloud roles as foundation
aws-power-user:
  name: AWS Power User
  inherits:
    - "aws-prod:arn:aws:iam::aws:policy/PowerUserAccess"
  # Add company-specific restrictions
  permissions:
    deny: ["iam:*User*", "iam:*Role*"]  # No user/role management
  resources:
    deny: ["arn:aws:s3:::sensitive-*"]   # No sensitive buckets

3. Security Patterns

Defense in Depth

production-admin:
  name: Production Administrator
  description: High-privilege production access with multiple security layers
  
  # Multiple approval layers
  workflows:
    - identity-verification
    - manager-approval
    - security-approval
    - time-restriction
  
  # Strict scope limitation
  scopes:
    users: [emergency-admin@example.com]
    groups: [senior-sre, security-team]
  
  # Explicit resource restrictions even for admin
  resources:
    allow: ["arn:aws:*:us-east-1:123456789012:*"]  # Single region only
    deny: 
      - "arn:aws:s3:::audit-*"                      # No audit data
      - "arn:aws:kms:*:*:key/*"                     # No key access
  
  permissions:
    allow: ["*:*"]
    deny:
      - "iam:DeleteUser"                            # No user deletion
      - "iam:DeleteRole"                            # No role deletion
      - "s3:DeleteBucket"                           # No bucket deletion

Explicit Denials for High-Risk Actions

developer-access:
  name: Developer Access
  permissions:
    allow:
      - "ec2:*"
      - "s3:*"
      - "rds:*"
    deny:
      # Explicit denials for dangerous actions
      - "ec2:TerminateInstances"
      - "s3:DeleteBucket"
      - "rds:DeleteDBInstance"
      - "iam:*"                                     # No IAM access at all

4. Operational Patterns

Environment Separation

# ✅ Good - clear environment separation
development-admin:
  name: Development Administrator
  providers: [aws-dev, azure-dev, gcp-dev]
  workflows: [self-approval]                        # Minimal approval for dev
  
staging-admin:
  name: Staging Administrator  
  providers: [aws-staging, azure-staging, gcp-staging]
  workflows: [lead-approval]                        # Team lead approval
  
production-readonly:
  name: Production Read-Only
  providers: [aws-prod, azure-prod, gcp-prod]
  workflows: [manager-approval, audit-logging]      # Strict controls for prod
  permissions:
    allow: ["*:Describe*", "*:List*", "*:Get*"]

Emergency Access Patterns

break-glass-access:
  name: Emergency Break Glass Access
  description: |
    EMERGENCY USE ONLY
    This role provides unrestricted access for critical incidents.
    All usage is heavily audited and requires post-incident review.
  
  workflows:
    - emergency-notification     # Immediate alerts
    - break-glass-logging        # Enhanced audit logging
    - post-incident-review       # Mandatory follow-up
  
  scopes:
    groups: [on-call, incident-commanders]
  
  permissions:
    allow: ["*:*"]
    
  # Even emergency access has some limits
  resources:
    deny: 
      - "arn:aws:s3:::customer-data-*"             # Customer data protection
      - "arn:aws:kms:*:*:key/*"                     # Encryption key protection

Service Account Patterns

ci-cd-deployment:
  name: CI/CD Deployment Access
  description: Automated deployment service access
  
  scopes:
    users:
      - ci-service@example.com
      - deployment-bot@example.com
  
  workflows:
    - automated-approval        # No human approval needed
    - deployment-logging        # Track all deployments
  
  permissions:
    allow:
      - "ec2:*Instance*"
      - "s3:GetObject,PutObject"
      - "ecs:*Service*"
      - "lambda:UpdateFunctionCode"
    deny:
      - "*:Delete*"             # No deletion permissions for automation
      - "*:Create*User*"        # No user creation

5. Maintenance Patterns

Regular Permission Audits

# Use descriptive comments for audit trails
quarterly-access-review:
  name: Quarterly Access Review
  description: |
    Last reviewed: 2025-01-15
    Next review: 2025-04-15
    Approved by: Security Committee
    
    This role provides quarterly access review capabilities
    for compliance auditing purposes.

Version Control Integration

# Include metadata for tracking
developer-role:
  name: Developer Access
  description: |
    Version: 2.1.0
    Last modified: 2025-01-15
    Modified by: alice@example.com
    Change reason: Added S3 read access for new logging requirements
    
    Change log:
    - 2.1.0: Added S3 read permissions
    - 2.0.0: Migrated to intelligent permission merging
    - 1.0.0: Initial role definition

Troubleshooting

Common Issues and Solutions

1. Role Inheritance Errors

Error: role admin inherits from non-existent role user

Cause: Referenced role doesn’t exist or isn’t loaded yet

Solution:

# ✅ Ensure base roles are defined before child roles
base-user:
  name: Base User
  permissions:
    allow: ["*:Describe*", "*:List*"]

admin-user:
  name: Administrator
  inherits: [base-user]  # Now this will work
  permissions:
    allow: ["*:*"]

2. Provider Role Not Found

Error: role inherits from arn:aws:iam::aws:policy/NonexistentPolicy

Cause: Provider role ARN is incorrect or doesn’t exist in target account

Solutions:

# Verify AWS managed policies
aws iam list-policies --scope AWS --query 'Policies[?PolicyName==`PowerUserAccess`]'

# Verify custom policies  
aws iam get-policy --policy-arn arn:aws:iam::123456789012:policy/CustomPolicy

# Check GCP roles
gcloud iam roles list --filter="name:roles/compute.viewer"

# Check Azure roles
az role definition list --name "Virtual Machine Contributor"

3. Permission Validation Errors

Error: permission ec2:InvalidAction not found in provider

Cause: Permission name is incorrect or not supported by provider

Solutions:

# ✅ Use correct AWS permission names
permissions:
  allow:
    - "ec2:DescribeInstances"     # Correct
    # - "ec2:ListInstances"       # Incorrect - no such permission

# ✅ Check provider documentation for correct names
# AWS: https://docs.aws.amazon.com/service-authorization/
# Azure: https://docs.microsoft.com/en-us/azure/role-based-access-control/
# GCP: https://cloud.google.com/iam/docs/understanding-roles

4. Scope Resolution Issues

Error: user alice@example.com cannot access role admin

Cause: User not included in role scopes

Solutions:

# ✅ Check role scopes include the user
admin-role:
  scopes:
    users: [alice@example.com]  # Direct user access
    groups: [administrators]    # Or group membership

# ✅ Verify user's group memberships
# Check identity provider for user's group assignments

5. Provider-Specific Inheritance Issues

Error: provider aws-prod does not support role arn:aws:iam::456:role/Role

Cause: Cross-account role inheritance without proper trust relationship

Solutions:

# ✅ Ensure role is in the correct account
aws-role:
  providers: [aws-prod]
  inherits:
    # Use role from same account as provider
    - "aws-prod:arn:aws:iam::123456789012:role/MyRole"  # Correct account

# ✅ Set up cross-account trust if needed
# In the target role's trust policy:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:root"  # Trust the source account
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

6. Condensed Action Parsing Issues

Error: invalid condensed action format: k8s:pods:get,list,

Cause: Trailing comma or empty action in condensed format

Solutions:

# ❌ Incorrect - trailing comma
permissions:
  allow:
    - "k8s:pods:get,list,"     # Trailing comma

# ✅ Correct format
permissions:
  allow:
    - "k8s:pods:get,list"      # No trailing comma
    - "k8s:services:create,delete,get,list,update"  # Properly formatted

7. GCP Permissions Being Condensed

Error: GCP permissions appear to be merged incorrectly

Cause: The system correctly detects GCP-style permissions (with dots in the action) and treats them atomically. If you see unexpected behavior, check that your permissions follow the correct format.

Expected Behavior:

permissions:
  allow:
    # These GCP-style permissions are NEVER condensed
    - "gcp-prod:compute.instances.get"
    - "gcp-prod:compute.instances.list"
    - "gcp-prod:compute.instances.start"
    # They remain as separate entries, not merged like:
    # - "gcp-prod:compute.instances:get,list,start"  # This is NOT how GCP works

    # These AWS/K8s permissions CAN be condensed
    - "ec2:DescribeInstances,StartInstances"   # Condensable (no dots in action)
    - "k8s:pods:get,list,watch"                # Condensable (no dots in action)

Detection Rule: If the last segment (after the final colon) contains a dot, the permission is treated as atomic and never condensed.

8. Provider Filtering Not Working

Error: Inherited permissions from other providers are showing up

Cause: Provider prefixes must exactly match entries in the role’s providers list

Solutions:

# ❌ Incorrect - provider prefix doesn't match providers list
my-role:
  providers: [aws-production]  # Note: "aws-production"
  inherits: [base-role]        # base-role has "aws-prod:ec2:*"
  # "aws-prod" != "aws-production", so permission is filtered out

# ✅ Correct - provider prefixes match
my-role:
  providers: [aws-prod]        # Matches the prefix
  inherits: [base-role]        # base-role has "aws-prod:ec2:*"
  # "aws-prod" == "aws-prod", so permission is included

Debugging Tools and Techniques

Enable Debug Logging

# In main agent configuration
logging:
  level: debug
  components:
    - roles
    - inheritance
    - permissions

Use CLI Tools for Testing

# Test role resolution (hypothetical CLI commands)
thand roles list                                    # List all available roles
thand roles describe aws-developer                  # Show role details
thand roles test alice@example.com aws-developer    # Test user access
thand roles inheritance aws-developer               # Show inheritance chain

Validate Configuration

# Validate role configuration files
thand config validate --roles-only
thand config validate --file ./config/roles/aws.yaml

Test Inheritance Resolution

# Add temporary debug role to test inheritance
debug-inheritance:
  name: Debug Inheritance Test
  inherits: [problematic-role]
  permissions:
    allow: ["debug:test"]
  # This will show inheritance resolution issues

Getting Help

Enable Verbose Logging

logging:
  level: trace
  format: json
  outputs:
    - type: file
      path: /var/log/agent/roles.log
    - type: console

Check System Health

# Check provider connectivity
thand providers status

# Check identity provider integration  
thand providers auth status

# Check workflow system
thand workflows status

Contact Information


Examples

For practical examples and templates of role configurations, see the Role Examples page which includes:

  • Basic Development Role - Simple developer access patterns
  • Inherited Admin Role - Multi-cloud administrative access using inheritance
  • Emergency Access Role - Break-glass access for incidents
  • Read-Only Auditor Role - Compliance and auditing access
  • Database Administrator Role - Specialized database permissions
  • DevOps Engineer Role - Infrastructure and deployment management
  • Security Analyst Role - Security monitoring and investigation
  • Temporary Contractor Role - Time-limited external access
  • Multi-Environment Role - Different access across environments
  • Application-Specific Role - Fine-grained application permissions

Each example includes complete YAML configurations with explanations of the patterns used.


Provider Prefix Syntax: When mixing multiple providers into a single role, you can use the provider name as a prefix to avoid ambiguity. For example, to inherit from an AWS role in the aws-prod provider instance, use aws-prod:arn:aws:iam::aws:policy/ReadOnlyAccess. The system uses the first colon as the delimiter between provider name and role identifier.


Table of contents