Cloud documentation is always outdated. Manual updates take 2-3 days per environment and are obsolete within weeks. Solution: AI agents that automatically scan your live AWS infrastructure and generate complete documentation in 5-10 minutes. What you get: - ✅ PlantUML architecture diagram with AWS icons - ✅ Technical runbook for ops teams - ✅ Executive summary for leadership - ✅ Developer onboarding guide How it works: 1. Scans AWS using Resource Groups Tagging API + AWS Config 2. 7 specialized AI agents analyze and document (Inspector → Analyst → Writers) 3. Outputs comprehensive, multi-audience documentation automaticallyCloud documentation is always outdated. Manual updates take 2-3 days per environment and are obsolete within weeks. Solution: AI agents that automatically scan your live AWS infrastructure and generate complete documentation in 5-10 minutes. What you get: - ✅ PlantUML architecture diagram with AWS icons - ✅ Technical runbook for ops teams - ✅ Executive summary for leadership - ✅ Developer onboarding guide How it works: 1. Scans AWS using Resource Groups Tagging API + AWS Config 2. 7 specialized AI agents analyze and document (Inspector → Analyst → Writers) 3. Outputs comprehensive, multi-audience documentation automatically

An AI Crew for Automated Diagramming and Documentation

Our cloud documentation is almost always out of date. It's not because we're lazy; it's because the cloud moves too fast. A diagram drawn in a sprint planning meeting is obsolete by the time the code hits production. This documentation crisis, that every engineering team faces, is a massive and invisible tax. Nobody talks about it, but we know that manual updates are expensive, error-prone, and always outdated when you need them most. The "cost" isn't just the 2-3 days of senior engineer time every quarter—it's the production incidents that could have been prevented, the security vulnerabilities you didn't know existed, and the new hires who take weeks to understand the system.

I was tired of this cycle. So I built a solution that uses AI agents to automatically scan live AWS environments and generate accurate, multi-audience documentation in minutes—not days. Here's how it works, what I learned, and why this approach unlocks something bigger than just better diagrams.

The Problem

:::tip Why Everything We've Tried Has Failed

:::

  • ❌ Manual Documentation

    The promise: "We'll keep the wiki updated" The reality: Updated once during setup, referenced never, trusted by no one The cost: 2-3 days of senior engineer time per environment, outdated within weeks

  • ❌ Diagrams-as-Code (Terraform/CloudFormation diagrams)

    The promise: "Our IaC is our documentation" The reality: Shows the intended state, not the actual state after three hotfixes and that manual console change on Friday night The gap: What you planned vs. what actually exists

  • ❌ Static Scanning Tools

    The promise: "We'll scan your infrastructure" The reality: Dumps 10,000 lines of JSON that tell you what exists but not why or how it's connected.

The Solution

:::tip AI Agents That Understand Infrastructure

:::

What we actually needed is a system that can perceive infrastructure like a scanner, understand it like a senior architect, and explain it like a technical writer—automatically. To achieve this, I created a "crew" of specialized AI agents—each with a specific job, just like a real engineering team.

Think of it like this:

  • The Inspector scans AWS (like a junior engineer running AWS CLI commands)
  • The Analyst understands relationships (like a senior architect reviewing configs)
  • The Draftsman creates diagrams (like a technical illustrator)
  • The Writers create documentation for different audiences:
  • Technical Writer → detailed runbook for ops teams
  • Executive Analyst → high-level summary for leadership
  • Developer Advocate → practical guide for developers

All working in parallel, all generating outputs from the same live data, all in minutes.

The Transformation

:::tip Before vs. After

:::

| Aspect | Before ( Manual Process ) | After ( Automated with AI Agents ) | |----|----|----| | ⏱️ Time | 2-3 days per environment | 5-10 minutes per environment | | 👤 Who | Senior engineer (expensive) | Anyone with AWS access | | 📄 Output | One diagram, maybe a doc | Diagram + 4 tailored documents | | 🔄 Update Frequency | Quarterly if you're lucky | On-demand or automated (CI/CD) | | 🎯 Accuracy | Outdated within weeks | Always reflects current state | | 😰 Stress Level | High (always out of date) | Low (always accurate) |

Quick Start

The entire system is open source. You can have it running in 5 minutes:

# 1. Install the package git clone <https://github.com/kirPoNik/aws-architecture-diagrams-with-crewai.git> cd aws-architecture-diagrams-with-crewai pip install -e . # 2. Run it (that's it!) aws-diagram-generator \\ --name "Production" \\ --region us-east-1 \\ --tags "Environment=prod" "App=myapp" # 3. Check your output/ directory for complete documentation

Prerequisites:

  • Python 3.10+
  • AWS credentials
  • AWS Config enabled
  • AWS Bedrock access (Claude 3.5 Sonnet preferred )

In under 10 minutes, you'll have:

  • ✅ PlantUML architecture diagram with AWS icons
  • ✅ Technical Runbook with every resource detail
  • ✅ Executive Summary in plain English
  • ✅ Developer Onboarding Guide with endpoints

How It Actually Works

Three Key Innovations:

  • Universal Discovery

    This works with ANY AWS Service. The first breakthrough was realizing we don't need to hard-code describe_instances(), describe_db_instances(), etc. for every service. Instead, use AWS's universal APIs:

# This one API call finds ANY tagged resource across ALL services paginator = tagging_client.get_paginator('get_resources') for page in paginator.paginate(TagFilters=boto3_tag_filters): resources = page.get('ResourceTagMappingList', []) all_resource_mappings.extend(resources)

Why this matters:

  • Works with services that didn't exist when you wrote the code. No maintenance as AWS adds new services.

    \

  • Batch Processing

    The second breakthrough was batching AWS Config calls instead of fetching resources one-by-one:

# Group by type resources_by_type: Dict[str, List] = {} for resource in resources: resource_type = extract_resource_type_from_arn(arn) resources_by_type[resource_type].append(resource) # Fetch up to 20 at once response = config_client.batch_get_resource_config( resourceKeys=resource_keys # Batch of 20 ) # Automatic fallback for edge cases if error_code == 'ValidationException': config_client.select_resource_config( Expression=f"SELECT * WHERE configuration.arn = '{safe_arn}'" )

Why this matters:

  • Processes 100s of resources in seconds

  • Built-in retry logic for throttling

  • Automatic fallback when batch isn't supported

    \

  • AI Understanding

    The third breakthrough was using specialized AI agents with personas:

inspector = Agent( role='AWS Infrastructure Inspector', goal='Scan AWS and provide detailed JSON of resources', backstory='You use AWS APIs to discover cloud resources based on tags.', tools=[aws_scanner_tool], llm=llm ) analyst = Agent( role='Cloud Architecture Analyst', goal='Understand architecture, components, and relationships', backstory='You interpret raw infrastructure data and structure it into a logical model.', llm=llm ) draftsman = Agent( role='PlantUML Diagram Draftsman', goal='Generate PlantUML diagram scripts', backstory='You convert architectural information into PlantUML using AWS icons.', llm=llm ) # Chain them together: Inspector → Analyst → Draftsman task_inspect = Task(description='Scan AWS...', agent=inspector) task_analyze = Task(description='Analyze...', agent=analyst, context=[task_inspect]) task_draw = Task(description='Create diagram...', agent=draftsman, context=[task_analyze]) crew = Crew(agents=[...], tasks=[...]) result = crew.kickoff()

Why this matters:

  • Each agent is an expert in its domain
  • Outputs are human-readable, not raw JSON
  • Same data → 4 different perspectives (technical, executive, developer, visual)

The Architecture

:::tip How It All Fits Together

:::

What You Actually Get

:::tip Here's what the final markdown file can look like

:::

# AWS Architecture Documentation: Production Environment ## Table of Contents 1. Architecture Diagram 2. Technical Infrastructure Runbook 3. Executive Summary for Leadership 4. Developer Onboarding Guide ## Architecture Diagram @startuml !include <awslib/AWSCommon> !include <awslib/Compute/EC2> !include <awslib/Database/RDS> rectangle "VPC: vpc-12345 (10.0.0.0/16)" { rectangle "Public Subnet: subnet-abc" { ElasticLoadBalancing(alb, "Application LB", "") } rectangle "Private Subnet: subnet-def" { EC2(web1, "Web Server 1", "t3.medium") EC2(web2, "Web Server 2", "t3.medium") } rectangle "DB Subnet: subnet-ghi" { RDS(db, "PostgreSQL", "db.t3.large") } } alb --> web1 alb --> web2 web1 --> db web2 --> db @enduml ## Technical Infrastructure Runbook ### Compute Resources **EC2 Instance: i-0abc123** (Web Server 1) - Instance Type: t3.medium - Private IP: 10.0.1.10 - Security Groups: sg-web123 (allows 80/443 from ALB) - IAM Role: web-server-role - Tags: Environment=production, Tier=web [... detailed configs for every resource ...] ## Executive Summary This production environment hosts our customer-facing web application using a highly available, three-tier architecture. The system consists of: - **Web Tier:** Redundant web servers behind a load balancer for high availability - **Database Tier:** Managed PostgreSQL database with automated backups - **Security:** Private subnets, restricted security groups, encrypted data The architecture supports approximately 10,000 daily users with 99.9% uptime... ## Developer Onboarding Guide ### Quick Start **Application URL:** <https://my-app-prod-123.us-east-1.elb.amazonaws.com> **Database Connection:**

bash Host: mydb.cluster-abc.us-east-1.rds.amazonaws.com Port: 5432 Database: productiondb User: appuser

Environment Variables:

[… practical connection details …]

## 💭 Final Thoughts and Next Steps This approach is powerful, but it's not magic. Here are the real-world considerations: 1. **Dependency:** The `AWS Config` discovery method is robust, but it relies on AWS Config being enabled and correctly configured to record all the resource types you care about. 2. **Cost:** This makes heavy use of a powerful LLM (like Claude 3.5 Sonnet or GPT-4). Running it on-demand is fine, but running it every 10 minutes on a massive environment could get expensive. 3. **API Rate Limits**: AWS Bedrock has very strong limits, especially on Anthropic Models ( 1-2 requests per minute). To work around we use models via inference profile. Also the Use-Case submission is required. 4. **Non-Determinism:** LLMs are non-deterministic. The `Analyst` might occasionally misinterpret a relationship or the `Draftsman` might make a syntax error. This requires prompt refinement and testing. Once you have AI agents that can perceive and understand your infrastructure, you unlock an entire category of use cases: * **Cost Optimization**

javascript finops_analyst = Agent( role='FinOps Analyst', goal='Identify cost optimization opportunities', backstory='You find abandoned or over-provisioned resources.' ) # Output: "Found 5 unattached EBS volumes costing $150/month" # "RDS instance at 12% CPU could be downsized, save $200/month"

* **Security Auditing**

javascript security_auditor = Agent( role='Security Auditor', goal='Identify security vulnerabilities', backstory='You audit cloud configurations for compliance.' ) # Output: "Security group sg-123 allows 0.0.0.0/0 on port 22" # "S3 bucket 'backups' is not encrypted" # "RDS instance publicly accessible"

* **Compliance Verification**

javascript compliance_checker = Agent( role='Compliance Checker', goal='Verify HIPAA/PCI-DSS/SOC2 compliance' ) # Output: "HIPAA Violation: Database not in private subnet" # "PCI-DSS: Encryption at rest not enabled" ```

📚 Resources

  • 📦 GitHub: aws-architecture-diagrams-with-crewai
  • 🛠️ Tools Used: CrewAI | AWS Config | PlantUML
  • 🎨 AWS Icons: aws-icons-for-plantuml
  • CrewAI GitHub Examples: https://github.com/crewAIInc/crewAI-examples

\

Market Opportunity
null Logo
null Price(null)
--
----
USD
null (null) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.