Merge 402038a8f9 into 8ed0bcf5ca

2026-04-07 00:15:23 +02:00 · 2026-04-05 01:31:15 +00:00 · 2026-04-05 01:31:15 +00:00 · 11a17cf346
commit 11a17cf346
parent 8ed0bcf5ca 402038a8f9
18 changed files with 5077 additions and 0 deletions
--- a/deploy/aws-sam/.gitignore
+++ b/deploy/aws-sam/.gitignore
@ -0,0 +1,77 @@
+# SAM build artifacts
+.aws-sam/
+samconfig.toml.bak
+
+# Environment files
+.env
+.env.local
+.env.production
+.env.staging
+.env.development
+.librechat-deploy-config*
+
+# AWS credentials
+.aws/
+
+# Logs
+*.log
+logs/
+
+# OS generated files
+.DS_Store
+.DS_Store?
+._*
+.Spotlight-V100
+.Trashes
+ehthumbs.db
+Thumbs.db
+
+# IDE files
+.vscode/
+.idea/
+*.swp
+*.swo
+*~
+
+# Temporary files
+*.tmp
+*.temp
+.cache/
+donotcommit.txt
+repo/
+
+# Node modules (if any)
+node_modules/
+
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+
+# Secrets and sensitive data
+secrets.yaml
+secrets.json
+*.pem
+*.key
+*.crt
+
+# Backup files
+*.bak
+*.backup
--- a/deploy/aws-sam/README.md
+++ b/deploy/aws-sam/README.md
@ -0,0 +1,724 @@
+# LibreChat AWS SAM Deployment
+
+This repository contains AWS SAM templates and scripts to deploy LibreChat on AWS with maximum scalability and high availability.
+
+## What is LibreChat?
+
+LibreChat is an enhanced, open-source ChatGPT clone that provides:
+- **Multi-AI Provider Support**: OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure OpenAI, and more
+- **Advanced Features**: Agents, function calling, file uploads, conversation search, code interpreter
+- **Secure Multi-User**: Authentication, user management, conversation privacy
+- **Extensible**: Plugin system, custom endpoints, RAG integration
+- **Self-Hosted**: Complete control over your data and infrastructure
+
+## Architecture Overview
+
+This deployment creates a highly scalable, production-ready LibreChat environment optimized for enterprise use:
+
+### Core Infrastructure (Scalability-First Design)
+- **ECS Fargate**: Serverless container orchestration with auto-scaling (2-20 instances)
+- **Application Load Balancer**: High availability with health checks and SSL termination
+- **VPC**: Multi-AZ setup with public/private subnets and flexible internet connectivity options
+- **Internet Connectivity**: Choose between NAT Gateways (standard AWS pattern) or Transit Gateway (existing infrastructure)
+- **Auto Scaling**: CPU-based scaling with target tracking (70% CPU utilization)
+
+### Data & Storage Layer
+- **DocumentDB**: MongoDB-compatible database with multi-AZ deployment and automatic failover
+- **ElastiCache Redis**: In-memory caching, session storage, and conversation search with failover
+- **S3**: Encrypted file storage for user uploads, avatars, documents, and static assets
+
+
+
+### Internet Connectivity Options
+
+The deployment supports two network connectivity patterns:
+
+**Option 1: NAT Gateway (Standard AWS Pattern)**
+- **High Availability**: NAT Gateways in each AZ with automatic failover
+- **Enterprise Performance**: Up to 45 Gbps bandwidth per gateway
+- **Zero Maintenance**: Fully managed by AWS with 99.95% SLA
+- **Cost**: ~$90/month for 2 NAT Gateways + data processing fees
+- **Use Case**: New deployments or when maximum reliability is required
+
+**Option 2: Transit Gateway (Existing Infrastructure)**
+- **Cost Optimization**: No NAT Gateway costs (~$90/month savings)
+- **Existing Infrastructure**: Leverages existing Transit Gateway setup
+- **Controlled Routing**: Uses existing network policies and routing
+- **Use Case**: Organizations with existing Transit Gateway infrastructure
+
+### Security & Monitoring
+- **Secrets Manager**: Secure storage for database passwords, JWT secrets, and API keys
+- **CloudWatch**: Centralized logging, monitoring, and alerting
+- **Security Groups**: Network-level security with least privilege access
+- **IAM Roles**: Fine-grained permissions for ECS tasks and AWS service access
+
+### Advanced Scalability Features
+- **Fargate Spot Integration**: 80% Spot instances + 20% On-Demand for cost optimization
+- **Multi-AZ High Availability**: Automatic failover across multiple availability zones
+- **Horizontal Auto Scaling**: Scales from 2-20 instances based on CPU utilization
+- **Load Balancing**: Intelligent traffic distribution across healthy instances
+- **Container Health Checks**: Automatic replacement of unhealthy containers
+- **Database Read Replicas**: DocumentDB supports read scaling for high-traffic scenarios
+- **Redis Clustering**: ElastiCache supports cluster mode for memory scaling
+
+## Prerequisites
+
+1. **AWS CLI** - [Installation Guide](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)
+2. **SAM CLI** - [Installation Guide](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/install-sam-cli.html)
+3. **AWS Account** with appropriate permissions and network topology
+4. **Domain & SSL Certificate** (for custom domain)
+5. **AWS Cognito User Pool** (optional - for SSO authentication)
+
+### SSO Prerequisites (Optional)
+If you plan to use SSO authentication:
+- **AWS Cognito User Pool** with configured identity providers
+- **App Client** created in the Cognito User Pool with appropriate settings
+- **Identity Provider** (SAML, OIDC, or social) configured in Cognito
+- **Attribute mappings** configured in Cognito for user claims (name, email)
+
+### Required AWS Permissions
+
+Your AWS user/role needs permissions for:
+- CloudFormation (full access)
+- ECS (full access)
+- EC2 (VPC, Security Groups, Load Balancers)
+- DocumentDB (full access)
+- ElastiCache (full access)
+- S3 (bucket creation and management)
+- IAM (role creation)
+- Secrets Manager (secret creation)
+- CloudWatch (log groups)
+- STS (checking caller identity)
+
+## Quick Start
+
+### Interactive Deployment (Recommended)
+
+1. **Clone and configure:**
+   ```bash
+   git clone <this-repo>
+   cd librechat-aws-sam
+   
+   # Configure AWS credentials
+   aws configure
+   ```
+
+2. **Run interactive deployment:**
+   ```bash
+   ./deploy-clean.sh
+   ```
+   
+   The script will interactively prompt for:
+   - Environment (dev/staging/prod)
+   - AWS region
+   - Stack name
+   - Internet connectivity option (NAT Gateway vs Transit Gateway)
+   - VPC ID (with helpful VPC listing)
+   - Public subnet IDs (for load balancer)
+   - Private subnet IDs (for ECS tasks and databases)
+   - AWS Bedrock credentials for AI model access
+   - Optional SSO configuration with AWS Cognito
+   - Optional domain name and SSL certificate
+
+3. **Save configuration for future deployments:**
+   The script automatically offers to save your configuration to `.librechat-deploy-config`
+
+4. **Redeploy with saved configuration:**
+   ```bash
+   ./deploy-clean.sh --load-config
+   ```
+
+5. **Update YAML config file only option:**
+   To update yaml config file and restart containers only
+   ```bash
+   ./deploy-clean.sh --update-config
+   ```
+
+
+## Deployment Options
+
+### Interactive Deployment (Recommended)
+```bash
+# First-time deployment
+./deploy-clean.sh
+
+# Redeploy with saved configuration
+./deploy-clean.sh --load-config
+
+# Reset saved configuration
+./deploy-clean.sh --reset-config
+
+# Update yaml config file and restart containers only
+./deploy-clean.sh --update-config
+```
+
+The interactive deployment provides:
+- **Guided Setup**: Step-by-step prompts for all parameters
+- **AWS Resource Discovery**: Lists available VPCs and subnets
+- **Validation**: Checks VPC and subnet accessibility
+- **Configuration Persistence**: Saves settings for future deployments
+- **Smart Defaults**: Remembers previous choices
+
+## Configuration
+
+### Deploy script configuration (`.librechat-deploy-config`)
+
+The deploy script saves your choices to `.librechat-deploy-config` and reloads them with `--load-config`. You can also edit this file to set or change options without re-prompting.
+
+**Optional: Custom container image (`LIBRECHAT_IMAGE`)**
+
+By default, the stack uses the container image defined in the template (e.g. the official `librechat/librechat:latest` or a template default). To use a custom image (e.g. your own ECR build), set `LIBRECHAT_IMAGE` in your deploy config:
+
+```bash
+LIBRECHAT_IMAGE="<account>.dkr.ecr.<region>.amazonaws.com/<repository>:<tag>"
+```
+
+Then deploy with the config loaded so the parameter is applied:
+
+```bash
+./deploy-clean.sh --load-config
+```
+
+If `LIBRECHAT_IMAGE` is unset or empty, the template’s default image is used.
+
+### Environment Variables
+
+The deployment automatically configures these environment variables for LibreChat:
+
+**Core Application Settings:**
+- `NODE_ENV`: Set to "production"
+- `MONGO_URI`: DocumentDB connection string with SSL and authentication
+- `REDIS_URI`: ElastiCache Redis connection string
+- `NODE_TLS_REJECT_UNAUTHORIZED`: Set to "0" for DocumentDB SSL compatibility
+- `ALLOW_REGISTRATION`: Set to "false" (configure SAML post-deployment)
+
+**Security & Authentication:**
+- `JWT_SECRET`: Auto-generated secure JWT secret (stored in Secrets Manager)
+- `JWT_REFRESH_SECRET`: Auto-generated refresh token secret (stored in Secrets Manager)
+- `CREDS_KEY`: Auto-generated credentials encryption key (stored in Secrets Manager)
+- `CREDS_IV`: Auto-generated encryption IV (stored in Secrets Manager)
+
+**SSO Authentication (Optional):**
+- `ENABLE_SSO`: Set to "true" to enable SSO authentication
+- `COGNITO_USER_POOL_ID`: AWS Cognito User Pool ID
+- `OPENID_CLIENT_ID`: App Client ID from Cognito User Pool
+- `OPENID_CLIENT_SECRET`: App Client Secret from Cognito User Pool
+- `OPENID_SCOPE`: OpenID scope for authentication (default: `openid profile email`)
+- `OPENID_BUTTON_LABEL`: Login button text (default: `Sign in with SSO`)
+- `OPENID_NAME_CLAIM`: Name attribute mapping (default: `name`)
+- `OPENID_EMAIL_CLAIM`: Email attribute mapping (default: `email`)
+- `OPENID_SESSION_SECRET`: Auto-generated session secret (stored in Secrets Manager)
+- `OPENID_ISSUER`: Auto-configured Cognito issuer URL
+- `OPENID_CALLBACK_URL`: Auto-configured callback URL (`/oauth/openid/callback`)
+
+**AWS Bedrock Configuration:**
+- `AWS_REGION`: Deployment region for AWS services
+- `BEDROCK_AWS_DEFAULT_REGION`: AWS region for Bedrock API calls
+- `BEDROCK_AWS_ACCESS_KEY_ID`: AWS access key for Bedrock access (from deployment parameters)
+- `BEDROCK_AWS_SECRET_ACCESS_KEY`: AWS secret key for Bedrock access (from deployment parameters)
+- `BEDROCK_AWS_MODELS`: Pre-configured Bedrock models including:
+  - `us.anthropic.claude-3-7-sonnet-20250219-v1:0`
+  - `us.anthropic.claude-opus-4-20250514-v1:0`
+  - `us.anthropic.claude-sonnet-4-20250514-v1:0`
+  - `us.anthropic.claude-3-5-haiku-20241022-v1:0`
+  - `us.meta.llama3-3-70b-instruct-v1:0`
+  - `us.amazon.nova-pro-v1:0`
+
+**Configuration Management:**
+- `CONFIG_PATH`: Set to "/app/config/librechat.yaml" (mounted from EFS)
+- `CACHE`: Set to "false" to disable prompt caching (avoids Bedrock caching issues)
+
+### EFS Configuration System:
+
+The deployment includes an EFS-based configuration management system:
+- **Real-time Updates**: Configuration changes without container rebuilds
+- **S3 → EFS Pipeline**: Automated sync from S3 to EFS via Lambda
+- **Container Mounting**: EFS volume mounted at `/app/config/librechat.yaml` and CONFIG_PATH environmental variable set to match it
+- **Update Commands**: Use `./deploy-clean.sh --update-config` for config-only updates
+
+### Scaling Configuration
+
+Default scaling settings:
+- **Min Capacity**: 2 instances
+- **Max Capacity**: 20 instances
+- **Target CPU**: 70% utilization
+- **Scale Out Cooldown**: 5 minutes
+- **Scale In Cooldown**: 5 minutes
+
+To modify scaling, edit the `ECSAutoScalingTarget` and `ECSAutoScalingPolicy` resources in `template.yaml`.
+
+### Database Configuration
+
+**DocumentDB (MongoDB-compatible):**
+- Instance Class: `db.t3.medium` (2 instances)
+- Backup Retention: 7 days
+- Encryption: Enabled
+- Multi-AZ: Yes
+
+**ElastiCache Redis:**
+- Node Type: `cache.t3.micro` (2 nodes)
+- Engine Version: 7.0
+- Encryption: At-rest and in-transit
+- Multi-AZ: Yes with automatic failover
+
+## LibreChat Dependencies & Features
+
+### Core Dependencies Deployed
+- **MongoDB/DocumentDB**: Primary database for conversations, users, and metadata
+- **Redis/ElastiCache**: Session management, caching, and real-time features
+- **S3**: File storage with support for multiple strategies:
+  - **Avatars**: User and agent profile images
+  - **Images**: Chat image uploads and generations
+  - **Documents**: PDF uploads, text files, and attachments
+  - **Static Assets**: CSS, JavaScript, and other static content
+
+### Optional Components (Can Be Added)
+- **Meilisearch**: Full-text search for conversation history with typo tolerance
+- **Vector Database**: For RAG (Retrieval-Augmented Generation) functionality
+- **CDN**: CloudFront integration for global content delivery
+
+### File Storage Strategies
+LibreChat supports multiple storage strategies that can be mixed:
+- **S3**: Scalable cloud storage (configured in this deployment)
+
+
+## Post-Deployment Setup
+
+### 1. Access LibreChat
+After deployment completes (15-20 minutes), access LibreChat using the Load Balancer URL:
+
+```bash
+# Get the application URL
+aws cloudformation describe-stacks \
+  --stack-name librechat \
+  --query 'Stacks[0].Outputs[?OutputKey==`LoadBalancerURL`].OutputValue' \
+  --output text
+```
+
+The application will be available at: `http://your-load-balancer-url` (or `https://` if you configured SSL)
+
+### 2. Initial Admin Setup
+1. **First User Registration**: The first user to register becomes the admin
+<!-- 2. **Admin Panel Access**: Navigate to `/admin` after logging in as admin
+3. **User Management**: Control user registration and permissions -->
+
+### 3. Configure SSO Authentication (Optional)
+
+**Prerequisites:**
+- AWS Cognito User Pool created and configured
+- App Client created in the User Pool with appropriate settings
+- Identity Provider configured in Cognito (SAML, OIDC, or social providers)
+- Attribute mappings configured in Cognito
+
+**SSO Configuration Options:**
+
+The deployment supports optional SSO authentication through AWS Cognito with OpenID Connect:
+
+**Required SSO Settings:**
+- `ENABLE_SSO`: Set to "true" to enable SSO authentication
+- `COGNITO_USER_POOL_ID`: Your AWS Cognito User Pool ID (e.g., `us-east-1_8o9DM3lHZ`)
+- `OPENID_CLIENT_ID`: App Client ID from your Cognito User Pool
+- `OPENID_CLIENT_SECRET`: App Client Secret from your Cognito User Pool
+
+**Optional SSO Settings:**
+- `OPENID_SCOPE`: OpenID scope for authentication (default: `openid profile email`)
+- `OPENID_BUTTON_LABEL`: Login button text (default: `Sign in with SSO`)
+- `OPENID_NAME_CLAIM`: Name attribute mapping (default: `name`)
+- `OPENID_EMAIL_CLAIM`: Email attribute mapping (default: `email`)
+
+**Automatic Configuration:**
+The deployment automatically configures:
+- `OPENID_ISSUER`: Cognito issuer URL (`https://cognito-idp.{region}.amazonaws.com/{user-pool-id}`)
+- `OPENID_CALLBACK_URL`: OAuth callback URL (`/oauth/openid/callback`)
+- `OPENID_SESSION_SECRET`: Secure session secret (auto-generated and stored in Secrets Manager)
+
+**Configuration Methods:**
+
+1. **During Deployment**: The interactive deployment script will prompt for SSO settings
+2. **Post-Deployment**: Update the CloudFormation stack with SSO parameters
+3. **Environment Variables**: Configure directly in the ECS task definition
+
+**SSO Setup Steps:**
+
+1. **Create AWS Cognito User Pool**:
+   - Create a new User Pool in AWS Cognito
+   - Configure sign-in options (email, username, etc.)
+   - Set up password policies and MFA if desired
+   - Configure attribute mappings for name and email
+
+2. **Create App Client**:
+   - Create an App Client in your User Pool
+   - Enable "Generate client secret"
+   - Configure OAuth 2.0 settings:
+     - Allowed OAuth Flows: Authorization code grant
+     - Allowed OAuth Scopes: openid, profile, email
+     - Callback URLs: `https://your-domain/oauth/openid/callback`
+     - Sign out URLs: `https://your-domain`
+
+3. **Configure Identity Provider (Optional)**:
+   - Add SAML, OIDC, or social identity providers to Cognito
+   - Configure attribute mappings between IdP and Cognito
+   - Test the identity provider integration
+
+4. **Deploy with SSO**:
+   ```bash
+   ./deploy-clean.sh
+   # Choose "y" when prompted for SSO configuration
+   # Provide the required Cognito User Pool ID, Client ID, and Client Secret
+   ```
+
+5. **Verify SSO Integration**:
+   - Access LibreChat URL
+   - Click the SSO login button (customizable label)
+   - Complete authentication flow through Cognito
+   - Verify user attributes are mapped correctly
+
+**Important Notes:**
+- SSO configuration is completely optional
+- If SSO is not configured, LibreChat uses standard email/password authentication
+- SSO settings can be added or modified after initial deployment
+- Ensure Cognito User Pool and App Client configuration is complete before enabling SSO
+- The callback URL must match exactly what's configured in your Cognito App Client
+
+**Adding SSO After Initial Deployment:**
+
+If you deployed without SSO initially, you can add it later:
+
+1. **Update CloudFormation Stack**:
+   ```bash
+   aws cloudformation update-stack \
+     --stack-name your-stack-name \
+     --use-previous-template \
+     --parameters ParameterKey=EnableSSO,ParameterValue="true" \
+                  ParameterKey=CognitoUserPoolId,ParameterValue="your-user-pool-id" \
+                  ParameterKey=OpenIdClientId,ParameterValue="your-client-id" \
+                  ParameterKey=OpenIdClientSecret,ParameterValue="your-client-secret" \
+     --capabilities CAPABILITY_IAM
+   ```
+
+2. **Or Re-run Deployment Script**:
+   ```bash
+   ./deploy-clean.sh --load-config
+   # Choose "y" for SSO configuration when prompted
+   ```
+
+**Supported Identity Providers:**
+Through AWS Cognito, you can integrate with:
+- **SAML 2.0**: Enterprise identity providers (Active Directory, Okta, etc.)
+- **OpenID Connect**: OIDC-compliant providers
+- **Social Providers**: Google, Facebook, Amazon, Apple
+- **Custom Providers**: Any OAuth 2.0 or SAML 2.0 compliant system
+
+### 4. Set Up AI Provider API Keys
+Configure your AI providers in the LibreChat interface:
+
+**Supported Providers:**
+- **OpenAI**: GPT-4, GPT-3.5, DALL-E, Whisper
+- **Anthropic**: Claude 3.5 Sonnet, Claude 3 Opus/Haiku
+- **Google**: Gemini Pro, Gemini Vision
+- **Azure OpenAI**: Enterprise OpenAI models
+- **AWS Bedrock**: Claude, Titan, Llama models
+- **Groq**: Fast inference for Llama, Mixtral
+- **OpenRouter**: Access to multiple model providers
+- **Custom Endpoints**: Any OpenAI-compatible API
+
+**Configuration Methods:**
+- **Environment Variables**: Pre-configure in deployment (more secure)
+- **YAML FILE**: Certain configuration options are configured via librechat.yaml
+
+<!-- ### 5. File Upload & Storage Configuration
+The deployment automatically configures S3 for file storage:
+
+- **Upload Limits**: Configure max file sizes in admin panel
+- **Supported Formats**: PDFs, images, text files, code files
+- **Storage Strategy**: S3 (configured automatically)
+- **CDN Integration**: Ready for CloudFront if needed -->
+
+### 5. Advanced Configuration Options
+
+<!-- **Conversation Search (Optional):**
+- Deploy Meilisearch for full-text conversation search
+- Enables typo-tolerant search across chat history
+- Can be added as additional ECS service
+
+**RAG Integration (Optional):**
+- Configure vector database for document Q&A
+- Supports PDF uploads with semantic search
+- Integrates with embedding providers
+
+**Rate Limiting:**
+- Configure per-user rate limits
+- Set up token usage tracking
+- Monitor costs across providers -->
+
+### 6. Monitoring & Maintenance
+
+**CloudWatch Dashboards:**
+- ECS service metrics (CPU, memory, task count)
+- Load balancer performance (response time, error rates)
+- Database metrics (DocumentDB and Redis)
+- Application logs and error tracking
+
+**Automated Scaling:**
+- Monitors CPU utilization (target: 70%)
+- Scales from 2-20 instances automatically
+- Uses 80% Spot instances for cost optimization
+
+**Health Checks:**
+- Application-level health checks
+- Database connectivity monitoring
+- Automatic unhealthy task replacement
+
+## Monitoring and Maintenance
+
+### CloudWatch Logs
+View application logs:
+```bash
+aws logs tail /ecs/librechat --follow
+```
+
+### ECS Service Status
+Check service health:
+```bash
+aws ecs describe-services --cluster librechat-cluster --services librechat-service
+```
+
+### Database Monitoring
+- DocumentDB metrics available in CloudWatch
+- ElastiCache Redis metrics and performance insights
+- Set up CloudWatch alarms for critical metrics
+
+### Cost Optimization
+- Monitor Fargate Spot vs On-Demand usage
+- Review DocumentDB and ElastiCache instance sizes
+- Set up billing alerts
+
+
+## Scaling Considerations
+
+### Horizontal Scaling (Automatic)
+The deployment automatically handles horizontal scaling:
+
+**ECS Auto Scaling:**
+- **Minimum**: 2 instances (high availability)
+- **Maximum**: 20 instances (configurable)
+- **Trigger**: 70% CPU utilization average
+- **Scale Out**: Add instances when CPU > 70% for 5 minutes
+- **Scale In**: Remove instances when CPU < 70% for 5 minutes
+- **Cooldown**: 5-minute intervals between scaling actions
+
+**Database Scaling:**
+- **DocumentDB**: Supports up to 15 read replicas for read scaling
+- **ElastiCache Redis**: Supports cluster mode for memory scaling
+- **Connection Pooling**: Efficient database connection management
+
+### Vertical Scaling (Manual)
+For higher per-instance performance:
+
+**ECS Task Scaling:**
+```yaml
+# In template.yaml, modify:
+Cpu: 2048        # Double CPU (1024 -> 2048)
+Memory: 4096     # Double memory (2048 -> 4096)
+```
+
+**Database Scaling:**
+```yaml
+# Upgrade DocumentDB instances:
+DBInstanceClass: db.r5.large    # From db.t3.medium
+DBInstanceClass: db.r5.xlarge   # For heavy workloads
+
+# Upgrade Redis instances:
+NodeType: cache.r6g.large       # From cache.t3.micro
+```
+
+### Global Scaling (Multi-Region)
+For worldwide deployment:
+
+<!-- 1. **Deploy in Multiple Regions**:
+   ```bash
+   ./deploy.sh -r us-east-1 -s librechat-us-east
+   ./deploy.sh -r eu-west-1 -s librechat-eu-west
+   ./deploy.sh -r ap-southeast-1 -s librechat-asia
+   ```
+
+2. **Route 53 Setup**:
+   - Health checks for each region
+   - Latency-based routing
+   - Automatic failover
+
+3. **Data Synchronization**:
+   - DocumentDB Global Clusters
+   - S3 Cross-Region Replication
+   - Redis Global Datastore -->
+
+### Load Testing
+Before production deployment, perform load testing:
+
+```bash
+# Example load test with Apache Bench
+ab -n 10000 -c 100 http://your-load-balancer-url/
+
+# Or use more sophisticated tools:
+# - Artillery.io for API testing
+# - JMeter for comprehensive testing
+# - Locust for Python-based testing
+```
+
+### Capacity Planning
+Plan for growth with these guidelines:
+
+**User Scaling:**
+- **Light Users**: 1 instance per 100 concurrent users
+- **Medium Users**: 1 instance per 50 concurrent users  
+- **Heavy Users**: 1 instance per 25 concurrent users
+
+**Database Scaling:**
+- **DocumentDB**: 1000 connections per db.t3.medium
+- **Redis**: 65,000 connections per cache.t3.micro
+- **Storage**: Plan 1GB per 1000 conversations
+
+## Security Best Practices
+
+### Network Security
+- All databases in private subnets
+- Security groups with minimal required access
+- Optional NAT gateways or Transit Gateway for outbound internet access
+- Flexible internet connectivity based on existing infrastructure
+
+### Data Security
+- Encryption at rest for all data stores
+- Encryption in transit for Redis
+- S3 bucket encryption and versioning
+- Secrets Manager for sensitive data
+
+### Access Control
+- IAM roles with least privilege
+- ECS task roles for service-specific permissions
+- No hardcoded credentials
+
+## Troubleshooting
+
+### Common Issues
+
+**Deployment Fails:**
+```bash
+# Check CloudFormation events
+aws cloudformation describe-stack-events --stack-name librechat
+
+# Check SAM logs
+sam logs -n ECSService --stack-name librechat
+```
+
+**Service Won't Start:**
+```bash
+# Check ECS task logs
+aws ecs describe-tasks --cluster librechat-cluster --tasks <task-arn>
+
+# Check CloudWatch logs
+aws logs tail /ecs/librechat --follow
+```
+
+**Database Connection Issues:**
+- Verify security group rules
+- Check DocumentDB cluster status
+- Validate connection strings in Secrets Manager
+
+### Performance Issues
+- Monitor ECS service CPU/memory utilization
+- Check DocumentDB performance insights
+- Review ElastiCache Redis metrics
+- Analyze ALB target group health
+
+## Cleanup
+
+To remove all resources:
+```bash
+aws cloudformation delete-stack --stack-name librechat
+```
+
+**Note:** This will delete all data. Ensure you have backups if needed.
+
+## Cost Optimization & Estimation
+
+### Cost Optimization Features
+This deployment is optimized for cost efficiency while maintaining high availability:
+
+**Fargate Spot Integration:**
+- **80% Spot Instances**: Up to 70% cost savings on compute
+- **20% On-Demand**: Ensures availability during Spot interruptions
+- **Automatic Failover**: Seamless transition between Spot and On-Demand
+
+**Right-Sizing Strategy:**
+- **Auto Scaling**: Only pay for resources you need (2-20 instances)
+- **Efficient Instance Types**: Optimized CPU/memory ratios
+- **Database Optimization**: DocumentDB and Redis sized for typical workloads
+
+**Storage Optimization:**
+- **S3 Intelligent Tiering**: Automatic cost optimization for file storage
+- **Lifecycle Policies**: Automatic cleanup of incomplete uploads
+- **Compression**: Efficient storage of conversation data
+
+### Monthly Cost Estimation (US-East-1)
+
+**Base Infrastructure (Minimum 2 instances):**
+- **ECS Fargate (2 instances)**: ~$30-50/month
+  - 80% Spot pricing: ~$24-40/month
+  - 20% On-Demand: ~$6-10/month
+- **DocumentDB (2x db.t3.medium)**: ~$100-120/month
+- **ElastiCache Redis (2x cache.t3.micro)**: ~$30-40/month
+- **Application Load Balancer**: ~$20/month
+- **NAT Gateway (2 AZs) - Optional**: ~$90/month
+  - **Base cost**: $45/month per NAT Gateway × 2 = $90/month
+  - **Data processing**: $0.045 per GB processed
+  - **High availability**: Automatic failover between AZs
+  - **Performance**: Up to 45 Gbps bandwidth per gateway
+- **S3 Storage**: ~$5-25/month (depending on usage)
+- **Data Transfer**: ~$10-30/month (depending on traffic)
+
+**Total Monthly Cost Ranges:**
+
+**With NAT Gateways (Standard AWS Pattern):**
+- **Light Usage (2-3 instances)**: ~$285-335/month
+- **Medium Usage (5-8 instances)**: ~$380-480/month
+- **Heavy Usage (10-20 instances)**: ~$530-830/month
+
+**Without NAT Gateways (Transit Gateway Pattern):**
+- **Light Usage (2-3 instances)**: ~$195-245/month
+- **Medium Usage (5-8 instances)**: ~$290-390/month
+- **Heavy Usage (10-20 instances)**: ~$440-740/month
+
+**NAT Gateway vs Transit Gateway Comparison:**
+- **NAT Gateway Benefits**: 99.95% SLA, zero maintenance, 45 Gbps performance, built-in DDoS protection
+- **Transit Gateway Benefits**: ~$90/month cost savings, leverages existing infrastructure, centralized routing
+- **Cost Difference**: ~$90/month for NAT Gateway option
+- **Performance**: NAT Gateway typically faster for internet access, Transit Gateway may have additional latency
+
+**Cost Comparison:**
+- **Traditional EC2**: 40-60% more expensive
+- **Managed Services**: 70-80% more expensive than self-managed
+- **Multi-Cloud**: This deployment is 50-70% cheaper than equivalent GCP/Azure
+
+### Cost Monitoring & Alerts
+- **AWS Cost Explorer**: Track spending by service
+- **Billing Alerts**: Set up budget notifications
+- **Resource Tagging**: Track costs by environment/team
+- **Spot Instance Savings**: Monitor Spot vs On-Demand usage
+
+### Additional Cost Optimization Tips
+1. **Use Reserved Instances**: For DocumentDB if usage is predictable
+2. **Enable S3 Intelligent Tiering**: Automatic storage class optimization
+3. **Monitor Data Transfer**: Optimize between AZs and regions
+4. **Regular Cleanup**: Remove unused resources and old backups
+5. **Right-Size Databases**: Monitor and adjust instance types based on usage
+
+## Support
+
+For issues related to:
+- **LibreChat**: [LibreChat GitHub](https://github.com/danny-avila/LibreChat)
+- **AWS SAM**: [AWS SAM Documentation](https://docs.aws.amazon.com/serverless-application-model/)
+- **This deployment**: Create an issue in this repository
+
+## License
+
+This deployment template is provided under the MIT License. LibreChat itself is licensed under the MIT License.
--- a/deploy/aws-sam/deploy-clean.sh
+++ b/deploy/aws-sam/deploy-clean.sh
--- a/deploy/aws-sam/librechat.example.yaml
+++ b/deploy/aws-sam/librechat.example.yaml
@ -0,0 +1,108 @@
+# Minimal LibreChat config for AWS SAM deploy
+# Copy this file to librechat.yaml and customize for your deployment.
+# For full options, see: https://www.librechat.ai/docs/configuration/librechat_yaml
+
+# Configuration version (required)
+version: 1.2.8
+
+# Cache settings
+cache: true
+
+# File storage configuration
+fileStrategy: "s3"
+
+# Transaction settings
+transactions:
+  enabled: true
+
+interface:
+  mcpServers:
+    placeholder: "Select MCP Servers"
+    use: true
+    create: true
+    share: true
+    trustCheckbox:
+      label: "I trust this server"
+      subLabel: "Only enable servers you trust"
+  privacyPolicy:
+    externalUrl: "https://example.com/privacy"
+    openNewTab: true
+  termsOfService:
+    externalUrl: "https://example.com/terms"
+    openNewTab: true
+    modalAcceptance: true
+    modalTitle: "Terms of Service"
+    modalContent: |
+      # Terms of Service
+      ## Introduction
+      Welcome to LibreChat!
+  modelSelect: true
+  parameters: true
+  sidePanel: true
+  presets: false
+  prompts: false
+  bookmarks: false
+  multiConvo: true
+  agents: true
+  customWelcome: "Welcome to LibreChat!"
+  runCode: true
+  webSearch: true
+  fileSearch: true
+  fileCitations: true
+
+# MCP Servers Configuration (customize or add your own)
+# Use env var placeholders for secrets, e.g. ${MCP_SOME_TOKEN}
+mcpServers:
+  # Example: third-party MCP
+  # Deepwiki:
+  #   url: "https://mcp.deepwiki.com/mcp"
+  #   name: "DeepWiki"
+  #   description: "DeepWiki MCP Server..."
+  #   type: "streamable-http"
+  # Example: your own MCP (replace with your API URL and token env var)
+  # MyMcp:
+  #   name: "My MCP Server"
+  #   description: "Description of the server"
+  #   url: "https://YOUR_API_ID.execute-api.YOUR_REGION.amazonaws.com/dev/mcp/your_mcp"
+  #   type: "streamable-http"
+  #   headers:
+  #     Authorization: "Bearer ${MCP_MY_TOKEN}"
+
+# Registration (optional)
+# registration:
+#   socialLogins: ['saml', 'github', 'google', 'openid', ...]
+registration:
+  socialLogins:
+    - "saml"
+    - "openid"
+  # allowedDomains:
+  #   - "example.edu"
+  #   - "*.example.edu"
+
+# Balance settings (optional)
+balance:
+  enabled: true
+  startBalance: 650000
+  autoRefillEnabled: true
+  refillIntervalValue: 1440
+  refillIntervalUnit: "minutes"
+  refillAmount: 250000
+
+# Custom endpoints (e.g. Bedrock)
+endpoints:
+  # bedrock:
+  #   cache: true
+  #   promptCache: true
+  #   titleModel: "us.anthropic.claude-3-7-sonnet-20250219-v1:0"
+
+# Model specs – default model selection for new users
+# modelSpecs:
+#   prioritize: true
+#   list:
+#     - name: "my-default"
+#       label: "My Default Model"
+#       description: "Default model for new conversations"
+#       default: true
+#       preset:
+#         endpoint: "bedrock"
+#         model: "us.anthropic.claude-sonnet-4-5-20250929-v1:0"
--- a/deploy/aws-sam/scripts/README.md
+++ b/deploy/aws-sam/scripts/README.md
@ -0,0 +1,235 @@
+# LibreChat Admin Scripts
+
+This directory contains utility scripts for managing your LibreChat deployment.
+
+## Managing Admin Users
+
+### Grant Admin Permissions
+
+To grant admin permissions to a user:
+
+```bash
+./scripts/make-admin.sh user@domain.edu
+```
+
+### Remove Admin Permissions
+
+To remove admin permissions from a user (demote to regular user):
+
+```bash
+./scripts/make-admin.sh user@domain.edu --remove
+```
+
+### How It Works
+
+The script:
+1. Spins up a one-off ECS task using your existing task definition
+2. Connects to MongoDB using the same credentials as your running application
+3. Updates the user's role to ADMIN or USER
+4. Waits for completion and reports success/failure
+5. Automatically cleans up the task
+
+The user will need to log out and log back in for changes to take effect.
+
+## Managing User Balance
+
+### Add Balance to a User
+
+To add tokens to a user's balance:
+
+```bash
+./scripts/add-balance.sh user@domain.edu 1000
+```
+
+This will add 1000 tokens to the user's account.
+
+### Requirements
+
+- Balance must be enabled in `librechat.yaml`:
+  ```yaml
+  balance:
+    enabled: true
+    startBalance: 600000
+    autoRefillEnabled: true
+    refillIntervalValue: 1440
+    refillIntervalUnit: 'minutes'
+    refillAmount: 100000
+  ```
+
+### How It Works
+
+The script:
+1. Validates that balance is enabled in your configuration
+2. Finds the user by email
+3. Creates a transaction record with the specified amount
+4. Updates the user's balance
+5. Reports the new balance
+
+### Common Use Cases
+
+```bash
+# Give a new user initial credits
+./scripts/add-balance.sh newuser@domain.edu 5000
+
+# Top up a user who ran out
+./scripts/add-balance.sh user@domain.edu 10000
+
+# Grant bonus credits
+./scripts/add-balance.sh poweruser@domain.edu 50000
+```
+
+## Manual AWS CLI Commands
+
+If you prefer to run commands manually or need to troubleshoot:
+
+### 1. Get your cluster and network configuration
+
+```bash
+# Load your deployment config
+source .librechat-deploy-config
+
+CLUSTER_NAME="${STACK_NAME}-cluster"
+REGION="${REGION:-us-east-1}"
+
+# Get network configuration from existing service
+aws ecs describe-services \
+    --cluster "$CLUSTER_NAME" \
+    --services "${STACK_NAME}-service" \
+    --region "$REGION" \
+    --query 'services[0].networkConfiguration.awsvpcConfiguration'
+```
+
+### 2. Run a one-off task to manage admin role
+
+```bash
+# Set the user email and action
+USER_EMAIL="user@domain.edu"
+TARGET_ROLE="ADMIN"  # or "USER" to remove admin
+
+# Get task definition
+TASK_DEF=$(aws ecs describe-task-definition \
+    --task-definition "${STACK_NAME}-task" \
+    --region "$REGION" \
+    --query 'taskDefinition.taskDefinitionArn' \
+    --output text)
+
+# Create the command
+SHELL_CMD="cd /app/api && cat > manage-admin.js << 'EOFSCRIPT'
+const path = require('path');
+require('module-alias')({ base: path.resolve(__dirname) });
+const mongoose = require('mongoose');
+const { updateUser, findUser } = require('~/models');
+
+(async () => {
+  try {
+    await mongoose.connect(process.env.MONGO_URI);
+    const user = await findUser({ email: '$USER_EMAIL' });
+    if (!user) {
+      console.error('User not found');
+      process.exit(1);
+    }
+    await updateUser(user._id, { role: '$TARGET_ROLE' });
+    console.log('User role updated to $TARGET_ROLE');
+    await mongoose.connection.close();
+  } catch (err) {
+    console.error('Error:', err.message);
+    process.exit(1);
+  }
+})();
+EOFSCRIPT
+node manage-admin.js"
+
+# Build JSON with jq
+OVERRIDES=$(jq -n --arg cmd "$SHELL_CMD" '{
+  containerOverrides: [{
+    name: "librechat",
+    command: ["sh", "-c", $cmd]
+  }]
+}')
+
+# Run the task (replace SUBNETS and SECURITY_GROUPS with values from step 1)
+aws ecs run-task \
+    --cluster "$CLUSTER_NAME" \
+    --task-definition "$TASK_DEF" \
+    --launch-type FARGATE \
+    --network-configuration "awsvpcConfiguration={subnets=[subnet-xxx,subnet-yyy],securityGroups=[sg-xxxxx],assignPublicIp=DISABLED}" \
+    --overrides "$OVERRIDES" \
+    --region "$REGION"
+```
+
+## Troubleshooting
+
+### Task fails to start
+- Check that your ECS service is running
+- Verify network configuration (subnets, security groups)
+- Check CloudWatch Logs: `/aws/ecs/${STACK_NAME}`
+
+### User not found error
+- Verify the email address is correct
+- Check that the user has logged in at least once
+- Email addresses are case-sensitive
+
+### MongoDB connection fails
+- Verify the MONGO_URI environment variable is set correctly in the task
+- Check that the security group allows access to DocumentDB (port 27017)
+- Ensure the task is running in the same VPC as DocumentDB
+
+### Changes don't take effect
+- User must log out and log back in for role changes to apply
+- Check CloudWatch Logs to confirm the update was successful
+- Verify the exit code was 0 (success)
+
+### Balance not enabled error
+- Ensure `balance.enabled: true` is set in `librechat.yaml`
+- Restart your ECS service after updating the configuration
+- Verify the config file is properly mounted in the container
+
+### Invalid amount error
+- Amount must be a positive integer
+- Do not use decimals or negative numbers
+- Example: `1000` not `1000.5` or `-1000`
+
+## Security Notes
+
+- These scripts use your existing task definition with all environment variables
+- The MongoDB connection uses the same credentials as your running application
+- Tasks run in your private subnets with no public IP
+- All commands are logged to CloudWatch Logs
+- One-off tasks automatically stop after completion
+
+## Alternative: Use OpenID Groups (Recommended for Production)
+
+Instead of manually managing admin users, consider using OpenID groups for automatic role assignment:
+
+### Setup
+
+1. **In AWS Cognito**, create a group called "admin"
+2. **Add users** to that group through the Cognito console
+3. **Configure LibreChat** (already done in `.env.local`):
+   ```bash
+   OPENID_ADMIN_ROLE=admin
+   OPENID_ADMIN_ROLE_PARAMETER_PATH=cognito:groups
+   OPENID_ADMIN_ROLE_TOKEN_KIND=id_token
+   ```
+4. **Users automatically get admin permissions** on their next login
+
+### Benefits
+
+- No database access required
+- Centralized user management in Cognito
+- Automatic role assignment on login
+- Easier to audit and manage at scale
+- Role changes take effect immediately on next login
+
+### When to Use the Script vs OpenID Groups
+
+**Use the script when:**
+- You need to quickly grant/revoke admin access
+- You're troubleshooting or testing
+- You have a one-time admin setup need
+
+**Use OpenID groups when:**
+- Managing multiple admins
+- You want centralized access control
+- You need audit trails through Cognito
+- You want automatic role management
--- a/deploy/aws-sam/scripts/add-balance.sh
+++ b/deploy/aws-sam/scripts/add-balance.sh
@ -0,0 +1,208 @@
+#!/bin/bash
+# Script to add balance to a user by running a one-off ECS task
+# Usage: ./scripts/add-balance.sh <user-email> <amount>
+
+set -e
+
+# Check if arguments are provided
+if [ -z "$1" ] || [ -z "$2" ]; then
+    echo "Usage: $0 <user-email> <amount>"
+    echo ""
+    echo "Examples:"
+    echo "  Add 1000 tokens:  $0 user@domain.com 1000"
+    echo "  Add 5000 tokens:  $0 user@domain.com 5000"
+    echo ""
+    echo "Note: Balance must be enabled in librechat.yaml"
+    exit 1
+fi
+
+USER_EMAIL="$1"
+AMOUNT="$2"
+
+# Validate amount is a number
+if ! [[ "$AMOUNT" =~ ^[0-9]+$ ]]; then
+    echo "Error: Amount must be a positive number"
+    exit 1
+fi
+
+# Load configuration
+if [ ! -f .librechat-deploy-config ]; then
+    echo "Error: .librechat-deploy-config not found"
+    exit 1
+fi
+
+source .librechat-deploy-config
+
+# Set variables
+CLUSTER_NAME="${STACK_NAME}-cluster"
+TASK_FAMILY="${STACK_NAME}-task"
+REGION="${REGION:-us-east-1}"
+
+echo "=========================================="
+echo "Adding balance to user: $USER_EMAIL"
+echo "Amount: $AMOUNT tokens"
+echo "Stack: $STACK_NAME"
+echo "Region: $REGION"
+echo "=========================================="
+
+# Get VPC configuration from the existing service
+echo "Getting network configuration from existing service..."
+SERVICE_INFO=$(aws ecs describe-services \
+    --cluster "$CLUSTER_NAME" \
+    --services "${STACK_NAME}-service" \
+    --region "$REGION" \
+    --query 'services[0].networkConfiguration.awsvpcConfiguration' \
+    --output json)
+
+SUBNETS=$(echo "$SERVICE_INFO" | jq -r '.subnets | join(",")')
+SECURITY_GROUPS=$(echo "$SERVICE_INFO" | jq -r '.securityGroups | join(",")')
+
+echo "Subnets: $SUBNETS"
+echo "Security Groups: $SECURITY_GROUPS"
+
+# Get the task definition
+echo "Getting task definition..."
+TASK_DEF=$(aws ecs describe-task-definition \
+    --task-definition "$TASK_FAMILY" \
+    --region "$REGION" \
+    --query 'taskDefinition.taskDefinitionArn' \
+    --output text)
+
+echo "Task Definition: $TASK_DEF"
+
+# Run the one-off task
+echo "Starting ECS task to add balance..."
+
+# Create a Node.js script that mimics the add-balance.js functionality
+SHELL_CMD="cd /app/api && cat > add-balance-task.js << 'EOFSCRIPT'
+// Setup module-alias like LibreChat does
+const path = require('path');
+require('module-alias')({ base: path.resolve(__dirname) });
+
+const mongoose = require('mongoose');
+const { getBalanceConfig } = require('@librechat/api');
+const { User } = require('@librechat/data-schemas').createModels(mongoose);
+const { createTransaction } = require('~/models/Transaction');
+const { getAppConfig } = require('~/server/services/Config');
+
+const email = '$USER_EMAIL';
+const amount = $AMOUNT;
+
+(async () => {
+  try {
+    // Connect to MongoDB
+    console.log('Connecting to MongoDB...');
+    await mongoose.connect(process.env.MONGO_URI);
+    console.log('Connected to MongoDB');
+    
+    // Get app config and balance config
+    console.log('Loading configuration...');
+    const appConfig = await getAppConfig();
+    const balanceConfig = getBalanceConfig(appConfig);
+    
+    if (!balanceConfig?.enabled) {
+      console.error('Error: Balance is not enabled. Use librechat.yaml to enable it');
+      await mongoose.connection.close();
+      process.exit(1);
+    }
+    
+    // Find the user
+    console.log('Looking for user:', email);
+    const user = await User.findOne({ email }).lean();
+    
+    if (!user) {
+      console.error('Error: No user with that email was found!');
+      await mongoose.connection.close();
+      process.exit(1);
+    }
+    
+    console.log('Found user:', user.email);
+    
+    // Create transaction and update balance
+    console.log('Creating transaction for', amount, 'tokens...');
+    const result = await createTransaction({
+      user: user._id,
+      tokenType: 'credits',
+      context: 'admin',
+      rawAmount: +amount,
+      balance: balanceConfig,
+    });
+    
+    if (!result?.balance) {
+      console.error('Error: Something went wrong while updating the balance!');
+      await mongoose.connection.close();
+      process.exit(1);
+    }
+    
+    // Success!
+    console.log('✅ Transaction created successfully!');
+    console.log('Amount added:', amount);
+    console.log('New balance:', result.balance);
+    
+    await mongoose.connection.close();
+    process.exit(0);
+  } catch (err) {
+    console.error('Error:', err.message);
+    console.error(err.stack);
+    if (mongoose.connection.readyState === 1) {
+      await mongoose.connection.close();
+    }
+    process.exit(1);
+  }
+})();
+EOFSCRIPT
+node add-balance-task.js"
+
+# Build the overrides JSON using jq for proper escaping
+OVERRIDES=$(jq -n \
+  --arg cmd "$SHELL_CMD" \
+  '{
+    containerOverrides: [{
+      name: "librechat",
+      command: ["sh", "-c", $cmd]
+    }]
+  }')
+
+echo "Running command in container..."
+TASK_ARN=$(aws ecs run-task \
+    --cluster "$CLUSTER_NAME" \
+    --task-definition "$TASK_DEF" \
+    --launch-type FARGATE \
+    --network-configuration "awsvpcConfiguration={subnets=[$SUBNETS],securityGroups=[$SECURITY_GROUPS],assignPublicIp=DISABLED}" \
+    --overrides "$OVERRIDES" \
+    --region "$REGION" \
+    --query 'tasks[0].taskArn' \
+    --output text)
+
+echo "Task started: $TASK_ARN"
+echo ""
+echo "Waiting for task to complete..."
+echo "You can monitor the task with:"
+echo "  aws ecs describe-tasks --cluster $CLUSTER_NAME --tasks $TASK_ARN --region $REGION"
+echo ""
+echo "Or view logs in CloudWatch Logs:"
+echo "  Log Group: /aws/ecs/${STACK_NAME}"
+echo ""
+
+# Wait for task to complete
+aws ecs wait tasks-stopped \
+    --cluster "$CLUSTER_NAME" \
+    --tasks "$TASK_ARN" \
+    --region "$REGION"
+
+# Check task exit code
+EXIT_CODE=$(aws ecs describe-tasks \
+    --cluster "$CLUSTER_NAME" \
+    --tasks "$TASK_ARN" \
+    --region "$REGION" \
+    --query 'tasks[0].containers[0].exitCode' \
+    --output text)
+
+if [ "$EXIT_CODE" = "0" ]; then
+    echo "✅ Success! Added $AMOUNT tokens to $USER_EMAIL"
+    echo "Check CloudWatch Logs for the new balance."
+else
+    echo "❌ Task failed with exit code: $EXIT_CODE"
+    echo "Check CloudWatch Logs for details."
+    exit 1
+fi
--- a/deploy/aws-sam/scripts/flush-redis-cache.sh
+++ b/deploy/aws-sam/scripts/flush-redis-cache.sh
@ -0,0 +1,201 @@
+#!/bin/bash
+# Script to flush Redis cache by running a one-off ECS task
+# Usage: ./scripts/flush-redis-cache.sh
+
+set -e
+
+# Load configuration
+if [ ! -f .librechat-deploy-config ]; then
+    echo "Error: .librechat-deploy-config not found"
+    exit 1
+fi
+
+source .librechat-deploy-config
+
+# Set variables
+CLUSTER_NAME="${STACK_NAME}-cluster"
+TASK_FAMILY="${STACK_NAME}-task"
+REGION="${REGION:-us-east-1}"
+
+echo "=========================================="
+echo "Flushing Redis Cache"
+echo "Stack: $STACK_NAME"
+echo "Region: $REGION"
+echo "=========================================="
+
+# Get VPC configuration from the existing service
+echo "Getting network configuration from existing service..."
+SERVICE_INFO=$(aws ecs describe-services \
+    --cluster "$CLUSTER_NAME" \
+    --services "${STACK_NAME}-service" \
+    --region "$REGION" \
+    --query 'services[0].networkConfiguration.awsvpcConfiguration' \
+    --output json)
+
+SUBNETS=$(echo "$SERVICE_INFO" | jq -r '.subnets | join(",")')
+SECURITY_GROUPS=$(echo "$SERVICE_INFO" | jq -r '.securityGroups | join(",")')
+
+echo "Subnets: $SUBNETS"
+echo "Security Groups: $SECURITY_GROUPS"
+
+# Get the task definition
+echo "Getting task definition..."
+TASK_DEF=$(aws ecs describe-task-definition \
+    --task-definition "$TASK_FAMILY" \
+    --region "$REGION" \
+    --query 'taskDefinition.taskDefinitionArn' \
+    --output text)
+
+echo "Task Definition: $TASK_DEF"
+
+# Run the one-off task
+echo "Starting ECS task to flush Redis cache..."
+
+# Inline Node.js script to flush Redis cache
+FLUSH_SCRIPT='
+const IoRedis = require("ioredis");
+
+const isEnabled = (value) => value === "true" || value === true;
+
+async function flushRedis() {
+  try {
+    console.log("🔍 Connecting to Redis...");
+    
+    const urls = (process.env.REDIS_URI || "").split(",").map((uri) => new URL(uri));
+    const username = urls[0]?.username || process.env.REDIS_USERNAME;
+    const password = urls[0]?.password || process.env.REDIS_PASSWORD;
+    
+    const redisOptions = {
+      username: username,
+      password: password,
+      connectTimeout: 10000,
+      maxRetriesPerRequest: 3,
+      enableOfflineQueue: true,
+      lazyConnect: false,
+    };
+    
+    const useCluster = urls.length > 1 || isEnabled(process.env.USE_REDIS_CLUSTER);
+    let redis;
+    
+    if (useCluster) {
+      const clusterOptions = {
+        redisOptions,
+        enableOfflineQueue: true,
+      };
+      
+      if (isEnabled(process.env.REDIS_USE_ALTERNATIVE_DNS_LOOKUP)) {
+        clusterOptions.dnsLookup = (address, callback) => callback(null, address);
+      }
+      
+      redis = new IoRedis.Cluster(
+        urls.map((url) => ({ host: url.hostname, port: parseInt(url.port, 10) || 6379 })),
+        clusterOptions,
+      );
+    } else {
+      redis = new IoRedis(process.env.REDIS_URI, redisOptions);
+    }
+    
+    await new Promise((resolve, reject) => {
+      const timeout = setTimeout(() => reject(new Error("Connection timeout")), 10000);
+      redis.once("ready", () => { clearTimeout(timeout); resolve(); });
+      redis.once("error", (err) => { clearTimeout(timeout); reject(err); });
+    });
+    
+    console.log("✅ Connected to Redis");
+    
+    let keyCount = 0;
+    try {
+      if (useCluster) {
+        const nodes = redis.nodes("master");
+        for (const node of nodes) {
+          const keys = await node.keys("*");
+          keyCount += keys.length;
+        }
+      } else {
+        const keys = await redis.keys("*");
+        keyCount = keys.length;
+      }
+    } catch (_error) {}
+    
+    if (useCluster) {
+      const nodes = redis.nodes("master");
+      await Promise.all(nodes.map((node) => node.flushdb()));
+      console.log(`✅ Redis cluster cache flushed successfully (${nodes.length} master nodes)`);
+    } else {
+      await redis.flushdb();
+      console.log("✅ Redis cache flushed successfully");
+    }
+    
+    if (keyCount > 0) {
+      console.log(`   Deleted ${keyCount} keys`);
+    }
+    
+    await redis.disconnect();
+    console.log("⚠️  Note: All users will need to re-authenticate");
+    process.exit(0);
+  } catch (error) {
+    console.error("❌ Error flushing Redis cache:", error.message);
+    process.exit(1);
+  }
+}
+
+flushRedis();
+'
+
+SHELL_CMD="cd /app && node -e '$FLUSH_SCRIPT'"
+
+# Build the overrides JSON using jq for proper escaping
+OVERRIDES=$(jq -n \
+  --arg cmd "$SHELL_CMD" \
+  '{
+    containerOverrides: [{
+      name: "librechat",
+      command: ["sh", "-c", $cmd]
+    }]
+  }')
+
+echo "Running command in container..."
+TASK_ARN=$(aws ecs run-task \
+    --cluster "$CLUSTER_NAME" \
+    --task-definition "$TASK_DEF" \
+    --launch-type FARGATE \
+    --network-configuration "awsvpcConfiguration={subnets=[$SUBNETS],securityGroups=[$SECURITY_GROUPS],assignPublicIp=DISABLED}" \
+    --overrides "$OVERRIDES" \
+    --region "$REGION" \
+    --query 'tasks[0].taskArn' \
+    --output text)
+
+echo "Task started: $TASK_ARN"
+echo ""
+echo "Waiting for task to complete..."
+echo "You can monitor the task with:"
+echo "  aws ecs describe-tasks --cluster $CLUSTER_NAME --tasks $TASK_ARN --region $REGION"
+echo ""
+echo "Or view logs in CloudWatch Logs:"
+echo "  Log Group: /ecs/${STACK_NAME}-task"
+echo ""
+
+# Wait for task to complete
+aws ecs wait tasks-stopped \
+    --cluster "$CLUSTER_NAME" \
+    --tasks "$TASK_ARN" \
+    --region "$REGION"
+
+# Check task exit code
+EXIT_CODE=$(aws ecs describe-tasks \
+    --cluster "$CLUSTER_NAME" \
+    --tasks "$TASK_ARN" \
+    --region "$REGION" \
+    --query 'tasks[0].containers[0].exitCode' \
+    --output text)
+
+if [ "$EXIT_CODE" = "0" ]; then
+    echo "✅ Success! Redis cache has been flushed."
+    echo ""
+    echo "⚠️  Note: All users will need to re-authenticate."
+else
+    echo "❌ Task failed with exit code: $EXIT_CODE"
+    echo "Check CloudWatch Logs for details:"
+    echo "  aws logs tail /ecs/${STACK_NAME}-task --follow --region $REGION"
+    exit 1
+fi
--- a/deploy/aws-sam/scripts/make-admin.sh
+++ b/deploy/aws-sam/scripts/make-admin.sh
@ -0,0 +1,212 @@
+#!/bin/bash
+# Script to manage user admin role by running a one-off ECS task
+# Usage: ./scripts/make-admin.sh <user-email> [--remove]
+
+set -e
+
+# Parse arguments
+REMOVE_ADMIN=false
+USER_EMAIL=""
+
+while [[ $# -gt 0 ]]; do
+    case $1 in
+        --remove|-r)
+            REMOVE_ADMIN=true
+            shift
+            ;;
+        *)
+            USER_EMAIL="$1"
+            shift
+            ;;
+    esac
+done
+
+# Check if email is provided
+if [ -z "$USER_EMAIL" ]; then
+    echo "Usage: $0 <user-email> [--remove]"
+    echo ""
+    echo "Examples:"
+    echo "  Grant admin:  $0 user@domain.com"
+    echo "  Remove admin: $0 user@domain.com --remove"
+    exit 1
+fi
+
+# Load configuration
+if [ ! -f .librechat-deploy-config ]; then
+    echo "Error: .librechat-deploy-config not found"
+    exit 1
+fi
+
+source .librechat-deploy-config
+
+# Set variables
+CLUSTER_NAME="${STACK_NAME}-cluster"
+TASK_FAMILY="${STACK_NAME}-task"
+REGION="${REGION:-us-east-1}"
+
+if [ "$REMOVE_ADMIN" = true ]; then
+    ACTION="Removing admin role from"
+    TARGET_ROLE="USER"
+else
+    ACTION="Granting admin role to"
+    TARGET_ROLE="ADMIN"
+fi
+
+echo "=========================================="
+echo "$ACTION: $USER_EMAIL"
+echo "Stack: $STACK_NAME"
+echo "Region: $REGION"
+echo "=========================================="
+
+# Get VPC configuration from the existing service
+echo "Getting network configuration from existing service..."
+SERVICE_INFO=$(aws ecs describe-services \
+    --cluster "$CLUSTER_NAME" \
+    --services "${STACK_NAME}-service" \
+    --region "$REGION" \
+    --query 'services[0].networkConfiguration.awsvpcConfiguration' \
+    --output json)
+
+SUBNETS=$(echo "$SERVICE_INFO" | jq -r '.subnets | join(",")')
+SECURITY_GROUPS=$(echo "$SERVICE_INFO" | jq -r '.securityGroups | join(",")')
+
+echo "Subnets: $SUBNETS"
+echo "Security Groups: $SECURITY_GROUPS"
+
+# Get the task definition
+echo "Getting task definition..."
+TASK_DEF=$(aws ecs describe-task-definition \
+    --task-definition "$TASK_FAMILY" \
+    --region "$REGION" \
+    --query 'taskDefinition.taskDefinitionArn' \
+    --output text)
+
+echo "Task Definition: $TASK_DEF"
+
+# Run the one-off task
+echo "Starting ECS task to update user role..."
+
+# Create a Node.js script that uses LibreChat's models with proper module-alias setup
+SHELL_CMD="cd /app/api && cat > manage-admin.js << 'EOFSCRIPT'
+// Setup module-alias like LibreChat does
+const path = require('path');
+require('module-alias')({ base: path.resolve(__dirname) });
+
+const mongoose = require('mongoose');
+const { updateUser, findUser } = require('~/models');
+const { SystemRoles } = require('librechat-data-provider');
+
+const targetRole = '$TARGET_ROLE';
+
+(async () => {
+  try {
+    // Connect to MongoDB
+    console.log('Connecting to MongoDB...');
+    await mongoose.connect(process.env.MONGO_URI);
+    console.log('Connected to MongoDB');
+    
+    // Find the user by email
+    console.log('Looking for user: $USER_EMAIL');
+    const user = await findUser({ email: '$USER_EMAIL' });
+    
+    if (!user) {
+      console.error('User not found: $USER_EMAIL');
+      await mongoose.connection.close();
+      process.exit(1);
+    }
+    
+    console.log('Found user:', user.email, 'Current role:', user.role);
+    
+    // Check if already has target role
+    if (user.role === targetRole) {
+      console.log('User already has ' + targetRole + ' role');
+      await mongoose.connection.close();
+      process.exit(0);
+    }
+    
+    // Update the user role
+    console.log('Updating user role to ' + targetRole + '...');
+    const result = await updateUser(user._id, { role: targetRole });
+    
+    if (result) {
+      if (targetRole === 'ADMIN') {
+        console.log('✅ User $USER_EMAIL granted ADMIN role successfully');
+      } else {
+        console.log('✅ User $USER_EMAIL removed from ADMIN role successfully');
+      }
+      await mongoose.connection.close();
+      process.exit(0);
+    } else {
+      console.error('Failed to update user role');
+      await mongoose.connection.close();
+      process.exit(1);
+    }
+  } catch (err) {
+    console.error('Error:', err.message);
+    console.error(err.stack);
+    if (mongoose.connection.readyState === 1) {
+      await mongoose.connection.close();
+    }
+    process.exit(1);
+  }
+})();
+EOFSCRIPT
+node manage-admin.js"
+
+# Build the overrides JSON using jq for proper escaping
+OVERRIDES=$(jq -n \
+  --arg cmd "$SHELL_CMD" \
+  '{
+    containerOverrides: [{
+      name: "librechat",
+      command: ["sh", "-c", $cmd]
+    }]
+  }')
+
+echo "Running command in container..."
+TASK_ARN=$(aws ecs run-task \
+    --cluster "$CLUSTER_NAME" \
+    --task-definition "$TASK_DEF" \
+    --launch-type FARGATE \
+    --network-configuration "awsvpcConfiguration={subnets=[$SUBNETS],securityGroups=[$SECURITY_GROUPS],assignPublicIp=DISABLED}" \
+    --overrides "$OVERRIDES" \
+    --region "$REGION" \
+    --query 'tasks[0].taskArn' \
+    --output text)
+
+echo "Task started: $TASK_ARN"
+echo ""
+echo "Waiting for task to complete..."
+echo "You can monitor the task with:"
+echo "  aws ecs describe-tasks --cluster $CLUSTER_NAME --tasks $TASK_ARN --region $REGION"
+echo ""
+echo "Or view logs in CloudWatch Logs:"
+echo "  Log Group: /aws/ecs/${STACK_NAME}"
+echo ""
+
+# Wait for task to complete
+aws ecs wait tasks-stopped \
+    --cluster "$CLUSTER_NAME" \
+    --tasks "$TASK_ARN" \
+    --region "$REGION"
+
+# Check task exit code
+EXIT_CODE=$(aws ecs describe-tasks \
+    --cluster "$CLUSTER_NAME" \
+    --tasks "$TASK_ARN" \
+    --region "$REGION" \
+    --query 'tasks[0].containers[0].exitCode' \
+    --output text)
+
+if [ "$EXIT_CODE" = "0" ]; then
+    if [ "$REMOVE_ADMIN" = true ]; then
+        echo "✅ Success! User $USER_EMAIL has been removed from admin role."
+    else
+        echo "✅ Success! User $USER_EMAIL has been granted admin permissions."
+    fi
+    echo "The user will need to log out and log back in for changes to take effect."
+else
+    echo "❌ Task failed with exit code: $EXIT_CODE"
+    echo "Check CloudWatch Logs for details."
+    exit 1
+fi
--- a/deploy/aws-sam/scripts/scale-service.sh
+++ b/deploy/aws-sam/scripts/scale-service.sh
@ -0,0 +1,153 @@
+#!/bin/bash
+
+# Script to manually scale LibreChat ECS service
+# Usage: ./scale-service.sh [stack-name] [desired-count]
+
+set -e
+
+# Colors for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m' # No Color
+
+# Default values
+STACK_NAME="${1:-librechat}"
+DESIRED_COUNT="${2}"
+REGION="${AWS_DEFAULT_REGION:-us-east-1}"
+
+# Function to print colored output
+print_status() {
+    echo -e "${BLUE}[INFO]${NC} $1"
+}
+
+print_success() {
+    echo -e "${GREEN}[SUCCESS]${NC} $1"
+}
+
+print_warning() {
+    echo -e "${YELLOW}[WARNING]${NC} $1"
+}
+
+print_error() {
+    echo -e "${RED}[ERROR]${NC} $1"
+}
+
+# Show usage if desired count not provided
+if [[ -z "$DESIRED_COUNT" ]]; then
+    echo "Usage: $0 [stack-name] [desired-count]"
+    echo ""
+    echo "Examples:"
+    echo "  $0 librechat 5          # Scale to 5 instances"
+    echo "  $0 librechat-dev 1      # Scale dev environment to 1 instance"
+    exit 1
+fi
+
+# Validate desired count is a number
+if ! [[ "$DESIRED_COUNT" =~ ^[0-9]+$ ]]; then
+    print_error "Desired count must be a number"
+    exit 1
+fi
+
+# Check if AWS CLI is available
+if ! command -v aws &> /dev/null; then
+    print_error "AWS CLI is not installed"
+    exit 1
+fi
+
+# Check AWS credentials
+if ! aws sts get-caller-identity &> /dev/null; then
+    print_error "AWS credentials not configured"
+    exit 1
+fi
+
+print_status "Scaling LibreChat service..."
+print_status "Stack: $STACK_NAME"
+print_status "Desired Count: $DESIRED_COUNT"
+print_status "Region: $REGION"
+
+# Get cluster and service names from CloudFormation
+CLUSTER_NAME=$(aws cloudformation describe-stacks \
+    --stack-name "$STACK_NAME" \
+    --region "$REGION" \
+    --query 'Stacks[0].Outputs[?OutputKey==`ECSClusterName`].OutputValue' \
+    --output text)
+
+SERVICE_NAME=$(aws cloudformation describe-stacks \
+    --stack-name "$STACK_NAME" \
+    --region "$REGION" \
+    --query 'Stacks[0].Outputs[?OutputKey==`ECSServiceName`].OutputValue' \
+    --output text)
+
+if [[ -z "$CLUSTER_NAME" || -z "$SERVICE_NAME" ]]; then
+    print_error "Could not find ECS cluster or service in stack $STACK_NAME"
+    exit 1
+fi
+
+print_status "Cluster: $CLUSTER_NAME"
+print_status "Service: $SERVICE_NAME"
+
+# Get current service status
+CURRENT_STATUS=$(aws ecs describe-services \
+    --cluster "$CLUSTER_NAME" \
+    --services "$SERVICE_NAME" \
+    --region "$REGION" \
+    --query 'services[0].{
+        RunningCount: runningCount,
+        PendingCount: pendingCount,
+        DesiredCount: desiredCount
+    }')
+
+print_status "Current service status:"
+echo "$CURRENT_STATUS" | jq .
+
+CURRENT_DESIRED=$(echo "$CURRENT_STATUS" | jq -r '.DesiredCount')
+
+if [[ "$CURRENT_DESIRED" == "$DESIRED_COUNT" ]]; then
+    print_warning "Service is already scaled to $DESIRED_COUNT instances"
+    exit 0
+fi
+
+# Update the service desired count
+print_status "Scaling service from $CURRENT_DESIRED to $DESIRED_COUNT instances..."
+aws ecs update-service \
+    --cluster "$CLUSTER_NAME" \
+    --service "$SERVICE_NAME" \
+    --desired-count "$DESIRED_COUNT" \
+    --region "$REGION" \
+    --query 'service.serviceName' \
+    --output text
+
+# Wait for deployment to stabilize
+print_status "Waiting for service to stabilize..."
+aws ecs wait services-stable \
+    --cluster "$CLUSTER_NAME" \
+    --services "$SERVICE_NAME" \
+    --region "$REGION"
+
+print_success "Service scaling completed successfully!"
+
+# Show final service status
+print_status "Final service status:"
+aws ecs describe-services \
+    --cluster "$CLUSTER_NAME" \
+    --services "$SERVICE_NAME" \
+    --region "$REGION" \
+    --query 'services[0].{
+        ServiceName: serviceName,
+        Status: status,
+        RunningCount: runningCount,
+        PendingCount: pendingCount,
+        DesiredCount: desiredCount
+    }' \
+    --output table
+
+# Show running tasks
+print_status "Running tasks:"
+aws ecs list-tasks \
+    --cluster "$CLUSTER_NAME" \
+    --service-name "$SERVICE_NAME" \
+    --region "$REGION" \
+    --query 'taskArns' \
+    --output table
--- a/deploy/aws-sam/scripts/simple-config-upload.sh
+++ b/deploy/aws-sam/scripts/simple-config-upload.sh
@ -0,0 +1,130 @@
+#!/bin/bash
+
+# Simple config upload script - replaces the complex Python approach
+# Usage: ./simple-config-upload.sh <stack-name> [region] [config-file]
+
+set -e
+
+# Colors for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m' # No Color
+
+print_status() {
+    echo -e "${BLUE}[INFO]${NC} $1"
+}
+
+print_success() {
+    echo -e "${GREEN}[SUCCESS]${NC} $1"
+}
+
+print_error() {
+    echo -e "${RED}[ERROR]${NC} $1"
+}
+
+# Parameters
+STACK_NAME="$1"
+REGION="${2:-us-east-1}"
+CONFIG_FILE="${3:-librechat.yaml}"
+
+if [[ -z "$STACK_NAME" ]]; then
+    print_error "Usage: $0 <stack-name> [region] [config-file]"
+    exit 1
+fi
+
+print_status "Uploading config for stack: $STACK_NAME"
+print_status "Region: $REGION"
+print_status "Config file: $CONFIG_FILE"
+
+# Get S3 bucket name from CloudFormation outputs
+print_status "Getting S3 bucket name from stack outputs..."
+BUCKET_NAME=$(aws cloudformation describe-stacks \
+    --stack-name "$STACK_NAME" \
+    --region "$REGION" \
+    --query 'Stacks[0].Outputs[?OutputKey==`S3BucketName`].OutputValue' \
+    --output text 2>/dev/null)
+
+if [[ -z "$BUCKET_NAME" ]]; then
+    print_error "Could not find S3BucketName in stack outputs"
+    exit 1
+fi
+
+print_success "Found S3 bucket: $BUCKET_NAME"
+
+# Upload config file to S3
+if [[ -f "$CONFIG_FILE" ]]; then
+    print_status "Uploading $CONFIG_FILE to S3..."
+    aws s3 cp "$CONFIG_FILE" "s3://$BUCKET_NAME/configs/librechat.yaml" \
+        --content-type "application/x-yaml" \
+        --region "$REGION"
+    
+    if [[ $? -eq 0 ]]; then
+        print_success "Configuration uploaded to s3://$BUCKET_NAME/configs/librechat.yaml"
+    else
+        print_error "Failed to upload configuration to S3"
+        exit 1
+    fi
+else
+    print_error "Config file not found: $CONFIG_FILE"
+    exit 1
+fi
+
+# Trigger Config Manager Lambda to copy S3 → EFS
+LAMBDA_NAME="${STACK_NAME}-config-manager"
+print_status "Triggering config manager Lambda: $LAMBDA_NAME"
+
+aws lambda invoke \
+    --function-name "$LAMBDA_NAME" \
+    --region "$REGION" \
+    --payload '{}' \
+    /tmp/lambda-response.json >/dev/null 2>&1
+
+if [[ $? -eq 0 ]]; then
+    print_success "Config manager Lambda executed successfully"
+    print_status "Configuration has been copied to EFS"
+else
+    print_error "Could not invoke config manager Lambda"
+    exit 1
+fi
+
+# Force ECS service to restart containers
+print_status "Getting ECS cluster and service information..."
+CLUSTER_NAME=$(aws cloudformation describe-stacks \
+    --stack-name "$STACK_NAME" \
+    --region "$REGION" \
+    --query 'Stacks[0].Outputs[?OutputKey==`ECSClusterName`].OutputValue' \
+    --output text 2>/dev/null)
+
+SERVICE_NAME=$(aws cloudformation describe-stacks \
+    --stack-name "$STACK_NAME" \
+    --region "$REGION" \
+    --query 'Stacks[0].Outputs[?OutputKey==`ECSServiceName`].OutputValue' \
+    --output text 2>/dev/null)
+
+if [[ -n "$CLUSTER_NAME" && -n "$SERVICE_NAME" ]]; then
+    print_status "Restarting ECS containers to pick up new config..."
+    print_status "Cluster: $CLUSTER_NAME"
+    print_status "Service: $SERVICE_NAME"
+    
+    aws ecs update-service \
+        --cluster "$CLUSTER_NAME" \
+        --service "$SERVICE_NAME" \
+        --region "$REGION" \
+        --force-new-deployment >/dev/null 2>&1
+    
+    if [[ $? -eq 0 ]]; then
+        print_success "ECS service restart initiated"
+        print_status "Containers will restart with the new configuration"
+        print_status "This may take a few minutes to complete"
+    else
+        print_error "Could not restart ECS service"
+        exit 1
+    fi
+else
+    print_error "Could not find ECS cluster/service information"
+    exit 1
+fi
+
+print_success "Configuration update completed successfully!"
--- a/deploy/aws-sam/scripts/update-service.sh
+++ b/deploy/aws-sam/scripts/update-service.sh
@ -0,0 +1,142 @@
+#!/bin/bash
+
+# Script to update LibreChat ECS service with new image version
+# Usage: ./update-service.sh [stack-name] [image-tag]
+
+set -e
+
+# Colors for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m' # No Color
+
+# Default values
+STACK_NAME="${1:-librechat}"
+IMAGE_TAG="${2:-latest}"
+REGION="${AWS_DEFAULT_REGION:-us-east-1}"
+
+# Function to print colored output
+print_status() {
+    echo -e "${BLUE}[INFO]${NC} $1"
+}
+
+print_success() {
+    echo -e "${GREEN}[SUCCESS]${NC} $1"
+}
+
+print_warning() {
+    echo -e "${YELLOW}[WARNING]${NC} $1"
+}
+
+print_error() {
+    echo -e "${RED}[ERROR]${NC} $1"
+}
+
+# Check if AWS CLI is available
+if ! command -v aws &> /dev/null; then
+    print_error "AWS CLI is not installed"
+    exit 1
+fi
+
+# Check AWS credentials
+if ! aws sts get-caller-identity &> /dev/null; then
+    print_error "AWS credentials not configured"
+    exit 1
+fi
+
+print_status "Updating LibreChat service..."
+print_status "Stack: $STACK_NAME"
+print_status "Image Tag: $IMAGE_TAG"
+print_status "Region: $REGION"
+
+# Get cluster and service names from CloudFormation
+CLUSTER_NAME=$(aws cloudformation describe-stacks \
+    --stack-name "$STACK_NAME" \
+    --region "$REGION" \
+    --query 'Stacks[0].Outputs[?OutputKey==`ECSClusterName`].OutputValue' \
+    --output text)
+
+SERVICE_NAME=$(aws cloudformation describe-stacks \
+    --stack-name "$STACK_NAME" \
+    --region "$REGION" \
+    --query 'Stacks[0].Outputs[?OutputKey==`ECSServiceName`].OutputValue' \
+    --output text)
+
+if [[ -z "$CLUSTER_NAME" || -z "$SERVICE_NAME" ]]; then
+    print_error "Could not find ECS cluster or service in stack $STACK_NAME"
+    exit 1
+fi
+
+print_status "Cluster: $CLUSTER_NAME"
+print_status "Service: $SERVICE_NAME"
+
+# Get current task definition
+TASK_DEF_ARN=$(aws ecs describe-services \
+    --cluster "$CLUSTER_NAME" \
+    --services "$SERVICE_NAME" \
+    --region "$REGION" \
+    --query 'services[0].taskDefinition' \
+    --output text)
+
+print_status "Current task definition: $TASK_DEF_ARN"
+
+# Get task definition details
+TASK_DEF=$(aws ecs describe-task-definition \
+    --task-definition "$TASK_DEF_ARN" \
+    --region "$REGION" \
+    --query 'taskDefinition')
+
+# Update the image in the task definition
+NEW_IMAGE="ghcr.io/danny-avila/librechat:$IMAGE_TAG"
+UPDATED_TASK_DEF=$(echo "$TASK_DEF" | jq --arg image "$NEW_IMAGE" '
+    .containerDefinitions[0].image = $image |
+    del(.taskDefinitionArn, .revision, .status, .requiresAttributes, .placementConstraints, .compatibilities, .registeredAt, .registeredBy)
+')
+
+print_status "Updating image to: $NEW_IMAGE"
+
+# Register new task definition
+NEW_TASK_DEF_ARN=$(echo "$UPDATED_TASK_DEF" | aws ecs register-task-definition \
+    --region "$REGION" \
+    --cli-input-json file:///dev/stdin \
+    --query 'taskDefinition.taskDefinitionArn' \
+    --output text)
+
+print_status "New task definition: $NEW_TASK_DEF_ARN"
+
+# Update the service
+print_status "Updating ECS service..."
+aws ecs update-service \
+    --cluster "$CLUSTER_NAME" \
+    --service "$SERVICE_NAME" \
+    --task-definition "$NEW_TASK_DEF_ARN" \
+    --region "$REGION" \
+    --query 'service.serviceName' \
+    --output text
+
+# Wait for deployment to complete
+print_status "Waiting for deployment to complete..."
+aws ecs wait services-stable \
+    --cluster "$CLUSTER_NAME" \
+    --services "$SERVICE_NAME" \
+    --region "$REGION"
+
+print_success "Service update completed successfully!"
+
+# Show service status
+print_status "Service status:"
+aws ecs describe-services \
+    --cluster "$CLUSTER_NAME" \
+    --services "$SERVICE_NAME" \
+    --region "$REGION" \
+    --query 'services[0].{
+        ServiceName: serviceName,
+        Status: status,
+        RunningCount: runningCount,
+        PendingCount: pendingCount,
+        DesiredCount: desiredCount,
+        TaskDefinition: taskDefinition
+    }' \
+    --output table
--- a/deploy/aws-sam/src/config_manager/app.py
+++ b/deploy/aws-sam/src/config_manager/app.py
@ -0,0 +1,239 @@
+import json
+import logging
+import os
+import boto3
+import urllib3
+from botocore.exceptions import ClientError
+
+# Configure logging
+logger = logging.getLogger()
+logger.setLevel(logging.INFO)
+
+# Initialize AWS clients
+s3_client = boto3.client('s3')
+
+def lambda_handler(event, context):
+    """
+    Lambda function to copy configuration files from S3 to EFS.
+    Handles both CloudFormation custom resource lifecycle events and direct invocations.
+    """
+    logger.info(f"Received event: {json.dumps(event, default=str)}")
+    
+    # Check if this is a CloudFormation custom resource call or direct invocation
+    is_cloudformation = 'RequestType' in event and 'ResourceProperties' in event
+    
+    if is_cloudformation:
+        # CloudFormation custom resource call
+        request_type = event.get('RequestType')
+        resource_properties = event.get('ResourceProperties', {})
+        s3_bucket = resource_properties.get('S3Bucket')
+        s3_key = resource_properties.get('S3Key', 'configs/librechat.yaml')
+    else:
+        # Direct invocation - get parameters from environment or event
+        logger.info("Direct invocation detected - processing config update")
+        request_type = 'Update'  # Treat direct calls as updates
+        s3_bucket = event.get('S3Bucket') or get_s3_bucket_from_environment()
+        s3_key = event.get('S3Key', 'configs/librechat.yaml')
+    
+    # Configuration
+    efs_mount_path = os.environ.get('EFS_MOUNT_PATH', '/mnt/efs')
+    efs_file_path = os.path.join(efs_mount_path, 'librechat.yaml')
+    
+    response_data = {}
+    
+    try:
+        if request_type in ['Create', 'Update']:
+            logger.info(f"Processing {request_type} request")
+            
+            # Validate required parameters
+            if not s3_bucket:
+                raise ValueError("S3Bucket is required - either in ResourceProperties or environment")
+            
+            # Ensure EFS mount directory exists
+            os.makedirs(efs_mount_path, exist_ok=True)
+            logger.info(f"EFS mount path ready: {efs_mount_path}")
+            
+            # Download file from S3
+            logger.info(f"Downloading s3://{s3_bucket}/{s3_key}")
+            used_default_config = False
+            try:
+                s3_response = s3_client.get_object(Bucket=s3_bucket, Key=s3_key)
+                file_content = s3_response['Body'].read()
+                logger.info(f"Successfully downloaded {len(file_content)} bytes from S3")
+            except ClientError as e:
+                error_code = e.response['Error']['Code']
+                if error_code == 'NoSuchKey':
+                    logger.warning(f"Configuration file not found: s3://{s3_bucket}/{s3_key}")
+                    logger.info("Creating default configuration file on EFS")
+                    used_default_config = True
+                    # Create a minimal default config if the file doesn't exist
+                    file_content = b"""# Default LibreChat Configuration
+# This file was created automatically because no custom config was found
+version: 1.2.8
+cache: false
+interface:
+  customWelcome: ""
+"""
+                elif error_code == 'NoSuchBucket':
+                    logger.warning(f"S3 bucket not found: {s3_bucket}")
+                    logger.info("Creating default configuration file on EFS")
+                    used_default_config = True
+                    # Create a minimal default config if the bucket doesn't exist
+                    file_content = b"""# Default LibreChat Configuration
+# This file was created automatically because S3 bucket was not accessible
+version: 1.2.8
+cache: false
+interface:
+  customWelcome: "Welcome to LibreChat! (Using Default Config - S3 Bucket Not Found)"
+"""
+                elif error_code == 'AccessDenied':
+                    logger.warning(f"Access denied to S3: s3://{s3_bucket}/{s3_key}")
+                    logger.info("Creating default configuration file on EFS")
+                    used_default_config = True
+                    # Create a minimal default config if access is denied
+                    file_content = b"""# Default LibreChat Configuration
+# This file was created automatically because S3 access was denied
+version: 1.2.8
+cache: false
+interface:
+  customWelcome: "Welcome to LibreChat! (Using Default Config - S3 Access Denied)"
+"""
+                else:
+                    raise ValueError(f"Failed to download from S3: {str(e)}")
+            
+            # Write file to EFS
+            logger.info(f"Writing file to EFS: {efs_file_path}")
+            with open(efs_file_path, 'wb') as f:
+                f.write(file_content)
+            
+            # Set appropriate file permissions (readable by all, writable by owner)
+            os.chmod(efs_file_path, 0o644)
+            logger.info(f"Set file permissions to 644 for {efs_file_path}")
+            
+            # Verify file was written correctly
+            if os.path.exists(efs_file_path):
+                file_size = os.path.getsize(efs_file_path)
+                logger.info(f"File successfully written to EFS: {file_size} bytes")
+                response_data['FileSize'] = file_size
+                response_data['EFSPath'] = efs_file_path
+                response_data['UsedDefaultConfig'] = used_default_config
+                
+                # For direct invocations, return success immediately
+                if not is_cloudformation:
+                    logger.info("Direct invocation completed successfully")
+                    return {
+                        'statusCode': 200,
+                        'body': json.dumps({
+                            'message': 'Configuration updated successfully',
+                            'fileSize': file_size,
+                            'efsPath': efs_file_path,
+                            'usedDefaultConfig': used_default_config
+                        })
+                    }
+            else:
+                raise RuntimeError("File was not created on EFS")
+                
+        elif request_type == 'Delete':
+            logger.info("Processing Delete request")
+            # For delete operations, we could optionally remove the file
+            # but it's safer to leave it in place for potential rollbacks
+            if os.path.exists(efs_file_path):
+                logger.info(f"Configuration file exists at {efs_file_path} (leaving in place)")
+            else:
+                logger.info("Configuration file not found (already removed or never created)")
+        
+        # Send success response to CloudFormation (only for CF calls)
+        if is_cloudformation:
+            send_response(event, context, 'SUCCESS', response_data)
+        
+    except Exception as e:
+        logger.error(f"Error processing request: {str(e)}", exc_info=True)
+        
+        # Handle errors differently for CF vs direct calls
+        if is_cloudformation:
+            send_response(event, context, 'FAILED', {'Error': str(e)})
+        else:
+            # For direct invocations, return error response
+            return {
+                'statusCode': 500,
+                'body': json.dumps({
+                    'error': str(e),
+                    'message': 'Configuration update failed'
+                })
+            }
+        raise
+
+
+def get_s3_bucket_from_environment():
+    """
+    Try to determine the S3 bucket name from the Lambda function's environment.
+    This is used for direct invocations when the bucket isn't provided in the event.
+    Prefers S3_BUCKET_NAME (set by the template) to avoid needing CloudFormation permissions.
+    """
+    # Prefer environment variable (set by CloudFormation template; no extra IAM needed)
+    bucket_name = os.environ.get('S3_BUCKET_NAME')
+    if bucket_name:
+        logger.info(f"Found S3 bucket from environment: {bucket_name}")
+        return bucket_name
+
+    # Fallback: try to get from CloudFormation stack outputs (requires cloudformation:DescribeStacks)
+    function_name = os.environ.get('AWS_LAMBDA_FUNCTION_NAME', '')
+    if function_name.endswith('-config-manager'):
+        stack_name = function_name[:-15]  # Remove '-config-manager'
+        try:
+            cf_client = boto3.client('cloudformation')
+            response = cf_client.describe_stacks(StackName=stack_name)
+            outputs = response['Stacks'][0].get('Outputs', [])
+            for output in outputs:
+                if output['OutputKey'] == 'S3BucketName':
+                    bucket_name = output['OutputValue']
+                    logger.info(f"Found S3 bucket from CloudFormation: {bucket_name}")
+                    return bucket_name
+        except Exception as e:
+            logger.warning(f"Could not get S3 bucket from CloudFormation: {str(e)}")
+
+    logger.warning("Could not determine S3 bucket name")
+    return None
+
+
+def send_response(event, context, response_status, response_data):
+    """
+    Send response to CloudFormation custom resource.
+    """
+    response_url = event.get('ResponseURL')
+    if not response_url:
+        logger.warning("No ResponseURL provided - this may be a test invocation")
+        return
+    
+    # Prepare response payload
+    response_body = {
+        'Status': response_status,
+        'Reason': f'See CloudWatch Log Stream: {context.log_stream_name}',
+        'PhysicalResourceId': event.get('LogicalResourceId', 'ConfigManagerResource'),
+        'StackId': event.get('StackId'),
+        'RequestId': event.get('RequestId'),
+        'LogicalResourceId': event.get('LogicalResourceId'),
+        'Data': response_data
+    }
+    
+    json_response_body = json.dumps(response_body)
+    logger.info(f"Sending response to CloudFormation: {response_status}")
+    logger.debug(f"Response body: {json_response_body}")
+    
+    try:
+        # Send HTTP PUT request to CloudFormation
+        http = urllib3.PoolManager()
+        response = http.request(
+            'PUT',
+            response_url,
+            body=json_response_body,
+            headers={
+                'Content-Type': 'application/json',
+                'Content-Length': str(len(json_response_body))
+            }
+        )
+        logger.info(f"CloudFormation response status: {response.status}")
+        
+    except Exception as e:
+        logger.error(f"Failed to send response to CloudFormation: {str(e)}")
+        raise
--- a/deploy/aws-sam/src/config_manager/requirements.txt
+++ b/deploy/aws-sam/src/config_manager/requirements.txt
@ -0,0 +1,2 @@
+boto3>=1.26.0
+urllib3>=1.26.0
--- a/deploy/aws-sam/src/mount_target_waiter/app.py
+++ b/deploy/aws-sam/src/mount_target_waiter/app.py
@ -0,0 +1,135 @@
+import json
+import logging
+import boto3
+import urllib3
+import time
+from botocore.exceptions import ClientError
+
+# Configure logging
+logger = logging.getLogger()
+logger.setLevel(logging.INFO)
+
+# Initialize AWS clients
+efs_client = boto3.client('efs')
+
+def lambda_handler(event, context):
+    """
+    Lambda function to wait for EFS mount targets to be available.
+    This ensures mount targets are ready before other resources try to use them.
+    """
+    logger.info(f"Received event: {json.dumps(event, default=str)}")
+    
+    # Extract CloudFormation custom resource properties
+    request_type = event.get('RequestType')
+    resource_properties = event.get('ResourceProperties', {})
+    
+    # Configuration
+    file_system_id = resource_properties.get('FileSystemId')
+    
+    response_data = {}
+    
+    try:
+        if request_type in ['Create', 'Update']:
+            logger.info(f"Processing {request_type} request")
+            
+            # Validate required parameters
+            if not file_system_id:
+                raise ValueError("FileSystemId is required in ResourceProperties")
+            
+            # Wait for mount targets to be available
+            logger.info(f"Waiting for mount targets to be available for EFS: {file_system_id}")
+            
+            max_wait_time = 300  # 5 minutes
+            start_time = time.time()
+            
+            while time.time() - start_time < max_wait_time:
+                try:
+                    # Get mount targets for the file system
+                    response = efs_client.describe_mount_targets(FileSystemId=file_system_id)
+                    mount_targets = response.get('MountTargets', [])
+                    
+                    if not mount_targets:
+                        logger.info("No mount targets found yet, waiting...")
+                        time.sleep(10)
+                        continue
+                    
+                    # Check if all mount targets are available
+                    all_available = True
+                    for mt in mount_targets:
+                        state = mt.get('LifeCycleState')
+                        logger.info(f"Mount target {mt.get('MountTargetId')} state: {state}")
+                        if state != 'available':
+                            all_available = False
+                            break
+                    
+                    if all_available:
+                        logger.info("All mount targets are available!")
+                        response_data['MountTargetsReady'] = True
+                        response_data['MountTargetCount'] = len(mount_targets)
+                        break
+                    else:
+                        logger.info("Some mount targets are not ready yet, waiting...")
+                        time.sleep(10)
+                        
+                except ClientError as e:
+                    logger.warning(f"Error checking mount targets: {e}")
+                    time.sleep(10)
+            else:
+                # Timeout reached
+                raise RuntimeError(f"Mount targets did not become available within {max_wait_time} seconds")
+                
+        elif request_type == 'Delete':
+            logger.info("Processing Delete request - nothing to do")
+            response_data['Status'] = 'Deleted'
+        
+        # Send success response to CloudFormation
+        send_response(event, context, 'SUCCESS', response_data)
+        
+    except Exception as e:
+        logger.error(f"Error processing request: {str(e)}", exc_info=True)
+        # Send failure response to CloudFormation
+        send_response(event, context, 'FAILED', {'Error': str(e)})
+        raise
+
+
+def send_response(event, context, response_status, response_data):
+    """
+    Send response to CloudFormation custom resource.
+    """
+    response_url = event.get('ResponseURL')
+    if not response_url:
+        logger.warning("No ResponseURL provided - this may be a test invocation")
+        return
+    
+    # Prepare response payload
+    response_body = {
+        'Status': response_status,
+        'Reason': f'See CloudWatch Log Stream: {context.log_stream_name}',
+        'PhysicalResourceId': event.get('LogicalResourceId', 'MountTargetWaiterResource'),
+        'StackId': event.get('StackId'),
+        'RequestId': event.get('RequestId'),
+        'LogicalResourceId': event.get('LogicalResourceId'),
+        'Data': response_data
+    }
+    
+    json_response_body = json.dumps(response_body)
+    logger.info(f"Sending response to CloudFormation: {response_status}")
+    logger.debug(f"Response body: {json_response_body}")
+    
+    try:
+        # Send HTTP PUT request to CloudFormation
+        http = urllib3.PoolManager()
+        response = http.request(
+            'PUT',
+            response_url,
+            body=json_response_body,
+            headers={
+                'Content-Type': 'application/json',
+                'Content-Length': str(len(json_response_body))
+            }
+        )
+        logger.info(f"CloudFormation response status: {response.status}")
+        
+    except Exception as e:
+        logger.error(f"Failed to send response to CloudFormation: {str(e)}")
+        raise
--- a/deploy/aws-sam/src/mount_target_waiter/requirements.txt
+++ b/deploy/aws-sam/src/mount_target_waiter/requirements.txt
@ -0,0 +1,2 @@
+boto3>=1.26.0
+urllib3>=1.26.0
--- a/deploy/aws-sam/src/secretsmanager_endpoint_ecs_access/app.py
+++ b/deploy/aws-sam/src/secretsmanager_endpoint_ecs_access/app.py
@ -0,0 +1,97 @@
+"""
+CloudFormation custom resource: add this stack's ECS security group to an existing
+Secrets Manager VPC endpoint's security group so ECS tasks can pull secrets.
+Runs during stack create/update (after ECSSecurityGroup exists, before ECS Service).
+"""
+import json
+import logging
+import urllib3
+import boto3
+from botocore.exceptions import ClientError
+
+logger = logging.getLogger()
+logger.setLevel(logging.INFO)
+ec2 = boto3.client('ec2')
+
+
+def lambda_handler(event, context):
+    request_type = event.get('RequestType')
+    props = event.get('ResourceProperties', {})
+    endpoint_sg_id = (props.get('EndpointSecurityGroupId') or '').strip()
+    ecs_sg_id = (props.get('EcsSecurityGroupId') or '').strip()
+    response_data = {}
+
+    try:
+        if request_type in ('Create', 'Update'):
+            if endpoint_sg_id and ecs_sg_id:
+                logger.info(
+                    "Adding ingress to endpoint SG %s: TCP 443 from ECS SG %s",
+                    endpoint_sg_id, ecs_sg_id
+                )
+                try:
+                    ec2.authorize_security_group_ingress(
+                        GroupId=endpoint_sg_id,
+                        IpPermissions=[{
+                            'IpProtocol': 'tcp',
+                            'FromPort': 443,
+                            'ToPort': 443,
+                            'UserIdGroupPairs': [{'GroupId': ecs_sg_id}],
+                        }],
+                    )
+                    response_data['RuleAdded'] = 'true'
+                except ClientError as e:
+                    if e.response['Error']['Code'] == 'InvalidPermission.Duplicate':
+                        logger.info("Rule already exists, no change")
+                        response_data['RuleAdded'] = 'already_exists'
+                    else:
+                        raise
+            else:
+                logger.info(
+                    "EndpointSecurityGroupId or EcsSecurityGroupId empty; skipping (no-op)"
+                )
+        elif request_type == 'Delete':
+            if endpoint_sg_id and ecs_sg_id:
+                try:
+                    ec2.revoke_security_group_ingress(
+                        GroupId=endpoint_sg_id,
+                        IpPermissions=[{
+                            'IpProtocol': 'tcp',
+                            'FromPort': 443,
+                            'ToPort': 443,
+                            'UserIdGroupPairs': [{'GroupId': ecs_sg_id}],
+                        }],
+                    )
+                    response_data['RuleRevoked'] = 'true'
+                except ClientError as e:
+                    if e.response['Error']['Code'] in (
+                        'InvalidPermission.NotFound', 'InvalidGroup.NotFound'
+                    ):
+                        logger.info("Rule or group already gone, ignoring")
+                    else:
+                        logger.warning("Revoke failed (non-fatal): %s", e)
+        send_response(event, context, 'SUCCESS', response_data)
+    except Exception as e:
+        logger.error("Error: %s", e, exc_info=True)
+        send_response(event, context, 'FAILED', {'Error': str(e)})
+        raise
+
+
+def send_response(event, context, response_status, response_data):
+    response_url = event.get('ResponseURL')
+    if not response_url:
+        return
+    body = {
+        'Status': response_status,
+        'Reason': f'See CloudWatch Log Stream: {context.log_stream_name}',
+        'PhysicalResourceId': event.get('LogicalResourceId', 'SecretsManagerEndpointEcsAccess'),
+        'StackId': event.get('StackId'),
+        'RequestId': event.get('RequestId'),
+        'LogicalResourceId': event.get('LogicalResourceId'),
+        'Data': response_data,
+    }
+    http = urllib3.PoolManager()
+    http.request(
+        'PUT', response_url,
+        body=json.dumps(body),
+        headers={'Content-Type': 'application/json'},
+    )
--- a/deploy/aws-sam/src/secretsmanager_endpoint_ecs_access/requirements.txt
+++ b/deploy/aws-sam/src/secretsmanager_endpoint_ecs_access/requirements.txt
@ -0,0 +1,2 @@
+boto3>=1.26.0
+urllib3>=1.26.0
--- a/deploy/aws-sam/template.yaml
+++ b/deploy/aws-sam/template.yaml