azure storage – change feed to replicate data
high-availability solution for a web application dealing with large volumes of data that need to be accessible within a specific time frame. The solution involves using Azure Cosmos DB as the primary data store and using the Azure Cosmos DB change feed to replicate data to low-cost secondary storage.
Updated
September 14, 2025
0
# Azure Minimal Storage Architecture with Change Feed
This Terraform configuration deploys a high-availability web application architecture on Azure that uses Cosmos DB change feed to replicate data to low-cost secondary storage. The solution provides automatic data cleanup, geo-replication, and global load balancing.
## Architecture Overview
The architecture implements a minimal storage pattern where:
- **Azure Cosmos DB** stores primary data with automatic TTL for data expiration
- **Change Feed** triggers Azure Functions to replicate data to Table Storage
- **Azure Functions** handle background processing and data cleanup
- **Azure Front Door** provides global load balancing and failover
- **Multi-region deployment** ensures high availability
## Components Deployed
### Core Infrastructure
- **Resource Groups**: Primary and secondary regions
- **Azure AD App Registration**: For authentication and authorization
- **Key Vault**: Secure storage for secrets and certificates
- **Log Analytics Workspace**: Centralized logging and monitoring
- **Application Insights**: Application performance monitoring
### Compute Services
- **App Service Plans**: Hosting for web applications and functions
- **Linux Web Apps**: Primary and secondary web applications
- **Linux Function Apps**: Change feed processing and data cleanup
### Data Services
- **Azure Cosmos DB**: Primary database with geo-replication
- **Storage Accounts**: Queue and Table storage for background tasks and replicated data
- **Azure Cache for Redis**: In-memory caching for performance
### Networking & Security
- **Azure Front Door**: Global CDN and load balancer
- **DNS Zone**: Custom domain management
- **Managed Identities**: Secure service-to-service authentication
## Prerequisites
- Azure CLI installed and authenticated
- Terraform installed (>= 1.2)
- Sufficient Azure subscription permissions to create resources
- A custom domain name (optional, for production use)
## Project Structure
```
├── provider.tf # Terraform providers and backend configuration
├── locals.tf # Local values and naming conventions
├── variables.tf # Variable declarations
├── terraform.tfvars # Variable values (customize for your environment)
├── main.tf # Main infrastructure resources
└── README.md # This documentation
```
## Quick Start
1. **Clone and customize the configuration:**
```bash
# Edit terraform.tfvars with your specific values
nano terraform.tfvars
```
2. **Initialize Terraform:**
```bash
terraform init
```
3. **Review the deployment plan:**
```bash
terraform plan -var-file="terraform.tfvars"
```
4. **Deploy the infrastructure:**
```bash
terraform apply -var-file="terraform.tfvars"
```
5. **Access your application:**
- Primary region: `https://[project-name]-[env]-webapp-[primary-location].azurewebsites.net`
- Front Door endpoint: `https://[project-name]-[env]-endpoint-[random].azurefd.net`
## Configuration
### Key Variables to Customize
| Variable | Description | Default |
|----------|-------------|---------|
| `project_name` | Name prefix for all resources | `minimal-storage` |
| `environment` | Environment identifier | `dev` |
| `primary_location` | Primary Azure region | `East US` |
| `secondary_location` | Secondary Azure region | `West US 2` |
| `dns_zone_name` | Your custom domain | `yourdomain.com` |
| `data_retention_days` | Days to keep data in Cosmos DB | `30` |
### Application Settings
The deployed applications will have the following connection strings and settings configured:
- **Cosmos DB**: Connection string and database/container names
- **Redis Cache**: Connection string for caching
- **Storage Account**: Connection strings for queues and tables
- **Azure AD**: Client ID, secret, and tenant ID for authentication
## Resource Connections
### Data Flow
1. **Web App** → **Cosmos DB**: Store primary application data
2. **Cosmos DB Change Feed** → **Function App**: Trigger data replication
3. **Function App** → **Table Storage**: Replicate data for long-term storage
4. **Function App** → **Cosmos DB**: Clean up expired data
5. **Web App** → **Redis Cache**: Cache frequently accessed data
6. **Web App** → **Queue Storage**: Queue background tasks
### Network Flow
1. **User** → **Azure Front Door**: Global entry point
2. **Front Door** → **Primary/Secondary Web App**: Load balancing and failover
3. **Web App** → **Azure AD**: User authentication
4. **Web App** → **Key Vault**: Retrieve secrets securely
## Function Apps
The solution deploys Function Apps in both regions with the following responsibilities:
### Change Feed Processor Function
- Triggered by Cosmos DB change feed
- Replicates data changes to Table Storage
- Filters out delete operations to preserve historical data
### Data Cleanup Function
- Scheduled function (timer trigger)
- Removes expired data from Cosmos DB based on TTL
- Configurable retention period via `data_retention_days`
## Monitoring and Logging
- **Application Insights**: Application performance monitoring
- **Log Analytics**: Centralized logging for all services
- **Metrics**: Built-in Azure metrics for all services
- **Alerts**: Configure custom alerts based on metrics and logs
## Security Features
- **Managed Identities**: Service-to-service authentication without storing credentials
- **Key Vault Integration**: Secure storage and retrieval of secrets
- **Azure AD Authentication**: Built-in authentication for web applications
- **HTTPS Only**: All traffic encrypted in transit
- **Network Security**: Front Door provides DDoS protection and WAF capabilities
## Cost Optimization
- **Serverless Cosmos DB**: Pay only for consumed RU/s
- **Table Storage**: Low-cost storage for replicated data
- **Consumption Function Apps**: Pay per execution
- **TTL-based Data Cleanup**: Automatic cost reduction through data expiration
## Customization
### Adding Custom Domains
1. Update `dns_zone_name` and `custom_domain` variables
2. Configure DNS records to point to Front Door endpoint
3. Add custom domain to Front Door configuration
### Scaling Considerations
- **App Service Plans**: Upgrade SKU for higher performance
- **Cosmos DB**: Configure dedicated throughput for predictable performance
- **Redis Cache**: Increase capacity for larger datasets
- **Function Apps**: Consider Premium plans for consistent performance
## Troubleshooting
### Common Issues
1. **Key Vault Access Denied**
- Ensure Managed Identity has proper access policies
- Check that Key Vault is in the same subscription
2. **Function App Not Triggering**
- Verify Cosmos DB change feed is enabled
- Check Function App connection strings
- Review Application Insights for errors
3. **Front Door Not Routing**
- Verify origin health status
- Check that web apps are running
- Review Front Door routing rules
### Monitoring Commands
```bash
# Check resource deployment status
terraform show
# View outputs
terraform output
# Refresh state
terraform refresh
# Destroy resources (caution!)
terraform destroy -var-file="terraform.tfvars"
```
## Production Considerations
- **Backup Strategy**: Configure automated backups for Cosmos DB
- **Disaster Recovery**: Test failover scenarios
- **Security**: Enable Azure Security Center recommendations
- **Performance**: Monitor and tune based on application metrics
- **Compliance**: Review data retention and privacy requirements
## Support
For issues related to:
- **Terraform**: Check Terraform documentation
- **Azure Services**: Review Azure documentation
- **Application Code**: This infrastructure is ready to host your application
## License
This Terraform configuration is provided as-is for educational and development purposes.