DEV Community

Cover image for Building Hybrid Identity at Scale: 1,000+ Users from Active Directory to Entra ID
Christian
Christian

Posted on

Building Hybrid Identity at Scale: 1,000+ Users from Active Directory to Entra ID

Description: Built a production-scale hybrid identity lab with 1,158 users
synced from Active Directory to Microsoft Entra ID. Used Terraform
for infrastructure-as-code, persistent disks to save £65/month,
and automated PowerShell imports. Includes UPN transformation,
OU structure, and honest security trade-offs.
Total cost: <£5/month
series: AI-IAM-Engineering-Lab

Most hybrid identity tutorials show you how to sync 5 test users. That's like learning to drive in an empty parking lot, then wondering why you crash in rush hour traffic. The real problems—UPN transformations that don't make sense, sync conflicts at scale, performance issues with thousands of objects—only surface when you're actually operating at volume.

This post walks through building a hybrid identity environment with 1,158 users. Over a thousand users with realistic department distributions, job titles, and organizational structures. We're deploying Active Directory on Azure using Terraform, automating the entire user import with PowerShell, and synchronizing everything to Microsoft Entra ID while dealing with the UPN transformation headaches.

Why Hybrid Identity Still Matters in 2026

Despite what cloud vendors want you to believe, 95% of enterprises still run hybrid identity. On-premises Active Directory isn't disappearing anytime soon. Legacy applications authenticate against it. Group policies manage thousands of workstations. And every single M365 deployment needs identity synchronization unless you're starting from absolute zero.

The gap between "cloud-only" marketing and operational reality creates a massive skill shortage. Identity engineers who can architect hybrid solutions at scale remain valuable precisely because this knowledge is just absolutely necessary to keep enterprises running.

Architecture: The Full Picture

Complete hybrid identity architecture

Complete hybrid identity architecture: AD syncing to Entra ID via Entra Connect with UPN transformation

On-Premises Layer (simulated in Azure):

  • Windows Server 2022 domain controller running aiiam.local
  • 1,158 users across six departments
  • Department-based OU structure
  • Users authenticate with @aiiam.local UPN

Synchronization Layer:

  • Microsoft Entra Connect with Password Hash Synchronization
  • 30-minute delta sync intervals
  • UPN suffix transformation from .local to .onmicrosoft.com
  • OU filtering for organizational users only

Cloud Layer:

  • Microsoft Entra ID tenant: acme.onmicrosoft.com
  • 1,158 synchronized users with hybrid identity attributes
  • Both UPNs work for seamless SSO

Active Directory uses sarah.mitchell@aiiam.local, but Entra ID needs a routable domain. Entra Connect handles this transparently—users can authenticate with either UPN.

Image showing AD services, .local forest and IP address

Domain controller IDENTITYDC running aiiam.local with all AD services operational

Infrastructure Deployment: Terraform for Repeatability

Manual setup works for one-off demos. For labs you'll teardown and rebuild? Infrastructure-as-Code saves hours.

The Terraform config handles everything: virtual network, domain controller VM, storage for user data, and—critically—a persistent OS disk that survives terraform destroy.

Here's the persistent disk configuration:

resource "azurerm_windows_virtual_machine" "ad_server" {
  name                  = "ad-dc-vm"
  computer_name         = "IDENTITYDC"
  size                  = "Standard_D2s_v3"

  identity {
    type = "SystemAssigned"  # Managed identity for secure blob access
  }

  os_disk {
    name                 = "ad-dc-os-disk-persistent"
    caching              = "ReadWrite"
    storage_account_type = "Standard_LRS"
    disk_size_gb         = 128
  }
}
Enter fullscreen mode Exit fullscreen mode

The disk name ad-dc-os-disk-persistent makes Azure treat it as pre-existing on subsequent deployments. First run: fresh Windows Server image. After terraform destroy and reapply: it reattaches the existing disk with your DC config intact.

💡 Game Changer: This persistent disk strategy transformed my workflow. Initial setup: 2-3 hours (domain promotion, user imports, Entra Connect config). With persistent disks: terraform destroy when done (saves money), terraform apply when needed (back online in 10 minutes with everything intact). For labs that run intermittently, this is the difference between £90/month and £5/month.

Cost optimization for intermittent labs:

resource "azurerm_dev_test_global_vm_shutdown_schedule" "ad_vm_shutdown" {
  virtual_machine_id    = azurerm_windows_virtual_machine.ad_server.id
  enabled               = true
  daily_recurrence_time = "1900"
  timezone              = "GMT Standard Time"
}
Enter fullscreen mode Exit fullscreen mode

Auto-shutdown at 7 PM GMT means you're only paying for compute when actively using the lab. VM stopped: £0.02/day for disk storage. Total monthly cost: under £5. This is also a fail safe incase I forget to terraform destroy after a demo day to avoid charges for VM running.

Run terraform apply and watch Azure populate your resource group:

Terraform-deployed resources

Terraform-deployed resources: VM, persistent disk, networking, and storage

Everything you need for hybrid identity lands in one resource group: persistent OS disk, Standard_D2s_v3 VM, storage account with the user JSON, network interface with static IP, security group, public IP, and virtual network. Tags track project phase for cost analysis across multiple lab environments.

User data JSON file uploaded to blob storage for VM initialization

User data JSON file uploaded to blob storage for VM initialization

Terraform uploads entra_users.json to blob storage, VM downloads it using a time-limited SAS token. Keeps sensitive data out of custom script extensions.

Active Directory Setup: Getting the Domain Right

Domain name choice matters. I used aiiam.local because .local is conventional for on-premises AD.

Installing AD DS features

Installing AD DS features and promoting to domain controller with aiiam.local domain

After promotion and the inevitable reboot:

Service all running

AD Web Services, DNS, Netlogon, and NTDS running—domain controller is operational

Organizational Unit Structure: Preparing for Scale

OU structure isn't just tidiness—it impacts Entra Connect filtering, Group Policy, and administrative delegation. I created AIIAM-Users as the parent OU holding all departmental OUs:

DC=aiiam,DC=local
├── AIIAM-Users
│   ├── Engineering
│   ├── Operations
│   ├── Executive
│   ├── Finance
│   ├── People
│   └── Security
Enter fullscreen mode Exit fullscreen mode

Department-based organizational units

Department-based organizational units under AIIAM-Users parent container

Users OU structure

AIIAM-Users OU structure populated with users distributed across departments

The parent OU becomes the sync boundary. Default containers stay on-premises.

Importing 1,158 Users: Automation at Work

The user data: 1,158 AI-generated identities in entra_users.json. Each has realistic attributes—display name, job title, department, office location. The JSON came from Entra ID's Graph API format, which means it needed transformation for AD import (because of course Microsoft's two identity systems use different attribute names for the same data).

The PowerShell script does a few things: reads the JSON, creates OUs if they're missing, sets the correct UPN suffix, and most importantly—shows progress every 100 users so you don't spend 8 minutes wondering if it crashed.

Here's the core import logic:

# Read and parse JSON
$users = Get-Content "C:\IdentityLab\entra_users.json" | ConvertFrom-Json
Write-Host "Loaded $($users.Count) users from JSON" -ForegroundColor Cyan

# Department mapping from job titles
$departmentMap = @{
    "Engineer" = "Engineering"
    "Operations" = "Operations"
    "Executive" = "Executive"
    "Finance" = "Finance"
    "HR" = "People"
    "Director" = "Security"
}

$count = 0
foreach ($user in $users) {
    $count++
    $username = ($user.userPrincipalName -split '@')[0]

    # Determine department from job title
    $department = "Operations"  # default
    foreach ($key in $departmentMap.Keys) {
        if ($user.jobTitle -match $key) {
            $department = $departmentMap[$key]
            break
        }
    }

    New-ADUser -Name $user.displayName `
               -GivenName $user.givenName `
               -Surname $user.surname `
               -UserPrincipalName "$username@aiiam.local" `
               -SamAccountName $username `
               -Title $user.jobTitle `
               -Department $department `
               -Office $user.officeLocation `
               -Path "OU=$department,OU=AIIAM-Users,DC=aiiam,DC=local" `
               -AccountPassword (ConvertTo-SecureString "redacted!" -AsPlainText -Force) `
               -Enabled $true

    if ($count % 100 -eq 0) {
        Write-Host "Imported $count users..." -ForegroundColor Green
    }
}
Enter fullscreen mode Exit fullscreen mode

Import script processing 1,100+ users with progress indicators

Import script processing 1,100+ users with progress indicators

Import completed in 8 minutes. Progress indicators every 100 users prevented that "is this frozen?" anxiety.

1,157 users successfully imported in Active Directory

1,157 users successfully imported in Active Directory across six departments

Department distribution:

Department Users Percentage
Operations 711 61.5%
Engineering 269 23.2%
Executive 69 6.0%
Finance 58 5.0%
People 48 4.1%
Security 2 0.2%

Operations dominated because the AI dataset included many operational roles: specialists, managers, coordinators. This mirrors real organizations where operational staff outnumber engineers.

The 2-Hour Gotcha That Almost Broke Everything

Here's the mistake that cost me two hours: my first import succeeded—1,157 users created in AD, everything looked perfect. Sync ran. Users appeared in Entra ID. Victory, right?

Wrong. The Department field was completely empty in Entra ID. Not null. Just... blank. For every single user.

I spent two hours thinking Entra Connect was broken. Checked attribute mappings. Re-ran sync. Read Microsoft docs. Questioned my career choices. Finally realized the source JSON didn't have a department attribute—it only had jobTitle. Entra Connect was working perfectly. It was faithfully syncing an empty field.

The fix: extract department from job titles using pattern matching in PowerShell during import. Worked for 99% of cases. Edge cases got dumped into Operations as the default. Lesson learned: validate your source data BEFORE building infrastructure around it.

Why does this matter? Try building conditional access policies based on department when every user's department is blank. Can't require MFA for Finance if you can't identify who's in Finance.

Entra Connect Configuration: Where Theory Meets Reality

Downloading Entra Connect

Downloading Entra Connect installer directly to the domain controller

1. Authentication Method

Selecting Password Hash Synchronization

Selecting Password Hash Synchronization for simplest hybrid auth

Three options: Password Hash Sync (PHS), Pass-through Authentication (PTA), or Federation. PHS wins for labs and many production scenarios:

  • Simplicity: No additional infrastructure
  • Resilience: Cloud auth works even if on-premises AD is offline
  • Security: One-way hash sync, not passwords
  • Seamless SSO: Transparent auth on domain-joined machines

PTA needs agents online. Federation demands ADFS infrastructure. PHS just works.

2. Service Account Permissions

Creating service account with required permission

Creating service account with required permissions for synchronization

Let the wizard create the service account. It grants precisely the needed permissions.

3. UPN Suffix Mapping: The Confusing Part

When Entra Connect analyzes your AD forest, it finds @aiiam.local. The wizard warns: "UPN suffix 'aiiam.local' is not verified in Entra ID and cannot be added."

You'll see: "Continue without matching all UPN suffixes to verified domains."

Most people panic here. "Continue without matching" sounds like you're about to break something critical. Here's what actually happens:

  1. User in AD: sarah.mitchell@aiiam.local
  2. Entra Connect reads this during sync
  3. Suffix aiiam.local doesn't exist in Entra ID (.local isn't routable)
  4. Entra Connect substitutes the default domain: acme.onmicrosoft.com
  5. User syncs to cloud as: sarah.mitchell@acme.onmicrosoft.com

Both UPNs work. On-premises: @aiiam.local. Cloud services: @acme.onmicrosoft.com. Seamless SSO makes it transparent.

⚠️ Critical: Don't panic when you see "Continue without matching all UPN suffixes." Check it and move on. Th is is just saying proceed even though your AD UPN suffixes don’t match verified Entra domains. Microsoft best practise recommends adding a routable domain to AD. This is the CORRECT configuration for hybrid environments with non-routable on-premises domains.

OU Filtering & Source Anchor

Configure sync to include only AIIAM-Users OU. Uncheck everything else. This keeps admin accounts, computer objects, and default containers on-premises.

Source anchor defaults to mS-DS-ConsistencyGuid—don't change it. This immutable identifier permanently links on-premises objects to cloud.

Write-Back Configuration

I enabled password write-back because I wanted to see how it works in practice. This lets users reset their passwords via M365 self-service, and the change automatically syncs back to on-premises AD—genuinely useful for production. I skipped device write-back, group write-back, and user write-back to avoid complexity. Password write-back is low-risk (passwords flow one way: cloud → on-prem during resets). The others? Add them when you have a clear use case

After config, sync starts immediately. First sync takes longer. Subsequent delta syncs every 30 minutes process only changes.

Entra Connect sync scheduler

Entra Connect sync scheduler showing 30-minute delta sync intervals

The 30-minute interval means AD changes appear in Entra ID within half an hour. For testing: Start-ADSyncSyncCycle -PolicyType Delta.

Verification: Confirming Hybrid Identity

After initial sync (about 30 minutes), check Entra ID portal with filter: On-premises sync enabled == Yes

Entra ID portal showing 1,158 user

Entra ID portal showing 1,158 users with on-premises sync enabled

Exactly 1,158 users—every AD import now exists in Entra ID. Click any user to see hybrid metadata:

profile showing full hybrid identity attributes

Aaliyah Campbell's profile showing full hybrid identity attributes and sync details

Job Information (synced from AD):

  • Job title: Junior Software Engineer
  • Department: Engineering
  • Office location: London, UK

On-premises Sync Metadata (hybrid identity fields):

  • On-premises sync enabled: Yes
  • On-premises last sync date time: 13 Mar 2026, 21:28
  • On-premises distinguished name: CN=Aaliyah Campbell,OU=AIIAM-Users,DC=aiiam,DC=local
  • On-premises immutable ID: A+NNCUrXmEKcpV8o3zqjjA== (Base64-encoded mS-DS-ConsistencyGuid)
  • On-premises SAM account name: aaliyah.campbell
  • On-premises user principal name: aaliyah.campbell@aiiam.local
  • On-premises domain name: aiiam.local

The distinguished name confirms the OU structure synchronized correctly. The immutable ID links this cloud object permanently to the on-premises AD account—mess with this and you break the sync relationship. The dual UPN setup is right there: one for on-premises (@aiiam.local), one for cloud (@acem.onmicrosoft.com).

This metadata saves you when auth breaks at 2 AM. User can't log in? Check these fields first. They'll tell you immediately if sync failed, if the UPN got mangled, or if you're chasing a completely different problem.

Testing Dual UPN Authentication

Both UPNs work:

  • aaliyah.campbell@aiiam.local for domain-joined systems
  • aaliyah.campbell@acme.onmicrosoft.com for M365, Azure portal, and Entra ID apps

Seamless SSO means domain users authenticate automatically to cloud services. No credential re-entry. The UPN transformation is invisible—they just log in.

Key Learnings from Building at Scale

UPN mapping is counterintuitive but works perfectly. That warning about unmatched UPN suffixes? It's technically accurate but designed to terrify you. Microsoft's UI team made that checkbox sound like you're bypassing a critical safety check. You're not. "Continue without matching" isn't a workaround—it's literally the correct configuration for every hybrid environment with non-routable on-premises domains. Microsoft recommends you add a routable domain to avoid split identity problem.

Persistent disks are transformational. First build: 3 hours manual setup. Terraform deployment: 8 minutes. Persistent disk strategy: destroy compute to save costs, rebuild in 10 minutes when needed. For intermittent labs, this is game-changing.

Department fields matter. Seemed optional until I thought about conditional access. Need MFA for Finance users accessing sensitive apps? Department field makes that policy trivial.

Scale reveals hidden issues. With 5 test users, you don't notice that your import script has zero progress indicators. At 1,000 users, watching a silent PowerShell window for 8 minutes wondering if it crashed becomes maddening. You also discover fun things like LDAP query optimization, why bulk operations matter, and how network latency to your domain controller adds up when you're creating users one at a time.

Cost optimization needs planning. Standard_D2s_v3 running 24/7: £90/month. Auto-shutdown + manual startup: £5/month. Include the shutdown schedule in your Terraform config—not as an afterthought when you open your Azure bill and wonder why you're funding Microsoft's next datacenter expansion.

JSON transformation between systems requires attention. Entra ID's Graph API format differs from what New-ADUser expects. Job title maps to title not jobTitle. Office location maps to physicalDeliveryOfficeName not officeLocation. Small mismatches cause silent failures.

Security Audit: Lab vs. Production Trade-offs

Let's be clear: this deployment would get you fired in production.

Domain controller with a public IP? RDP open to the entire internet? No MFA? Static passwords committed to version control? Any competent security team would shut this down before it hit production.

And that's fine. This is a lab, not a production environment. But let's be honest about exactly what corners we're cutting and why they'd be catastrophic in the real world.

Security Measures We Implemented

Infrastructure:

  • Managed Identity for blob access (ended up using SAS token)
  • Time-limited SAS tokens (1-hour expiry)
  • Private blob storage
  • Network Security Group rules
  • Auto-shutdown to reduce attack surface

Identity:

  • Password Hash Sync (more resilient than pass-through)
  • Service account least privilege
  • OU filtering (admin accounts stay on-premises)
  • Immutable ID anchoring

Security Concerns We Deliberately Ignored

Network Exposure:

  • RDP open to 0.0.0.0/0: Production should restrict RDP to known IP ranges or use Azure Bastion
  • Public IP on domain controller: Real deployments use site-to-site VPN or ExpressRoute
  • No network segmentation: Production needs separate subnets for DCs, apps, and management

Authentication & Access:

  • No MFA enforcement: Production Entra ID requires MFA for admins minimum, ideally all users
  • Static passwords in Terraform: Sensitive values should use Azure Key Vault
  • No conditional access policies: Production needs location-based, device compliance, and risk-based policies
  • No Privileged Identity Management: Admin access should be just-in-time, not permanent
  • UPN Transformation: One of the most common root causes of Microsoft 365 authentication incidents during hybrid migrations is keeping .local UPNs instead of aligning them before sync with Microsoft Entra Connect.

Monitoring & Compliance:

  • No audit logging: Production needs Azure Monitor, Log Analytics, and SIEM integration
  • No alerts on suspicious activity: Failed auth, unusual locations, privilege escalation should trigger alerts
  • No backup strategy: Domain controller state should be backed up
  • No disaster recovery plan: Production needs documented DR procedures and tested failover

Compliance & Governance:

  • No encryption at rest: Production should enforce Azure Disk Encryption
  • No data residency controls: Compliance may mandate specific Azure regions
  • No change management: Production changes need approval workflows, not direct terraform apply

Why This Is Acceptable for Labs

Labs prioritize demonstration/learning and cost over defence-in-depth. RDP from anywhere lets you connect without VPN complexity. Static passwords simplify Terraform demos. No MFA reduces testing friction.

The goal is understanding how hybrid identity works, not building a production fortress. You're experimenting, breaking things, fixing them. That's harder when every action needs MFA approval and change management tickets.

But: Understanding what you're skipping is critical. Every ignored control is a production requirement. The transition from lab to production isn't scaling up—it's hardening security.

Making This Safer

How would you harden this?

Network: Replace NSG 0.0.0.0/0 with specific IPs, deploy Azure Bastion, implement Azure Firewall

Identity: Enable MFA, implement conditional access, configure PAWs, use Identity Protection

Monitoring: Stream logs to Sentinel, configure anomaly alerts, enable Cloud App Security

You can progressively add these controls. Each layer teaches you something.

What else would you add? Drop your hardening recommendations in the comments. Someone always thinks of the control you missed.

What's Next:

We've built the foundation: AD infrastructure, 1,158 users imported, hybrid sync to Entra ID. These identities authenticate seamlessly across on-premises and cloud resources.

Next part adds Conditional Access and security monitoring: Conditional Access based on location, Azure Event Hub streaming AD sign-in events, Stream Analytics for anomaly detection. The 1,158-user dataset provides sufficient volume for meaningful security telemetry.

The identity foundation is in place. Now we monitor and detect threats.

Questions about hybrid identity at scale? Drop them in the comments.

Top comments (0)