I’ve been preparing for Microsoft exam 70-534: Architecting Microsoft Azure Solutions. At the time of writing, I haven’t yet sat the exam (so this post doesn’t breach any NDA) but the notes that follow were taken as I studied.
Resources I used included:
- Microsoft Association of Practicing Architects (MAPA) bootcamp (unfortunately the delivery suffered from issues with the streaming media platform and the practical labs are difficult to follow, partly due to changes in the platform).
- Hands-on time with Azure – though the exam is still mostly based on the old Classic/Azure Service Manager (ASM) model, so I found myself going back to learn things in ASM that I do differently under Azure Resource Manager (ARM).
- The Microsoft Press exam preparation book, which contains a lot more detail and is pretty readable (or it would be if I wasn’t trying to read it in PDF form – sometimes paperback books are better for flicking back and forward!).
- A free Azure subscription (either sign up for a one-off £125 credit for a month, or you can get £20 each month for 12 months through Visual Studio Dev Essentials).
— Mark Wilson (@markwilsonit) September 5, 2016
Other potentially-useful references include:
The rest of this post contains my study notes – which may be useful to others but will almost certainly not be enough to pass the exam (i.e. you’ll need to read around the topics too – the Azure documentation is generally very good).
Note that Microsoft Azure is a fast-moving landscape – these notes are based on studying the exam curriculum and may not be current – refer to the Azure documentation for the latest position.
- Virtual networks (VNets) are used to manage networking in Azure. Can only exist in one Azure region.
- CIDR notation is used to describe networks.
- Use different subnets to partition network – e.g. Internet-facing web servers from internal traffic; different environments.
- Subnet has to be part of VNet range with no overlap.
- All virtual machines (VMs) in a VNet can communicate (by default) but anything outside cannot talk (by default) – so VNet is default network boundary.
- In ASM, every VM has an associated cloud service (with its own name @cloudapp.net). Without subnets the VMs can only communicate via a public IP. If multiple cloud services are on same VNet then VMs can communicate using private IP.
- Endpoints are used to manage connections: internal (private) endpoint listening on a given port (e.g. for RDP on 3389); external (public) endpoint on defined port number – therefore go to a particular server, rather than just to the cloud service.
- Public from anywhere on the Internet; private only within the cloud service/VNet.
- Dynamic IP (DIP) is the private IP associated with a VM; only resolvable inside the VNet – external access needs a public IP. Can chose an IP address to use – and will be reserved.
- Virtual IP (VIP) – assigned to a cloud service – static public IP for as long as at least one VM running inside the cloud service.
- Instance Level Public IP (ILPIP) – for direct connection to Azure VM from Internet (not via the cloud service); public IP attached to a VM. In this configuration, whatever ports open on the VM are open to the Internet – effectively bypassing the security of the VNet.
- Use a VNet-to-VNet VPN to create a tunnel between VNets in different regions. This extends VNets to appear as if they were one.
- Site-to-site VPN to create tunnel between on-premises network and Azure VNet. Uses persistent hardware on-premises.
- Point-to-site VPN to create tunnel from individual computers to an Azure VNet. Software-based.
- Multi-site VPN is a combination of the other methods, combined.
- Azure ExpressRoute avoids routing via ISP – effectively a dedicated link from customer datacentre to Azure region, bypassing ISP. High throughput, low-latency and no effect on Internet link).
- ExpressRoute Providers provide point-to-point Ethernet of connect via a cloud exchange. BGP sessions with edge routers on customer site. 200Mbps/500Mbps/1Gbps/10Gbps.
- Can use for Azure computing (IaaS); Azure public services (web apps, etc. – PaaS) or Office 365 (SaaS).
- Secure network with Network Access Control Lists (ACLs), attached to a VIP – define what traffic will be allowed/denied to/from the VIP (i.e. the cloud service). Lower number rule has higher priority. First match is executed and rest are ignored.
- If there is no ACL – all traffic is allowed (whatever endpoints are open will allow access); if there is one or more permit, deny all others; if there is one or more deny, allow all others; combination of permit and deny to define a specific IP range.
- Network ACL affects Incoming traffic only.
- Network Security Groups (NSGs) are attached to a VM or a subnet and act on both inbound and outbound traffic.
- By default all inbound access is blocked inbound rules (allow inbound within VNet and from Azure LB; deny all other inbound – rules 65000/65001/65500).
- Outbound defaults allow outbound within VNet outbound, Internet outbound (0.0.0.0/0) and deny all others – rules 65000/65001/65500.
- Default rules can’t be edited but can be overridden with higher priority rules.
- Can only use Network ACLs or NSGs – not both together.
- VMs can have multiple NICs in different subnets – i.e. dual-homed machine.
Azure Virtual Machines
- Azure Hypervisor is similar to Hyper-V (but not the same).
- Different sizes of VMs are available.
- VMs are isolated at network and execution level – Azure customers never get access to the hypervisor – only to the VM layer.
- Use Windows Server 2008 onwards or Linux: OpenSUSE; SUSE Enterprise Linux; CentOS; Ubuntu 12.04; Oracle Enterprise Linux; CoreOS; OpenLogic; RHEL
- Basic and Standard service tiers – different machine types available:
- General Purpose: A0-A4 Basic; A5-A7 Standard; A8-A9 Network Optimised (10Gbps networking); A10-A11 Compute Intensive (high end CPUs)
- D1-D4, D11-D14 with SSD temp storage.
- DS1-DS4, DS11-14 with premium (SSD) storage.
- G1-G5 (and GS) with local SSD and lots of RAM.
- F and N too
— Mark Wilson (@markwilsonit) September 8, 2016
- Every Azure VM has temporary storage drive (D:) – lost when VM is moved/restarted.
- VMs may be attached to data disks that persist across VM restarts/redeployments and are locally replicated in-region (and beyond if specified).
- Can use gallery images or create custom images (to meet custom requirements, e.g. with certain software pre-installed).
- OS disk always has caching, default Read/Write (data disk caching is optional, default none) – changes need a reboot.
- Can create a bootable image from an OS disk (not data disk).
- Can change caching on data disk without reboot.
- OS disk max 127GB, data disk max 1TB.
- Only charged for storage used (regardless of what is provisioned).
- Can take VHDs from on-premises: (Windows Server 2008 R2 SP1 or later), sysprep then upload with
Add-AzureVhd -Destination storageaccount/container/name.vhd -LocalFilePath localfile.vhd; for Linux install WALinuxAgent (different preparation for different distributions).
- Tell cloud service to load balance an endpoint to split load between VMs. With ARM there is the option to define a separate Load Balancer.
- Encryption at rest for data disks requires third party applications (encryption is in preview though…).
- Availability set: 2 or more VMs distributed across fault domains and upgrade domains for SLA of 99.95% (no SLA for single VMs).
- Auto-scaling based on thresholds (mix/max number of instances, CPU utilisation, queue length – between web and worker roles) or time schedule (also time to wait before adding/removing more instances – AKA cooldown period). Needs at least 2 VMs in an availability set.
- Basic VMs have no load balancing or auto-scaling.
Azure Storage Service
- Blob, table, or queue storage (plus file storage for legacy apps) encapsulated inside a storage account.
- Two types: Standard/Premium – essentially HDD/SSD.
- Up to 500TB per storage account – can create multiple accounts.
- Data stored in multiple locations (minimum 3 copies).
- LRS (Locally Redundant Storage) synchronously replicates 3 copies data in separate fault and update domains. Use for: low cost; high throughput (less replication); data sovereignty concerns re: transfer out of region. If region goes down, so do all copies.
- ZRS (Zone Redundant Storage) also 3 copies but in at least 2 facilities (1 or 2 regions). Data durable in case of facility failure.
- GRS (Globally Redundant Storage) – 6 copies (3 copies in primary region asynchronously replicated to 3 more copies in a secondary region). Data still safe in a secondary region but cannot be read (unless Azure flips primary and secondary in event of catastrophic failure).
- RA-GRS (Read Access Geo Redundant Storage) – read from secondary copy. -secondary.cloud.core.windows.net domain name.
- More copies and more bandwidth is more cost! Also:
- GRS ingress max 10 Gibps (20 egress) but does not impact latency of transactions made to primary location.
- LRS ingress max 20 Gibps (30 egress)
- File storage – mounted by servers and accessed via API. Provides shared storage for applications using SMB 2.1. Use cases:
- On-premises apps that rely on file shares migrated to Azure VMs or cloud services without app re-write.
- Storing shared application settings (e.g. config files) or diagnostc data like logs, metrics and crash dumps.
- Tools and utils for developing or administering Azure VMs or cloud services.
- Create shares inside storage accounts – up to 5TB per share, 1TB per file. Unlimited total number of files and folders.
- Blob storage: Not a file system – an object store.
- Create containers inside storage accounts with up to 500TB data per container
- Block blobs, with block ID; uploaded and then committed – unless committed doesn’t become part of the blob: max 64MB per upload (blocks <=4MB), max 200GB per blob; Can upload in parallel, better for large blogs (generally) and for sequential streaming of data.
- Page blobs – collection of 512byte pages. Max size set during creation and initialisation (up to 1TB). Write by offset and range – instantly committed. Overwrite single page or up to 4MB at once; Generally used for random read/write operations (e.g. disks in VMs). Page blobs can be created on premium storage for higher IOPs.
- Access control is via 512bit keys (secret key – used in API calls to sign requests) – two keys so can maintain connectivity whilst regenerate another (i.e. during key rotation).
- Can have full public read access for anonymous access to blobs in a container; public read access for blobs only (but not list the blobs in the container); no public read access (default – only signed requests allowed); shared access signature – signed URL for access including permissions, start time and expiry time.
- Lease blob for atomic operations – lease for 15-60 seconds (or infinite). Acquire/renew/change/release (immediately)/break (at lease end).
- Snapshots – used to create a read-only copy of a blob (multiple snapshots possible but cannot outlive the original blob – i.e. deleting blob deletes the snapshots); charges based on difference.
- Copy blob to any container within the same storage account (e.g. between environments).
- Table storage:
- Store data for simple query – NoSQL key-value store – no locks, joins, validation.
- Generally, use row key to retrieve data.
- Can partition tables and generate a partition key.
- Use shared access signatures for querying/adding/updating/deleting/upserting (insert if does not already exist, else update) table entries
- Queue storage:
- Store and access messages through HTTP/HTTPS calls.
- Each queue entry up to 64KB in size.
- Store messages up to 100TB.
- Use for an asynchronous list for processing; messaging layer between applications (avoid handshaking – just add to or consume from the queue); or messaging between web and worker roles.
- Operations to put (add), get (which makes message invisible), peek (get first entry without making invisible), delete, clear (all), update (visibility timeout or contents) for messages.
- Pricing based on storage (per GB/month); replication type (LRS/ZRS/GRS/RA-GRS); bandwidth (ingress is free; egress charged per GB); requests/transactions.
- Web Apps are available in 5 tiers: free/shared/basic/standard/premium.
- These tiers affect: the maximum number of web/mobile/API apps (10/100/unlimited/unlimited/unlimited), logic apps (10/10/10/20 per core/20 per core, integration options (dev/test up to basic; Standard connectors for Standard; Premium Connectors and BizTalk Services for premium), disk space (1GB/1GB/10GB/50GB/500GB), maximum instances (-/-/3/10/50), App Service environments (Premium only), SLA (Free/shared none; Basic 99.9; Standard and Premium 99.95%)
- Resource Group and Web Hosting Plan are used to group websites and other resources in a single view; can also add databases and other resources; deleting a resource group will delete all of the resources in it.
- Instance types:
- Free F1.
- Shared D1.
- Basic B1-B3 1 core, 1.75GB RAM, 10GB storage x2 cores and RAM (2/3.5; 4/7) – VMs running web apps.
- Standard S1-S3 same cores and RAM but more storage (50GB).
- Premium P1-P4 same again but 500GB storage (P4 is 8 cores, 14GB RAM).
- Other things to configure:
- .NET Framework version.
- PHP version (or off).
- Java version (or off) – use web container version to chose between Tomcat and Jetty; enabling Java disables .NET, PHP and Python.
- Python version (or off).
- Scale web apps by moving up plans: Free-Shared-Basic-Standard – changes apply in seconds and affect all websites in web hosting plan. No real scaling for Free or Shared plans. Basic can change instance size and count. Standard can autoscale based on schedule or CPU – min/max instances (checked every 5 mins).
- Scale database separately.
- Deployment pipeline can be automated and can flip environments when move from staging to production (flips virtual IP). Can flip back if there are issues.
- SSL certificates – can add own custom certs (2 options – server name indication with multiple SSL certs on a single VM; or IP SSL for older browsers but only one SSL cert for IP address).
- Site extensions – no RDP access to the VM, so tools for website: Visual Studio Online for viewing code or phpMyAdmin.
- Webjobs allow running programs or scripts on website (like cron in Linux or scheduled task in Windows) – one time, schedules or recurring.
- Can use .cmd, .bat or .exe; .ps1, .sh., php, .py, .js
- Monitoring web app via metrics in the portal.
- For more complex, multi-tier apps.
- Web role with IIS
- Worker role for back-end (synchronous, perpetual tasks – independent of user interaction; uses polling, listening or third party process patterns).
- Upload code and Azure manages infrastructure (provisioning, load balancing, availability, monitoring, patch management, updates, hardware failures…)
- 99.95% SLA (min 2 role machines)
- Auto-scale based on CPU or queue.
- Communicate via internal endpoints, Azure storage queues, Azure Service Bus (pub/sub model – service bus creates a topic, published by web role and worker role subscriber is notified).
- Availability: fault domain (physical – power, network, etc.) – cannot control but can programmatically query to find out which domain a service is running in. In ASM, normally 0 or 1. ASM automatically distributes VMs across fault domains.
- Upgrade domain (logical – services stopped one domain at a time) – default is 5, can be changed.
If have web and worker roles, automatically placed in Availability set.
- Azure Service Definition Schema (.csdef file) has definitions for cloud service (number of web/worker roles, communications, etc.), service endpoints, config for the service – changes required restart of services.
- Azure Service Configuration Schema (.cscfg file) runtime components, number of VMs per web/worker role and size etc. – changes do not require service restart.
- Deployment pipeline as for Web Apps.
Azure Active Directory
- Identity and Access Management in the cloud – provided as a service.
- Optionally integrate with on-premises AD.
- Integrate with SaaS (e.g. Office 365).
- Use cases: system to take care of authentication for application in the cloud; “same sign-on” for applications on-premises and cloud; federation to avoid concerns re: syncing passwords and avoid multiple logins to different apps (even with same sign-on) – provide single sign-on; SSO for 1000s of third-party applications. Effectively, if sync password then same sign-on, if no password sync then single sign-on.
- Can also enable Multi-Factor Authentication (MFA) for Azure AD and therefore add MFA to third party apps.
- Directory integration with Azure Active Directory Synchronization Tool (DirSync) or Azure AD Sync. Use Azure Active Directory Connect instead.
- Can also use Forefront Identity Manager 2010 R2 (or Microsoft Identity Manager?) – originally was needed if sync multiple ADs.
- Each directory gets a DNS name at .onmicrosoft.com. Also possible to use custom domains (verify domains in DNS).
- Supports WS-Federation (SAML token format); OAuth 2.0; OpenID Connect; SAML 2.0.
Role-based access control
- Role = collection of actions that can be performed on Azure resources.
- Users for RBAC are from the associated Azure AD.
- Roles can be assigned to external account users by invite.
- Roles can be assigned to Azure AD security groups (recommended practice, rather than direct role assignment).
- Roles can also be assigned for Resource Groups (resources inherit access from subscription-Resource Group-Resource).
- Built-in roles: Owner (create and manage all types of resource); Reader (read all types of resource); Contributor (manage everything except access). Lots of other roles built on this construct – e.g. Virtual Network Contributor.
Azure SQL Database
- Relational database service as a service (PaaS) – up to 500 GB per database.
- Easy provisioning, automatic HA, load balancing, built-in management portal, scalability, use existing skills to deploy database, patching, etc. taken care of so less time to manage, easy sync with offline data.
- It is not same as SQL Server on a VM though!
- Unsupported features may have corresponding features in Azure; some are just not available.
- Performance model with different tiers: Basic, then Standard S0-S3, Premium P1-P2, P4, P6 (formerly P3).
- Measured in Database Thoughput Units (DTUs) – standardised model to help sizing (relative model [like ACU for VMs]).
- Only committing to transactions per hour in Basic, per minute in Standard, per second in Premium.
- Scaling Azure SQL: Federation is deprecated; Custom Sharding (create multiple database and use application logic to separate, e.g. based on customer ID); Elastic Scale (application doesn’t need to be so smart, endpoint is same but multiple applications).
- SQL database creates automatic backup for active database; at least 3 replicas at any one time – one primary replica and two or more secondaries (more if using GRS).
- Can restore to point-in-time (self-service capability to restore from automated system – creates new database on same server – zero-cost/zero-admin – number of days depends on service tier – 7, 14, 35 days for basic/standard/premium), or geo-restore (restore from geo-redundant backup to any server in any region.
- Automatically enabled for all tiers at no extra cost – helps when there is a region outage – estimated recovery time <12h RPO <1h).
- Also standard geo-replication (protect app from regional outage – one secondary database in Microsoft-defined paired region; secondary is visible but can’t connect to it until failover occurs – discount for secondary DB as offline until failover – standard/premium only with ERT <30s RPO <5s) and active geo-replication (database redundancy within different regions – up to 4 readable secondary servers – asynchronous replication of committed transactions from one DB to another; for write-intensive applications – e.g. load balancing for read-only workloads – premium only with ERT <30s RPO <5s).
- Regional disaster – Geo Restore, Standard or Active Geo-Replication.
- Online application upgrade – Active Geo replication.
- Online application relocation – Active Geo replication.
- Read load balancing – Active Geo replication.
- Security: only available via TCP 1433 – blocked by default – define firewall rules at server and database level to open up (i.e. to own IP address). Can define firewall rules programmatically with T-SQL, REST API and Azure PowerShell.
- Data encrypted on wire – SSL required all the time
- Data encrypted at rest – encryption with transparent data encryption – real-time I/O encryption/decryption for data and log files.
- Only supports SQL Server authentication or Azure AD authentication – i.e. no Windows authentication.
- First user created (master database principal) cannot be altered or dropped; can configure user-level permissions by logging on to the database and issuing SQL commands.
- Pricing: DB size plus outbound data transfers (per database, per month) – per hour pricing, so drop DTUs at quiet time.
Azure Mobile Service
- Cross-platform app development service (PaaS).
- Mobile apps need to be cross-platform, with cloud storage, ID management, database integration and push notifications.
- Azure Mobile Services provides mobile back-end as a service (MBaaS).
- Easily connect to SaaS APIs – e.g. Facebook, Salesforce, etc.
- Auto-scaling based on incoming customer load.
- User authentication taken care of by the service.
- Push notifications to millions in seconds.
- Offline-ready apps with sync capability.
Azure Content Delivery Network (CDN)
- Caching public objects from a storage account at point of presence (POP) for faster access close to users (and to scale when a lot of traffic hits).
- Content served from local edge location. If content not there (first serve), it fetches information from the origin and caches locally.
- Drastic reduction in traffic on original content (so faster access and more scalable!)
Use a CDN for lower latency, higher throughput, improved performance!
- POP locations separate to Azure regions – not full-fledged DCs.
- CDN origin can be Azure Storage, Apps, Cloud Services or Media Services (including live streaming) – or a custom origin on any web server.
- CDN Edge is a cache – not a permanent store.
- Anycast protocol is used to route user to closest endpoint.
- Create a CDN endpoint: http://cdnname.azureedge.net/
- Change website code to point to the CDN. Route dynamic content to origin, static to CDN.
- Can set a custom domain too (e.g. cdn.domain.com) – avoid browser warnings about content from other domains.
- Can also enable HTTPS – need to upload the SSL certificate.
- Default cache is 72 hours – cache control header can be used to control (any value >300s). Use to ensure not serving stale content.
- Use CDN to cache images, scripts, CSS from Azure Cloud Service but have to provide using HTTP on port 80.
- Pricing based on bandwidth (between edge and origin) and requests.
Azure Traffic Manager
- DNS-based routing for infrastructure. Route to different regions, monitoring health of endpoints (HTTP checks) to assist with DR. Many routing policies.
- Create a Traffic Manager endpoint and route to this via DNS.
- Options include failover load balancing (re-route based on availability, with priority list – 100% of traffic to one endpoint – used for DR/BC rather than scaling); round robin load balancing (shared across various endpoints in rotation – but only to healthy endpoints cf. DNS RR); Weighted round robin load balancing (use weight to distribute traffic between endpoints); performance load balancing (based on latency times).
- Different to traditional load balancer in that it is DNS-based – user request is direct to endpoint, not through load balancer. Also, note that traffic is direct to web servers – not to Edge locations as in CDN.
- Pay per DNS request resolved (TTL will keep this down) and per health-check configured.
- Diagnostic tasks may include performance measurement, troubleshooting and debugging, capacity planning, traffic analysis, billing and auditing.
- Monitor via portal; Visual Studio (plugins to parse logs, etc.) or third party tools.
- Azure management services to manage alerts or view operational logs. Create alerts based on metrics and thresholds (and average to smooth out spikes) and send email to service admins and co-admins or to a specific address.
- Operational logs are service requests – operation, timestamped, by whom.
- Visual Studio 2013 has Azure SDK for managing Azure services. Some limitations: with remote debugging cannot have more than 25 role instances in a cloud service.
- Azure Redis cache monitoring allows diagnostic data stored in storage account – enable desired chart from Redis cache blade to display the metric blade for that chart.
- System Center 2012 R2 can also monitor, provision, configure, automate, protect and self-service Azure and on-premises.
- Third party tools like New Relic and AppDynamics.
- For websites there are application diagnostic logs and site diagnostic logs (3 types: web server logging; detailed error messages; failed request tracing) – access via Visual Studio, PowerShell or portal. Kudu dashboard at https://sitename.scm.azurewebsites.net.
- View streaming log files (i.e. just see the end):
Get-AzureWebsiteLog -Name "sitename" -Tail -Path http
- View only the error logs:
Get-AzureWebsiteLog -Name "sitename" -Tail -Message Error
- Options include -ListPath (to list log paths) -Message <string> -Name <string> -Path (defaults to root) -Slot <string> -Tail (to stream instead of downloading entire log)
- Can also turn on diagnostics on storage accounts.
Azure HD Insight
- Microsoft Implementation of Hadoop – create clusters in minutes (Windows or Linux); pay per use (no need to leave running); use blob storage as storage layer and Excel to visualise the data.
- Hadoop uses divide and conquer approach to solving big data problems (chunking): processes the data, then combines it again – using HDFS and MapReduce components.
- Provision cluster, take large data set (e.g. search engine queries) on master node, distributed to processing nodes (Map). Reduce collects results and collates.
- Hybrid Hadoop – e.g. for organisations that offer analytics services – burst to cloud…
- Either site-to-site VPN on-premises to Azure, or ExpressRoute.
- Supports Storm and HBase clusters natively – can install other software via custom script.
- Connectors in WebApp (Standard and Premium) – connect to other services (e.g. Azure HDInsight).
High Performance Computing (HPC)
- HPC not the same as big data:
- Big data analytics is usually bounded by data volumes and so network IO.
- HPC usually CPU-bounded.
- HPC good for financial modelling, media encoding, video and image rendering, smaller compter-aided engineering models, etc.
- HPC instances are A8/9 (network optimised – high-bandwidth RDMA network 32Gbps within cloud service as well as 10Gbps Ethernet to other services) and A10/11 (compute intensive).
- Both 8/16 cores, 56/112GB RAM, 382GiB disk.
- Microsoft HPC Pack 2012 R2 SP1 on Windows Server (on-premises, in Azure or hybrid) – Message Passing Interface (MPI) used (over RDMA network).
Azure Machine Learning
- Predictive analysis in cloud – as a service, no VMs etc. to manage.
- Take existing data, analyse by running predictive models and predict future outcomes/trends.
- Deploy in minutes; drag and drop machine learning algorithms (built-in); use data in Azure; add custom scripts; Marketplace of vendors providing custom solutions.
- Classification (group data).
- Regression (predict a value).
- Ranking (order items by criteria).
- Clustering (take a set of data, e.g. by date range).
- Get raw data (unstructured or losely structured) -> data cleaning -> build machine learning model -> predict results.
- Script and automate the application lifecycle; simplify cloud management; automate manual, long-running and frequently-repeated tasks (save time and increase reliability).
- Works with Web Apps. Virtual Machines, Storage, SQL Server and other Azure services.
- Automation account is a container for Azure Automation resources.
- Create runbooks – set of tasks that perform an automated process – PowerShell workflow.
- Scheduler to start run-books daily/hourly/at a defined point in time.
- Pricing based on minutes/triggers:
- Free = 500 minutes
- Basic tier
- Standard tier
- Automation is an enabler for DevOps:
- Dev team loves changes.
- Ops Team loves stability.
- Agile used for development between business-dev.
- DevOps fills gap between dev and ops.
- Infrastructure as code; configuration automation; automation testing.
- Continuous integration – pipeline to delivery and deployment – cycle of integrating solution with various phases:
- Delivery team check-in to Version Control, triggers Build and Unit Tests (with Feedback). When Build and Unit tests are clean, triggers Automated Acceptance tests (with feedback). When approval gained, move to User Acceptance Tests, and then on FInal Approval move to release.
- Continuous Delivery – push-button deployment of any version of software to any environment, on demand – similar to CI but can feed business logic tests.
- Need automated testing to achieve CD.
- Continuous Deployment – natural extension to CD; every check-in ends up in a production release.
- Chef for Configuration Automation: Configuration Management between environments: Build, Test, Release, Deploy (and automate CI/CD). Manage Windows and Linux VMs, integration via Azure Portal. Chef and DSC can be used together to manage infrastructure.
- Puppet – integrated with Azure and VS 2013 for easy deployment of infrastructure across physical and virtual machines. Can deploy pre-configured Puppet image to create a VM.
- Deploy Custom Script with VM configuration – run when VM is launched (one of the available config extensions).
- VM agent is used to install and manage extensions that help interact with the VM (Chef, Puppet, Custom Script).
Azure Media Services
- Developing video on demand is challenging: cost/managing content/encoding/distribution across multiple devices/streaming experience/DRM content protection/providing high quality video for any device any time anywhere.
- Ingest data, encode, format conversion, content protection (DRM policies), on-demand streaming, live streaming, analytics, advertising.
- Need media service account and associated storage account.
- Media Player is web video player service backed by Azure Media Service: one player for all popular devices – no need to develop device-specific player; plays format for that device; easy intergtaion with web and apps; standard player controls.
- Data caching via Azure CDN.
- In management portal, create new Media Service with name, storage account and region.
- Start the Media Service.
- Scale up streaming units (1 unit=200Mbps).
- Upload a video file (from local or from Azure storage) – will be stored in storage account without encryption.
- Publish the file.
- Configure the encoding options, then video is uploaded into portal (can encode multiple times for different formats with different names).
- View the media content (copy link into browser).
Azure Resource Manager
- With ASM even a VM has a cloud service.
- ARM is pure IaaS, not necessarily cloud service.
- Deploy, manage and monitor services as a group; deploy repeatedly throughout the application life cycle; use declarative templates to define deployment; can have dependencies between resources; apply RBAC; organise logically by tagging.
- ASM tightly couples to cloud service – VM in subnet, in VNet, in cloud service, in region, with VIP for DNS and public IP.
- ARM is more loosely coupled – can have multiple VIPs, NICs, etc. All in a RG (which can span regions). Attached via reference.
|ASM XML||ARM JSON|
|VM deployment||Cloud service as container||Does not require a cloud service|
|Availability set||Define VMs under same availability set||Availability set is a resource exposed by the Microsoft.Compute provider – VMs that need HA must be included in availability set|
|Fault domain||Maximum 2 fault domains||Maximum 3 fault domains|
|Load balancing||Cloud service provides an implicit load balancer for the VMs||The load balancer is a resource exposed by the Microsoft.Network provider|
|Virtual IP address||Default static VIP as long as one VM running in the cloud service||Public IP is a resource exposed by Microsoft.Network – can be static (reserved) or dynamic|
|Reserved IP address||Reserve an IP address in Azure and associate with a cloud service||Public IP can be created as static and assigned to a load balancer|
- Choose deployment mode when provisioning resources. Limited inter-operability so choose the right model.
- Deploy using
Switch-AzureMode -Name AzureResourceManager
- ARM REST API
- Azure CLI:
azure config mode arm
- Resource Manager template – JSON document – deploys and provisions all of the related resources in a single, co-ordinated operation.
- Tags are key-value pairs of metadata: applied to individual ARM resources or ARM RGs – up to 15 tags per Resource or RG
RBAC – Owner, Reader or Contributor.
Azure Messaging Solutions
- Service Bus: multi-tenant cloud service – each user creates a namespace to work within.
- Queues – one-way communication, asynchronous queuing with guarantee of message delivery order (worker has to keep polling).
- Topics – let each receiving application create a subscription by defining a filter (avoid polling – get notification instead) – pub-sub model. Read with RecievAndDelete or PeekLock; can have multiple subscribers.
- Relays – synchronous 2 way communications between applications – won’t help with buffering.
- Event hubs – highly scalable ingestion system that can process millions of events per second (e.g. for IoT).
- Can also queue via storage – more options with service bus but more scalable with storage.
- Backup service targeted at replacing tape backup.
- Can work with on-premises workloads or Azure workloads.
- On-premises backup – pick region and create a vault; download vault credential files; download and install Azure backup agent; can seed through Azure Import/Export Service; select backup policy (start time of backup (retention policies (weekly/monthly/yearly)) – backups are incremental.
- Azure VM Backup – install agent if not already installed, register VMs with Azure Backup Service (installs backup agent in extensions); select backup policy.
- Azure backup is to backup data on VM. Priced per protected instance and storage consumed (price for protected instance goes up at 50GB, then 500GB, then each additional 500GB.
Azure Site Recovery
- Orchestrates failover and recovery of a VM.
- On-premises machine replicated to vault in Azure, or to another datacentre – not Azure to Azure.
- Protect AD and DNS, SQL Server, SharePoint, Dynamics AX, RDS, Exchange, SAP.
- Can also perform a test failover, starting resources in Azure but not routing the traffic.
- Use to protect VMware ESX or Hyper-V VMs or physical servers and can be used to migrate to Azure
Business continuity (BC) and disaster recovery (DR)
- Scenarios: recover from local failures; loss of a region; on-premises to Azure
- For Azure failures:
- HA in PaaS (per region), just make sure web and worker roles 2 or more roles each – then will automatically be spread across fault domains.
- For region failure need to plan across regions – more elaborate (make sure code and config is available in a second region).
- HA in IaaS needs management of VMs in availability sets (need to define define manually).
- At region level, also think about load balancing (VIP), storage (LRS, ZRS, GRS of RA-GRS), Azure SQL replication.
- Recover from loss of region:
- Redeploy on disaster (cold DR) – replicate data ready to run (not high RTO/RPO)
- Warm spare (active/passive) – infrastructure in DR region but not fully available (e.g. SQL replication with secondary copy not accessed, not routing traffic to passive).
- Hot spare (active/active) – two regions at the same time (e.g. SQL on IaaS and replicating itself).
- Cross regional strategies for DR:
- VNet – export settings, import in secondary region.
- Cloud Services – create a separate cloud service in target region; publish to secondary region if primary files; use Traffic Manager to route traffic.
- VM – use blob copy API to duplicate VM disks; geo-replicated VM images.
- Storage – use GRS or RA-GRS (replicated in minutes, so tight RPOs cannot rely on this – need to write own algorithm).
- Azure SQL:
- Geo-restore (1 hour RPO/<12 hours RTO).
- Standard geo-replication (5 secs RPO/30 mins RTO) – no access to secondary.
- Active geo-replication (5 secs RPO/30 mins RTO) – read access to secondary.
- Manually export to Azure Storage (blob) with Azure SQL database import/export service.
Securing Azure Resources
- Cloud security model is shared security model:
- Users are responsible for securing applications.
- Cloud Service Provider (CSP) is responsible for providing controls; users for using them!
- CSP is responsible for infrastructure security.
- VNet/VM security: use endpoints (ACL for endpoints, NSGs at VM or VNet level).
- Storage: use shared access signatures.
- Role-based access control.