Office updates in an unfamiliar language

A few weeks ago, I spotted this, when I went to apply updates to Office 365 ProPlus on my work laptop:

It had me confused, but Colin Powers (@iamcolpow) pointed me to a Microsoft Community forum post with a potential fix.

I changed the language order in my Office settings so that English was the first option after Match Windows. Whatever was causing Windows to fall back down the list then went to English rather than Arabic.

Office Language Preferences as originally set
Office Language Preferences after the change

Now I can read the dialogue boxes on my Office updates!

Microsoft Online Services: tenants, subscriptions and domain names

I often come across confusion with clients trying to understand the differences between tenants, subscriptions and domain names when deploying Microsoft services. This post attempts to clear up some misunderstandings and to – hopefully – make things a little clearer.

Each organisation has a Microsoft Online Services tenant which has a unique DNS name in the format organisationname.onmicrosoft.com. This is unique to the tenant and cannot be changed. Of course, a company can establish multiple organisations, each with its own tenant but these will always be independent of one another and need to be managed separately.

It’s important to remember that each tenant has a single Azure Active Directory (Azure AD). There is a 1:1 relationship between the Azure AD and the tenant. The Azure AD directory uses a unique tenant ID, represented in GUID format. Azure AD can be synchronised with an existing on premises Active Directory Domain Services (AD DS) directory using the Azure AD Connect software.

Multiple service offerings (services) can be deployed into the tenant: Office 365; Intune; Dynamics 365; Azure. Some of these services support multiple subscriptions that may be deployed for several reasons, including separation of administrative control. Quoting from the Microsoft documentation:

“An Azure subscription has a trust relationship with Azure Active Directory (Azure AD). A subscription trusts Azure AD to authenticate users, services, and devices.

Multiple subscriptions can trust the same Azure AD directory. Each subscription can only trust a single directory.”

Associate or add an Azure subscription to your Azure Active Directory tenant

Multiple custom (DNS) domain names can be applied to services – so mycompany.com, mycompany.co.uk and myoldcompanyname.com could all be directed to the same services – but there is still a limit of one tenant name per tenant.

Further reading

Subscriptions, licenses, accounts, and tenants for Microsoft’s cloud offerings.

Digital transformation – it’s not about the technology

For the last few years, every IT organisation has been talking about “digital”. Digital this, digital that. “Digital transformation” has become a buzzword (OK, two words), just like “Cloud” in 2010 or “Big Data” a few years later.

But what do we mean when we talk about digital transformation? It certainly caused a stir in my recent team meeting.

To answer that, let’s look at three forms of transformation – Cloud, Business and Digital – and how they build on each other:

  • Cloud transformation is about tools and technology. It’s often IT-led (though it should involve business stakeholders too) and so it’s the domain where us techies are most happy. Often, it involves creating new platforms, using cloud services – Azure, Amazon Web Services, Office 365, G-Suite, Dynamics 365, Salesforce. But cloud transformation is just an enabler. In order to deliver value, business transformation is required.
  • Business transformation is about re-engineering internal processes to better serve the needs of the business and improve the way in which services are delivered. It’s about driving efficiencies and delivering better outcomes, but still focused on the way that a business (or other organisation) operates. Business transformation should be business-led but will often (but not always) demand new platforms and services from IT – which leads back to cloud transformation.
  • Digital transformation relates to the external interface with clients/customers/citizens/students. This is the domain of disruptive innovation. Evolve or become extinct. It’s often spoken about in terms of channel shifting – getting people to use digital services in place of older, more laborious alternatives but, ideally, its complementary, rather than replacing existing methods (because otherwise we run the risk of digital exclusion). Importantly though, it’s no good having digital transformation without business transformation and, like business transformation, digital transformation should be business-led.
Cloud, Business and Digital Transformation

Let’s take an example of digital transformation: when my bins were missed from a council waste collection, I logged a call via my local council’s website, which created an incident in a case management system and within an hour or so the bin lorry was back in my street because the driver had been alerted to the missed collection on his in-cab display. The service was excellent (OK, there was a mistake but it was quickly dealt with), the resolution was effective, and it was enabled using digital technologies.

But here’s another example. When I was held up in the neighbouring county by some defective temporary traffic lights at some roadworks, the local authority‘s out of hours phone service wanted me to channel shift to the website (not appropriate when driving a car). It also couldn’t cope with my problem – the out of hours phone service ended up at a random mobile voice mailbox. In the end, I called the Police on 101 (non-emergency) when really some basic business processes needed to be fixed. That shouldn’t necessarily require a technical solution but digital transformation of external services does rely on effective internal processes. Otherwise, what you have created is a shiny new approach on the outside, with the same clunky processes internally.

Hopefully this post has helped to describe the differences between cloud, business and digital transformation. But also consider this… digital transformation relies on business transformation – but not all business transformation needs new IT… the important thing is to identify the challenges being faced, the opportunities to innovate, and only then consider the platforms that are needed in order to move forwards.

Workaround for when Microsoft Edge Beta (v79) fails to load pages

Last week, following Microsoft’s announcement at Ignite that the (Chromium-based) Edge was nearing completion, ready for launch on 15 January 2020, my browser updated itself to the final beta – version 79.0.309.14. And it broke.

I found the original Microsoft Edge to be unreliable and for many years my default browser was Google Chrome. When my latest work PC came without Chrome, rather than ask an administrator to install Chrome for me I installed the Edge beta, which had been rock solid until this latest update.

Various people suggested a later build might fix it (my PC says Edge is up to date, but there will be faster deployment rings than the one I’m on) but the real fix came from my colleague Thom McKiernan (@thommck) who suggested adding the -no-sandbox switch to the Edge Beta shortcut:

"C:\Users\mark\AppData\Local\Microsoft\Edge Beta\Application\msedge.exe" -no-sandbox

The Edge beta complains that “You’re using an unsupported command-line flag –no-sandbox. This poses stability and security risks.” but all my tabs load without issue now. Let’s see if this is properly fixed next time my browser updates…

[Updated 14/11/2019: this post originally indicated that the use of the -no-sandbox switch was a fix for the issue. It’s not – it should be seen as a workaround and used with caution until this problem is properly resolved.]

A toolkit for cyclists: saving money on basic maintenance

Anyone who regularly reads my blog or follows me on Twitter will know that cycling (or being a “cyclist’s Dad”) is one of my major activities. I also commute by bike where practical – I have a Brompton folding bike, which has caused some amusement in the office (think BBC W1A) – though I mostly work from home and commuting by bike up two flights of stairs might be a bit tricky…

Image result for w1a brompton

With more bikes in the family than I care to admit (I have at least 5 at the moment and my eldest son’s n+1 count is increasing too), I’ve been trying to do more of the servicing myself (or with my son) to reduce costs. This has been spurred on by a few things including:

  • I needed to call on help from others to swap pedals at a race recently. I had the skills but not the tools. I then bought the right tools…
  • I discovered that drive chains are supposed to be shiny, not grimy and that they perform much better when you know how to remove and degrease the chain, cassette, chainrings, etc. A chain cleaner is great, but if the other components are still covered in gunk then the chain quickly turns black again.
  • I also had to remove the cranks and bottom bracket on a bike as part of another project so the bike-specific toolkit has been growing.
  • My son was able to use his new knowledge and my tools to swap components between frames.

Luckily, it needn’t be expensive. Much as I would like to have a wall of Park Tool tools, that’s a stretch too far for my wallet, so this is what’s in the toolkit so far:

BTW, almost every task I’ve needed to complete has had a short video available on YouTube to tell me how to do it. GCN is consistently good.

Note: Wiggle rejected me for their affiliate marketing scheme, so there is no financial incentive for me if you click the links above – they are purely for the convenience of readers!

Why landscaping my garden was just like an IT project

Over the last few weeks, I’ve been redeveloping the garden at home and the whole experience has made me reflect on the way that IT projects are often delivered…

Who’s been developing the garden?

Well, when I say “I’ve”, that may be pushing it slightly… I paid other people to do significant chunks of it but that’s the first similarity. I started work and quickly realised that it would a) take me a long time and b) involve the use of tools and machinery that I don’t have so I needed to engage specialist assistance.

This is just like my customers who have something in mind that aligns with their strategic goals and objectives but they lack the resources or experience and so look externally for assistance.

Getting some quotes

Having decided that I needed help, the next step was to get some idea of what it might cost. After speaking to a selection of potential contractors, I knew that my budget was hopelessly optimistic and I’d either need to scale the plans back or dig deeper into my pockets.

Again, just as in my professional world, everyone has their idea of what something might cost but sometimes that’s just not realistic.

How quickly can we start?

Having agreed on a price and a scope, the next question was how soon? Actually, for me this was pretty good: 2 weeks to start and it should take about 2 weeks. Great. Let’s do this.

In my professional life, I come across procurement periods that can run for months but then the project must happen right now. It’s not realistic to expect a professional services company to have people waiting around for your order (if they do, then maybe ask why). Expect to take a few weeks to engage.

The flurry of activity

The big day came. My drive was filled up with a skip and several tons of aggregate, sand and cement. Materials came and went. People were on site. Earth was moved. Things happened.

It always feels good when something becomes real. Progress on any project is good, especially after waiting a while to get going. But don’t expect a smooth ride the whole way…

The first sprint delivered

Whilst my family took a break, work continued at home. Drainage was installed, wooden sleepers were built up into steps and walls and a stone patio was laid.

That sounds like a successful first sprint. Step one completed, demonstrable progress and a milestone payment due.

Slippage

But hang on, we’re already 9 days into a 2-week project and there are still many items on the backlog. The weather had either been too wet or too hot. And there were delays from the skip hire company that led to inefficiencies in removing materials from the site. We were making progress but the timeline (and so the budget) was starting to slip

Many projects will have unforeseen issues. That’s life. Managing them is what makes the difference. And the key to that is communication between client and supplier.

Scope creep

What about the electrics? I had already spotted that they were missing from the quote but there was armoured cabling to be buried before the garden was completed. And that meant bringing in another contractor. Thankfully, he had worked with the landscaping team before, so he could fit around them without delay (at least for the first fix).

More contractors mean dependencies. Even when teams have worked together previously, there will be some complications to work out. Again, good project management helps.

When will this end? And what about the budget?

Sprint 2 was more of a jog. There was still earth to move, a pergola to be built, a concrete base for my “man cave” to be poured and turf to be laid. Time was ticking – the gap I’d left between the landscaping and the project work packages I was due to deliver myself (log cabin construction, garden furniture arrival) was shrinking – and with work taking place on a time and materials basis the budget was stretched.

Time for a meeting. Let’s agree what’s still left to do and how long it will take, lock down the budget and push towards completion.

I have to admit this was frustrating. But I’ve seen it in my world of IT too. Want a fixed price? Be prepared to pay more as the risk taken on by the organisation delivering the work needs to be factored in. Time and materials can work both ways (finish early, pay less – or to project over-runs) and after a while, patience will wear thin. Again, communication is key. Establish what’s left to do in the agreed scope, nail down the timescales and push for completion.

And as for the other work packages, very few projects exist in isolation. There’s nearly always an entire programme of works to deliver to meet the stated goals/objectives. Some realism is required about how dependencies will align because if you expect the various work packages to run on from each other, you should be prepared for the occasional disappointment.

Phase 1 complete

Three and a bit weeks after work started, phase 1 was complete. And it looked great. All the pain was worthwhile. Just in time to start construction of the log cabin on that base.

phase 1 of the garden completed

60% over time, 7% over budget. Not wonderful stats but also not atypical.

Postscript: Phase 2 delayed

The log cabin arrived on time but was damaged on delivery. And it would take 2 weeks for a replacement roof apex to be manufactured and shipped. With most of the materials on-site though, it needed to be built as far as it could be and then wrapped up to protect it from the elements.

Sometimes, even the best planning can come undone. Supplier contracts might help with speedy resolution of issues but sometimes there’s nothing to be done except to sit and wait…

Using Microsoft Bookings to manage device rollouts

Microsoft Bookings, showing available services

End-user computing (EUC) refreshes can place significant logistical challenges on an organisation. Whilst technologies like Windows 10 Autopilot will take us to a place where users can self-provision, often there’s more involved and some training is required to help users to adopt the technology (and potentially associated business changes).

Over the last few years, I’ve worked on projects that have used a variety of systems to manage the allocation of training/handover sessions to people but they’ve always been lacking in some way. We’ve tried using a PowerApps app, and a SharePoint calendar extension but then Microsoft made their Bookings app available on Office 365 Enterprise subscriptions (it was previously only available for Business subscriptions).

Microsoft Bookings is designed for small businesses and the example given in the Microsoft documentation is a pet grooming parlour. You could equally apply the app to other scenarios though: a hairdressing salon; bike repairs; or IT Services.

I can’t see Microsoft Bookings on my tenant!

That’s because, by default, it’s not there for Enterprise customers. Most of my customers use an E3 or E5 subscription and I was able to successfully test on a trial E3 tenant. My E1 was no good though…

The process to add the Business Apps (free) – including Bookings – to an Enterprise tenant will depend on whether it’s Credit Card (PAYG), Enterprise Agreement (EA) or Cloud Solutions Provider (CSP) licenced but it’s fully documented by Microsoft. When I enabled it on my test tenant, I received an invoice for £0.00.

So, how do I configure Microsoft Bookings?

The app is built around a calendar on a website, with a number of services and assigned staff. Each “staff member” needs to have a valid email address but they don’t need to be a real person – all of the email messages could be directed to a single mailbox, which also reduces the number of licences needed to operate the solution.

It took some thinking about how to do this for my End User Device Handover scenario but I set up:

  • A calendar for the project.
  • A service for the handover sessions. Use this to control when services are provided (e.g. available times and staff).
  • A number of dummy “staff” for the number of slots in each session (e.g. 10 people in each session, 10 slots so 10 “staff”).
Microsoft Bookings, showing confirmation of booked service Microsoft Bookings, calendar view

Once all of the staff available for a session are booked (i.e. all of the slots for a session are full), it’s no longer offered in the calendar. There’s no mechanism for preventing multiple/duplicate bookings but a simple manual check to export a .TSV file with all of the bookings each day will allow those to be identified and remediated.

(Incidentally, Excel wouldn’t open a TSV file for me. What I could do though was open the file in Notepad and copy/paste it to Excel, for sorting and identification of multiple bookings from the same email address.)

Further reading

These blog posts are a couple of years old now but helped a lot:

A logical view on a virtual datacentre services architecture

A couple of years ago, I wrote a post about a logical view of an End-User Computing (EUC) architecture (which provides a platform for Modern Workplace). It’s served me well and the model continues to be developed (although the changes are subtle so it’s not really worth writing a new post for the 2019 version).

Building on the original EUC/Modern Workplace framework, I started to think what it might look like for datacentre services – and this is something I came up with last year that’s starting to take shape.

Just as for the EUC model, I’ve tried to step up a level from the technology – to get back to the logical building blocks of the solution so that I can apply them according to a specific client’s requirements. I know that it’s far from complete – just look at an Azure or AWS feature list and you can come up with many more classifications for cloud services – but I think it provides the basics and a starting point for a conversation:

Logical view of a virtual datacentre environment

Starting at the bottom left of the diagram, I’ll describe each of the main blocks in turn:

  • Whether hosted on-premises, co-located or making use of public cloud capabilities, Connectivity is a key consideration for datacentre services. This element of the solution includes the WAN connectivity between sites, site-to-site VPN connections to secure access to the datacentre, Internet breakout and network security at the endpoints – specifically the firewalls and other network security appliances in the datacentre.
  • Whilst many of the SBBs in the virtual datacentre services architecture are equally applicable for co-located or on-premises datacentres, there are some specific Cloud Considerations. Firstly, cloud solutions must be designed for failure – i.e. to design out any elements that may lead to non-availability of services (or at least to fail within agreed service levels). Depending on the organisation(s) consuming the services, there may also be considerations around data location. Finally, and most significantly, the cloud provider(s) must practice trustworthy computing and, ideally, will conform to the UK National Cyber Security Centre (NCSC)’s 14 cloud security principles (or equivalent).
  • Just as for the EUC/Modern Workplace architecture, Identity and Access is key to the provision of virtual datacentre services. A directory service is at the heart of the solution, combined with a model for limiting the scope of access to resources. Together with Role Based Access Control (RBAC), this allows for fine-grained access permissions to be defined. Some form of remote access is required – both to access services running in the datacentre and for management purposes. Meanwhile, identity integration is concerned with integrating the datacentre directory service with existing (on-premises) identity solutions and providing SSO for applications, both in the virtual datacentre and elsewhere in the cloud (i.e. SaaS applications).
  • Data Protection takes place throughout the solution – but key considerations include intrusion detection and endpoint security. Just as for end-user devices, endpoint security covers such aspects as firewalls, anti-virus/malware protection and encryption of data at rest.
  • In the centre of the diagram, the Fabric is based on the US National Institute of Standards and Technology (NIST)’s established definition of essential characteristics for cloud computing.
  • The NIST guidance referred to above also defines three service models for cloud computing: Infrastructure as a Service (IaaS); Platform as a Service (PaaS) and Software as a Service (SaaS).
  • In the case of IaaS, there are considerations around the choice of Operating System. Supported operating systems will depend on the cloud service provider.
  • Many cloud service providers will also provide one or more Marketplaces with both first and third-party (ISV) products ranging from firewalls and security appliances to pre-configured application servers.
  • Application Services are the real reason that the virtual datacentre services exist, and applications may be web, mobile or API-based. There may also be traditional hosted server applications – especially where IaaS is in use.
  • The whole stack is wrapped with a suite of Management Tools. These exist to ensure that the cloud services are effectively managed in line with expected practices and cover all of the operational tasks that would be expected for any datacentre including: licensing; resource management; billing; HA and disaster recovery/business continuity; backup and recovery; configuration management; software updates; automation; management policies and monitoring/alerting.

If you have feedback – for example, a glaring hole or suggestions for changes, please feel free to leave a comment below.

Amazon Web Services (AWS) Summit: London Recap

I’ve written previously about the couple of days I spent at ExCeL in February, learning about Microsoft’s latest developments at the Ignite Tour and, a few weeks later I found myself back at the same venue, this time focusing on Amazon Web Services (AWS) at the London AWS Summit (four years since my last visit).

Even with a predominantly Microsoft-focused client base, there are situations where a multi-cloud solution is required and so, it makes sense for me to expand my knowledge to include Amazon’s cloud offerings. I may not have the detail and experience that I have with Microsoft Azure, but certainly enough to make an informed choice within my Architect role.

One of the first things I noticed is that, for Amazon, it’s all about the numbers. The AWS Summit had a lot of attendees – 12000+ were claimed, for more than 60 technical sessions supported by 98 sponsoring partners. Frankly, it felt to me that there were a few too many people there at times…

AWS is clearly growing – citing 41% growth comparing Q1 2019 with Q1 2018. And, whilst the comparisons with the industrial revolution and the LSE research that shows 95% of today’s startups would find traditional IT models limiting today were all good and valid, the keynote soon switched to focus on AWS claims of “more”. More services. More depth. More breadth.

There were some good customer slots in the keynote: Sainsbury’s Group CIO Phil Jordan and Group Digital Officer Clodagh Moriaty spoke about improving online experiences, integrating brands such as Nectar and Sainsbury’s, and using machine learning to re-plan retail space and to plan online deliveries. Ministry of Justice CDIO Tom Read talked about how the MOJ is moving to a microservice-based application architecture.

After the keynote, I immersed myself in technical sessions. In fact, I avoided the vendor booths completely because the room was absolutely packed when I tried to get near. My afternoon consisted of:

  • Driving digital transformation using artificial intelligence by Steven Bryen (@Steven_Bryen) and Bjoern Reinke.
  • AWS networking fundamentals by Perry Wald and Tom Adamski.
  • Creating resilience through destruction by Adrian Hornsby (@adhorn).
  • How to build an Alexa Skill in 30 minutes by Andrew Muttoni (@muttonia).

All of these were great technical sessions – and probably too much for a single blog post but, here goes anyway…

Driving digital transformation using artificial intelligence

Amazon thinks that driving better customer experience requires Artificial Intelligence (AI), specifically Machine Learning (ML). Using an old picture of London Underground workers sorting through used tickets in the 1950s to identify the most popular journeys, Steven Bryen suggested that more data leads to better analytics and better outcomes that can be applied in more ways (in a cyclical manner).

The term “artificial intelligence” has been used since John McCarthy coined it in 1955. The AWS view is that AI taking off because of:

  • Algorithms.
  • Data (specifically the ability to capture and store it at scale).
  • GPUs and acceleration.
  • Cloud computing.

Citing research from PwC [which I can’t find on the Internet], AWS claim that world GDP was $80Tn in 2018 and is expected to be $112Tn in 2030  ($15.7Tn of which can be attributed to AI).

Data science, artificial intelligence, machine learning and deep learning can be thought of as a series of concentric rings.

Machine learning can be supervised learning (betting better at finding targets); unsupervised (assume nothing and question everything); or reinforcement learning (rewarding high performing behaviour).

Amazon claims extensive AI experience through its own ML experience:

  • Recommendations Engine
  • Prime Air
  • Alexa
  • Go (checkoutless stores)
  • Robotic warehouses – taking trolleys to packer to scan and pack (using an IoT wristband to make sure robots avoid maintenance engineers).

Every day Amazon applies new AI/ML-based improvements to its business, at a global scale through AWS.

Challenges for organisations are that:

  • ML is rare
  • plus: Building and scaling ML technology is hard
  • plus: Deploying and operating models in production is time-consuming and expensive
  • equals: a lack of cost-effective easy-to-use and scalable ML services

Most time is spent getting data ready to get intelligence from it. Customers need a complete end-to-end ML stack and AWS provides that with edge technologies such as Greengrass for offline inference and modelling in SageMaker. The AWS view is that ML prediction becomes a RESTful API call.

With the scene set, Steven Bryen handed over to Bjoern Reinke, Drax Retail’s Director of Smart Metering.

Drax has converted former coal-fired power stations to use biomass: capturing carbon into biomass pellets, which are burned to create steam that drives turbines – representing 15% of the UK’s renewable energy.

Drax uses a systems thinking approach with systems of record, intelligence and engagement

System of intelligence need:

  • Trusted data.
  • Insight everywhere.
  • Enterprise automation.

Customers expect tailoring: efficiency; security; safety; and competitive advantage.

Systems of intelligence can be applied to team leaders, front line agents (so they already know that customer has just been online looking for a new tariff), leaders (for reliable data sources), and assistant-enabled recommendations (which are no longer futuristic).

Fragmented/conflicting data is pumped into a data lake from where ETL and data warehousing technologies are used for reporting and visualisation. But Drax also pull from the data lake to run analytics for data science (using Inawisdom technology).

The data science applications can monitor usage and see base load, holidays, etc. Then, they can look for anomalies – a deviation from an established time series. This might help to detect changes in tenants, etc. and the information can be surfaced to operations teams.

AWS networking fundamentals

After hearing how AWS can be used to drive insight into customer activities, the next session was back to pure tech. Not just tech but infrastructure (all be it as a service). The following notes cover off some AWS IaaS concepts and fundamentals.

Customers deploy into virtual private cloud (VPC) environments within AWS:

  • For demonstration purposes, a private address range (CIDR) was used – 172.31.0.0/16 (a private IP range from RFC 1918). Importantly, AWS ranges should be selected to avoid potential conflicts with on-premises infrastructure. Amazon recommends using /16 (65536 addresses) but network teams may suggest something smaller.
  • AWS is dual-stack (IPv4 and IPv6) so even if an IPv6 CIDR is used, infrastructure will have both IPv4 and IPv6 addresses.
  • Each VPC should be broken into availability zones (AZs), which are risk domains on different power grids/flood profiles and a subnet placed in each (e.g. 172.31.0.0/24, 172.31.1.0/24, 172.31.2.0/24).
  • Each VPC has a default routing table but an administrator can create and assign different routing tables to different subnets.

To connect to the Internet you will need a connection, a route and a public address:

  • Create a public subnet (one with public and private IP addresses).
  • Then, create an Internet Gateway (IGW).
  • Finally, Create a route so that the default gateway is the IGW (172.31.0.0/16 local and 0.0.0.0/0 igw_id).
  • Alternatively, create a private subnet and use a NAT gateway for outbound only traffic and direct responses (172.31.0.0/16 local and 0.0.0.0/0 nat_gw_id).

Moving on to network security:

  • Network Security Groups (NSGs) provide a stateful distributed firewall so a request from one direction automatically sets up permissions for a response from the other (avoiding the need to set up separate rules for inbound and outbound traffic).
    • Using an example VPC with 4 web servers and 3 back end servers:
      • Group into 2 security groups
      • Allow web traffic from anywhere to web servers (port 80 and source 0.0.0.0/0)
      • Only allow web servers to talk to back end servers (port 2345 and source security group ID)
  • Network Access Control Lists (NACLs) are stateless – they are just lists and need to be explicit to allow both directions.
  • Flow logs work at instance, subnet or VPC level and write output to S3 buckets or CloudWatch logs. They can be used for:
    • Visibility
    • Troubleshooting
    • Analysing traffic flow (no payload, just metadata)
      • Network interface
      • Source IP and port
      • Destination IP and port
      • Bytes
      • Condition (accept/reject)
  • DNS in a VPC is switched on by default for resolution and assigning hostnames (rather than just using IP addresses).
    • AWS also has the Route 53 service for customers who would like to manage their own DNS.

Finally, connectivity options include:

  • Peering for private communication between VPCs
    • Peering is 1:1 and can be in different regions but the CIDR must not overlap
    • Each VPC owner can send a request which is accepted by the owner on the other side. Then, update the routing tables on the other side.
    • Peering can get complex if there are many VPCs. There is also a limit of 125 peerings so a Transit Gateway can be used to act as a central point but there are some limitations around regions.
    • Each Transit Gateway can support up to 5000 connections.
  • AWS can be connected to on-premises infrastructure using a VPN or with AWS Direct Connect
    • A VPN is established with a customer gateway and a virtual private gateway is created on the VPC side of the connection.
      • Each connection has 2 tunnels (2 endpoints in different AZs).
      • Update the routing table to define how to reach on-premises networks.
    • Direct Connect
      • AWS services on public address space are outside the VPC.
      • Direct Connect locations have a customer or partner cage and an AWS cage.
      • Create a private virtual interface (VLAN) and a public virtual interface (VLAN) for access to VPC and to other AWS services.
      • A Direct Connect Gateway is used to connect to each VPC
    • Before Transit Gateway customers needed a VPN per VPC.
      • Now they can consolidate on-premises connectivity
      • For Direct Connect it’s possible to have a single tunnel with a Transit Gateway between the customer gateway and AWS.
  • Route 53 Resolver service can be used for DNS forwarding on-premises to AWS and vice versa.
  • VPC Sharing provides separation of resources with:
    • An Owner account to set up infrastructure/networking.
    • Subnets shared with other AWS accounts so they can deploy into the subnet.
  • Interface endpoints make an API look as if it’s part of an organisation’s VPC.
    • They override the public domain name for service.
    • Using a private link can only expose a specific service port and control the direction of communications and no longer care about IP addresses.
  • Amazon Global Accelerator brings traffic onto the AWS backbone close to end users and then uses that backbone to provide access to services.

Creating resilience through destruction

Adrian Horn presenting at AWS Summit London

One of the most interesting sessions I saw at the AWS Summit was Adrian Horn’s session that talked about deliberately breaking things to create resilience – which is effectively the infrastructure version of test-driven development (TDD), I guess…

Actually, Adrian made the point that it’s not so much the issues that bringing things down causes as the complexity of bringing them back up.

“Failures are a given and everything will eventually fail over time”

Werner Vogels, CTO, Amazon.com

We may break a system into microservices to scale but we also need to think about resilience: the ability for a system to handle and eventually recover from unexpected conditions.

This needs to consider a stack that includes:

  • People
  • Application
  • Network and Data
  • Infrastructure

And building confidence through testing only takes us so far. Adrian referred to another presentation, by Jesse Robbins, where he talks about creating resilience through destruction.

Firefighters train to build intuition – so they know what to do in the event of a real emergency. In IT, we have the concept of chaos engineering – deliberately injecting failures into an environment:

  • Start small and build confidence:
    • Application level
    • Host failure
    • Resource attacks (CPU, latency…)
    • Network attacks (dependencies, latency…)
    • Region attack
    • Human attack (remove a key resource)
  • Then, build resilient systems:
    • Steady state
    • Hypothesis
    • Design and run an experiment
    • Verify and learn
    • Fix
    • (maybe go back to experiment or to start)
  • And use bulkheads to isolate parts of the system (as in shipping).

Think about:

  • Software:
    • Certificate Expiry
    • Memory leaks
    • Licences
    • Versioning
  • Infrastructure:
    • Redundancy (multi-AZ)
    • Use of managed services
    • Bulkheads
    • Infrastructure as code
  • Application:
    • Timeouts
    • Retries with back-offs (not infinite retries)
    • Circuit breakers
    • Load shedding
    • Exception handing
  • Operations:
    • Monitoring and observability
    • Incident response
    • Measure, measure, measure
    • You build it, your run it

AWS’ Well Architected framework has been developed to help cloud architects build secure, high-performing, resilient, and efficient infrastructure for their applications, based on some of these principles.

Adrian then moved on to consider what a steady state looks like:

  • Normal behaviour of system
  • Business metric (e.g. pulse of Netflix – multiple clicks on play button if not working)
    • Amazon extra 100ms load time led to 1% drop in sales (Greg Linden)
    • Google extra 500ms of load time led to 20% fewer searches (Marissa Mayer)
    • Yahoo extra 400ms of load time caused 5-9% increase in back clicks (Nicole Sullivan)

He suggests asking questions about “what if?” and following some rules of thumb:

  • Start very small
  • As close as possible to production
  • Minimise the blast radius
  • Have an emergency stop
    • Be careful with state that can’t be rolled back (corrupt or incorrect data)

Use canary deployment with A-B testing via DNS or similar for chaos experiment (1%) or normal (99%).

Adrian then went on to demonstrate his approach to chaos engineering, including:

  • Fault injection queries for Amazon Aurora (can revert immediately)
    • Crash a master instance
    • Fail a replica
    • Disk failure
    • Disk congestion
  • DDoS yourself
  • Add latency to network
    • ~ tc qdisc add dev eth0 root netem delay 200ms
  • https://github.com/Netflix/SimianArmy
    • Shut down services randomly
    • Slow down performance
    • Check conformity
    • Break an entire region
    • etc.
  • The chaos toolkit
  • Gremin
    • Destruction as a service!
  • ToxiProxy
    • Sit between components and add “toxics” to test impact of issues
  • Kube-Money project (for Kubernetes)
  • Pumba (for Docker)
  • Thundra (for Lambda)

Use post mortems for correction of errors – the 5 whys. Also, understand that there is no isolated “cause” of an accident.

My notes don’t do Adrian’s talk justice – there’s so much more that I could pick up from re-watching his presentation. Adrian tweeted a link to his slides and code – if you’d like to know more, check them out:

How to build an Alexa Skill in 30 minutes

Spoiler: I didn’t have a working Alexa skill at the end of my 30 minutes… nevertheless, here’s some info to get you started!

Amazon’s view is that technology tries to constrain us. Things got better with mobile and voice is the next step forward. With voice, we can express ourselves without having to understand a user interface [except we do, because we have to know how to issue commands in a format that’s understood – that’s the voice UI!].

I get the point being made – to add an item to a to-do list involves several steps:

  • Find phone
  • Unlock phone
  • Find app
  • Add item
  • etc.

Or, you could just say (for example) “Alexa, ask Ocado to add tuna to my trolley”.

Alexa is a service in the AWS cloud that understands request and acts upon them. There are two components:

  • Alexa voice service – how a device manufacturer adds Alexa to its products.
  • Alexa Skills Kit – to create skills that make something happen (and there are currently more than 80,000 skills available).

An Alexa-enabled device only needs to know to wake up, then stream some “mumbo jumbo” to the cloud, at which point:

  • Automatic speech recognition with translate text to speech
  • Natural language understanding will infer intent (not just text, but understanding…)

Creating skills is requires two parts:

Alexa-hosted skills use Lambda under the hood and creating the skill involves:

  1. Give the skill a name.
  2. Choose the development model.
  3. Choose a hosting method.
  4. Create a skill.
  5. Test in a simulation environment.

Finally, some more links that may be useful:

In summary

Looking back, the technical sessions made my visit to the AWS Summit worthwhile but overall, I was a little disappointed, as this tweet suggests:

Would I recommend the AWS Summit to others? Maybe. Would I watch the keynote from home? No. Would I try to watch some more technical sessions? Absolutely, if they were of the quality I saw on the day. Would I bother to go to ExCeL with 12000 other delegates herded like cattle? Probably not…

Shop local when buying a new bike

Back in 2013, I bought myself a road bike. It’s a Bianchi Via Nirone 7 C2C and it was the first road bike I’d had since my teenage years when I had a 21-speed “racer” (complete with shifters on the down tube).

My Bianchi has served me well but, after nearly 12000km I’m starting to notice some hairline cracks in the paint, a bit of corrosion on the chainstay – and I recently had to cut out one of the upgrades I’d made as the carbon fibre seat post had bonded itself to the inside of the aluminium alloy seat tube.

I’d been saving up for a new bike for a while (promising myself that I could have a new bike when I lost some weight…) but I decided to retire the Bianchi (or at least just use it for Zwifting) and get something new (maybe I can lose some weight by riding more now I have the new bike).

For a long while, I was tempted by a Canyon Endurace CF SL Disc 8.0 Di2. Canyon make some lovely bikes but they are mail-order only (unless you can visit them in Germany). Not having distributors reduces the price, but it also increases the risk of buying the wrong size, etc. added to which, recent experience (buying a frame from Planet X for my son) showed me that sometimes you get what you pay for.

I also feel guilty every time I shop at Wiggle – we’ll miss our local bike shops (LBS) when they are gone and I’ve relied on a few for parts at short notice recently (including Corley Cycles and Chaineys in Milton Keynes). But, just like buying from Amazon instead of a high street store, sometimes the economics mean it just makes sense. Even so, with a new bike purchase, I wasn’t entirely comfortable buying online.

I looked at some of the other mainstream brands too (how about a Trek Domane?). But what about the price difference?

Well, there were a few things to take into account there:

  • Online sizing tools are good, but not perfect and the Canyon would need a bike fit before I could be sure I was ordering the right size. Corley Cycles included not only the sizing fit but also an advanced bike fit with the new bike.
  • Then, membership of my local cycling club got me a further discount (10%).
  • At this point, we’re getting close to pretty much the same price.
  • Chuck in some bottles, cages, and a lot of advice – plus I’m helping to keep my LBS in business and I decided that I’d rather have the “purchase from a shop” experience.

So, I’m now the proud owner of the new Specialized Roubaix Comp (2020 edition). Sure, the lightweight endurance bike with electronic shifting became a lightweight endurance bike with mechanical shifting and front suspension instead but my conscience is clear – and it is pretty damned awesome.