General

Saving money in the cloud?

MoneyCloud.png

One of the cloud’s big selling points is the promise of lower costs, but more often than not customers who move servers to the cloud end up paying more for the same workload.  Have we all been duped?  Is the promise a lie? Over the past several years the ACE team (the group of experts behind the AzureFieldNotes blog) has helped a number of customers on their Azure journey, many of whom were motivated by the economic benefits of moving to the cloud.  Few take the time to truly understand the business value as it applies to their unique technology estate and develop plans to achieve and measure the benefits.  Most simply assume that running workloads in the cloud will result in lower costs - the more they move, the more they will save.  As a result, management establishes a "Cloud First" initiative and IT scrambles to find workloads that are low risk, low complexity candidates.  Inevitably, these end up being existing virtual machines or physical servers which can be easily migrated to Azure.  And here is where the problems begin.

When customers view Azure as simply another datacenter (which just happens to be in the cloud) they apply their existing datacenter thinking to Azure workloads and they negate any cost benefit.  To realize the savings from cloud computing customers need to shift into consumption-based models and this goes far beyond simply migrating virtual machines to Azure.  When server instances are deployed just like those in the old datacenter and left running 24x7, the same workload will most likely end up costing more in Azure.  In addition, if instances aren't decommissioned when no longer needed it leads to sprawl, environment complexity, and costs that quickly get out of control.

Taking it a step further, customers must also consider which services should continue to be built and maintained in-house, and which should simply be consumed as a service.  These decisions will shape the technical cloud foundations for the enterprise.  Unfortunately, many of these decisions are made based on early applications deployed to Azure.  We call this the "first mover" issue.  Decisions made to support the first app in the cloud may not be the right decisions for subsequent apps or for the enterprise as a whole, leading to redundant and perhaps incompatible architecture, poor performance, higher complexity, and ultimately higher cost.  Take identity as an example:  existing identity solutions deployed in-house are often sacred cows because of the historical investment and specialized skills required to maintain the platform.  Previously, these investments were necessary because the only way to deliver this function was to build your own.  But (with limited exception) identity doesn't differentiate your core business and customers don't pay more or buy more product because of your beloved identity solution.  With the introduction of cloud-based identity, such as Azure Active Directory, companies can now choose to consume identity as a service, eliminate the complexity and specialized skills required to support in-house solutions, and focus talent and resources on higher value services which can truly differentiate the business.

Breaking it down, there are a handful of critical elements that must be addressed for any customer to realize value in the cloud:

  • Business Case:  understand what is valuable to your business, how you measure those things, and how you will achieve the value.  The answers to these questions will be different for every customer, but the need to answer them is universal.  Assuming the cloud will bring value - whether you view value as speed to market, cost reduction, evergreen, simplification, etc. - without understanding how you achieve and measure that goal is a recipe for failure.
  • Cloud Foundations:  infrastructure components that will be shared across all services need to be designed for the Enterprise, and not driven based on the first mover.  Its not unusual for Azure environments to quickly evolve from early Proof of Concept deployments to running production workloads, but the foundations (such as subscription model, network, storage, compute, backup, security, identity, etc.) were never designed for production - you need to spend the time early to get these right or your ability to realize results from Azure will be negatively impacted.
  • Ruthless automation:  standardization and automation underpin virtually every element of the cloud's value proposition and you must embrace them to realize maximum benefit from the cloud.  This goes beyond systems admins having scripts to automate build processes (although that is a start).  It means build and configuration become part of the software development practice, including version control, testing, and design patterns.  In other words, you write code to provision and manage cloud resources and the underlying infrastructure is treated just like software:  infrastructure as code.
  • Operating Model: workloads running in the cloud are different from those in your datacenter and supporting these instances will require changes to the traditional operating model.  As you move higher into the as-a-Service stack (IaaS -> PaaS -> SaaS -> BPaaS etc.) the management layer shifts more and more to the cloud provider.  Introduce DevOps in the equation and the impact to traditional operating models is even greater.  When there is an issue, how is the root cause determined when you don't have a single party responsible for the full stack?  Who is responsible for resolution of service and how will hand-offs work between the cloud provider and your in-house support teams?  What tools are involved, what skills are required, and how is information tracked and communicated?  In the end, much of the savings from cloud can come from transformation within the operating model.
  • Governance and Controls:  If you thought keeping a handle on systems running in your datacenter was a challenge, the cloud can make it exponentially worse.  Self-service and near instantaneous access to resources is the perfect storm for introducing server sprawl without proper governance and controls.  In addition, since cloud resources aren't sitting within the datacenter where IT has full control of the entire stack, how can you be sure data is secure, systems are protected, and the company is not exposed to regulatory or legal risk?

In future posts I'll cover each one of these in more detail to help frame how you can maximize the value of Azure (and how Azure Stack can play an important role) in your cloud journey.

 

 

 

Azure Subscription commercials and layout best practices - From the Vault: Part 1

vault.jpg

First, whats an Azure subscription really?

An Azure subscription is the base container into which related resources (similar business and/or technical needs) are grouped together for usage and billing purposes. The key thing about a subscription is its a billing boundary in Azure. See, Microsoft rolls up all the resources in a subscription and sends a single bill for the entire subscription. There is no ability to sub define payment information below the subscription level. Additionally, a subscription is an administration boundary. A subscription co-admin can access any resource within the subscription, and then delegate that access through role based access control as needed.

Now, a little history. Originally a subscription was the only role based access control boundary. either everyone was an administrator of everything in the subscription, or they were a reader, or they had no access at all. This led to a school of thought where subscriptions were created, one for each working group/access boundary. This superuser or nothing access was one of the consistent early failures of the old Azure ASM console approach in a major enterprise. Having to receive 100 bills from 100 subscriptions and reconcile that, as well as have all or nothing access didn't sit well in the enterprise customer market. It didn't help that to separate resources between dev and production (and access rights) meant you separated subscriptions. Further, separate subscriptions meant new networking infrastructure and shared services setup. Each time. This didn't seem right. People started placing brokers and cloud management platforms in front of azure, looking for ways to automate subscription creation, tear down, and bill reconciliation. This waters down the platform to a least common denominator, and doesn't fit well with PaaS.

Fast forward a few years, and azure is picking up steam, and also has a new look and API. In the ARM console (the new portal), this was solved. Azure introduced granular RBAC into not just the console, but the underlying ARM API. The model chosen allows delegation of individual rights and permissions down to the lowest node in the tree (any ARM object) and still allows use of the full API, etc. Further, you can apply policy to any level of the ARM API (we'll cover this in a future post). This switch changes our guidance dramatically, and in fact makes it possible for us to go with a fewer subscription policy.

Less is more

Overall the number of subscriptions should be kept to a minimum, and in many cases a single subscription is adequate and only adding more when it makes sense. Fewer subscriptions means less complexity and overhead.

Several ARM features facilitate a reduced number of subscriptions, especially Role Based Access Control (RBAC), Tagging, NETs/VNET Peering, ARM Policy, and Azure Active Directory (AAD).

  • RBAC allows creation of different access control security models within one subscription.
  • Tagging permits flexibility for billing to be segregated as needed within a single subscription.
  • VNETs and NSGs allow you to segregate traffic within a subscription
  • VNET peering allows you to connect those segregated vnets as needed to a central vnet or an express route connection.
  • ARM Policy allows you to setup restrictions for naming conventions, resource and region usage, images, etc.
  • AAD can be used to grant permissions within the RBAC framework

With the new Azure Resource Model, fewer subscriptions are needed resulting in fewer billing invoices, top level administrators and lower overall complexity. Less is better.

Best Practices

What you should do:

  • Keep subscriptions to a minimum to reduce complexity
  • Segment bills by leveraging Tagging instead of creating more subscriptions (See our tagging post here)
  • Use Resource Groups as a application life cycle container
  • Use Resource Groups to define security boundaries
  • Use RBAC to grant access and delegate administration

What you shouldn't do:

  • Do not create a subscription for each environment Dev/Test/Prod to protect quota and enforce security.
    • Instead leverage the feature Azure Dev/Test Labs capability to ring-fence pools of IaaS capacity for specific dev teams/users if necessary. (Understand the limitations of Dev/Test Labs in its current state)
    • Consider if enforcing quota is necessary. Using ARM Policy, or even an advanced billing system can have the same outcome without the complexity/business pain.
    • Resource limits can modified by Microsoft up to a point, but most limits are well beyond what a typical client would require. For example one current limit is 10,000 VM cores per region per subscription. At present, Storage accounts is the limit most customers hit first, at 250 per subscription. Spend some time on careful management of this resource, but don't go crazy locking it down, 250 is still a lot and that change someday.
  • Do not create multiple subscriptions just to have separate bills for each department.
    • Tagging can be used to separate out cost
    • Separate subscriptions introduces the need for a second set of networking infrastructure, or cross subscription Vnets thru site to site VPNs. While it is possible to do it does increase complexity.
  • Do not use a subscription as the primary method of delegating administration.
    • Subscriptions should be as a very high level administrative container, but that’s it. (e.g. one subscription per IT department in a company with multiple IT departments might make sense).
  • Avoid spanning applications across multiple subscriptions with a single app, even with multiple environments, because it reduces your ability to view and manage all the related items from one place and on one bill.
    • If you must have multiple subscriptions, don't split by dev/test/prod, instead split by groups of apps with the entire app and its related apps contained within a single subscription.
  • Do not proactively split subscriptions based an "eventually" needing more resources.
    • Resource limits are always increasing so by the time you may get close to a limit it would have likely been increased.

Billing

A single bill per client is best, leverage tagging to segment the bill

There is a 1:1 relationship between an Azure subscription and its corresponding bill so knowing how IT related expenses are handled internally will be a major factor to consider. For most clients the IT department handles the bill from a central budget. While other clients require the bill to be broken down in order to charge back costs to each business unit. Utilize tags to achieve the desired result.

Be mindful that only the Account administrator can manage billing.

Plans and Offers/Commercials

If you have an Enterprise Agreement with Microsoft, an EA subscription will be the best choice.  

There are many different types of Azure tenant types available (Enterprise, MSDN, Pay-As-You-Go, Free, Sponsorship, etc.). The recommended subscription type applicable to most customers is Enterprise.

Be aware some tenants do not have access to certain service types. For example an EA tenant can not purchase AAD Basic or Premium through the Azure portal, it must be purchased separately thru the EA portal (because it is a per user based service). The same is also true of certain third party services available only through specific tenant type. Before moving forward with a subscription, ensure the commercial construct of the subscription matches the intended use. Sponsorship subscriptions should not be used as the main subscription as you'll need to migrate the resources to another subscription at some point. This is because Sponsorship subscriptions usually have a hard usage cap or time limit. When they reach that cap/limit, they revert to Pay-As-You-Go with no discounts.

Resource Limits

Do not create additional subscriptions planning on growing beyond current resource limits.

Resource limits in Azure change frequently.  Do not use current limits and your anticipated growth to determine the number of subscriptions to deploy. Always have the mindset to have as few subscriptions as possible (ideally one), for as long as possible. Defer adding complexity that may never be needed or not needed for a while.

There are hard limits on how many of a given resource can exist within a subscription. These limits can be raised by Microsoft to a point, but once this point is reached additional subscriptions are required. It is important to note that ARM limits supersede the subscription limits, for example ARM is unlimited number of cores, but the limit is 10,000 VMs per region per subscription. Be aware that many quotas for resources in Azure Resource Groups are per-region, not per-subscription (as service management quotas are).

It is important to keep current on Azure Subscription Limits, and expect them to change frequently. Azure Subscription Limit information is available here.

When you have to separate into multiple subscriptions

Each application should be wholly contained within a single subscription whenever possible. e.g. the Dev/Test/Production environments for an app, across all regions should exist in a single subscription. Next try to place adjacent applications in a single subscription, particularly if they are managed and/or billed with the same people. If you split the app between subscriptions it reduces your ability to view and manage all the related items from one place, and see them on one bill. We have seen cases where clients have separated test/dev/production into separate subscriptions, which adds complexity without value. We don't recommend this now that RBAC and tagging are available.

 

Again, goal is as few subscriptions as possible. Only add new subscriptions as you need them. Do not proactively split subscriptions based an "eventually" needing more resources. Odds are by the time "eventually" comes the limits will have increased.

 

A Quick note on Azure Administrators and Co-Administrators

They still exist. Co-Administrators (limit 200 per subscription for ASM, no limit for ARM) - This role has the same access privileges as the Service Administrator, but can’t change the association of subscriptions to Azure directories. Minimize the number of co-admins as much as possible, and use RBAC to grant access on the principle of least privilege. Think of them as the domain admins of your azure account. E.g. they should be restricted to specific tasks that need them.

https://azure.microsoft.com/en-us/documentation/articles/billing-add-change-azure-subscription-administrator/

A Final note on Azure Stack and Azure Pack

Windows Azure Pack uses the ASM/Co-Administrator model, and since you own the whole thing, its not that detrimental to give each group its own subscription.

Windows Azure Stack uses the ARM model. Wile you could give everyone their own subscription, we don't see a reason to deviate from the guidance here within a business unit. Azure Stack subscriptions still need networking, etc. just like an azure subscription. There is a concept of delegated subscriptions (basically nesting) in Stack, but since it doesn't currently translate to Azure, and because RBAC/Resource Group based rights management works well, we simply don't see the need. Instead use multiple subscriptions in stack to peel off resources for groups you truly don't trust, like partners and 3rd party organizations.

This is the first in a series of blog posts discussing the subscription and resource group layout we recommend for large enterprise customers. Wile smaller customers may not hit some of the limits discussed here, its certainly equally relevant and applicable.

Wile this content is primarily written by the author, some guidance is the work of a collection of people within Avanade's Azure Cloud Enablement group.

Automation Frameworks & Threshold for Creation

gears-930x620cut.jpg

IntroductionYears ago I was walking through an office and saw an administrator logging onto a system, checking the C drive, updating a spreadsheet, logged off and then repeated this task. In pure disbelief, I stood and watched for a minute then asked, "What are you doing?", as I feared the response was "One of the monthly maintenance tasks, checking C drive space".  I calmly asked him to please stop and went back to my desk to write a script.

As administrators and consultants we constantly have to evaluate when do you automate and when do you just do the task.  There are many variables, time to automate, level of production access or change required (and security's opinion about this), how long it takes now, who's paying, how long will you have to keep doing it and what else you have to do.

Automation Frameworks While there are tasks we can automate and this programmer takes it to a new level of task automation including scripts for emailing his wife and making coffee (if I may quote Wayne and Garth, "we're not worthy, we're not worthy"), there is another side, automation frameworks for multiple scenarios and reuse.   The PowerShell Deployment Toolkit was an amazing framework for deploying System Center.  It took the task of deployment from days and weeks of installations and months of fixing deployments to a few minutes, hours and days and still flex to a variety of deployment patterns.

However, there was a learning curve, a list of prerequisites (some documented and some not), a few tricks and digging around custom code you sometimes had to reverse engineer.   This PowerShell framework could have deployed anything, you just had to write your own XML manifest for what you wanted to deploy but that would take a lot of time learning the XML syntax that the deployment framework understood, testing the deployment, working through 'undocumented features' and so on. I actually extended the existing scripts for our own needs, but now a total of one person knows what those extra XML properties mean.

New Thresholds Cloud technologies like Azure are changing at a pace unseen in the enterprise space. VMs or IaaS compute has shifted through Classic, ARM deployments, versions V1 and V2, not to mention API versions.  Extensions keep being added and DSC brings another layer to the table.  These layers are all shifting with and without each other.  Recently I had the property of the VM.OSProfile change on me from virtualharddisks to VHDs which broke a script.

When we consider developing frameworks like PDT in the cloud space with tools like PowerShell and RGTs we have to consider this threshold to develop with a new set of values.  Is a script with 30 parameters at the top and some blocks of code you have to run manually enough? As opposed to building script with if, then, else, switch and validation logic around this original code.  The logic flow is more than the code for actual deployment. The next level being script to generate the RGTs JSON code or PowerShell syntax dynamically. If this complex code had been using the property VM.OSProfile.virtualharddisks how would the script have responded and would the time to develop (and in the future maintain) this framework, around what is already a fairly streamlined deployment, be worth trading for the time to deploy manually?

Azure Resource Group Templates are a great example, the language is JSON and fairly straight forward at a glance, there are Inputs, Variables, Resources and Outputs. With Azure's rate of changes writing code around this could take weeks and months as opposed to managing several RGTs as the deployments are needed. Devops methodologies are starting to introduce this level of automation into code development and Infrastructure as code is rapidly changing what and how quickly we can rip down an environment and redeploy it.

Investing Time If you do want to invest time, it could be spent working on scripts and policy to reduce cloud costs like turning off VMs over the weekends or machines not tagged properly that were possibly deployed for testing and never deleted which could save more dollars than your time to execute a complex deployment script every month. Perhaps writing PowerShell modules to help with things like reporting or authentication.  Maybe it's worth just reading about what's new instead, I found out about Azure Resource Manager policies months after they were released, keeping up is almost becoming a full time job.

Summary This article doesn't have the answers, it's meant to add new perspective and raise questions about what you automate and how far you push the modularization and reuse of code.

Disk to LUN Mapping for a Windows VM using Microsoft Storage Spaces in Azure

disks.jpg

When multiple data disks are attached to a Windows VM using Microsoft Storage Spaces in Azure it can be difficult to identify the LUN a disk belongs to. This is especially troublesome when there is a need to delete, clone, or migrate a specific disk(s) from on VM to another VM in Azure. A scenario where this may come in handy is if you need to migrate only 1 or 2 disks associated with Microsoft Storage Spaces while leaving other disks in place on the VM.

When a disk is not associated with a storage pool the disk’s LUN can easily be seen in Location property of the general tab on the disk’s properties under the “Disk Management” utility.

When a disk is associated with a storage pool the disk will not show up in the “Disk Management” utility and the LUN is not easily viewable because of this.

The following can help identify the Disk to LUN Mapping on your Windows VM in Azure when Microsoft Storage Spaces are used.

We will collect some information on some of the disks displayed in Device Manager. Use this information to identify the LUNs of the disks you would like to delete, clone, or migrate. You will need to do these steps for each of the “Microsoft Storage Space Device” objects in the “Disks drives” object in the Device Manager utility.

  • Physical Device Object Name – Can be used as the unique identifier for the “Microsoft Storage Space Device”.
  • Volume and Drive Letter(s) – Will show what volumes are associated with the storage space device.
  • Volume Capacity – This is optional but helpful to see.
  • Power relations – Identifies uniquely each “Microsoft Virtual Disk” used by the “Microsoft Storage Space Device”.
  • LUN ID(s) – This derived by manually converting the last 6 characters of each of the “Power Relations” values (see above) from HEX to Decimal.

1. Open Device Manager

a. Remote Desktop into your Azure VM that has the disks you would like to delete, clone, or migrate.

b. Right-Click Start Menu –> Select “Device Manager”

2. Get properties of the “Disk drives” object.

Note: This will need to be done for each of the “Microsoft Storage Space Device”

a. Right Click on the “Microsoft Storage Space Device” object and select properties.

01-DeviceManager

 

3. Get the “Microsoft Storage Space Device” object’s “Physical Device Object name” property.

a. Select the “Details” tab of the “Microsoft Storage Space Device” properties window.

b. Select the “Physical Device Object name” property in the “Property” field dropdown menu.

c. Write down the “Physical Device Object Name” value for this “Microsoft Storage Space Device”.

Sample Value: “\Device\Space1”

03-DeviceManagerDevicePropertiesDetails-Physical Device Object name

4. Get the Volumes associated with the “Microsoft Storage Space Device”.

a. Select the “Volumes” tab of the “Microsoft Storage Space Device” properties window.

b. Click the “Populate” button to retrieve all volume information for this “Microsoft Storage Space Device”.

02a-DeviceManagerDevicePropertiesVolumes-Unpopulated

c. For each volume in the “Microsoft Storage Space Device” make note of the “Volume” and “Capacity” fields.

Sample Value “Volume”: “volume-on-pool-a (M:)”

Sample Value “Capacity”: “2016MB”

02b-DeviceManagerDevicePropertiesVolumes-Populated

 

5. Get the “Power relations” properties for the “Microsoft Storage Space Device”.

a. Select the “Details” tab of the “Microsoft Storage Space Device” properties window.

b. Select the “Power relations” property in the “Property” field dropdown menu.

c. Write down the “Power relations” value(s) for this “Microsoft Storage Space Device”.

Sample “Power relations” Value(s):

SCSI\Disk&Ven_Msft&Prod_Virtual_Disk\000001

SCSI\Disk&Ven_Msft&Prod_Virtual_Disk\000000

 

04-DeviceManagerDevicePropertiesDetails-Power relations

6. Identify the LUN information in the “Power relations” Value string(s) and convert from HEX to Decimal.

Note: The LUN information is stored in hexadecimal format as the last 6 characters on the “Power relations” Value string(s). At this time this will need to be manually identified and converted to a decimal format for LUN identification. See this sample data below:

Sample “Power relations” value: “SCSI\Disk&Ven_Msft&Prod_Virtual_Disk\00001c

Sample Hexadecimal value (last six characters from above):00001c

Sample LUN ID (converted from HEX): “28”

a. Get the last 6 characters of the “Power relation” string(s).

b. Convert the last 6 characters from hexadecimal to decimal using Calc.exe (in Programmer mode). This is the LUN ID.

c. Note down the LUN ID with the other collected information.

 

Below are some example tables that were created using the method above. These can be used to aggregate the data when multiple disks are involved.

 

Volumes Table

Microsoft Storage Space Device (Physical Device Object Name)

Drive Letter

Volume Name

Capacity

\Device\Space1

M:

volume-on-pool-a (M:)

2016 MB

\Device\Space2

N:

volume-on-pool-b (N:)

2016 MB

 

Power Relations Table (LUNs)

Microsoft Storage Space Device (Physical Device Object Name)

Power relations

HEX (Manual - Last 6 of Power Relations)

LUN (Manual - Convert Hex to Decimal)

\Device\Space1                                         

SCSI\Disk&Ven_Msft&Prod_Virtual_Disk\000001

000001

1

\Device\Space1

SCSI\Disk&Ven_Msft&Prod_Virtual_Disk\000000

000000

0

\Device\Space2

SCSI\Disk&Ven_Msft&Prod_Virtual_Disk\000003

000003

3

\Device\Space2

SCSI\Disk&Ven_Msft&Prod_Virtual_Disk\000002

000002

2