Tuesday 30 June 2015

Solution: The processing of Group Policy failed. Windows could not authenticate to the Active Directory service on a domain controller.

The processing of Group Policy failed. Windows could not authenticate to the Active Directory service on a domain controller. (LDAP Bind function call failed). Look in the details tab for error code and description.

Looking at the detailed description in the Windows Event Viewer or the "Alert Context" tab in SCOM I found the following:

User: domain\joebloggs

Event Data:
< DataItem type =" System.XmlData " time =" 2011-01-15T08:00:01.4111071+02:00 " sourceHealthServiceId =" 353-3533535-4353535353 " >
< EventData >
  < Data Name =" SupportInfo1 " > 1 </ Data >
  < Data Name =" SupportInfo2 " > 5111 </ Data >
  < Data Name =" ProcessingMode " > 0 </ Data >
  < Data Name =" ProcessingTimeInMilliseconds " > 3422 </ Data >
  < Data Name =" ErrorCode " > 49 </ Data >
  < Data Name =" ErrorDescription " > Invalid Credentials </ Data >
  < Data Name =" DCName " />
  </ EventData >
  </ DataItem >

From this it appears that a user (joebloggs) is currently logged into this computer (although has disconnected their RDP session) has had their password expire. This can be confirmed with the qwinsta command:

C:\Users\adminuser>qwinsta
 SESSIONNAME       USERNAME                 ID  STATE   TYPE        DEVICE
 services                                    0  Disc
                   joebloggs                 1  Disc
                   adminuser                 2  Active

So we simply use the rwinsta command to boot out the appropriate user ID e.g.:

rwinsta 1

For more information on this error please refer to:
https://technet.microsoft.com/en-us/library/cc727283.aspx



Thursday 18 June 2015

Enabling Kerboros authentication for MAPI clients with Exchange 2013

In my experience security behind authentication mechanisms in Exchange is often overlooked - I am not going to go on a big rant about what kerboros can do and why it should

be used (you can do that pretty easily by doing a quick search e.g. ntlm vs kerboros.) Although in this post I will highlight the main steps in order to get Kerboros up and running in your Exchange environment.

It is worth noting that by default Outlook client are set to negotiate the most secure authentication mechanism available - so when Kerboros has been setup on the server side the client configuration should pickup the changes.

1. Create a computer account called "ExchangeASA" which will be used as an 'Alternative Service Account' - this account is necessary due to when working with CAS Arrays (in Exchange 2010) or a cluster of client access servers the TGT (Ticket Granting Ticket) contains the target server - of which will be the load balancer namespace e.g. cas-cluster.domain.com - when in actuality the end server will be different e.g. cas01.domain.com or cas02.domain.com.

The following from TechNet explains goes into a little more detail of this process:

a) Client contacts KDC and obtains a TGT.
b) Client sends a request to the TGS based on the TGT, the authenticator, and the name of the target server. In this case, the name of the target server outlook.contoso.com, and not the FQDN of the Client Access server.
c) Client retrieves service ticket and sends it to outlook.contoso.com, which happens to get directed by the load balancer to cas1.contoso.com. cas1.contoso.com fails to decrypt the service ticket because: Its name does not match outlook.contoso.com.
e) The SPN is not associated with cas1.contoso.com or multiple objects within Active Directory have an SPN associated with the associated service and outlook.contoso.com.
f) Client fails to obtain an access token using Kerberos authentication.


2. Run the following powershell script to setup the relevent SPN's - replacing the relevent hostnames where necessary:

$acct="domain.com\ExchangeASA`$"

$spnlist=@("http/owa.domain.com”,"http/autodiscover.domain.com","http/failback.domain.com”,"exchangeMDB/loadbalancer-namespace.domain.com","exchangeRFR/loadbalancer-

namespace.domain.com","exchangeAB/loadbalancer-namespace.domain.com")

foreach ($spn in $spnlist) {setspn -F -S $spn $acct}

3. Within the Exchange 2013 scripts folder you will find a scipt entitled: RollAlternateserviceAccountCredential.ps1 - this will register (or update) the ASA credentials

assoisiated with the CAS load balancing set (or CAS array if you are running 2010):

.\RollAlternateServiceAccountPassword.ps1 -ToArrayMembers loadbalancing-namespace.domain.com -GenerateNewPasswordFor Contoso\ExchangeASA$ -verbose

That's it! Just note that if you ever have add any additional CAS nodes to the load balancing set / array you will have to re-run this script - as Kerboros will fail!

Sources:
http://anexinetisg.blogspot.co.uk/2013/07/kerberosslaying-dragon-in-exchange.html
http://blogs.technet.com/b/exchange/archive/2011/04/15/recommendation-enabling-kerberos-authentication-for-mapi-clients.aspx


Deleting a specific message from an (or all) Exchange account(s)

In order to perform this operation we need to use the "New-MailboxImportRequest" - which is not actually available by default! So we must add the user executing the following commands to the Mailbox Import Export role group:

We should firstly create a dedicated security group for the users who our likely to run this command:

New-ManagementRoleAssignment -Name "Mailbox Import Export Admins" -SecurityGroup "Mailbox Admins" -Role "Mailbox Import Export"

or specify an individual user:

New-ManagementRoleAssignment -Name "Mailbox Import Export Admins" -User "Joe Bloggs" -Role "Mailbox Import Export"

Since this operation is potentially very dangerous we can issue the following command to firslty verify what *would* be deleted:

Get-Mailbox -Server  "MailboxServer01" | Search-Mailbox -SearchQuery 'Subject:"Offensive Email" and Body:"Joe Bloggs"' -targetmailbox "AdminUser" -targetfolder "Inbox" -logonly -loglevel full

The above example searches all mailboxes on all of the databases on MailboxServer01 for an email with a specific subject and body and sends the results to the 'AdminUser' mailbox's Inbox folder.

If we wanted to perform a search query on a individual user we could alternatively run:

Search-Mailbox "joebloggs@domain.com" -SearchQuery 'Subject:"Offensive Email" and Body:"Joe Bloggs"' -targetmailbox "AdminUser" -targetfolder "Inbox" -logonly -loglevel full

Now upon verification of the prior commands we can now perform the deletion operation on all mailboxes on the server:

Get-Mailbox -Server  "MailboxServer01" | Search-Mailbox -SearchQuery 'Subject:"Offensive Email" and Body:"Joe Bloggs"' -deletecontent

or on a specific user:

Search-Mailbox "joebloggs@domain.com" -SearchQuery 'Subject:"Offensive Email" and Body:"Joe Bloggs"' -deletecontent

Wednesday 17 June 2015

Understanding and managing cached credentials with Windows 7/8/2008/2012

Windows will often make use  of caching credentials for user logins and services such as Remote Desktop Services. If you've ever wondered how to manage all of these
credentials there is a utility called cmdkey (available since Server 2003 onwards) that can help you achieve this.

Passwords stored within the cache are encrypted - although some are easier to encrypt than others.

Typically there are three main switches:

cmdkey /list
That will display a list of all cached credentials

cmdkey /add:targetname /user:username /pass:password
That will allow you to add credentials for a specific target (e.g. remote server you are rdp'ing to)

cmdkey /delete:targetname
That will allow you to delete a specific target's credentials from the cache.

Example output of 'cmdkey /list':

Currently stored credentials:

    Target: LegacyGeneric:target=Microsoft_OC1:uri=user@domain.com:specific
EWS:1
    Type: Generic
    User: domain\user

    Target: LegacyGeneric:target=Microsoft_OC1:uri=user@domain.com:specific
OCS:1
    Type: Generic
    User: user@domain.com

There are several different types of credential types:

- Generic Password: Used to store user specific credentials (e.g. Outlook, Lync etc.)
- Domain Password: Used for network authentication e.g. Outlook, RDP etc.) - More secure as only the LSASS.exe process can encrpyt / decrypt the passwords.
- Domain Visible Password: Similar to a generic password, although the username is not encrpyted - used for services such as .NET Passport.
- Certificates

Now these credentials are stored in the followng paths (in Windows 7 and above):

%USERPROFILE%\AppData\Roaming\Microsoft\Credentials

and

%USERPROFILE%\AppData\Local\Microsoft\Credentials

** Although you will need to ensure "Hide protected operating system files' is unticked before you will see anything in this directory! **

Sources: http://securityxploded.com/networkpasswordsecrets.php

Cleaning up stale device entries in SCCM 2012 and Active Directory

While you can run the Site Maintianence tasks within SCCM - "Delete Aged Discovery Data" and "Delete Inactive Client Discovery Data" - but unfortuantly this does not delete

the assoisated devices computer objects in AD. So as I wrote a quick proof on concept script to do just this...

We should firstly launch the SCCM Console >> Monitoring >> Reporting >> Reports >> Computers not discovered recently >> enter "30" days and select the relevent collection that holds all of your workstations (e.g. 'All Systems'.) When I find the time I would like to use SQL Reporting Services to extract the data - so the above process could be automated in the script below.

We will now need to feed the report into the following script - that will then delete the relevent SCCM devices and AD computer objects:

** This script should be tested in a development environment firstly! It is only in an alpha state and should ideally only be used as concept to understand how the task could be completed **

# SCCM 2012 R2 Device Cleanup Script

# Pre-requisites
# - Windows PowerShell 3.0 https://www.microsoft.com/en-gb/download/details.aspx?id=34595
# - Tested on Server 2008 R2 and Server 2012 SP1
# - SCCM Console is installed on the server
# - Executed by a user with the relevent privilages to access SCCM and delete computer objects from AD.

#PowerShell Version Check
if ($PSVersionTable.PSVersion.Major -gt 2)
   {
    Write-Output "PowerShell 3.0 or above detected!"
   }
   else
   {
        Write-Output "Please ensure you are running this script with PowerShell 3.0 or above! (Use the -version switch)"
    pause
    exit
   }

# Variables
param(
[string]$input-report
)
$sccm-server-name
$sccm-site-name
$auto-delete = false
#

# Register modules
Import-Module ((Split-Path $env:SMS_ADMIN_UI_PATH)+"\ConfigurationManager.psd1")
Import-Module ActiveDirectory

# Change to SCCM site directory
cd $sccm-site-name + ":"

# Strip the first three lines from the file
$sccm_devices = (get-content $input-report)
$sccm_devices = $sccm_devices | select -Skip 3
$sccm_devices > sccm_devices.csv

# Parse file as CSV
$sccm_devices_csv = Import-Csv -Path sccm_devices.csv

# Get data from the 'Details_Table0_Netbios_Name0' column (making sure table labelling is stripped)
$sccm_devices_csv = $sccm_devices_csv | Select-Object Details_Table0_Netbios_Name0 | Format-Table -HideTableHeaders

# Convert to string type
$sccm_devices_csv > sccm_devices_filtered.txt
# Get file contents, stripping any blank lines
$sccm_devices_csv = @(get-content sccm_devices_filtered.txt) -match '\S'

$sccm_devices_array = $sccm_devices_csv -Split '[\r\n]'
foreach ($device in $sccm_devices_array)
{
        # Remove the nodes from SCCM and also Active Directory
    if ($auto-delete -eq true)
    {
        Remove-ADComputer $Computer -ErrorAction Stop -Force
        Remove-CMDevice -DeviceName $device -Force
    }
    else
    {
        Remove-ADComputer $Computer -ErrorAction Stop
        Remove-CMDevice -DeviceName $device
    }
    }

Tuesday 16 June 2015

New Server Build Checklist

This is more for a quick reference to ensure all of the main areas of a new server build have been checked off - these should typically be independent of the OS.

Identity: Hostname, host file entries

TCP / IP Setings: (Static) Ip Address, DNS Servers, Default Gateway, Static Routing

Security: Antimalware products, server hardening, user accounts/permissions, NTP setup, syslog setup

Firewall: Deny-all policy, explicit rules in place

Management: DRAC / iLO Configured? Server Management Software Installed?

Monitoring Agents: SolorWinds, SCOM, NAGIOS

Updates: Server up to date? (Windows Update, apt, etc.) Have you enabled automatic patching of critical security updates?

Components: All unnesasery server components uninstalled?

Virtualization: VMWare tools installed?

Licensing
: Windows Activated? All licensing in place?

Documentation: All server details documented?

Thursday 11 June 2015

19 ways to reduce your Windows Server 2008/2012 R2 disk space consumption

The other day I came accross a poorly provsioned VM that had a measly 20GB of disk space allocated for Windows Server 2008 R2 - while the server was just about running OK despite the lack of disk; upgrading the server to 2012 R2 was simply not going to be feasable since the disk only had around 1GB of free space!

So I decided to comprise a checklist of pretty much anything that can be done to reduce disk usage (albeit some of them are only temporary measures!)

1. Disk Cleanup with Administrative Rights (The obvious choice - but somewhat limited too): This utility is not actually available with 2008 R2 by default and requires the

"Desktop Experience" feature to be installed manually. This utility cleans up the WinSXS directory (which is the home for windows components) - which can grow prety large.

2. CCleaner: Another obvious choice - but again does not cover everything! http://www.piriform.com/ccleaner/download

3. Ensuring the following files / directories are clear:

- C:\Windows\Temp
- C:\Users\%USERNAME%\AppData\Local\Temp
- C:\Users\%USERNAME%\AppData\LocalLow\Temp
- %SystemRoot%\MEMORY.DMP

4. Disable Pagefile: This should be treated as a temporary measure (unless you really know what you are doing) - Right-hand click "Computer" >> Properties >> Advanced

Settings >> Performance Settings >> Advanced >> Virtual Memory >> Change >> Untick "Automatically manage paging file size for all drives" >> Select "No paging file".

5. Moving Windows Search Indexing Database (If installed): Start >> Search for "Indexing" >> Indexing Options >> Advanced >> Set "Index Location" >> Restart Indexing service.

6. Manually compressing any files that are not often accessed.

7. Use Microsoft's 'junction' utility to create symbolic links from your system drive to another fixed drive. E.g. you could re-locate your software distribution / updates

folder to another drive:

Stop the Windows Update service: sc stop wuauserv
Rename C:\WINDOWS\SoftwareDistribution to C:\WINDOWS\SoftwareDistribution.old
Create the destination directory on the other drive: D:\WINDOWS\SoftwareDistribution
Download the junction utiltiy from SysInternals: https://technet.microsoft.com/en-gb/sysinternals/bb896768.aspx
Create a symbolic link like follows: junction "C:\WINDOWS\SoftwareDistribution" "D:\WINDOWS\SoftwareDistribution" -q
Start the Windows Update service: sc start wuauserv
Ensure Windows Updates are working as expected.
Delete C:\WINDOWS\SoftwareDistribution.old

8. dism /online /cleanup-image /spsuperseded

9. Remove any unnesasery Windows Features / third party software

10. IIS Log Files (If applicable): Ensure C:\inetpub\logs\LogFiles is clear

11. Clear Event Logs (Not reccomended)

12. Removing old installer packages: Installer packages for your software are stored within C:\Windows\Installer - although sometimes you can get a fair few orphaned installers -

so to save space we can use a utiliy called "msizap" (provided by Microsoft - but no longer available / supported) but still works the OS's I have tried it on (Windows 7, 8.1

and Server 2008 R2). Do not simply delete all of the files within this directory - as installed software rely on these packages to perform maintainence like repairing,

uninstalling software etc. I have provided a direct link to msizap below:

<msizap dl>

Simply run 'msizap !g' to remove any orphaned packages.

You can also run the following script that will identify packages (.msp files) that are currnetly in use:

http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-components-postattachments/00-01-56-98-47/WiMsps.zip

13. Deleting Internet Explorer Web Cache: WebCacheV01.dat - This file contains recourses such as cookies, images etc.

C:\Users\%USER%\AppData\Local\Microsoft\Windows\WebCache\WebCacheV01.dat

14. Delete unnecessary fonts and drivers

15. .NET Framework - The assemblies (C:\Windows\assembly\NativeImages_VX.X) assosiated with each version can easily take up around 500MB for each version - consider uninstalling unneeded versions via Add or Remove Roles / Programs and Features.

16. Clear all wallpapers (Except default): %SystemRoot%\Web\Wallpaper

17. Delete any unnecessary user profiles: Right-hand click computer >> Properties >> Advanced >> User Profiles >> Settings.

*Warning: While the following tweaks they could negatively affect user experience - only perform them if you know what you are doing!*

18. Remove Windows Help files: Delete all of the '.h1s' files in C:\Windows\Help\Windows\en-US

19. Verify folders within C:\ProgramData - identifying any folders related to orphaned software 


Wednesday 10 June 2015

Exchange Server 2013 Hardware Requirements Checklist

- Calculate the nessasery IOPs needed for your Exchange organization. : https://technet.microsoft.com/en-us/library/dd298109%28v=exchg.141%29.aspx

- Choose RAID configuration: e.g. RAID0 for Exchange Installation and RAID6 for Mailbox Databases

- Mailbox Database, Log and Content Indexing Requirements:
http://blogs.technet.com/b/exchange/archive/2013/05/06/ask-the-perf-guy-sizing-exchange-2013-deployments.aspx

- CPU Time: Calculate the nessasery CPU time for your Exchange organization.
https://technet.microsoft.com/en-us/library/dd298109%28v=exchg.141%29.aspx

- Exchange Role Recourse Requiements: https://gallery.technet.microsoft.com/office/Exchange-2013-Server-Role-f8a61780

- iSCSI / Network Storage: NIC Teaming, Port Aggregation

- Virutal Disks: Thick Provisioned (Fixed size disks) Eager Zeroed

Exchange Server 2013 Minimum Requirements

Mailbox Role: 8GB of RAM
Client Access Server: 4GB of RAM
Mailbox and Client Access Role (Combined): 8GB of RAM

- 30GB of disk space for each Exchange Server
- 500MB of additional disk space for each language pack
- 200MB of disk space on the system drive
- Page minimum AND maximum page file size should be equal to your RAM + 10MB

Friday 5 June 2015

IOmeter Quickstart Guide

IOmeter is a multi-platform tool that is used to benchmark disk IOPS. At first the utility can look a little intimidating - hence I wrote this short guide to help new users get up and running quickly.

Once downloaded has been downloaded and has started up expand the topology tree in the left hand view:

All Managers >> "Computer-Name" >> Select "Worker 2" and click on the "Disconnect Selected Worker or Manager" icon (second from the right on the top hand navigation bar) - rinse and repeat for 3, 4 etc. ensuring that we have one worker left (Worker 1)

Now go to the "Access Specifications" tab (This is where you can select the block size for testing) and select the "All in one" node under the "Global Access Specifications" list view and hit the "Add" button.

We should now highlight "Worker 1" (if not already done so) and click on the "Start a Duplicate of This Worker" - so we have in total (the more the better!)

Proceed by clicking on the "Start Tests" (green flag) icon - it should prompt you to save the results somewhere and then begin the tests.


IOPS Testing and Planning

IOPS reperesent the amount of input and output requests (per second) a disk is handling. Dependent on what kind of applications you are using can significantly determine the amount of IOPS are required.

For example recource intensive application such as heavily used relational databases and NOSQL databases typically require a large number of IOPS due to the frequency of disk operations (e.g. writing / requesting data from the database)

When taking into account IOPS we should also be aware of latency (which relates to how long a request actually takes from the applications point of view.) IOPS can vary greatly dependent on the block sizes being tested - typically testing lower block sizes (e.g. 12KB) result in greater IOPS, while larger block sizes (e.g. 100MB) result in lower IOPS but have greater throughput.

In order to get an idea of how many IOPS a disk is capable of performing we can use a tool called IOmeter (there are a whole host of alternatives - I personally like IOmeter as it will work on multiple platforms) - available below:

http://www.iometer.org/doc/downloads.html

** Update **  A much more user-friendly disk benchmark tool can be downloaded below (will also work perfectly fine on mechanical disks):

AA SSD Benchmark

In order to maximize available IOPS on a disk the following actions can be taken:

- Stripe a volume over multiple physical disks (e.g. RAID 1, or RAID 10)
- Use a faster disk type e.g. SSD - rather than coventional media such as mechanical disks.

Calculating IOPS can become a pretty complex task (and although there is nothing wrong with working out advanced RAID configurations yourself - it never harms to double check it with an IOPS calculator):

http://www.thecloudcalculator.com/calculators/disk-raid-and-iops.html

As a rough average I have outlined some common configurations below:

7,200 rpm (SATA) =    75 - 100 IOPS
10,000 rpm (SATA / SAS) = 125 - 150 IOPS
15,000 rpm (SAS) = 175 - 210 IOPS
SSD Drive 45,000 - 50,000

Determining whether a mounted disk has a file system on it

Firstly identify which disk we are after with fdisk:
fdisk -l
we can also check for disk and partitions with lsblk
lsblk
And finally check whether there is a file system on the disk / partition:
file -s /dev/sda1
To create a file system we can use the mkfs utility:
mkfs -t ext4 /dev/sda1
and to mount it temporarily:
mkdir -t /mount/partition1
mount -t auto /dev/sda1 /mount/partition1
or to mount it permenently:
sudo vi /etc/fstab
and add the following line:
/dev/sda1    /mount/partition1    auto    auto,nouser,exec,rw,async,atime    0 0

Thursday 4 June 2015

Amazon S3 Key Features

S3 is a cloud based storage service offering by Amazon. It makes use of "buckets" (that are accessed over HTTPS) - that are effectively folders. I have briefly outlined the main features available below:

- Access Control: By default S3 buckets are private - although you have the ability to make them publically accessable. Access can be fine tuned for specific users - for example restricting Upload, Delete and Editing of permissions.

- Static Website Hosting: You have the ability to turn your bucket into a static website - although adding dynamic content is not possible. A common use scenerio here is enabling your bucket to act as host for pictures / media.

- Logging: You have the ability to audit access to the bucket by logging events such as users adding or deleting items from the bucket.

- Versioning: Allows you to retain deleted / older copies of files for recovery given the event that you need to restore a file due to accidental deletion. You can apply Lifecycle rules to archive older version of file automatically (hence saving space (and money) in your S3 bucket)

- Cross Region Replication: Allows you to replicate data accross regions (used in conjunction with versioning) - by default data in the bucket is replicated anyway between different sites in a region - but due to regulatory requirements sometimes there are times where circumstances were cross region replication is required.

Ensuring high availability in Windows Azure with Availability Groups

It might come as a shock to many (or maybe not) that by default virtual machines on the Azure Cloud offer no high availability whatsoever. Although microsoft does offer a 99.95% availability option if you utilize availability sets - however competitors such as AWS offer this as standard.

Availability sets require at least two or more virtual machines to function - for example you could have two web servers and two backend SQL servers running both running in an availability set.

An availability set provides a guaranteed availability ensuring that the VM's are spread accross different  racks in the Azure datacentre - hence offering switch and power supply redudency. They are also vital when Microsoft are performing planned maintainence due to the fact that sometimes the VM's need to be restarted as a result.

We can set an availability set up in one of two ways - either defining the availability set upon creation of the VM's:

$vm1 = New-AzureVMConfig -Name "myvm1" -InstanceSize Small -ImageName $img -AvailabilitySetName "myavset"

or simply assigning the availability set with a pre-provisioned VM (although note that a VM restart will be needed for changes to take effect):
                        
Get-AzureVM -ServiceName "mycloudsvcname" -Name "myvm1" | Set-AzureAvailabilitySet -AvailabilitySetName "myavset" | Update-AzureVM

In the scenerio of a web application as above - you would also obviously require the use of a load balancer for the web servers.

Pricing / Costs

As far as I can tell there are not any charges explicitly for using an availability group - although he fact that both of the VM's must be running in the group double your costs anyway.

Wednesday 3 June 2015

Enabling federation between Active Directory and AWS IAM (Identity Access Management)

AWS allows us to integrate our existing AD intrastructure within it's Identity Access Management service with the use of SAML - allowing us to utilize features such as SSO (Single Sign-on) enabling users to manage our different AWS services.

So what exactly is SAML?

SAML (Security Assertion Markup Language) is an XML based format that provides auhentication and authorization between two parties - in this case the claims provider (AD) and the relying part (AWS.)

The following demonstrates the process flow of an AD user logging into AWS:

From Amazon: http://cdn.amazonblogs.com/security_awsblog/images/AD1.png

1. User browses to AWS sign-in URL and is then re-directed to our ADFS instance
2. User authenticates their self with ADFS
3. User then recieves a SAML assertion from the ADFS server
4. User then posts the SAML assertaion for the AWS SAML endpoint (https://signin.aws.amazon.com/saml)
5. Finally user is redirect to AWS sign-in page and is redirected to the AWS console

For this tutorial we will setup a domain called example.internal and join an ADFS server to the domain. Once we have a DC / domain setup we will proceed by setting up ADFS as follows:

Add Roles and Services >> Active Directory Federation Services >> 'Configure the federation service on this server' >> 'Create the first federation server...' >> Setup Certificates, Service Accounts etc.

The next step will involve creating a SAML provides in your AWS account:

AWS Console >> IAM >> Identity Providers >> Create Provider >> SAML >> Enter a name and upload the metadata file that can be downloaded from your ADFS server as follows:

http://<adfs-servicename>/FederationMetadata/2007-06/FederationMetadata.xml

(NOT the hostname of the ADFS server!)

** Be aware:  ADFS 2.0 no longer has a dependence on IIS - AD FS is now built directly on top of HTTP.SYS and does not require the installation of Internet Information Services (IIS). This caught me out! **

We should then proceed by creating a SAML Role as follows:

AWS Console >> IAM >> Roles >> 'Create new role' >> 'Role for Identity Provider Access' >> 'Grant Web Single Sign-On (WebSSO) access to SAML providers' >> Select the SAML provider we created earlier >> Select the relevent policies e.g. "AmazonEC2FullAccess'.

We will proceed by adding AWS as a Trusted Relying Party within ADFS:

ADFS Console >> Right-hand click ADFS 2.0 >> 'Add Relying Party Trust...' >> 'Import data about the relying party published online...' which should be: https://signin.aws.amazon.com/static/saml-metadata.xml >> Display Name should be "signin.aws.amazon.com" >> Choose whether you want any multi-factor options >> "Permit all ysers to access this relying trust" >> Finish Wizard.

Once the relying trust has been created we should now right-hand click on the newly created relying trust and select "Edit claim rules..." >> Issuance Transform Rules >> Add Rule >> "Transform an incoming claim" >> and enter the following in line with the form:

Claim rule name: NameId
Incoming claim type: Windows Account Name
Outgoing claim type: Name ID
Outgoing name ID format: Persistent Identifier
Pass through all claim values: Checked

We then add another rule >> 'Send LDAP Attributes as Claims' >> enter the following:

Claim rule name: RoleSessionName
Attribute store: Active Directory
LDAP Attribute: E-Mail-Addresses
** Ensure you have an email address attached to your AD user object! **
Outgoing Claim Type: https://aws.amazon.com/SAML/Attributes/RoleSessionName

We proceed by adding another rule >> 'Add rule' >> 'Send Claims Using a Custom Rule' and enter as follows:

Claim Rule Name: Get AD Groups
Custom Rule: c:[Type == "http://schemas.microsoft.com/ws/2008/06/identity/claims/windowsaccountname", Issuer == "AD AUTHORITY"] => add(store = "Active Directory", types = ("http://temp/variable"), query = ";tokenGroups;{0}", param = c.Value);

and finally one more >> 'Add rule' >> 'Send Claims Using a Custom Rule' and enter as follows:

Claim Rule Name: Roles
Custom Rule: c:[Type == "http://temp/variable", Value =~ "(?i)^AWS-"] => issue(Type = "https://aws.amazon.com/SAML/Attributes/Role", Value = RegExReplace(c.Value, "AWS-", "arn:aws:iam::123456789012:saml-provider/ADFS,arn:aws:iam::123456789012:role/ADFS-"));

Finally we can test our configuraion by going to the following url:

https://localhost/adfs/ls/IdpInitiatedSignOn.aspx

Creating a self-signed certificate within Windows (the easy way!)

One of the great annoyances I have with Windows is the inability to do such simple tasks easily - like generating a self-signed certificate. On linux you can easily install and make use of openssl with a one liner. Although with Windows you have the following options:

- Install IIS and generate a new self-signed certificate (this is a little excessive in my opinion to achieve something so trivial.

- Install a dedicated CA and give up the idea of a self-signed certificate

- Use the makecert and pvk2pfx utilities (unfortuantly requiring you to download a whole host of unneeded junk with it.

I will explain the latter option:

1. Download and install the Windows Software Development Kit for Windows 7 (or 8) from:

https://msdn.microsoft.com/en-us/windows/desktop/hh852363.aspx

2. Open an elevated command prompt and change the directory to:

"C:\Program Files (x86)\Windows Kits\8.0\bin\x86"

3. Create the self-signed certificate:

makecert -r -n "CN=www.example.com" -sv cert.pvk cert.cer

4. Combine the private key and certificate into a PFX container:

pvk2pfx -pvk cert.pvk -spc cert.cer -pi yourpassword -pfx cert.pfx

Lab: WAN Acceleration with WANOS and WANEM: Part 1




 WAN links can be expensive and unless you have a lot of money to shell out it is likely that you don't have dedicated gigabit bandwidth available to you - so it makes sense to optimize the performance in any way we can.

While there are dozens and dozens of different solutions out there all offering similar results I chose WANOS (http://wanos.co) since it seems to be fairly well established product and provides a reasonable pricing model - either free (with limited support) or paid (with support and updates.)

WANOS can also be downloaded as a virtual appliance enabling me to lab it up nice and easily!

I have outlined a few of the main / interesting features of WANOS below:

- Packet Loss Recovery: Ensures that data is delivered reliability over the LAN.
- Universal Deduplication: Provides cross-flow deduplication on byte patterns (hence is protocol independent)
- Compression: Compresses data on the fly which is going over the WAN link
- Quality of Service: Provides features such as traffic shaping, tagging and classification.

The lab will consist of two sites (A and B) which go over a capped WAN link of 10mbps.

I want to perform some performance tests over the two sites with and without WANOS like follows:



Obviously being in a lab environment; packet loss, jitter and latency are unlikely to be anything like the real world – although fortunately we can emulate these within WANEM! Other factors to take into consideration are the data de-duplication process and the processor speed (as this may affect performance greatly.)
I will setup the WANEM virtual appliance (emulating the WAN link) as follows:

Hardware
Value
RAM
2GB
Disk
8GB
Virtual NIC
Bridged x1

In part two we will start to build are virtual lab.

Tuesday 2 June 2015

Requesting an Exchange 2013 certificate from an enterpise certifcate authority

We should firstly identify the relevent certifcate authority:

certutil –config - -ping

RDP into the CA and go to the Certifcate Authority snapin >> Right-hand click 'Certifcate Templates' >> Manage >> Right-hand click the 'Web Server' certificate and select "Duplicate" >> Name it something like 'Exchange 2013 Server' and ensure that under the 'Request Handling' tab that 'Allow private key to be exported' is selected and that the reqesting COMPUTER has 'Enroll' permission under the 'Security' tab >> Finally exit the 'Certifcate Templates Management' console.

Now from the CA snapin right-hand click 'Certifcate Templates' >> New >> Certificate Template to Issue >> select our newly created template: Exchange 2013 Server.

We can then request a new certifcate by going to the certifcates snapin for the local COMPUTER >> Right-hand click 'Personal' node >> All Tasks >> 'Request New Certificate...'. ** During this process ensure that as well as entering a common name(s) for the certificate you also specify a 'friendly name' otherwise Exchange will display the certificate with a blank name on it's UI!

We should now export the certificate (along with it's private key) and simply import it from the Exchange ECP:

Servers >> Certificates >> Import Exchange Certificate >> Assosiate with the relevent server >> Click Finish >> Double-click the newly created certificate >> Services >> Select the relevent services.

Finally restart the relevent services to take effect.

Monday 1 June 2015

Troubleshooting poor network performance with ESXITOP

When networking issues arise a good place to start is ESXTOP:

SSH into the ESXI host >> esxtop >> press 'n'.

This will present you will all of your ports, network labels, vSwitch names and so on. The significant columns here are '%DRPTX' and '%DRPRX' - hence indicating that the vSwitch queue has been exhausted and packets are being dropped.

The other columns are reperesented as follows:

- MbTX/s: Data transmitted in mbps.
- MbRX/s: Data recieved in mbps.
- PKTTX/s: Average number of packets transmitted per second
- PKTRX/s: Average number of packets recieved per second

Possible causes could be:

- High guest CPU utilization
- Insufficient bandwidth supplied by the uplink

Possible resolutions include:

- Network I/O control
- Moving network hungry VM's onto another host
- Upgrading the uplink capacity

Quickstart guide to using and interpreting ESXTOP

I have pulled together a quick and dirty guide to capturing and interpreting ESXTOP results.

Launching ESXTOP

ESXTOP can be launched from the command line by ssh'ing into the ESXI host:
esxtop
Metrics and Thresholds

Display Metric Threshold Explanation
CPU %RDY 10 Overprovisioning of vCPUs, excessive usage of vSMP or a limit(check %MLMTD) has been set. See Jason’s explanation for vSMP VMs
CPU %CSTP 3 Excessive usage of vSMP. Decrease amount of vCPUs for this particular VM. This should lead to increased scheduling opportunities.
CPU %SYS 20 The percentage of time spent by system services on behalf of the world. Most likely caused by high IO VM. Check other metrics and VM for possible root cause
CPU %MLMTD 0 The percentage of time the vCPU was ready to run but deliberately wasn’t scheduled because that would violate the “CPU limit” settings. If larger than 0 the world is being throttled due to the limit on CPU.
CPU %SWPWT 5 VM waiting on swapped pages to be read from disk. Possible cause: Memory overcommitment.
MEM MCTLSZ 1 If larger than 0 host is forcing VMs to inflate balloon driver to reclaim memory as host is overcommited.
MEM SWCUR 1 If larger than 0 host has swapped memory pages in the past. Possible cause: Overcommitment.
MEM SWR/s 1 If larger than 0 host is actively reading from swap(vswp). Possible cause: Excessive memory overcommitment.
MEM SWW/s 1 If larger than 0 host is actively writing to swap(vswp). Possible cause: Excessive memory overcommitment.
MEM CACHEUSD 0 If larger than 0 host has compressed memory. Possible cause: Memory overcommitment.
MEM ZIP/s 0 If larger than 0 host is actively compressing memory. Possible cause: Memory overcommitment.
MEM UNZIP/s 0 If larger than 0 host has accessing compressed memory. Possible cause: Previously host was overcommited on memory.
MEM N%L 80 If less than 80 VM experiences poor NUMA locality. If a VM has a memory size greater than the amount of memory local to each processor, the ESX scheduler does not attempt to use NUMA optimizations for that VM and “remotely” uses memory via “interconnect”. Check “GST_ND(X)” to find out which NUMA nodes are used.
NETWORK %DRPTX 1 Dropped packets transmitted, hardware overworked. Possible cause: very high network utilization
NETWORK %DRPRX 1 Dropped packets received, hardware overworked. Possible cause: very high network utilization
DISK GAVG 25 Look at “DAVG” and “KAVG” as the sum of both is GAVG.
DISK DAVG 25 Disk latency most likely to be caused by array.
DISK KAVG 2 Disk latency caused by the VMkernel, high KAVG usually means queuing. Check “QUED”.
DISK QUED 1 Queue maxed out. Possibly queue depth set to low. Check with array vendor for optimal queue depth value.
DISK ABRTS/s 1 Aborts issued by guest(VM) because storage is not responding. For Windows VMs this happens after 60 seconds by default. Can be caused for instance when paths failed or array is not accepting any IO for whatever reason.
DISK RESETS/s 1 The number of commands reset per second.
DISK CONS/s 20 SCSI Reservation Conflicts per second. If many SCSI Reservation Conflicts occur performance could be degraded due to the lock on the VMFS.
%VMWAIT: Is a derivitive of %WAIT and reperesents just the hardware and SWAP waiting time and hence is a better metric to use than %WAIT when diagnosing performance issues such as storage controllers etc.

%WAIT: Reperesents the waiting time for devices (e.g. storage controller), SWAP waiting time AND %IDLE time - so should not be taken at face value!

%RUN: Reperesents the percentage of total time scheduled for the world to run. %USED = %RUN + %SYS – %OVRLP. When the %RUN value of a virtual machine is high, it means the VM is using a lot of CPU resource.

ESXTOP Toggles

c = cpu
m = memory
n = network
i = interrupts
d = disk adapter
u = disk device (includes NFS as of 4.0 Update 2)
v = disk VM
p = power states

V = only show virtual machine worlds
e = Expand/Rollup CPU statistics, show details of all worlds associated with group (GID)
k = kill world, for tech support purposes only!
l  = limit display to a single group (GID), enables you to focus on one VM
# = limiting the number of entitites, for instance the top 5

2 = highlight a row, moving down
8 = highlight a row, moving up
4 = remove selected row from view
e = statistics broken down per world
6 = statistics broken down per world

Exporting results from ESXTOP

From the command line we can run:

esxtop -b -d 2 -n 250 > esxtopout.csv

Interpreting results from ESXTOP

You can directly hook into ESXTOP with a utility called VisualESXTOP (rather than having to manually export it's results - that will build pretty graphs to help you interpet the data a little easier.

References

http://www.yellow-bricks.com/esxtop/
https://communities.vmware.com/docs/DOC-11812
http://buildvirtual.net/analyzing-esxtop-data/

Troubleshooting poor storage controller performance with ESXTOP

Firstly consult the following logs to check for any controller issues:
/var/log/messages
and
/var/log/vmkernel
We should proceed by SSH'ing into the ESXI host and running the esxtop command:
esxtop
Press d (to go to the device statistics screen, then f (to go to the field selector menu) and finally press j (to add error stats).

From this view we should then look for the DAVG/cmd column (Device Average Latency) - this will give you an idea (latency in milliseconds) of how long the ESXI host is waiting for SCSI commands submitted to the SAN to come back with a reponse. According to VMware - anything greater than 50ms for the local HBA indicates device contention is occuring.

CMDS/s: This is the total amount of commands per second and includes IOPS (Input/Output Operations Per Second) and other SCSI commands such as SCSI reservations, locks, vendor string requests, unit attention commands etc. being sent to or coming from the device or virtual machine being monitored.

DAVG/cmd: This is the average response time in milliseconds per command being sent to the device.

KAVG/cmd: This is the amount of time the command spends in the VMkernel. 

GAVG/cmd: This is the response time as it is perceived by the guest operating system. This number is calculated with the formula: DAVG + KAVG = GAVG 

** In order to determine which storage controller we should be looking at we can go to the vSphere Client >> ESXI Host >> Configuration >> Storage Adapters and identify the connected devices. **

HP also provide a live CD to perform a thorough diagnostics run on a large series of storage controllers - to download the ISO you should go to:

http://h20564.www2.hp.com/hpsc/swd/public/detail?swItemId=MTX_3c888073127c4c65b7bd8559eb

The current supported list is as follows (as of the 1st of June 2015):

     Smart Array 5312 Controller
     Smart Array 5302 Controller
     Smart Array 5304 Controller
     Smart Array 532 Controller
     Smart Array 5i Controller 
     Smart Array 641 Controller
     Smart Array 642 Controller
     Smart Array 6400 Controller
     Smart Array 6400 EM Controller
     Smart Array 6i Controller
     Smart Array P600 Controller
     Smart Array P400 Controller
     Smart Array P400i Controller
     Smart Array E200 Controller
     Smart Array E200i Controller
     Smart Array P800 Controller
     Smart Array E500 Controller
     Smart Array P700m Controller
     Smart Array P410i Controller
     Smart Array P411 Controller
     Smart Array P212 Controller
     Smart Array P712m Controller
     Smart Array B110i SATA RAID
     Smart Array P812 Controller
     Smart Array P220i Controller
     Smart Array P222 Controller
     Smart Array P420 Controller
     Smart Array P420i Controller
     Smart Array P421 Controller
     Dynamic Smart Array B320i RAID

Collecting hardware diagnostic logs for ESXI 5.0 hosts on HP servers

** Fornote: I do not beleive this works on ESXI 5.5/6.0+. Instead take a look at: HP Agentless Management Service Offline Bundle http://h20564.www2.hp.com/hpsc/swd/public/detail?swItemId=MTX_b05d4c644fb742aa87cb5f5da1#tab1 **

HP have provided a utility that allows you to remotely collect hardware diagnostic logs from an ESXI host.

We should firstly download the utility as below:

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=5351&prodSeriesId=1121516&swItem=MTX-b677f47e08794b8cba62d12c3f&prodNameId=3288134&swEnvOID=4115&swLang=8&taskId=135&mode=4&idx=2

We should then launch the HP Remote System Managment CLI and export the relevent logs:

Array Controller and Harddrive LOGS and Status:
hprsmcli -s xx.xx.xx.xx -u root -p password -o ArrayLogs.txt -f TEXT -t SA

System Fan LOGS and Status:
hprsmcli -s xx.xx.xx.xx -u root -p password -o FAN.txt -f TEXT -t FAN

ILO LOGS and Status:
hprsmcli -s xx.xx.xx.xx -u root -p password -o ILO.txt -f TEXT -t ILO

MEMORY LOGS and Status:
hprsmcli -s xx.xx.xx.xx -u root -p password -o MEMORY.txt -f TEXT -t MEMORY

OA LOGS and Status:
hprsmcli -s xx.xx.xx.xx -u root -p password -o OA.txt -f TEXT -t OA

CPU LOGS and Status:
hprsmcli -s xx.xx.xx.xx -u root -p password -o CPU.txt -f TEXT -t CPU

Fiber channel HBA LOGS and Status:
hprsmcli -s xx.xx.xx.xx -u root -p password -o FC.txt -f TEXT -t FC

NIC LOGS and Status:
hprsmcli -s xx.xx.xx.xx -u root -p password -o NIC.txt -f TEXT -t NIC

Power Supply LOGS and Status:
hprsmcli -s xx.xx.xx.xx -u root -p password -o PSU.txt -f TEXT -t PS

Server Firmware LOGS and Status:
hprsmcli -s xx.xx.xx.xx -u root -p password -o Firmware.txt -f TEXT -t SF

Server Temperature LOGS and Status:
hprsmcli -s xx.xx.xx.xx -u root -p password -o TEMP.txt -f TEXT -t TEMP

ML LOGS:
hprsmcli -s xx.xx.xx.xx -u root -p password -o iml.xml -f TEXT -t IML

Export ALL of the logs:
hprsmcli -s xx.xx.xx.xx -u root -p password -o AllLogs.txt -f TEXT