We had a problem with SMVI not taking backups and to make matters worse not alerting us to the fact that it was not taking backups , the software can only alert if the backup job fails or if it generates warnings which is fine.
But what happens when the job doesnt start , the SMVI service doesnt stop and nothing alerts you to the fact that no backups are taken... this was the second time this happened , the first time could be attributed to a random occurrence but the second time it happens ... well then your waiting to get caught with your pants around your ankles...
I made a short script , scheduled to run each evening , which would send a mail if there were no snapshots less than a day old.
#gets todays date , stores it in the format day-month-year in the variable $nowDate
$NowDate =get-date -uformat "%d-%m-%y"
#Loops through c:\snapshotsourcelist.txt assigning one line to the variable $snappath
For ($file = [System.IO.File]::OpenText("c:\snapshotsourcelist.txt");
!($file.EndOfStream); $snappath = $file.readline())
{
#recursively scan through all subdirectories of $snappath
$Snapshotlist = get-childitem $snappath -recurse |
#we are only interested in files and not directories
where-object {$_.mode -notmatch "d"} |
#cant scan this for some reason so excluding it from the search criteria
Where-object {$_.name -notlike "*iegwydc01*"} |
#returns files which have a datestamp of today
where-object {$_.lastwritetime -gt [datetime]::parse($nowDate)}
#if the variable is empty then send a mail
if (!$Snapshotlist) {
write-host 'Snapshot Alert , No Snapshots for' +$snappath 'were taken since 00:00 last night'
$SmtpClient = new-object system.net.mail.smtpClient
$SmtpServer = "mailserver.domain.com"
$SmtpClient.host =
$SmtpServer
$From = "ie-dlitalerts@domain.com"
$To = "Darragh@domain.com"
$Title = 'Snapshot Alert , No Snapshots for' +$snappath
$Body = 'Snapshot Alert , No Snapshots for' +$snappath
$SmtpClient.Send($from,$to,$title,$Body) }
}
Wednesday, July 25, 2012
Storage basics , IOPS Penalty and RAID
Post from Yellow bricks showing the write penalty for various raid implementations
http://www.yellow-bricks.com/2009/12/23/iops/
Nice series of blogs to cement the basics ... again http://vmtoday.com/2009/12/storage-basics-part-i-intro/
http://www.yellow-bricks.com/2009/12/23/iops/
Nice series of blogs to cement the basics ... again http://vmtoday.com/2009/12/storage-basics-part-i-intro/
High %costop values - no CPU contention - Poor performance
High %costop values - no CPU contention - Poor performance
Presented itself as general performance problems in a regional office
, specifically one of the application servers was performing very poorly , with
frequent application timeouts , exchange was going offline and VM's were becoming orphaned in vSphere , i logged in to a windows server and saw that the CPU
was operating at 100% , in performance view in vsphere the server was consuming
approximately 300Mhz , this behaviour was repeated on all other servers in the
cluster.
In the DRS view i could see that the servers were receiving appx 10% of entitled resources, there were no limits or reservations set on any of the VM's , on examining the CPU counters in ESXTOP i found that all of the servers had extremely high %costop values (~80% - 90%) , this would normally be indicative of over committed CPU resources on SMP VM’s , as ESX throttles individual CPU’s to prevent skew when some CPU’s make progress and others are unable to due to being scheduled on other VM’s. In our case this could not have been the cause as we had more physical CPU’s than vCPU’s.
During the troubleshooting I noticed that we periodically had
huge latencies on the storage system , sometimes spiking to 6 seconds , the first
strange thing was that the latencies were within acceptable limits until the
IOPS rose above 600 , the second strange point was that combined total of CIFS
and NFS IOPS were rarely sustained above 400 IOPS.
This had me stumped until i discovered that the storage
array had been populated with 7.2K SATA instead of the 15K disks which i expected , immediately i saw why we weren’t seeing at least double the number
of IOPS before latencies ramped up , with 8 usable disks we should see 600
IOPS , instead of the 1200 we expected.
The second point was where were the mysterious extra IOPS coming
from , after more investigation we found that the Netapp 2020 has hidden
aggregate level snapshots which were tipping us well over the 600IOPS threshold
, these hidden jobs were set to run every 3 hours , we rescheduled these to run
outside of production hours.
The high costop value can be attributed to the fact that the
vCPU has to wait for IO completion and as IO completion was taking an extended
period of time , ESX was costopping the CPU’s leading to extremely poor performance
Friday, July 6, 2012
PS Script to move users to an OU
AD Functional Account cleanup , i created this script to move a list of users from multiple OUs to an OU
where they will be mass disabled.
import-CSV c:\testdisableduseraccounts.csv -Header @("Name") | foreach-object { Get-ADUser $_.Name | Move-ADObject -Targetpath "ou=temp disabled,ou=disabled user accounts,dc=DC,dc=DC,dc=domain,dc=com" }
AD Functional Account cleanup , i created this script to move a list of users from multiple OUs to an OU
where they will be mass disabled.
import-CSV c:\testdisableduseraccounts.csv -Header @("Name") | foreach-object { Get-ADUser $_.Name | Move-ADObject -Targetpath "ou=temp disabled,ou=disabled user accounts,dc=DC,dc=DC,dc=domain,dc=com" }
Subscribe to:
Posts (Atom)