Monday, December 29, 2008

Failing over to SCR target

Any of you who have done this knows it is a touchy game.  Luckily a coworker (Kevin Miller) decided to put all the info in one place, along with some good suggestions to think about whilst performing this procedure.

You can check it out here.

Wednesday, December 17, 2008

RPC/HTTP -- Outlook Anywhere Login Prompts Fail

The issue came up where users couldn't gain access to their email via RPC over HTTP / Outlook Anywhere.  An authentication prompt comes up, but does not allow the user to authenticate.
















When checking their configurations, everything seemed at first to be in order.  Upon checking into it further, I noticed that their mail FQDN is mail.domain.com, but it looks like the common name on the certificate is just domain.com (though mail.domain.com was also on the cert under the subject alternative names).  While the cert was registered as valid, it did not match up for the mutually authenticated session.  The problem?  The red outlined boxes didn't match up. 














































After changing the principal name to msstd:domain.com rather than msstd:mail.domain.com such that it matched the certificate name, the authentication began to work once again.

Tuesday, December 16, 2008

Calendar Items Disappearing From Mailbox - BlackBerry

When calendar items are disappearing from peoples' mailboxes, the first thing I ask nowadays is "Do you own a Blackberry?"  The reason for this is simple.  If the Blackberry handheld runs out of memory, the first things it goes for are your calendar items.

While it is a good idea to upgrade your your phone's memory anyways, you can also help your memory utilization by properly closing windows such that they do not continually occupy space.  It is also a good idea to periodically reboot your phone to clean this up.

I can't tell you how valuable this information is when users come to you wondering what happened to their meetings.  Kudos to my coworkers that attended the RIM/BlackBerry/BES training for the initial identification of this problem!

Reference:



Monday, December 15, 2008

Issues mounting NDMP Backed Up Luns

A coworker of mine cracked the case on this one with a bit of help from NetApp, but I thought I'd share as it was a strange little issue.  First off, what is an NDMP backup?

NDMP stands for Network Data Management Protocol, and was actually pioneered by Network Appliance (NetApp) in association with Intelliguard.  The purpose was to backup various platforms and provide interoperability.  Fast forward back to our current predicament..  

The current predicament is that someone uses Veritas to backup and restore.  They want to be able to restore from this, and have it correctly mount such that a they can recover data in Exchange.  The problem is that the restored lun won't mount in Snapdrive or otherwise.  Initially NetApp merely told us that the lun had to be in the root of the share rather than in a folder.. no problem.  Veritas must restore to a folder, but we can move it after the fact to the root. 

We then try to mount the lun, but it doesn't work.  It instead tells us that it is already mounted. 







The trick that my compatriot discovered was that it requires you to change the name as well by doing the following via command line:

lun move /vol/Volume1/restore.lun /vol/Volume1/restorenew.lun

He was then able to mount the lun.  Go team.

Tuesday, December 9, 2008

SMTP Categorizer Queue Length Spikes

The issue has cropped up with the Categorizer Queue Length intermittently spiking.  First off what is the Categorizer Queue Length?

This counter basically determines the items that reside in the categorizer queue.  The categorizer does a few things.  It resolves/validates recipients, determines whether the message should be queued for local or remote delivery, expands Distribution Lists (DLs), and detects limits and restrictions.

The first thing to check in order to resolve the spikes, would be that there are ample Global Catalog servers (GCs) and Domain Controllers (DCs) in the environment to perform the look-ups.  The best way to check this is on the DSAccess tab in ESM (2k3), or the System Settings tab under the properties of the Exchange server (2k7). 



















It would also be prudent to run a network diagnostic tool in search of a bottleneck.

Microsoft also recommends monitoring the processor utilization of Inetinfo.exe (categorizer component) and using the e2kdsinteg config object from the ConfigDSInteg tool in order to check for malformed objects that could be slowing the process of directory look-ups.  You can get the tool here.

Knowing that the environment probably wasn't the shining star on the top of the Active Directory hill, this was the first thing to check.

The e2kdsinteg log came back with numerous entries for old mail servers, and objects not seen by human eyes in many moons.  My advice was to continue to run diagnostics for bottlenecks, and to investigate the purging of these rogue/malformed objects.

Monday, December 8, 2008

Outlook Anywhere Failing - RPC End Points - 6004

It was brought to my attention that autodiscover was not behaving correctly externally.  I ran it through Microsoft's Exchange connectivity tester @ http://www.testexchangeconnectivity.com/ and received the following output:









To resolve this first simple part I just went into the EMS and gave it an ExternalURL via:

Get-AutodiscoverVirtualDirectory | set-AutodiscoverVirtualDirectory -ExternalUrl https://autodiscover.domain.com/Autodiscover/Autodiscover.xml

I now received this error:









"Failed to ping RPC Endpoint 6004 (NSPI Proxy Interface)"

..and also RPC_S_SERVER_UNAVAILABLE error (0x6ba) was thrown by the RPC Runtime

Most curious about an RPC error at this level.  Perhaps a connection between the Hub/Cas and MBX server or MBX server and AD/DCs/GCs?  The environment was not 2008, nor was it using IPv6.

The following is what fixed my issue:

Using the configurations here I was able to remedy the situation.  Basically what happened was that it could not use DSPROXY via HTTP, and it is a known issue.  The fix is to:

1. Changes for Mailbox servers..

a. create a DWORD called "Do Not Refer HTTP to DSProxy" at HKLM\System\CCS\Services\MSExchangeSA\Parameters\ and the value set to 1.  This will, as it spells out, stop it from trying to use DSProxy when using HTTP.
b. HKLM\System\CCS\Services\MSExchangeSA \Parameters key "NSPI Target Server" to the FQDN of the domain controller that you would like used for profile creation.

2. Changes for Client Access Servers..
a. Ensure that the "PeriodicPollingMinutes" key at HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\MSExchangeServiceHost\RpcHttpConfigurator\ is set to zero.  This will ensure that the system won't continue to over write our settings every 15 minutes.
3. b. Also modify "ValidPorts" at HKLM\Software\Microsoft\RPC\RPCProxy such that it lists the DCs which can be accessed via port 6004.  An example of this would be:










domaincontroller.domain.com:6004;domaincontroller2.domain.com:6004


4. Changes for all Global Catalog (GC) servers..
a. Be sure that there is an REG_MULTI_SZ entry created named NSPI interface protocol sequences at HKLM\System\CurrentControlSet\Services\NTDS\Parameters\ and the value set to "ncacn_http:6004"


Testing autodiscover/Outlook anywhere now yields the following output in the connectivity tester:















You can double check these settings by configuring a profile in Outlook, then Ctrl+RightClicking the outlook icon on the system tray, and running "Test E-Mail Autoconfiguration."

For the full explanation I highly recommend reading the official blog post by Siddhartha Mathu at:



Good read!

New Blog - Jeremy Hayes

Another of my coworkers has created a blog over at: http://www.irundis.com/

Primary topics will probably be powershell based.

Wednesday, December 3, 2008

Archive Attender - HTML Formatting on x64

As a followup to my previous blog located here, this issue has now been resolved.  With combined efforts of Azaleos and SherpaSoftware, we have successfully rolled out this update to our first 6 clients.  It was much anticipated, and has been well received so far.

Tuesday, December 2, 2008

OWA Exception - Exchange Cluster Name Stolen


ERROR:

While attempting to access OWA..


Outlook Web Access could not connect to Microsoft Exchange. If the problem continues, contact technical support for your organization.


https://email.fnal.gov/owa/8.1.336.0/themes/base/copy.gifCopy error details to clipboard

https://email.fnal.gov/owa/8.1.336.0/themes/base/expnd.gifShow details

 Request

Url: https://email.domain.gov:443/owa/forms/premium/StartPage.aspx
User host address: XXX.XXX.XXX.XXX
User: Username
EX Address: /o=DOM/ou=Exchange Administrative Group (--)/cn=Recipients/cn=
SMTP Address: SMTP ADDRESSS
OWA version: 8.1.336.0
Mailbox server: MBX SERVER

Exception
Exception type: Microsoft.Exchange.Data.Storage.ConnectionFailedTransientException
Exception message: Event Manager was not created.

Call stack

Microsoft.Exchange.Data.Storage.EventPump..ctor(EventPumpManager eventPumpManager, String server, Guid mdbGuid)

Microsoft.Exchange.Data.Storage.EventPumpManager.GetEventPump(StoreSession session)

Microsoft.Exchange.Data.Storage.EventPumpManager.RegisterEventSink(StoreSession session, EventSink eventSink)

Microsoft.Exchange.Data.Storage.EventSink.InternalCreateEventSink[T](StoreSession session, EventWatermark watermark, ConstructSinkDelegate`1 constructEventSinkDelegate)

Microsoft.Exchange.Clients.Owa.Core.OwaFolderCountAdvisor..ctor(UserContext userContext, StoreObjectId folderId, EventObjectType objectType, EventType eventType)

Microsoft.Exchange.Clients.Owa.Core.OwaNotificationManager.CreateOwaFolderCountAdvisor(UserContext userContext, StoreObjectId folderId, EventObjectType objectType, EventType eventType)

Microsoft.Exchange.Clients.Owa.Premium.StartPage.OnInit(EventArgs e)

System.Web.UI.Control.InitRecursive(Control namingContainer)

System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint)


Inner Exception
Exception type: Microsoft.Mapi.MapiExceptionNetworkError
Exception message: MapiExceptionNetworkError: Unable to make admin interface connection to server. (hr=0x80040115, ec=-2147221227) Diagnostic context: ...... Lid: 8600 dwParam: 0x721 Msg: EEInfo: ProcessID: 4208 Lid: 12696 dwParam: 0x721 Msg: EEInfo: Generation Time: 2008-12-02 16:23:58:282 Lid: 10648 dwParam: 0x721 Msg: EEInfo: Generating component: 3 Lid: 14744 dwParam: 0x721 Msg: EEInfo: Status: -2146893022 Lid: 9624 dwParam: 0x721 Msg: EEInfo: Detection location: 150 Lid: 13720 dwParam: 0x721 Msg: EEInfo: Flags: 0 Lid: 11672 dwParam: 0x721 Msg: EEInfo: NumberOfParameters: 3 Lid: 12952 dwParam: 0x721 Msg: EEInfo: prm[0]: Long val: 9 Lid: 12952 dwParam: 0x721 Msg: EEInfo: prm[1]: Long val: 6 Lid: 12952 dwParam: 0x721 Msg: EEInfo: prm[2]: Long val: 0 Lid: 24060 StoreEc: 0x80040115 Lid: 23746 Lid: 31938 StoreEc: 0x80040115 Lid: 19650 Lid: 27842 StoreEc: 0x80040115 Lid: 20866 Lid: 29058 StoreEc: 0x80040115

Call stack

Microsoft.Mapi.MapiExceptionHelper.ThrowIfError(String message, Int32 hresult, Int32 ec, DiagnosticContext diagCtx)

Microsoft.Mapi.ExRpcAdmin.Create(String server, String user, String domain, String password)

Microsoft.Exchange.Data.Storage.EventPump..ctor(EventPumpManager eventPumpManager, String server, Guid mdbGuid)


I've seen numerous things fix errors like this..  Resetting the Information Store (or verifying that it is started), or restarting the AD Topology Service and then the Information Store (one starts before the other, so creating a dependency would be a good idea) are both common fixes.  This problem was not fixed by either.

This particular issue seemed to only affect OWA Premium users, but not Lite users.  We also received sproadic reports of issues creating MAPI profiles.  

About an hour earlier, we had fixed a clustering issue where the network name was in a failed state.  After checking the event logs, it was obvious that the disaster recovery machine was stealing the network name after an SCR fail over was tested.

Event Type:        Error

Event Source:    ClusSvc

Event Category:              Network Name Resource

Event ID:              1214

Date:                     10/31/2006

Time:                     7:30:45 AM

User:                     N/A

Computer:          NODE1

Description:

Cluster Network Name resource 'Network Name (EXCHANGE)' cannot be brought online because the name could not be added to the system for the following reason: You were not connected because a duplicate name exists on the network. Go to System in Control Panel to change the computer name and try again.

Event Type:        Error

Event Source:    NetBT

Event Category:                None

Event ID:              4321

Date:                     10/31/2006

Time:                     7:45:23 AM

User:                     N/A

Computer:          NODE1

Description:

The name "EXCHANGE          :20" could not be registered on the Interface with IP address XXX.XXX.XXX.XXX. The machine with the IP address XXX.XXX.XXX.XXX did not allow the name to be claimed by this machine.


We had rebooted the SCR server to get Exchange up and working, but we had neglected to shut it down until it could be cleaned up.  The problem is that it came back up and caused problems with Exchange authenticating, and thus generated a number of Kerberos errors.  Upon checking, a coworker noticed that the DCs weren't showing up as accessible in the EMC.  After shutting down the SCR node completely and failing Exchange over, the DCs repopulated, and OWA worked.  The moral of the story is don't leave your SCR failbacks in an incomplete state, or you get the fun task of scheduling a new maintenance window to perform clean ups.