Yesterday, I finished setting up a second instance of TFS 2005 so that I could test the upgrade to TFS 2008. I restored all the databases from the original TFS, ran the Tfs utility to rename the application tier and the data tier, connected with Visual Studios. Then all hell broke loose – our developers could no longer connect to the original TFS. We are software company and losing TFS would be tantamount to me losing my job. I almost cried and wet my pants… luckily my bladder is strong.
[Note: run TfsAdminUtil RenameDT server name BEFORE running ActivateAT server name, so that your AT is correctly pointing at the cloned DT machine (after having restored the dbs.]
How did this problem happen?
In your “%userprofile%Local SettingsApplication DataMicrosoftTeam Foundation1.0Cache” directory there is a file called ServerMap.xml.
In this file is a TFS server definition like this:
After running a dumping the HTTP traffic between my client and the servers, I discovered that anytime you talk to TFS the first thing it does is requests the Registration.asmx web service. In the output of this web service is an InstanceId field with a GUID.
This InstanceId uniquely identifies a TFS server. What happens next is the problem. Because I had been working against the production TFS server, I had an entry in ServerMap.xml with a GUID and the production URL. What my client did when I executed against the development instance was have a look at Registration.asmx, match up the InstanceId then start using the cached URL – which happened to be the production server
There’s not much documentation I could find supporting this, however I did find a post on the forums from Buck “Just copying the db is dangerous, because having two servers with the same guid will wreak havoc.” However he recommends that you follow the MSDN instructions (which we did). The instructions seem correct, because if you are restoring a backup you probably don’t want to change the InstanceId. Otherwise you will break all your client’s workspaces.
Recommendations for setting up a second instance of TFS
So now, we’re rethinking our strategy for a development TFS server. How can this be prevented in the future? This is what we’re thinking about doing:
Run the development server in a different domain
Run the development server with a different service account (wish we’d done this one earlier!)
See Tip 3 below, then execute commands against the development server with different credentials – first run the following:
TfsAdminUtil RenameDT TeamFoundationDataTierServerName
then run: TfsAdminUtil ActivateAT newTeamFoundationApplicationTierServerName
Change the InstanceId – if we can find out how.
So the lesson to be learnt here is – Be very careful when working with restored TFS databases!Buck and James have kindly pointed out that there is actually a tool installed with the Application Tier. It’s called InstanceInfo.exe and it is designed exactly for doing this. It’s just not documented anywhere except for this post on the forums.
Because you’ve "cloned" the production machine, there will now be 2 server machines with the same machine instance ID. Unfortunately our move instructions are for moving –
there is a way to "restamp" the cloned machine using a shipping command-line tool called InstanceInfo.exe (which can be found under the TFS install directory in the Tools folder on the Application Tier machine – along with the other server command line tools, like TFsAdminUtil). You should restamp the server after following the other "move" steps.
After making this change it should be safe to connect a client to both the original server and the cloned server.
What we did was manually changing the Instance ID on the TFSWarehouse in the SQL Server Management Studio.
The error text told us this:
TF30046: The instance information does not match. Team Foundation expected 3eeb4cd0-4cf6-485b-9d7c-6deb954d6582 but found c3ee6106-7b7a-4089-8896-77bacc302b3d
So we changed the Instance ID on the TFSWarehouse DB to 3eeb4cd0-4cf6-485b-9d7c-6deb954d6582, which is the same ID as the other TFS databases.
We did NOT re-run the InstanceInfo tool with the TFSWarehouse included – but we believe this would work also.
The instructions for using InstanceInfo provided by Dan Kershaw is missing the TFSWarehouse. A confirmation of this would be nice.
It is probably more wise to re-run the InstanceInfo.exe commands from scratch – but with the TFSWarehouse included.
So the correct command should probably be:
%TFSInstallDir%ToolsInstanceInfo.exe" stamp /setup /install /rollback /d TFSWorkItemTracking, TFSBuild, TFSVersionControl, TFSIntegration, TFSWarehouse /s <<your new data tier>>
"% TFSInstallDir %ToolsInstanceInfo.exe" stamp /d TFSWorkItemTracking, TFSBuild, TFSVersionControl, TFSIntegration, TFSWarehouse /s <<your new data tier>>
If you are going to take second TFS off of the domain and use a different account to run services, make sure you change the account in several places:
To replace tfsservice to a local account:
1) App Pools
2) Services: SharePoint Timer Service, TfsServerScheduler
3) SQL: SERVERNAMEtfsservice, tfssetup, tfsreports – replace all 3 of the DOMAINaccounts you used to set this beast up.
To verify that it the TFS services are running, go to http://server:8080/Services/v1.0/ServerStatus.asmx
Sharepoint – install to STS_Config_TFS, use Server Farm mode