Tuesday, November 24, 2009

What I learned today – temp file created when reading from multiple trace files

I’m going to do some updating on past posts before moving forward on my BI learning. Today I want to close the book on using Profiler.

First, there’s an excellent series of three webcasts on Profiler on the MSDEV website. MSDEV is a great resource for anyone looking for webcasts on Microsoft developer products. Check out the site even if you’re familiar with Profiler, chances are you’ll find something you didn’t know before in one of the other webcast series.

Second, Profiler will create a temporary file when you read from multiple trace files. Like a lot of things, I found this out by accident. I had created a trace for replay on one of our production servers. It wasn’t until the trace had been running for a while that I noticed that I had forgotten to change the maximum size of the trace files. I let the trace complete, knowing I could read them all later. My trace ended up creating 369 .trc files totaling about 1.89 GB.

So I began reading all of the .trc files, meaning to save the results into one .trc file. While the trace was running, I noticed that the amount of free space on the C drive of my test server was dropping. It eventually bottomed out when the trace finished reading the files. And when I investigated, I found a file in my Local Settings\Temp folder that was the same size as the total size of the 369 .trc files. The temp file was deleted automatically when I closed the Profiler session. Finally I opened the single large .trc file I saved when reading the multiple smaller .trc files. This time no temporary files at all were created.

I wasn’t expecting the temp file to be created. Luckily this happened on my test server where i had enough free space. But it’s something to consider anyway. You may easily run out of space on your C drive without knowing why. I don’t know what would happen if I read the same trace with fewer, larger .trc files, that’s something I’ll need to look at for sure.

Thursday, November 12, 2009

What I learned today – GID in sysusers table

I had a question from a colleague yesterday about the GID field in the sysusers table. He had a SQL 2000 database that he had moved to a SQL 2005 instance. When he did this, his third party application stopped working. When he investigated, he noticed that the app was using the GID field in sysusers to validate logins and set roles. After he restored the database on the new server the GID field was 0 for everyone. At this point no one could connect to the database through the app.

In SQL 2000, when you add a user to a database, the GID is 0 for the public role. If you add them to a database role, the GID becomes the UID of the role. I’m not sure what happens if you add a user to more than one database roles, that’s something I’ll have to check out. In SQL 2005 this has changed. The GID for all database users is 0. This is not a field that you can update.  

This is a good example of why you should never depend on the system tables being static. Microsoft always warns that they may be updated when you apply patches or when you upgrade.

Monday, November 9, 2009

SSIS Expressions Cheatsheet

Maybe you’re new to SSIS, like I am, or maybe you just need a little refresher on expressions in SSIS. I just ran across the SSIS Expressions Cheatsheet. This is a page on the PragmaticWorks website. It’s also available as a downloadable .pdf file. For me, anyway, it’s a great one-page reference. I’ve got it printed and pinned to my cube wall.


I learned about this resource from Dougbert.com

Tuesday, November 3, 2009

My first data warehouse – Part 1

I’m going to lay a little ground work for future posts here. My plan to learn BI, and also to create a useful data warehouse from scratch, is to build one based on reporting server performance for my company’s eventual use. I want to be able to drill down to an individual report to view performance data, as well as summarize report performance by locations, by dates, by users,by date ranges, as well as other factors that I don’t envision right now.

Today I’m going to discus the download I found on the CodePlex site, thanks to Tyler Chessman from SQL Magazine. This package and the accompanying reports will be incorporated in my data warehouse project.

To use the downloaded code you’ll need to create a database. The examples use RSExecutionLog as the database so I kept the same name. The download includes a script to create the necessary tables and relations. The tables it creates are used to store data extracted from the ExecutionLogStorage table in your SQL 2008 report server database. Mine has the default name of ReportServer, created when I configured Report Server for the first time. This database stores the report catalog and also useful information pertaining to when reports are run.

There are two Visual Studio solutions in the download; one SSIS solution called RSExecutionLog with a single package called RSExecutionLog_Update.dtsx, the other is a reporting solution with three sample reports to use against the extracted report server data and two to view schema and table information. I’m going to concentrate on the first two. 

I had some original issues with the package and reports. They were written using SQL 2005; I’m using SQL 2008. After converting the reports (basically just opening the report solution in VS 2008) I still had issues with the package and it’s connections. There were places where the connection to the source or destination seemed to be hard-coded inside different components of the package, even after I changed the source and destination connections. I ended up building a new package based on the original.   

The package is pretty sweet. Not only does it populate tables for users and reports, it also creates one for the report parameters. When the parameters is stored in the report server catalog, it’s something like “PARAM1=123&PARAM2=456”. The package shreds the report server string into individual values, so now I can easily search by parameters. I was thinking of doing the same thing, only storing the values as one record as opposed to one or more.

So I can use this package with very little customization. My next step is to pull data into my dw that is dependent on each report execution. This package should be fairly straight forward.

My next post will show my database design.