Tuesday, October 14, 2008

"ETL"ing Source Code

The past couple weeks, I've been between projects, which has gotten me involved in a number of "odd jobs". An interesting pattern that I'm seeing in them is querying and joining, and updating data from very traditionally "unlikely" sources... especially code.

SQL databases are very involved, but I find myself querying system views of the schema itself, rather than its contents. In fact, I'm doing so much of this, that I'm finding myself building skeleton databases... no data, just schema, stored procs, and supporting structures.

I'm also pulling and updating metadata from the likes of SharePoint sites, SSRS RDL files, SSIS packages... and most recently, CLR objects that were serialized and persisted to a file. Rather than outputting in the form of reports, in some cases, I'm outputting in the form of more source code.

I've already blogged a bit about pulling SharePoint lists into ADO.NET DataSet's. I'll post about some of the other fun stuff I've been hacking at soon.

I think the interesting part is how relatively easy it's becoming to write code to "ETL" source code.