When SSIS is used to export data to a MS Excel workbook, Excel tries to guess the data type from the first row of data. (Tip: This can be overridden by the IMEX=n parameter in the Excel connection string, where n is the number of rows you want it to scan). As it happens, Excel can not always win the guessing game (or maybe it just plays it safe) and treats data as plain text. Sometimes even numbers and URLs are treated as text and an apostrophe is appended at the beginning of the data value.
If you have a requirement to export hot (i.e. clickable) URLs to Excel, you can use the workaround that I am going to demonstrate below.
We will use the following SQL table. The sample data is intentionally in bad format. We will fix it with a query -
create table URLDemo
id int identity
insert into URLDemo
-- Data in table ------------------------
select id, url, friendlyname
-- Clean data with query ----------------
when left(url, 4) = 'www.' then 'http://'+url
when left(url, 11) <> 'http://www.' then 'http://www.'+url
when left(url, 7) <> 'http://' then 'http://'+url
) as url,
when isnull(friendlyname, '') = '' then url
) as friendlyname
The Clean Data query is an attempt to add any missing http://www. to the URL and also show a friendly name if missing. These will be useful later for the Excel HYPERLINK formula.
The problem -
We can export this data to Excel, even try to encode the Excel HYPERLINK formula as a derived column, the url and the formula will show up just as plain text, that is they will not be clickable.
The workaround -
Create an Excel Workbook that will be the template for data export. Type in the column names [id], [url] and [friendlyname] for column A, B and C respectively. Put the following Excel formula in the column D :-
=IF( AND(B1<>"", C1<>""), HYPERLINK(B1,C1), "")
Drag this formula cell to the number of rows as appropriate for your expected data volume. I did it for the first 20 rows. The idea here is to populate the [url] and [friendlyname] columns via SSIS and let the formula create a clickable link for us.
Then select cells A1 to C1 and give this range a name, e.g. MyDataRange.
Now create an SSIS package. Add a Data Flow Task and double-click to open its designer. Add a OLEDB Source in the Data Flow designer. The query in this editor is the Clean Data query above.
Add a Data Conversion Task. It will be used to convert non-unicode (VARCHAR) columns namely [url] and [friendlyname] to unicode. Excel likes to have character data in unicode.
Add an Excel Destination. It will use the data range name. Verify the column mappings.
Gyaan: Excel tries to identify the last “dirty” row in the worksheet and puts new data from SSIS in the next row. In my case it will be row 21 because I dragged the formula till row 20. Even if the cells once used are emptied by deleting the values/formulas, they are still considered “dirty”. The only way to remove the “dirty” flag from a cell is to delete the whole row! If the package will put data in row 21 and beyond, we miss out on the formulas that we put in the first 20 rows. Using the range tells Excel to look for dirty cells only within the range, thus allowing us to use the formulas in the adjacent cells.
Now verify the column mappings. Note that Excel columns [url] and [friendlyname] are mapped to the [urlWSTR] and [friendlynameWSTR] of the Data Conversion.
That is it, execute the package and click away at the URLs!