[SP20xx] Migrating HTM, HTML, MHT files to SharePoint
It seems this issue arises in almost every migration project: “Hey, we have a bunch of HTML based files which we’ve uploaded to SharePoint, but now they don’t work any more!”. With “don’t work”, usually it means the user is prompted to download the file instead of being able to just view it in the browser. So it’s working, but the end result is not what the end-user expects. Also, “a bunch” is usually a couple of hundred files which were exported from some legacy application a few years back and have been sitting on a file share or in SharePoint 2007 ever since.
The problem
So first we need to understand what the problem is. I won’t go into too much detail because there are already enough blogs out there on this. There are three things you should know about:
- Trusted MIME types: each file has a MIME type. It describes the type of file we’re dealing with. SharePoint comes with a default list of trusted MIME types which you can set via Powershell. Check out this blog on TechNet for a deep dive.
- Blocked file types: the list of blocked file types is found under “Security” in Central Administration. That alone should be an indication of what it’s for: security. Use this list to manage the types of files a user can upload to SharePoint, based on their extension (.zip, .exe, etc.)
- Browser File Handling. This is a setting which you can manage on web application level. It has two settings: Strict and Permissive.
So, when wanting to use HTML files on SharePoint there are a few things you can bump into. First, the filetype might be in the blocked file types list, which means you cannot upload them at all. You might have avoided this because your files were migrated from a SP2007 site where this usually wasn’t an issue. Or your administrator added the file type to the trusted types, which they shouldn’t have done (more about that later).
If you were able to upload, or the files were already there, the browser file handling is normally set to Strict and the HTML MIME type is not in the trusted MIME types list. That combination will trigger SharePoint to add a security header which in its turn tells your browser to download the file instead of opening it. Which is not what your users want, they want the file to be opened in their browser.
Solutions
Without too much detail about the why, let’s skip to the solution part. The bad news: you should not allow people to view HTML based files straight from SharePoint in their browser. The reason is twofold: the security risk being the most important. Users can download all kinds of weird pages and scripts from the internet and host them on SharePoint. Colleagues might click these pages and run scripts they should not run. Be warned: a JavaScript file is perfectly capable of asking the user for credentials and sending those to some remote server. Especially when it is running in IE’s Intranet zone which has low security by default. The second reason is for companies who are thinking about moving to the cloud. Note that Office 365 and other SharePoint PaaS services usually do not allow you to host these files. So you’re creating migration issues for yourself when you would allow it on-prem.
The good news: there are some things you can offer your users to acheive the same result.
- SharePoint pages: for plain HTML content, it’s perfectly doable to create some pages in SharePoint and copy/paste the contents of the files. You can do this manually, find a tool which does it for you, or custom create such a tool. When your files are using all kinds of scripts, linked CSS files and more, this might not be the best solution though.
- Host the files / page viewer web part: although it would be my preference, you’re not forced to host the files on SharePoint. When your IT department has a webserver available (don’t use SharePoint front-end servers for this…), explore the option of hosting the files there. When hosted, you can use the page viewer web part in SharePoint to include the content in a SharePoint site. For your end-users, it will look like the content is in SharePoint when it’s not.
- Office Web Applications: in some cases, the HTML content is being exported from Office applications. This is usually Visio, Word or Powerpoint content. Know that you can perfectly host these types of files in SharePoint and display them in the browser by using Office Web Applications. The Office Web Applications come with webparts which you can include on a page to display specific Office documents on that page. The downside is that Office Web Applications require a license and starting with version 2013, a server.
- Host HTML in SharePoint: as a last resort, you can configure SharePoint to host the files. That includes removing .html / . htm / .mht extensions from the blocked file types list. Also, you need to either add the application/html mime type to the allowed mime types, or change the file handling of the web application to permissive (adding the mime type is the least bad option). Note that these changes wil affect all sites in your web application. The blocked file types are even set at farm level. I would definitely recommend not doing this.
Those are the options I normally advise my customers. There might be other good ones out there, feel free to comment. The important lesson to learn here is that the easiest way to get it to work is also the one you don’t want to use.
Leave a Comment