Externalizing SharePoint Content Storage
SharePoint has been used for document storage along with the document management and work flow association since the beginning of SharePoint evolution. SharePoint by default store the file system with in the content database as binary large object format. This design has very high level of drawbacks with SQL bloat. The database will be inflated with storage of large volume of files or with big files. To resolve this problems the concept of externalizing the storage in SharePoint is introduced in SharePoint.
Externalizing SharePoint Content Storage Evolution
At the time of SharePoint 2007 a concept called External Blob Storage (EBS) was introduced in SharePoint to externalize the storage of large files, thus to improve the performance. External Blob storage is a mechanism to store the large files in the physical location than storing the files in the content database in BLOB format.
The counterpart of EBS in SharePoint 2010 was Remote Blob Storage (RBS) which came with faster API along with a different way of implementation. If we examine the implementation of EBS and RBS, It is found that both technology store the metadata in the content database and the actual content reside as files outside the content database.
EBS is implemented by MOSS 2007 (available as a hot fix to MOSS 2007 SP1 and later). The EBS provider lives at the very bottom of the SharePoint stack, just above the interface in to SQL Server. Just before the BLOB is passed to SQL server the EBS provider gives process the opportunity to optionally take ownership of the BLOB. A token in exchanged for each stored file internally so it knows how to get the object back from database.
RBS is implemented by SQL Server, (only SQL Server 2008 and later); RBS actually sits as a library attached to the application, not as part of the SQL stack. When you enable RBS, all BLOB streams that SQL Server would normally be compelled to store internally are spewed forth to the file system. You can limit this storage based on the file size, to decide the file below certain specified size will reside in the content database and file above the specified size will held in the physical location.
There are pros and cons while comparison of both RBS and EBS, but generally RBS is primarily targeted at large scale CAS systems and is in fact a direct replacement of EBS.EBS will probably not survive in the long term. For what it is worth, Microsoft ensure that a transition to RBS in the future would be seamless.
Features in RBS compared to EBS
RBS has Managed interface whereas EBS’s is unmanaged. The RBS BLOB store scope is per Farm, EBS is per Content DB. SharePoint 2010 adds the Configurable Maintainer to sync operations from SharePoint down to RBS.RBS shows a commitment to the integration by having a PowerShell-based UI within SharePoint. RBS supports multiple providers. RBS implementation includes PowerShell-based migration between content stores. That means, move BLOBs from SQL to any RBS provider, from RBS provider to RBS provider and from EBS to RBS, where a ‘shallow migration’ is possible the BLOB would stay where it is but the management of it would be migrated (only for already externalized BLOBs).
Microsoft put forward the following bullet points in promotion of RBS implementations
• Optimal Capital Expense
o Trade cost effective BLOB storage for expensive SQL Storage
o Ability to group/store BLOB separate from Metadata
• Optimal Operational Expense
o Storage management beyond SQL
o Facilitates cost effective/optimal backup/DR policies for BLOB vs. Metadata
• Take advantage of advanced storage features provided by BLOB store vendors
o Expunge, multiple storage locations, immutable writes
o Guaranteed retention, guaranteed deletion,
• Hierarchical Storage Management
o SQL and layers of BLOB stores offer more savings in CapEx and OpEx
o Efficient access patterns.
More about RBS
RBS is implemented in SQL Server and is application agnostic. That’s to say, if you turn RBS on then all BLOB objects from any SQL Server-based application will be externalized. If that’s what you want to happen then that’s great but if you need to be able to apply business logic to what is externalized and whence it goes then you are severely restricted. Still you can restrict some files resides in the content database and some in the blob location based on the file size.
RBS is simple to implement – turn RBS on and the content is simply stored on to the local file system. If you have some kind of file system virtualization software in place then you can do some basic management tasks but only based on the file system attributes of the object. If you want access to the context and metadata of the object then you are going to have to dip in to SQL Server and start hunting down SharePoint based reference information; Microsoft do not recommend this - in fact they do not publicly publish the DB schema for SharePoint so it would be potentially dangerous.
It is highly recommended by Microsoft to read the following link (http://technet.microsoft.com/en-us/library/ff628583.aspx) regarding the capacity planning for SQL server and SharePoint before we go to implement the RBS for SharePoint. This will give the insight of under what situations we should opt for the RBS.
Configuring Remote Blob Storage
Configuring RBS does not come with a single click, it need a series of steps to do in SQL server, Install wizard or in the power shell management window. The following subtitles depict the sequence of implementing RBS.
Enable file stream in SQL server
The first thing we need to do for this is to configure SQL server 2008 R2 to enable File stream. This configuration needs to be done for the instance of the SQL server that runs for the SharePoint. From SQL Server Configuration Manager, Click SQL Server Services in the left panel. Right Click on your target database server and then click properties.
Fig. Configuring File Stream in the SQL server instance.
Then select the file stream tab and enable the following by checking the option button as depicted in the below figure. What we are doing here is nothing but enabling file stream for T-SQL access and file I/O streaming access and also enabling the remote clients to have access to the file stream data. The third option is to use when the web front ends and the file storage location reside on different servers.
Fig. File Stream options enabling.
Configure File streaming for SharePoint
The next thing we need to do is configure File streaming for SharePoint. Open SQL server management studio and run the following T-SQL statement in the master data base.
EXEC sp_configure filestream_access_level, 2 RECONFIGURE
This will change the file stream access level from 1 to 2, in case the file stream was initially enabled while installation of SQL server this statement will run successfully and will display the result as file stream access level is changed from 2 to 2, that means no change and all are good to go.
Fig. Setting File stream access level.
The next step is to either create a new application that uses the SQL server 2008 R2 database as content database or to identify the database name (database should be in SQL server 2008 R2) of the existing application. We can easily identify this using a power shell script as follows.
>$cdb = Get-SPContentDatabase –WebApplication <webapplication url>
This will display the content database name depicted in the following figure.
Fig. Getting the Content database details.
Say in this case the database name is “WSS_Content”. Next we need to create a master key for our database, Back in SQL Server Management Studio And need to create a file group and add our file stream file For this run the following sql statement in the wss_content(your database for the application) database.
/*To create a master key for the database*/
if not exists (select * from sys.symmetric_keys where name = N'##MS_DatabaseMasterKey##')create master key encryption by password = N'Admin Key Password !2#4'
/*To add file group*/
if not exists (select groupname from sysfilegroups where groupname=N'RBSFilestreamProvider')alter database [WSS_Content] add filegroup RBSFilestreamProvider contains filestream
/*To add file stream file to filegroup provider*/
Alter database [WSS_Content] add file (name = RBSFilestreamFile, filename = 'c:\Blobstore') to filegroup RBSFilestreamProvider
Note: WSS_Content mentioned in the above script need to be replaced by the database name used by the SharePoint application.
Installing RBS package
Once these steps are completed we can install RBS. For this down load the RBS.msi and install the package. This package will install a few stored procedures, tables in the content database and some API s that interact with SharePoint.
Fig. Start the RBS wizard.
Provide the Server Instance Name, Database Name and File group for internal tables.
Fig. Setting the RBS instance, database, and file group.
Ensure the connection is proper by clicking the Test Connection before you click next. In the next screen input the file stream blob store file group and information’s to connect to the file stream blob store. By default the connection will be maintained in the same server as the content database, but there is option to configure the same to a different database.
Fig. Setting the Filestream blob store.
In the next screen of wizard you can configure the store name.
Fig. Setting the Filestream store name.
In the next screen the configuration for running the RBS maintainer can be set.
Fig. Setting the Maintainer task.
There is an option for schedule the Maintainer at the later point of time, this option is explained in the below session “RBS Maintainer” this is the Microsoft suggested way to do the Scheduling. But we can go with the default configuration provided by the wizard. After all RBS maintainer is a EXE file which need to be run explicitly by a scheduler. It is important to configure the task scheduler with the task properties and the account under which it need to run.
Next screen is where you need to configure the type of error to different target.
Fig. Setting the RBS Logging settings.
By Clicking Next you will be installing the RBS in your SQL server.
Enable the RBS in the SharePoint.
Use the following power shell scripts from the PowerShell command window to enable the RBS in the SharePoint for a particular application. Run the scripts one after the other in the given order.
/*Get the content database*/
>$cdb = Get-SPContentDatabase –WebApplication <webapplication url>
/*Get the remote blob storage settings */
>$rbss = $cdb.RemoteBlobStorageSettings
/*Check remote blob storage install*/
/*Enable remote blob storage for the application*/
/*Set the Provider for remote blob storage*/
Testing the RBS
Create a Site under the web application which is configured for RBS and create a document library in it. Load a file in the document library. And check the presence of the file in the folder which is configured via the tsl query “To add file stream file to filegroup provider”(in the physical path mentioned in the query).You can see the same file with a system generated name got added there. The file can be opened and verified by the same application that uses to open the file. But ensure that nothing get modified from the file blob location for safer side.
Setting the File Size Limit for Storage in BLOB location
With the help of the Power shell script we can set the file size limit from which we can keep in the BLOB location. This option is available in the wizard also while we configure. For example the following command (please refer how we derive at $rbss in the steps mentioned in the “Enable the RBS in Sharepoint” section)
Will set the minimum blob storage size to 1mb, if you would like to raise the threshold, you can enter any desired number. By this setting any file less than 1 MB will reside in the content database itself and file above the 1 MB in the blob location.
RBS Maintainer module is the piece that creates an element of awareness between SharePoint and RBS. It tracks the deletions in SharePoint and pushes them to the BLOB store. Interestingly it includes the ability to set up rules to delay the delete from the BLOB store to make it easier to manage restores – if the BLOB is still in the external store when you restore SQL then you are probably in good shape. RBS maintainer the clean-up of BLOBs for failed transactions - orphaned BLOBs. Basically garbage collection management is the function doing by the RBS maintainer. RBS Maintainer can be run on the DB server or separate box. RBS Maintainer should be scheduled to run periodically as a separate job.
RBS Maintainer either needs to install at the time of the installation of RBS package or need to configure explicitly. By default the wizard help to configure the RBS maintainer, the maintainer is an exe, which is configured by a connection string mentioned in the configuration file reside in the same folder.
The connection string in the Maintainer is specified in the Microsoft.Data.SqlRemoteBlobs.Maintainer.exe file (lives with the exe). The connection strings are either all encrypted or none. The default connection string that the RBS specifies is always encrypted and it’s a good practice to encrypt the connection strings.
Maintainer Connection string Encrypting
Here is how we need to specify the connection strings:[From any WFE where you want to run GC]
For each content Database that is RBS enabled: Add content database connection string to the Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config file
a) Rename the existing config file to web.config using following command (located in %programfiles%\Microsoft SQL Remote Blob Storage 10.50\Maintainer )
cd /d %programfiles%\Microsoft SQL Remote Blob Storage 10.50\Maintainer
ren Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config web.config
b) Un-encrypt the connection strings from the config file by using following command (aspnet_regiis is located in %windir%\Microsoft.NET\Framework64\v2.0.50727
cd /d %windir%\Microsoft.NET\Framework64\v2.0.50727
aspnet_regiis.exe -pdf connectionStrings "%programfiles%\Microsoft SQL Remote Blob Storage 10.50\Maintainer"
c) Add additional connection string to the web.config file for each content database that is rbs enabled like following . Change the name of connection string , data source and Initial Catalog as needed
<add connectionString="Data Source=localhost;Initial Catalog=RemoteBlobStorage;Integrated Security=True;Application Name="Remote Blob Storage Maintainer"" providerName="System.Data.SqlClient" />
<add connectionString="Data Source=wsslab40;Initial Catalog=wss_content;Integrated Security=True;Application Name="Remote Blob Storage Maintainer for WSS_Content"" providerName="System.Data.SqlClient" />
d) Encrypt the web.config file again by using following command.
cd /d %windir%\Microsoft.NET\Framework64\v2.0.50727
aspnet_regiis -pef connectionStrings "%programfiles%\Microsoft SQL Remote Blob Storage 10.50\Maintainer" -prov DataProtectionConfigurationProvider
e) Rename the file back to original
cd /d %programfiles%\Microsoft SQL Remote Blob Storage 10.50\Maintainer
ren web.config Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config
Note: 1) XML file is case sensitive; you need to use the exact string for 'connectionStrings' parameter above.
2) In case there are more than one content database connection string need to set for each content database in the Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config file.
Create a Maintenance Task
Create a Maintenance Task using following steps (for each database).Follow the given steps to create a task in the task scheduler.
1. Click Start, point to Administrative Tools, and click Task Scheduler.
2. Right-click Task Scheduler (Local) and click Create Task.
3. Click the Actions tab and click New.
4. On the New Action page, specify:
i. Action as Start a Program.
ii. For the Program/script, click Browse and navigate to the RBS Maintainer application; by default, the location is %programfiles%\Microsoft SQL Remote Blob Storage 10.50\Maintainer \Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.
iii. In the Add Arguments (optional) field, enter the following parameter string: (change the name of the connection string as specified in the config file earlier)
-ConnectionStringName RBSMaintainerConnection -Operation GarbageCollection ConsistencyCheck ConsistencyCheckForStores -GarbageCollectionPhases rdo -ConsistencyCheckMode r -TimeLimit 120
iv. Click OK
Note: XML file is case sensitive, you need to use the exact string for the connection string above.
5. On the Triggers tab, click New.
6. In the New task dialog box, set:
i. Begin the task to on a schedule.
ii. The trigger schedule to be Weekly, Sunday, at 2am (or at another time when system usage is low.)
iii. Click OK.
7. On the General tab, enter a name for the task, such as “<Database Name> RBS Maintainer”, where <Database Name> identifies the database associated with the task. In the Security settings section:
8. Make sure that the account under which the task is to be run has sufficient permissions to the database.
9. Select the option to run whether user is logged on or not.
10. Click OK.
RBS Maintainer will be running for this schedules and maintain consistency between the physical store and metadata in the content database.
Backups and RBS.
While taking back up, first to do the sql back up then the blob location and while restore, first the blob then the sql. This ensures that data will be consistent while backup and restore.
The following steps are recommended by Microsoft for uninstalling the RBS and brought back the external blob black in content in database.
a) Set the active RBS provider to none: $rbs.SetActiveProviderName(“”)
b) Migrate all the data from blob storage to WSS inline: $rbs.Migrate()
c) Disable RBS: $rbs.Disable()
d) Remove all the blobs from blob storage by running RBS Maintainer. Before running Maintainer, do the following on the ContentDB:
e) exec mssqlrbs.rbs_sp_set_config_value 'garbage_collection_time_window','time 00:00:00'
f) exec mssqlrbs.rbs_sp_set_config_value 'delete_scan_period','time 00:00:00'
g) Uninstall RBS (database component) by running the following store procedure: exec mssqlrbs.rbs_sp_uninstall_rbs 0
This option will work fine if the file sizes in the RBS abide to the capacity limitation mentioned for the content database by Microsoft.
Remote BLOB Storage is an optimum solution provided by Microsoft to reduce the SQL server inflation with respect to content database in SharePoint environment. RBS make SharePoint more productive and substantial to the business needs. The clear abstraction of the data and metadata provided by RBS implementation and the possibility of SharePoint to use large data stores along with its popular capability of document management and work flow association, will leverage SharePoint to replace legacy document management systems with SharePoint Libraries.