Sunday, March 29, 2015

Difference between SharePoint 2010 Search and SharePoint 2013 Search

Recently, I got an opportunity to configure Search in SharePoint 2013. Then I thought let me compare the difference between SharePoint 2010 and SharePoint 2013 search and then go for the SharePoint 2013 configuration.

Below are the some of the differences found during the analysis.

SharePoint 2010
SharePoint 2013
Majority of the search runs under MSSearch.exe process
Only crawl component run under MSSearch.exe process. Rest of the search components run under SharePoint Search Host Controller Service i.e. hostcontrollerservice.exe
During crawl, as items are indexed (built-in Memory), they are streamed/propagated to a Query server and index there.
During crawl, crawled items are sent over to the Content Processing Component for further processing.
Crawl freshness is poor compared to 2013 search. E.g. In SharePoint 2010 it is easy to get into state where a very long incremental crawl would overlap into the next scheduled incremental crawl. This usually occurs because lots of security changes are being processed.
SharePoint 2013 improves the crawl freshness tremendously using Continuous and Cleanup crawl.
In SharePoint 2010 search, crawl component is ultimately responsible for extracting metadata, links and property mappings.
In SharePoint 2013 search, the Content Processing Component performs the document parsing, metadata extraction, links and property mapping.
In SharePoint 2010, the web analytics service application is responsible for analytics processing.
All analytics processing is performed by the Analytics Processing Component within Search Service Application.
Whenever document is changed/updated, the entire document will be re-indexed and propagated to query server.
The index is more efficient in SharePoint 2013 search. Because it’s been broken into update groups. A single crawled document could be indexed across several different update groups. Each update groups contains a unique portion of the index. This allows for partial updates which means if I make a changes to a document, only that change is updated within the index of the associated update group instead of entire document.
Query runs under MSSearch.exe and retrieves properties from the Property Store database and invokes Search Administration database for security trimming.
Query Processing Component now under noderunner.exe and no longer retrieves properties from the property store or invokes the Search Admin database for Security Trimming when processing a query. It only fetches results from index components which simplifies things tremendously.
In SharePoint 2010 Search, each crawl component is mapped to crawl database which is 1:1 relationship.
The relationship between crawl component and crawl database is doesn’t exist in SharePoint 2013 Search. So a crawl component will automatically communicate with all crawl database if there is more than one.
To gain fault tolerance in SharePoint 2010 Search, two crawl components (one mirrored) would need to be provisioned per crawl database.
Since there is no unique relationship between crawl component and crawl database, SharePoint 2013 Search automatically gains fault tolerance by simply provisioning a new crawl component.
Installation: Different installation for SharePoint 2010 and FAST Search.
Installation: Single installation in SharePoint 2013.
Crawling Components: Multiple crawl components on single server but each component was associated with only one crawl database.
Crawling Components: No unique relationship maintained between crawl components and crawl database. All crawl component can talk to all crawl database.
Host Distribution: In SP2010, you could pick certain crawl components to crawl a particular URL.
Host Distribution: Host can be distributed across multiple crawl databases. The new distribution happens on the basis of Content DB ID rather than host URL.
Content Processing Customization: In SP 2010, we had pipeline extensibility. Which let you write custom code to use in the transformation of crawled content before it got added to the index.
Content Processing Customization: In SP 2013, the pipeline extensibility model has gone. But it has been replaced with new web service called out feature. It is invoked via triggers which you create and are based on the value of the different managed properties.
Security Information: Search Admin DB is storing the security information (ACL Information).
Security Information: Search Admin DB is NOT storing any security information (ACL Information). It is storing with index itself.
Built in BDC Connector: No Documentum Connector OR Term Store Connector.
Built in BDC Connector: New built-in Document Connector and Term Store Connector are available in SP 2013.
HTTP Site Crawling: Anonymous crawling was not supported into SP 2010 for HTTP Site.
HTTP Site Crawling: Anonymous crawling is available in SP 2013 for http sites.
Crawling - Continuous Crawl: This feature doesn’t exist in SP 2010 Search.
Crawling - Continuous Crawl: Continuous crawl feature is available in SP 2013 Search only for SharePoint sources. The default interval for this is 15 minute and can be changed through PowerShell commands.
Indexing - Remote SharePoint Source: This functionality is not available in SharePoint 2010 Search.
Indexing - Remote SharePoint Source: A new feature to index “Remote SharePoint Sources” is available in SP 2013 Search in which we can crawl a remote SharePoint Farm in different Geo locations using “oAuth Trust” which removes the issues of “Kerberos” authentication between the farms.
Search Sources: In SP 2010 Search, scopes can only be created by Search Service Admin.
Search Sources: In SP 2013 Search, Site and Site Collection Admin can configure the result sources for a site collection.
Data Source: Exchange is not a data source
Data Source: Exchange is a data source for a Result Source.
Query Transformation for a Result Source: Not Applicable
Query Transformation for a Result Source: Query transformation can be applied for a Result Source.
Indexing PDF Support: Advanced filter pack/iFilter being used in SP 2010.
Indexing PDF Support: Indexing PDF support is OOB feature of SP 2013 Search.
Format Handlers: Not Applicable
Format Handlers: New parsing feature in SP 2013 Search.
Automatic File Format Detection: No longer relies on file extension. Deep link extraction for Word and PowerPoint formats.
Visual Metadata Extraction: titles, authors and dates
High performance format handlers for HTML, DOCX, PPTX, TXT, Image, XML and PDF Formats.
New Filters for Visio and OneNote: Not Applicable
New Filters for Visio and OneNote: Available in SP 2013 Search.
Schema Management: Search Service Application Admin can only create “Crawled” & “Managed” properties available to the entire farm.
Schema Management: Capability to do the limited schema management down to Site Collection Admins. Site Collection Admin can create custom crawled and managed properties that are only used in their site collection. The managed properties that can be created are limited – that means they cannot be refinable or sortable and it only supports text data type.
Schema Management: In SharePoint 2010 Search, you had to do one FULL crawl of all your contents to create a crawled property. You could then create a managed property and map it to the crawled property. But you had to do a second FULL of all your contents to populate it with data.
Schema Management: That is no longer required in SharePoint 2013 Search. To begin with when you create Site Column, it automatically create crawled properties out of it right then and there – no other FULL crawl required. Secondly after you create managed property and map it to your crawled property, you NO longer need to crawl the entire farm. Instead you can go into the list or library and there is an option now to Re-Index the list. Alternatively, you can go to the site level and choose to Re-Index the Site.
Query Spelling Corrections: Default spelling dictionaries and Query Spelling Inclusions list.
Query Spelling Corrections: Customization to Query Spelling Correction is now managed in term store – both inclusions and exclusions. In addition to that we still have a dynamic dictionary that is based on content in the index itself, or you can also still choose to go with a static dictionary.
Query Rules: SharePoint 2010 query is simple i.e. one query has one set of results.
Query Rules: Query rules allow you to have search requests from a user trigger multiple queries and multiple results sets.
Display Templates: Not Applicable
Display Templates: A display template is applied to the result based on the result type that it matches. You can define your own Custom Display Type.
Thumbnail Preview: Thumbnail preview was available for Office documents with preview of first page in word and 3 slides in PPT.
Thumbnail Preview: With new web app engine, you can browse through entire document in the preview.
-          See all pages, animations, zoom in/out, and scroll through the entire document.
-          The point of this is to allow users to find the exact item they are looking for right in search results – no more clicking a result, hitting back button and on and on until they find the one they are looking for.
Previews only work with claims authentication and doesn’t work with Classic Windows Authentication.
Search Portability: Not Applicable
Search Portability: Search portability supports transferring the following items
-          Result Sources
-          Query Rules
-          Result Types
-          Schema
-          Custom Ranking Models
Transferring can happen between a tenant, site collection or site using export/import search configuration option of CSOM (Client Side Object Model).

Hope this helps and I would request to update me if you find some more in terms of technical / functional.

Have a nice day!!!!

Thursday, March 12, 2015

Crawled Properties and Managed Properties in SharePoint

These two terminologies “Crawled Properties” and “Managed Properties” are very important to understand to have better knowledge on SharePoint search. These two concept are core to the SharePoint search.

Crawled Properties

By default when you create a SharePoint column on list or library, it will generate a crawled property which is marked with “Include in Full-Text Index”. The naming of crawled property will be ows_internalColumnName. You do not have any control over the creation of “Crawled Properties”.

The crawled property by itself is useless for you when trying to run or build search queries or even display the value of this property in search results. Crawled properties can be found in two ways – during crawling of the content and during content processing.

Crawled Property found during Crawling Time

When you crawl content from a SharePoint libraries, each column in the library is metadata associated to the corresponding document. Therefore they are exposed as crawled properties. In the below image, each column will be exposed as crawled property including custom column “Department”.

Crawled Property found during Content Processing

The second way the crawled property can be found is during content processing. The best example for this is that the metadata property defined for office documents as given below.

Each one of the above properties are extracted during content processing time and exposed as crawled properties.

Managed Properties

Managed properties are basically group of one or more crawled properties. When site columns are crawled, SharePoint 2013 automatically creates crawled properties and also creates managed properties and adds a mapping between them. Regular columns do not. Let’s say, user has created a columns like “Customer Name” and “Client”. For the organization these two columns represent the same content / information but not for search. For search, they are just crawled properties and they are different as they do not share the same name. On top of this, since they are just crawled properties, if someone searches for all documents where Client = XYZ. Then they will find nothing at all. Because no search related feature works with crawled properties themselves. To make it work, you need to create a Managed Property and associate the respective crawled properties. Managed Properties appear in search result and user can execute query on this. Crawled properties can’t be used for this.

Mapping between Crawled and Managed Properties

We all know that SharePoint can crawl data from various system/sources. The data present in this various source system have metadata which can have different name but all refers to same information. For example, Author information is stored in different system can be Author, Writer, Created By, Owner etc. But all these fields represent same information that who has created the information/files/documents.

When SharePoint crawls these various systems, these fields are exposed as crawled properties. We can single managed property as “Author” in SharePoint and associate all the crawled properties as given below.

If you look at the managed property, they have a corresponding property called “Searchable”. This means if crawled property is mapped to managed property marked as searchable, those columns are searchable. However, if the managed property is NOT marked as searchable, even though crawled property is marked with “Include in Full-Text index”, it will NOT be searchable. Please refer the below table for clarification.

Crawled Properties
Managed Properties

Included in Full Text-Index
Not Included in Full Text-Index
Not Searchable
Column Searchable




If a crawled property is mapped to two managed properties, where one of them are set as searchable, then the value will be searchable.

Happy reading and provide your feedback.

Tuesday, March 3, 2015

Office Web App cannot open Office Documents in office client


Recently, one of my user reported a issue that he is not able to open the office document in office client from office web app. User was presented with below error while trying to open document from office web app.

"To open this document (or presentation or spreadsheet), your computer must be running a version of Microsoft <Office Application> and a browser that supports opening files directly from the Office Online."

Probable Cause

It took some time for me to find the root cause. I have validated user's browser & office versions and they are all fine in terms of versions and user was using IE-9 32-bit and Office-2010 32 bit.

During the analysis I came to know that user also installed Microsoft Visio 2013 version for business need. It means that multiple versions of office product installed on same laptop which cause the "SharePoint OpenDocument class" i.e. OWSSUPP.dll. The version of the dll starts with 15.****. It means that right dll is not registered to open the document from web app. This add-on is for IE and is the integration add-on for opening the documents from SharePoint.

To check the version of the SharePoint OpenDocuments Class Add-On (OWSSUPP.DLL) open Internet Explorer when the error is showing and go to Tools > Manage Add-ons. In Manage Add-ons click on SharePoint OpenDocuments Class and notice the version:

12.0.. stands for Office 2007

14.0.. stands for Office 2010

15.0.. stands for Office 2013


To resolve this issue you must disable SharePoint Support in any previous version of Office installed on the machine and then repair Office 2010.

Follow below steps to uninstall process

1) Go to Control Panel, Programs, Uninstall a Program.

2) For each Office program that is not an Office 2010 program, select the item and click Change, Choose Add or Remove Features, Click Continue.

3) In Installation Options, expand Office Tools, Click Windows SharePoint Service Support, Choose Not Available, Click Continue. You must do this for all Office family programs that are not Office 2010 including earlier versions of Office Project or Visio.

4) Once configuration is complete, in Programs, click Office 2010, click Change, Choose Repair, Click Continue.

Note: A reboot will be required after the Repair completes