Creating a SharePoint 2013 Index Component – PowerShell Bug

After setting up SharePoint 2013 Search with David at Trek Bicycles we encountered a bug with the PowerShell cmdlet:

When using the –RootFolder parameter to specify where you want the index location (For Example: C:\SPSearch) to be you must have the folders created ahead of time. After creating the folders on the servers where you are creating the index components on, you will still received the error message “New-SPEnterpriseSearchIndexComponent : Cannot bind parameter ‘RootDirectory’

The cmdlet does a check not only on the servers you are creating the index component on, but also on the server where you run the PowerShell script (in this case the app server, which will not have any index components at this time). After creating the folders on the app server it will create the index partitions successfully, even though nothing was/should be written to that folder.

This is described at the bottom of this blog post:

Using Fiddler2 as a proxy for SharePoint Search

I was looking for a way to troubleshoot search crawl errors a little bit better and figured why not use a tool I already use daily…Fiddler! Here is a quick guide on how to use Fiddler2 as a search proxy so you can see what is happening when the user agent is crawling the content.

  1. Login to the appropriate SharePoint 2010 Application Server where Fiddler is installed
  2. Run Fiddler2 as the SharePoint 2010 default content access account
  3. Minimize the Fiddler2 window and browse to http://localhost:8888
  4. You should see a screen similar to this:
  5. Fiddler Echo Service
  6. Login to central admin
  7. Go to Application management > Manage Service Applications > Search Service Application
  8. Click on None next to Proxy Server: Select “use the proxy server specified” and type http://localhost for Address and 8888 for port.
  9. Click OK
  10. Now run a full crawl on the content source in question. The crawl results will appear in the Fiddler2 window on the server.

Now watch for any errors…especially those pesky red 500 errors!

SharePoint 2013 Metadata Extraction – Redefining how we should style our documents

I’ve been noticing some odd results coming in from search in SharePoint 2013 so I decided to do some research. These odd results seem to be coming in from the new 2013 feature in search called metadata extraction. SharePoint 2013 search tries to determine the document name based on styling in that document which is usually the first H1 style. For this reason some documents were showing up as the title “Table of Contents” or “Insert Title Here” if the document was a template. I went ahead and did some tests to try and determine what takes precedence in search results. For this testing I created blank Word documents, Excel spreadsheets, and PowerPoint presentations and ran through the scenarios listed in the table below.

Scenario Word (doc/docx) Excel (xls/xlsx) PowerPoint   (ppt/pptx)
No title defined   in SharePoint/File Properties (H1) Displayed Header text Displayed File Name Displayed Title Slide   Text
Title defined in   both SharePoint/File Properties (H1) Displayed Header text Displayed Title Displayed Title Slide Text
No title defined   in SharePoint/File Properties (No H1) Displayed File Name Displayed File Name Displayed File Name
Title defined in   both SharePoint/File Properties (No H1) Displayed Title Displayed Title Displayed Title

**Note: The SharePoint title property and file title property are directly related and one will update the other.

Here’s a little bit more about each test document and how it returned in search results:

1)      Word (one doc/one docx file) – DOCX didn’t extract text from document, but DOC did

doc and docx - metadata extraction

  1. H1: Heading (H1) with sample text (normal) underneath
  2. No H1: Sample Text only

2)      PowerPoint (one ppt/one pptx file) – PPTX didn’t extract text from slides, but PPT did

ppt and ppt - metadata extraction

  1. H1: Title Slide with one content slide. A content slide with a title behaved the same
  2. No H1: One Content Slide (No title at top and text in text box underneath)

3)      Excel (one xls/one xlsx file)  – Both XLS and XLSX extracted text from the spreadsheet, but XLS extracted the title of the sheet as well

xls and xlsx - metadata extraction

  1. H1: Text style with (Heading style applied) and text in column underneath
  2. No H1: Plain text in top 2 cells

Summary: Here is a list of the priorities for the 3 types of documents tested –

  1. Word: H1 (Style) > Title > FIle Name
  2. Excel: Title > File Name
  3. PowerPoint: Title text > Title > File Name

SharePoint 2013 Continuous Crawl

As I learn all about SharePoint 2013 Administration this week at SharePoint911 – SharePoint 2013 Administration Training, one thing stands out above the rest: search. Search could really be a week long class in itself, but we did dedicate 4 hours to it today. In 2013 Search has undergone an extreme makeover: SharePoint Edition. One of the new features that I heard about, but didn’t really understand until today was this new concept of a continuous crawl. Is it really continuous? In SharePoint 2010 we had 2 options: Incremental or Full. You could schedule an incremental crawl to go every 15 minutes, but if it exceeded that 15 minute mark it would wait until the next 15 minute block before starting another incremental crawl. For example: Let’s say we schedule 15 minute incremental search crawls on a 2010 farm. The first crawl completes in 15 minutes, but the second crawl completes in 20 minutes. SharePoint 2010 would actually wait 10 minutes before kicking off another incremental crawl.

The OLD Way:

15 minutes 20 Minutes 10 Minutes No Crawl 15 Minutes

The NEW Way:

15 minutes 20 minutes 15 minutes 15 minutes

This is really just a continuous incremental crawl, but names don’t always match up in 2013 (SkyDrive vs. SkyDrive Pro, Apps (As in Lists, Document Libraries, etc.) or Cloud Apps (As in development). Some other fun search facts for 2013 are as follows: Search Best Bets and Enterprise Keywords are dead (replaced by query rules), instant search (pre-query suggestions) are awesome, no more search scopes (replaced by result sources), and most admin level search work is now done in PowerShell, but pretty much all other items are done at the site collection level. SharePoint 2013 is the real deal and definitely is one search to rule them all in the Enterprise…long live Google Search appliances.

SharePoint Invalid File Name Workaround..ish

I’m sure anyne dealing with SharePoint end user support has ran into this error: “The file name you specified is not valid or too long. Specify a different file name.”

Invalid Filename

After some searching we noticed that this is a limitation with all versions of SharePoint! Usually renaming one or two files isn’t too big of a problem (And it never seems to be a problem if they want to get around blocked file types!), but this specific case we needed to upload thousands of documents into SharePoint. I did some more searching around online and found a few different options: powershell (love it, but to train someone who has never used it would be a fun day), create an event receiver to change the file name at the time of upload (not enough time in this case), or find an application capable of renaming bulk files. I found this little gem on sourcefourge:

SharePoint File Fix is capable of renaming bulk files (and even does recursive renames) to make them compatbile with SharePoint. It is a lightweight html application that doesn’t require an install and is very easy to train a user how to use it.

SharePoint File Fix

Export the User Information List of a site collection to Excel

Obtain “listID” & “ViewID” for the list “User Information List”.
a) Go to the “People and Groups” page of the site and navigate to List Settings (Settings > List Settings) via Toolbar.
b) Choose the list from the list of views in the bottom portion of the page.
c) In the address bar of the page, you would find the LISTID and VIEWID.
For example: http://sitename/_layouts/ViewEdit.aspx?List=%7B83959520%2DE3F8%2D4696%2D802B%2D22B547F097F4%7D&View=%7B8E86F4BF%2D1B65%2D4AC5%2DA507%2D87236E3F8D05%7D


Replace the viewID and ListID we obtained from previous step in the url