Skip to content
  • Home
  • About
  • Twtter
  • RSS
  • LinkedIn

Searching for Bob

Search, Information, Architecture and Technology

Month: November 2018

Enterprise Metadata Tagging – The Setup

November 27, 2018January 19, 2019 Bob Bachand3 Comments

This is the second of four blog posts about how we implemented a complex metadata mapping strategy within our enterprise. I was inspired to write these posts to help others use BA Insight metadata tagging products with SharePoint and SQL content. The internet does not have much out there to help people like us walk through real world scenarios on how these tools work together to create a desired solution.

Series of posts describing the Enterprise Metadata Tagging project:

  • Enterprise Metadata Tagging – The Requirements
  • Enterprise Metadata Tagging – The Setup [This one]
  • Enterprise Metadata Tagging – The Implementation
  • Enterprise Metadata Tagging – The Demo

This post will concentrate on the content and database sources which are needed to create this prototype. We will have three SharePoint content sources and one SQL content source. Taxonomy will be stored in two different places: SharePoint MMS and a SQL database. We would also need to add some new managed properties and map them to the content crawl properties. Lets walk through this prototype data setup.

Taxonomy and Content Setup View

Taxonomy process setup

Tool installation

We can assume that the following has been installed and functioning properly on the prototype server:

  • Microsoft SharePoint 2013 – Could be the 2016 version of the software
  • Microsoft SQL Server 2008 R2 – Could be a higher version of the database server
  • BA Insight Connector Framework
  • BA Insight SQL Connector
  • BA Insight Smart Pipeline
  • BA Insight AutoClassifier

SharePoint MMS Taxonomy Setup

Static values

Taxonomy values which do not change often or are easily changed when appropriate. Mostly business rule or complex tagging schemes.

  • Application – New term set used to tag the listed applications to the enterprise application Id using the content source property.
blog1
  • Industry – New term set used to tag documents within a specific path to an industry Id
blog2

Content source taxonomies

  • Prototype Industry Term Store – Used to store Enterprise, US and UK term sets consumed by the Content, US and UK content sources (SharePoint sites)
blog4

SQL Database Taxonomy Setup

Dynamic values

Taxonomy values which change often, are not easily updated or need data deployments to achieve. Mostly large and/or complex taxonomies which also include mapping from application to enterprise taxonomy.

These tables captures application to enterprise tagging which is key to this metadata strategy. These tables can also be updated as often as necessary to keep the metadata tagging approach consistent. Generally, mapping is from App GUID, TermId and Text to Enterprise Term Id’s.

  • US Application – SharePoint content source using an application taxonomy
blog5

Since the US application is a SharePoint content source the GUID (AppGUID) represents the MMS value to be matched on in the application industry managed property. The system tags each document from AppGUID to the enterprise industry term Id (EntTermId).

  • UK Application – SharePoint content source using an application taxonomy
blog6

Since the UK application is a SharePoint content source the GUID (AppGUID) represents the MMS value to be matched on in the application industry managed property. The system tags each document from AppGUID to the enterprise industry term Id (EntTermId).

  • Employee Application – For all non-SharePoint content sources which use an application taxonomy
blog7

Since the Employee application does not use SharePoint MMS the application Term Id (AppTermId) and Text (AppText) represent the value to be matched on in the application industry managed property. Then the system tags each document to the enterprise industry term Id (EntTermid) from this relationship.

SharePoint Schema Setup

Added the following managed properties to the schema

Enterprise taxonomy

  • EntApplicationId – Used to capture application tagging
  • EntIndustryId – Used to capture Industry tagging
blog8

Application taxonomy

  • AppBusinessId- Used to capture the TermId values from the SQL content source
  • AppBusinessText – Used to capture the text values from the SQL content source
  • AppBusinessMMS – Used to capture the MMS values from the SharePoint content sources
blog9

SharePoint Content Sources Setup

US Application – SharePoint MMS tagged content source

  • Documents in root document repository
blog10
  • Example of tagging with MMS US Industry term set
blog11

UK Application – SharePoint MMS taxonomy tagged content source [multiple tags]

  • Documents in root document repository
blog12
  • Example of tagged with MMS UK Industry term set
blog13

Content Application – SharePoint content source not taxonomy tagged. Is used to test tagging with static path values.

  • Folders – All documents are in these folders
blog14

SQL Content Source Setup

Employee Application is SQL based and is representative of our non-SharePoint content sources

  • EmployeeSQL table
blog15

909 employees to be crawled from this table with some having an application Term Id, some not. Application Text and some not. And some both. Will be used to test alternate field when primary field is empty as well as normal SQL content tagging.

Next post in this series:

Enterprise Metadata Tagging – The Implementation

Enterprise Metadata Tagging – The Requirements

November 27, 2018January 19, 2019 Bob Bachand3 Comments

This is the first of four blog posts about how we implemented a complex metadata mapping strategy within our enterprise. I was inspired to write these posts to help others use BA Insight metadata tagging products with SharePoint and SQL content. The internet does not have much out there to help people like us walk through real world scenarios on how these tools work together to create a desired solution.

Series of posts describing the Enterprise Metadata Tagging project:

  • Enterprise Metadata Tagging – The Requirements [This one]
  • Enterprise Metadata Tagging – The Setup
  • Enterprise Metadata Tagging – The Implementation
  • Enterprise Metadata Tagging – The Demo

We are moving off of an older less flexible metadata tagging system and are looking for a more scalable and efficient system. Here is a list of those goals and requirements for the new system. As well the a description of the conceptual approach and use cases that we will use.

Goals and Requirements

  • Automate taxonomy update process – Our current process is manual and we need an approach which would automate that process.
  • Increase scale-ability of application taxonomies – We can currently only support 3 application taxonomies. The next system should be able to handle n+1 application taxonomies.
  • Keep current taxonomy metadata tagging process in place until consumers can move to new approach – We will be changing our approach from refinements to filters and we have applications which will need time to change from the current properties and approach to the new.
  • Increase metadata tagging performance. Always…
  • Reduce deployment and down time when updating or adding metadata tagging data, processes or rules – We currently have to redeploy code during any taxonomy change or update.

Conceptual View

We have a SharePoint 2013 Search index which we crawl our Enterprise data into. Our current user experience (JavaScript enhanced SharePoint page) queries that index for organic, promoted and refinement results.

This prototype will look at using a combination of a SQL database and Managed Metadata Service as sources of metadata for the tagging solution. The SQL database will be automated to consume data from our taxonomy tool and MMS would be used for static data which does not change often.

We have SharePoint, Lotus Notes, SharePoint Online and SQL databases as sources of our Enterprise content. Our current connectors include the SharePoint connector and the BA Insight SharePoint Online, Notes and SQL connectors. We will only use the SharePoint connector and BA Insight SQL connectors for this prototype.

We will be using some BA Insight components to achieve the new taxonomy metadata tagging requirements. Smart Pipeline will be used to interact with the application documents in the SharePoint index pipeline being crawled by the connectors. AutoClassifier will be used to tag the application properties to enterprise properties for static values. Smart Pipeline Components like Custom Entity Extraction will be used to tag the application properties to enterprise properties for dynamically changing values.

Use Cases

  • Allow for current metadata tagging process to occur simultaneously through the Content Enrichment Web Service (CEWS) – We have a metadata process in which we will need to have continue after this new process is implemented.
  • Tag content to the enterprise taxonomy using the SharePoint and BA Insight SQL connectors – We use the OOTB SharePoint connector and many of the BA Insight connectors but the prototype will only use these two.
  • Tag content from application document properties to enterprise taxonomy properties – Identify and tag the application content correctly.
  • Tag all enterprise content to enterprise Term Id’s and include Id’s for each taxonomy level. Tag to the enterprise standard and ensure that every Id which is in a taxonomy path is present. This will facilitate the new filtering approach in the user experience.
  • Tag content from application document taxonomy property types (TermId, Text and MMS GUID) to enterprise taxonomy. Have ability to use these data types to map to the standard enterprise Term Id.
  • Tag content to enterprise taxonomy by application using URL fragments – Use the documents path property to tag to the desired industry or application taxonomy.
  • Tag content to enterprise taxonomy by application using document properties – Use the contentsource property to tag content to the correct application Id.
  • If content is not present in a specific application document property use an alternative property for tagging to enterprise taxonomy – Use a predefined alternate property to identify the correct taxonomy to tag a document to.
  • Dynamically change application taxonomy or enterprise metadata and see changes in search user experience – Demonstrate the normal changes to documents and taxonomies then reflect those changes to the tagging process which in turn tag those changes on the appropriate documents.

Next post in this series:

Enterprise Metadata Tagging – The Setup

Recent Posts

  • Enterprise Metadata Tagging – Demo
  • Enterprise Metadata Tagging – The Implementation
  • Enterprise Metadata Tagging – The Setup
  • Enterprise Metadata Tagging – The Requirements

Archives

  • January 2019
  • November 2018

Categories

  • Enterprise Search
  • Metadata
  • Taxonomy

Meta

  • Create account
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com
Blog at WordPress.com.
Back to top
  • Subscribe Subscribed
    • Searching for Bob
    • Already have a WordPress.com account? Log in now.
    • Searching for Bob
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...