Enterprise Search Buyer’s Guide

by Team Capacity | Nov 24, 2019

While deciding on the correct enterprise search interface is daunting—new technology and accompanying jargon springs up almost daily—there’s no reason for the task to rise to the level of intimidation. Look at the search as a marathon, not a sprint.

Use a process consisting of methodical selection criteria, carefully evaluating each against the needs of the users and the organization. Prioritize those needs and work on the factors provided below, one at a time. It’s important to remain flexible with priorities. As you examine the types of enterprise search tools available, as you may see where a slight modification has no real impact on the final result.

Understand the infrastructure to search.

It is important to know how the interconnected systems are constructed. To do that, you must know:

Where the data is stored, the current volume of data, projected volume of data, and the technology used to store it
The type of data stored and any specific security needs for that data to maintain compliance (e.g., HIPAA, social security numbers, credit card numbers, etc.)
Anticipated future expansion of the data system, to allow for scalability
Which technologies are used in your storage system as well as the programming language utilized
If a proprietary standalone search system is needed, or if an out-of-the-box solution works, or perhaps a hybridized system is best
If the solution is able to communicate effectively and efficiently with other systems in use
If necessary, whether the solution allows for the complexity of security protocols needed to limit access to different levels of data
Whether or not the search system is accessible by clients
The availability of personnel with the skillset to customize the enterprise search solution

Your task is to research and make a recommendation on the type of enterprise search solution that is best for your company. How you make that decision requires some knowledge of the capabilities of the different types of solutions.

Compare the pros and cons of each type, and then compare the pros and cons of what is available to ensure you recommend a solution that is relevant to the function of your business and is scalable. In order to set your company up for success, both now and into the future, there are some terms you must understand to get started.

Types of Enterprise Search Solutions

Open source toolkits offer the most flexibility for scaling with the business. This type of solution requires the business to provide its own personnel to monitor the growth and expanding needs. In order to customize the search technology, personnel with the relevant knowledge in that technology and in the company’s data storage technology are needed. Some companies opt to go open source and use a technology firm to manage it.

Legacy solutions are older platforms from vendors no longer in business. It is unlikely that legacy systems have incorporated much, if any, of the newest technologies. Likewise, these systems are the least likely to integrate with AI or ML technologies. Legacy systems do not scale as well as newer solutions. However, if a business has the personnel to keep the system going, perhaps doing so can carry them over until the next iteration of solutions becomes available.

Application-specific search are the native search engines to a content management system. There are third-party solutions, often referred to as plugins, that mesh with the existing data management system. These search tools are limited to one dataset. As a result, it requires the user to open each system for various datasets and search it individually, as there is no cross-platform connectivity.

A lack of interoperability makes for a less than desirable user experience. If a business is small enough and uses only a couple of data management tools, these application-specific search systems may suffice.

Cloud solutions are, as the name implies, maintained on the cloud and out of view of clients. Cloud search platforms vary from the simple to comprehensive. Many are capable of connecting to other solutions to help businesses create a more robust system. Cloud solutions are customizable and scalable, but lack the single point of search.

Comprehensive platforms are available that incorporate a little of all of these solutions. Many vendors are building more machine learning and artificial intelligence into their flagship solutions. These vendors then provide varying amounts of ongoing support as the amount of data expands as a result of continued business growth.

Vendors can maintain the entire system, monitor the growth, identify issues, provide the fixes, add more customized enterprise search features, analyze the search logs, and divine better search algorithms as the needs of a business change. While this is the most comprehensive solution, it is also the most expensive.

How the enterprise search system operates.

Faceting is the vital capability that permits for user filtering of results, often filtered by category or field type. Filtering for ranges, such as between dates and dollar amounts, are other enterprise search features. Another key function of faceting is building a thesaurus of synonyms to allow users to use a single term to identify other terms with the same meaning.

Signal capture is the recording of the behavior data or “signals” of all, or specific users. Signals include user queries, clicks, adds, purchases, and other types of clicks when a user engages with a search result. This helps fine tune the algorithms so that the most relevant (as determined by clicks and time-on-page). Other signal captures include GPS, activity level, movement patterns, and other actions or events. Collectively, signals provide the basis for the predictive abilities of search engines.

Query pipelines are used with complex queries and complex data handling. This tool makes it possible for search engines to profile behavior. This is accomplished by breaking the search and returns into stages to allow the search engine the opportunity to modify the information.

Scalability and capacity are likely at the top of any business looking to implement an enterprise search engine. The whole point is to increase efficiency by speeding the delivery of the most accurate data returns. That leads to business expansion. The data types come into play under this category, as certain data types are more difficult to search than others.

Media data is notoriously large, and as we settle into media streaming, the ability to add more storage to the data system is vital. Thus, the enterprise search engine must have the capability of accommodating exponential data expansion.

Artificial intelligence and recommendations are the new frontier in search engines. Since there are so many data types, not every search is amenable to using keywords or key phrases. AI looks at the clicks from keyword/key phrase queries and determines those are present on the clicked page. If not, a subroutine updates the algorithm allowing that result to climb higher on the results page. Sometimes, none of the results are relevant. When that happens, the search engine learns by looking at the way the keywords or key phrases were used.

Illustration of enterprise search being helped by artificial intelligence

As the AI becomes smarter, the results to queries have better quality. Since all search queries are focused through the same AI “brain,” everyone with access to that search engine benefits.

High availability systems offer the ability to modify traffic if something goes wrong with hardware locally, or when internal networks experience some type of interruption. More importantly, advanced enterprise search systems have multiple layers of redundancy built into them. If there is data loss on one data server, the system recognizes this and reroutes queries to the next nearest server.

Even better is the ability of the system to attempt to self-heal, as it alerts its human minders that there is an anomaly. This reduces downtime and is often invisible to the users. This type of AI also lends itself to disaster recovery by providing the ability of the system to recognize external events, such as a fire, a hack, an earthquake, etc. This enables the system to shut down or disconnect to protect data.

Security is a critical factor for any search solution. Various types of data require different levels of security protocol. Some protocols will limit access to certain information, and other protocols are designed to sense intrusions from outside sources.

Without a robust security system built into the AI, the system is at risk of unauthorized data retrieval. For that reason, the teams responsible for maintaining it must remain vigilant and keep the system updated. While the AI is capable of performing some of these tasks, team members must manually verify that it is done.

Cookie	Duration	Description
__tld__	session	Description is currently not available.
_cfuvid	session	Description is currently not available.
_no_tracky_101397840	1 hour	Description is currently not available.
rl_session	1 year	Description is currently not available.
ubpv	6 months 1 day	No description available.
ubvs	5 months 27 days	No description available.
ubvt	3 days	No description available.
VISITOR_PRIVACY_METADATA	5 months 27 days	Description is currently not available.

Cookie	Duration	Description
bcookie	1 year	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser IDs.
li_sugr	3 months	LinkedIn sets this cookie to collect user behaviour data to optimise the website and make advertisements on the website more relevant.
yt.innertube::nextId	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__cf_bm	30 minutes	Cloudflare set the cookie to support Cloudflare Bot Management.
INGRESSCOOKIE	session	This cookie is used for load balancing and session stickiness. This technical session identifier is required for some website features.
li_gc	5 months 27 days	Linkedin set this cookie for storing visitor's consent regarding using cookies for non-essential purposes.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.
VISITOR_INFO1_LIVE	5 months 27 days	YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface.
visitorId	1 year	ZoomInfo sets this cookie to identify a user.
yt-remote-connected-devices	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt-remote-device-id	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.