Software Recognition and License Management – The Pros and Cons of Different Recognition Methodologies

By Kris Barker, Apptria Technologies

Anyone working in the trenches of asset management has experienced the difficulties related to poor software recognition. On the surface, software identification may seem to be a very basic, low-level task; however, nothing could be further from the truth. And the consequences of poor software identification, especially for organizations trying to stay compliant with license agreements, can be extremely costly.

Any software product that needs to access, interact with, or report on installed desktop applications requires a method of discovering and recognizing those applications. Without strong application discovery and recognition capabilities, software data is not reliable as the foundation for critical IT processes and decision-making. For software license management, software identification capabilities are absolutely critical.

Ascertaining the presence of applications on a PC is a fairly straightforward process from a technology standpoint. However, there exists no common, standardized method for correlating discoverable application data with its actual software title, version, manufacturer, or other important application data. In fact, most products that claim to discover and identify application sin an automated way rarely deliver information reliable and accurate enough to calculate the correct number of installed titles.

Many license management products rely on one of two methodologies to ascertain software identification:

  • Examining file headers
  • Analyzing registry entries

Due to the inherent lack of uniformity within the computing environment, the use of any one of these methods in isolation is inaccurate and/or incomplete, leading to incorrect conclusions about one’s license position and/or countless work-hours spent trying to correct the tool’s mistakes.

Other products rely on a third methodology of using a software catalog. While software catalogs vary widely in their accuracy and comprehensiveness, if properly maintained, they offer the strongest potential for effective license management at this time.

Common Software Recognition Methodologies

File Header Identification

File header analysis is one commonly used technique for identifying installed application. The advantage of this approach is that it is directly tied to the application executable; the presence of an application is determined by the presence of its executable(s). However, one major issue with file header analysis is the lack of reliability of the information that the file header contains. Many software vendors don’t provide complete information of don’t update file headers on a regular basis, leading to inaccurate, inconsistent, or even missing application data. For example, the file header for Google’s Chrome browser, when read on the Windows platform, reveals its name and copyright, but no version information at all. In addition, Adobe products are notorious for being named inconsistently, making it difficult to evaluate data in a consolidated, unified manner.

Even when applications do have reliable header information, a much more significant problem exists. Applications frequently consist of numerous (sometimes hundred or thousands) executable files, and in many cases, multiple applications from a single vendor may share one or more identical executables. Because file headers contain no information revealing the relationship between executables, there is no way of knowing to which application the executable corresponds.

The end results of basing application recognition on this information are:

  • Artificially inflated application counts that may lead to the wrong conclusion that too few licenses have been purchased than are actually required
  • Uncertainty as to which application a shared executable represents
Registry Analysis

Many technologies rely on information contained in the Windows registry to determine what software is installed on a PC. However, there are difficulties associated with this methodology. First, programs installed using means other than the Windows installer are often not detected; and even when they are, they may lack critical version information. More importantly, registry information (such as that shown in the Add/Remove Programs control panel) is based on how a product is packaged from an installation perspective, not how the components of the product need to be evaluated from a licensing standpoint. In other words, each entry in Add/Remove does not necessarily indicate a separate application for which a license is required.

Software Catalogs

An alternative to the identification methods described above is the use of a software recognition database. Such software catalogs are generally built by teams who understand how to turn raw data into normalized information representing how software is actually licensed. Software catalogs are therefore typically more useful than the other methods for generating accurate license counts and evaluating one’s license position. Special algorithms align discovered executable files and other application information to the application data residing in the database. The end result is a list of normalized software titles, manufacturers, versions, and other information indicated by the discovered data.

Software catalogs often contain non-discoverable information such as software categories, SKU information, product use rights, and just about any other relevant data point that can be associated with a specific application title. Assuming that the algorithms are properly constructed, this additional software data makes the software catalog approach comprehensive and highly accurate.

Evaluating Software Recognition within License Management Products

There are a number of areas one should carefully evaluate when assessing a license management product’s software identification capabilities. If managing software licenses is the most important goal, it’s critical to evaluate not just what a license management product recognizes, but also how it recognizes, interprets, and classifies the data it discovers. In particular, products should perform recognition in a license-centric manner that allows what’s discovered to be easily aligned with entitlement requirements.

Here are other key areas to carefully examine:

Breadth of product’s application coverage, especially when compared to your organization’s titles: Many license management solutions base their application identification capabilities, at least in part, upon a software catalog or signature library. If this is the case, it’s critical to evaluate how frequently the catalog and/or its recognition capabilities are updated. Furthermore, depending on your own organization’s software upgrade policies and procedures, it’s equally important to investigate a catalog’s coverage from a historical standpoint. Actively maintaining an exhaustive signature library requires both expertise and a long-term commitment from the catalog’s curator. Unfortunately, license management vendors often place a greater emphasis on features that look impressive on a spec sheet but contribute far less to end users’ abilities to effectively manage their software assets.

Proper identification of complex applications such as suites and editions: Accurate software recognition means more than simply knowing with which application an executable or registry entry corresponds. Additional information is needed to determine which version, release or edition of an application or suite is installed which is a critical data element for understanding license entitlements. The following two questions present specific capabilities to take into account:

  1. Can it differentiate between applications that share a common executable? Applications that are bundled within multiple editions (such as the Standard and Professional versions of Adobe Acrobat) generally share a common executable; therefore, executable analysis alone isn’t enough to reveal which edition is actually installed. In such situations, additional information (such as that found in the registry) is required to make this determination.
  2. How does the tool recognize suites and their constituent components? The presence of suites such as Microsoft Office adds complexity to the software identification process. It’s not sufficient to simply identify which components are installed to determine the license entitlement. Also, analyzing Add/Remove Programs entries generally won’t produce component-level installation or usage data required for optimal license management. Identifying such suites requires a hybrid approach which takes into account both discovered file executables and additional information found within the registry.

Handling unrecognized applications: It’s important to acknowledge that no license management product will be 100% complete; some applications – particularly those developed in-house and more specialized or obscure commercial software – will not be recognized out of the box. But any technology you consider for license management should provide a straightforward way to account for custom and/or unrecognized applications so these applications can be managed alongside recognized software in a unified manner. In summary, it’s critically important to carefully explore any license management product’s software recognition capabilities. Without accurate software identification, it’s virtually impossible to rely on presented data for making decisions related to the management of licenses and software assets. The end result is the need to devote resources to reconciling raw, discovered data with their corresponding licensable entities—a time-consuming and error-prone process that’s impractical to repeat with any regularity. By evaluating a license management technology’s software identification capabilities in isolation of its other features is highly recommended since recognition is critical to the success of any organization’s software license management efforts.