Do You Trust Your Discovery Data?

By Aimee Dundon

Good asset discovery data is the foundation for every IT Asset Management program, for both hardware and software Asset Management. Tools do more than ever to track and monitor assets. Despite this, a lot of IT Asset management is accomplished in spreadsheets with manual steps needed to fill in missing information or to clean up a report before it gets sent to a key stake holder. Up to 50% of the time of knowledge workers is spent looking for high quality data (Redman, 2008). As a practitioner, I have seen organizations spend time and money on tools and projects to “get better data”. Without specific, measurable goals, these projects are doomed to failure. If you want to have discovery data you can trust, you need to understand your needs, honestly assess the quality of the information in your inventory system today, and make a plan to correct problems.

Your sources (probably) weren’t made with ITAM in mind

A many discovery tools are not designed for IT Asset Management. They are designed for patching, or network administration. Financial systems might only include detailed financial information instead of technical information. As a result, these systems may be missing information critical for doing your job. For example, Network management tools will not have financial information, but would be a good source of network resources. DAMA, The Data Management Association International, defines data quality as, “The degree to which data is accurate, complete, timely, consistent with all requirements and business rules, and relevant for a given use.” (DAMA International, 2009). In order to have “good data”, you have to understand your needs. Look for common information on the most commonly used reports. Also consider the information that is being filled in after the fact, and looks for ways to add that information to your tool.

In addition to understanding what information is available in your discovery tool, it is essential to understand some basics about how data is collected. Most discovery for Windows desktops and laptops relies on information in the Windows registry or the Windows Management Instrumentation (WMI) file. These sources are usually used to get hardware configuration information. The information about installed software usually comes from Add Remove Programs or the executable files on the device, or a combination of the two. There are tools that work in a similar way for Apple devices and Linux. All of this information might be consolidated on the server for the discovery tool, or on a beacon server before it is processed and added to the application. New information is transmitted and added to the application on a schedule. It’s important to have a high level understanding of these elements to understand where the system could be breaking down.

Ensuring High Quality Data for ITAM

Think about the typical report from a discovery system. A casual scan will reveal that some information is missing for some devices. There could be repetitive information and duplicates. There could be old information. In the following sample discovery report, there are some problems with the data. There are two rows with duplicate serial numbers, which means that one or both of the serial numbers is incorrect. There are several fields missing information. If you count every individual cell with a defect, and divide by the total number of cells, this would look like a 9% error rate. If you count the rows with any one defect and divide by the total number of rows, this becomes a 70% error rate. (Redman, 2008). When you focus on the total number of rows that are unusable, this changes how the problem will be perceived when you go to another team to ask them to change a process. To fix this problem, you would have to report the issue and get the problem fixed, preferably at the source to ensure that the same problem won’t affect other people.

It’s easy to see that there are some records missing a single piece of information. This could be happening for multiple reasons. It’s not uncommon for individual workstations to have a corrupted WMI file, which could result missing some information. If a large number of computers have the same issue, such as a missing serial number, that would suggest a larger problem than an individual workstation. It may be necessary to work with technical teams to find out what the affected devices have in common. If all of the machines have a specific applications installed, or come from the same locations, or some other common factor, that may be part of the root cause.

It’s possible for machines to be completely missing from discovery data. While examining reports, count the number of machines showing for each location, and verify that the counts look right. It’s possible for a beacon server to stop communicating, which would make one or more entire sites stop communicating with the system. It’s also important to talk to the administrator for the discovery tool, and find out how long a system is visible after it stops communicating. It’s important to make sure that if computers are being archived or being deleted, there is a process to verify that the computers are in storage or have been sent for disposal. Discovery records can also be compared to purchase records and disposal records. If a machine with a particular serial number was purchased, but doesn’t show up in discovery data or in the disposal records, and there isn’t some breakdown with the discovery system, there may be a process that is breaking down.

There are several things that could cause duplicates, like the two rows with the same serial number, but different machine names. This could have happened as a result of poor computer imaging practices, such as cloning the hard drive between machines. It could also happen if the computer was redeployed for a new employee. It could also be the result of the discovery tool. There are business rules built into discovery tools and Asset Management programs that determine whether incoming information should update a particular record, or create a new record. Depending on how those rules are set up, that could cause duplicates. For example, a tool that checks the computer name and serial number combined to verify if a record is unique or not would not catch this error. To permanently correct the problem, it might be necessary to have a strict process for naming computers that ensures that every computer has a unique record. Alternatively, some tools allow the application administrator to adjust the rules for incoming data to prevent problems. The criteria selected should be either something that refers to a unique device, or a combination of fields that identify the device and the configuration. Information that does not identify a specific machine isn’t useful because it’s not a unique identifier. For example, an IP address might be used by one computer on one day, and a different device on another day. MAC addresses identify network cards, but they are difficult to use for desktops and laptops because individual workstations might have multiple MAC addresses. Some companies are tracking a configuration item, and that might be defined as the configuration of a specific computer with a specific software configuration for a particular user. In this case, it may be necessary to use a combination of two or three pieces of information, like the computer name and the serial number together, to identify that configuration item.

Working with other departments

As a result of all of this work, it will probably be necessary to work with other teams to find a resolution for these issues. It can be very difficult to hold other people accountable for data quality issues. When working with other teams, keep in mind that the goal is to make everyone’s work better. By focusing on issues not on the people and avoiding blame, it’s possible to make relationships between teams stronger. Data Quality should be tracked with the relevant groups on an ongoing basis. This will keep problems from becoming recurrent and make sure that changes to processes become permanent.


DAMA International. (2009, April). The DAMA Guide to the Data Management Body of Knowledge (DAMA-DMBOK). Technics Publications.
Redman, T. C. (2008). Data Driven: Profiting from Your Most Important Business Asset. Harvard Business Review Press.

About the Author