Validation of MS Excel Spreadsheets

Are your spreadsheets in compliance, avoid a 483!

This article will present a simple method for validating MS Excel spreadsheets for GXP use. The goal of our validation strategy is to provide better testing and documentation of individual spreadsheets in less time. Once we establish this methodology, we can rapidly apply it to many spreadsheets to reach compliance faster.

Scope

This article will attempt to cover the most common examples of spreadsheet validation activities that we find. We will attempt to cover special or very complicated examples whenever possible, but it should be understood that anything not explicitly covered in this article can be added as needed to fit your existing validation requirements or for special cases. A good example is any spreadsheet that uses or is used with custom automation. We have validated spreadsheets that contained hundreds of formulas and custom macros, but did not require any user intervention at all because they were created, populated with data and saved entirely by external code. In this case, the focus of the validation was on validating formulas and macros, but security testing was limited to proving that users could not intercept or interrupt the operation at any time.

Assumptions

We are not going to make any assumptions about the existing SDLC (Software Development Life Cycle) that may be currently in use, any material presented here can be adapted to meet your existing standards and practices.
We are also going to assume that the spreadsheet that need validation has already been written, as this is what we find in the overwhelming majority of the cases presented to us.

Approach

The basics of our approach are as follows:

  • We define our basic documentation practices and methodology in a single document, the Spreadsheet Validation Master Plan. This document can be referred to for all individual spreadsheet validation projects, and will not need to be reviewed or approved for each validation effort.
  • We define the requirements for each sheet or chart in the workbook, then focus our testing on verifying these requirements.
  • A heavy emphasis is placed on defining and testing formulas, and also on the security for each sheet by limiting the parts of each sheet that the users are allowed to edit.

Methodology

To keep things simple, we define how we enter information into the requirements and design specification documents, then describe how we go about testing the spreadsheet in the test protocols. Keep in mind that input cells may be defined as one single cell or as a range of cells.

Requirements Specification

The requirements specification should include all the requirements that your spreadsheet must accomplish. This document should be kept simple and relatively non-technical so that anyone who reads the document will understand that the requirements are.

Starting with a common validation template:

  • List all sheets and charts in the workbook
  • What users should be allowed to enter into the sheet. These are your input cells. You may find it helpful here and for later use to label each input cell, or they can be labeled in the design specification document.
  • What formulas or calculations exist on the sheet.
  • What cells represent the output or final calculation of the sheet. This may also refer to a chart.
  • For each chart, define properties like the title, axis labels and units, and the datasets used to create the chart.

Design Specification

Now that you have defined your requirements for each sheet, you can create the detailed software design specification document. The purpose of this document is to describe how the requirements have been implemented. This document should include enough information so that a developer could create the entire software project based on the information contained within this document and from reading the requirements specification
We like to break each sheet into four sections: Inputs, Processing, Outputs and Security.

Inputs

Document the cells users are expected to enter or update data. In an automated system, you can also define the source or the input data or instructions. If any validation rules are used to enforce proper data entry, these should be documented here as well.

Processing

Processing is mostly about documenting the formulas that are used on the sheet. Any custom macros or code is also documented here.
The majority of errors that we find when validating sheets is in the formulas. The best way we have found to catch these errors is to define each formula using the actual names of the variables represented as input cells.

For example, it may be easy to document that the range of cells F10:F20 contains the formula “=(A10*$C$5)/($D$5*B10). However, it is difficult to verify if this formula is correct. We recommend writing the formulas out like this:

  • Cell $C$5: Volume (V)
  • Cell $D$5: Ideal Gas Constant (R)
  • Cells A10:A20 Pressure (P)
  • Cells B10:B20 Temperature (T)
  • F10:F20 Final Result (moles of gas, n)
  • n=(PV)/(RT)

This is the easiest and most effective way we have found to verify that formulas are correct / to catch errors in formulas.

The other kind of processing is the validation of macros and code that is used in the spreadsheet. Here, the design specification is a good place to copy the code and annotate as needed to describe the purpose of each macro or function.

Note: If you have good coding standards and user proper headers and comments in your code, this step may be already done for you.

Outputs

Outputs usually fall under one of the following three main categories:

1. The cell or range of cells that contain the final result of all previous calculations
2. Charts – many times these are printed and saved with external reports
3. Data that is copied into a final result sheet or exported to a separate file or database.

Security

This section can be a short statement, i.e. “All non-input cells should be locked to prevent changes.” You can also include additional security settings if you are using either custom code or a third party add-on to implement multiple levels of security to control who can edit certain cells, run a macro or function, etc.

Test Protocols

The testing of any spreadsheet should prove that the requirements were properly implemented according the design specifications.

Installation Qualification (IQ) testing is usually limited to making sure the file is in a location where users can access the file, unless the workbook is part of a larger automation project. Operational Qualification (OQ) testing is mostly about verifying formulas, macros, and also to test the security of each sheet to verify that all non-input cells are locked to prevent changes. The IQ and OQ can be combined into a single IOQ protocol as needed.

To start generating our test cases, Once again, we like to break testing down into Inputs, Processing, and Outputs, but security testing is easier to test in separate test cases.

Process & Output Testing

These are usually easier to combine when writing test cases. Once again, FOCUS ON TESTING THE FORMULAS!

  • Are formulas correct? There are several methods for verifying and testing formulas
  • Continue testing until all input cells have had data entered on changed, and that result has been verified. The key point is to look for formulas that are incorrect or charts that are not using the correct or complete set of data, or are pointing to the wrong columns of data.
  • Macros can be tested by entering a range of data and comparing the results with a hand calculator, or by visual inspection that the function performed as expected.
  • Charts can be tested by a combination of visual inspection or verification of the properties, including the dataset used as the basis for the charts.

Security Testing

The type and amount of security testing that you can do for spreadsheets is largely based on how security has been implemented. At a bare minimum, you should test that users are limited to entering data into the defined input cells only, and that they do not have the ability to alter any other part of the spreadsheet. Failure to do this can compromise the integrity of the validation effort and of any data or information generated by the workbook.

Conclusion

This methodology will result in a User/Functional Requirements Specification, a Software Design Specification, and an IOQ Protocol ready for approval and execution. Any deviations found during testing can be handled according to existing validation practices, and a summary report that shows that all the activities specified in the Validation Master Plan or SOP can be generated.

About the Author

Tyson Mew is President of Ofni Systems, a regulatory compliance consulting, software and validation firm for FDA-regulated companies. They are the creators of the ExcelSafe software for adding full Part 11 compliance to Excel Spreadsheets, and also the FastVal Validation software for automating the generation and execution of many popular types of software application, including Excel spreadsheets. For more information, visit www.Ofnisystems.com.

Excel Complete – Validate your spreadsheets today!

If you would like to get a free quote on any spreadsheet validation project please feel free to contact us directly.

17 thoughts on “Validation of MS Excel Spreadsheets

  1. Very nice article, its a good refresher course for me.

    Thanks you – Love the new design!

    Keep up the good work

  2. Very nice article, its a good refresher course for me.

    Thanks you – Love the new design!

    Keep up the good work

  3. If you do a search for “Printing Formulas To Word” you’ll come across a page by CPearson. The macro will print all the formulas in a selected region in an Excel spreadsheet to a Word document. It saves so much time!
    To get the macro to run you need to “Make sure that you have enabled references to Word objects, from the Tools->References menu” in the VB Editor menu which can be confusing for novices to do.
    The problem we’ve had with validating Excel spreadsheets is getting the upper echelons to sign off the documents… they just don’t understand it! ;)

  4. If you do a search for “Printing Formulas To Word” you’ll come across a page by CPearson. The macro will print all the formulas in a selected region in an Excel spreadsheet to a Word document. It saves so much time!
    To get the macro to run you need to “Make sure that you have enabled references to Word objects, from the Tools->References menu” in the VB Editor menu which can be confusing for novices to do.
    The problem we’ve had with validating Excel spreadsheets is getting the upper echelons to sign off the documents… they just don’t understand it! ;)

  5. The problem with the entire concept of validating an Excel spread sheet is that there is no audit trail that comes with an out of the box Excel spreadsheet. Which means there is no way to track if the data has been modified after the original entry, when the modification occurred, why it occurred and who did it. The Excel spread sheet maybe doing the calculations correctly but any auditor worth their salt would not be satisfied with a system carrying out a critical function that did not have an audit trail. There are some third party add-ons that install an audit trail into an excel spread sheet. I am not sure how good these are.

  6. The problem with the entire concept of validating an Excel spread sheet is that there is no audit trail that comes with an out of the box Excel spreadsheet. Which means there is no way to track if the data has been modified after the original entry, when the modification occurred, why it occurred and who did it. The Excel spread sheet maybe doing the calculations correctly but any auditor worth their salt would not be satisfied with a system carrying out a critical function that did not have an audit trail. There are some third party add-ons that install an audit trail into an excel spread sheet. I am not sure how good these are.

  7. Excelent article! Very clear and straight forward.

    My only comment is although it is probably not a problem 99% of the time, be aware of rounding errors when comparing Excel calculations with a calculator, especially where there may be several to many iterations in a calculation! Also if displaying the result to only a few decimal places errors can creep in as Excel calculates to 18 dp (I think) and doing cumlulative calculations can again cause rounding errors. I know, because this has happened to me!

    I have also heard of audit trail addons, but I cant vouch for these either.

  8. Excelent article! Very clear and straight forward.

    My only comment is although it is probably not a problem 99% of the time, be aware of rounding errors when comparing Excel calculations with a calculator, especially where there may be several to many iterations in a calculation! Also if displaying the result to only a few decimal places errors can creep in as Excel calculates to 18 dp (I think) and doing cumlulative calculations can again cause rounding errors. I know, because this has happened to me!

    I have also heard of audit trail addons, but I cant vouch for these either.

  9. There are numerous add-ons to Excel which provide audit trails and the necessary features to enable 21CFRpart11 compliance. DaCS is probably the most widely used add-on in the Life Sciences arena, details can be found at http://www.spreadsheetvalidation.com

    One other consideration when validating spreadsheets is the time/costs required for the documentation/testing. If you have a large (ie more than about 5!) portfolio of GxP critical spreadsheets, then I would strongly recommend that the validation process is ‘genericised’ by means of standard document templates to allow rapid turnaround.

    There is a mini-series of papers available at http://www.spreadsheetvalidation.com describing a “Pragmatic Approach to the Validation of Excel Spreadsheets”. Using this approach we find that the majority of relatively simple spreadsheets (as typically used for data calculation/reporting etc), can be validated in 4-5 days, this includes full documentation and testing.

  10. There are numerous add-ons to Excel which provide audit trails and the necessary features to enable 21CFRpart11 compliance. DaCS is probably the most widely used add-on in the Life Sciences arena, details can be found at http://www.spreadsheetvalidation.com

    One other consideration when validating spreadsheets is the time/costs required for the documentation/testing. If you have a large (ie more than about 5!) portfolio of GxP critical spreadsheets, then I would strongly recommend that the validation process is ‘genericised’ by means of standard document templates to allow rapid turnaround.

    There is a mini-series of papers available at http://www.spreadsheetvalidation.com describing a “Pragmatic Approach to the Validation of Excel Spreadsheets”. Using this approach we find that the majority of relatively simple spreadsheets (as typically used for data calculation/reporting etc), can be validated in 4-5 days, this includes full documentation and testing.

  11. Kindly note that the Off the box MS EXcel does actually have a form of audit trail. This is activated when the file is shared through the Share Workbook function or Protect and Share Workbook. Once saved in this form on the network new options appear on track changes in which a new sheet (hidden normally) appears where there will be the audit trail.

  12. Kindly note that the Off the box MS EXcel does actually have a form of audit trail. This is activated when the file is shared through the Share Workbook function or Protect and Share Workbook. Once saved in this form on the network new options appear on track changes in which a new sheet (hidden normally) appears where there will be the audit trail.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Single Sign On provided by vBSSO