As the uptake of digital health technologies increases, so too does the need for robust digital health assessment methodologies. Evaluations of impact enable healthcare professionals and the public to make informed decisions about which technologies to use to best support their health and wellbeing.

When thinking about creating a digital health standard as a national body, or looking to gather evidence as an innovator, the question at the forefront of minds is: what is suitable evidence for digital health technologies? And what proof do we need to satisfy assurance in this emerging field?

We take a deep dive into how current standards tackle the issue of evidence, the gaps and challenges that emerge, and what factors to consider.

What is suitable evidence?

Digital health technologies are often born out of the desire to help people tackle a health challenge, be it to reduce anxiety or monitor for signs of skin cancer. The possibilities offered by technology, combined with the convenience of mobile, creates the potential for improved outcomes. But how can you tell which digital health technologies are safe, which will deliver improved outcomes, and in what scenarios?

Whilst there are many elements that people need to be assured about in the digital health space, it is the questions around product efficacy and safety that are perhaps the most important. 

Many early digital health assessment models recognised how crucial efficacy and safety are, and establishing evidence of claimed benefits or impact has been a common requirement ever since. But what types of evidence or assurance a digital health technology should be required to provide to establish its efficacy and safety?

Initially, there was no international reference point to help with this challenge. Traditional healthcare approaches to evidence, typically centre around randomised controlled trials (RCTs) or, more recently, high-quality observational studies capturing real world evidence. But digital health presents challenges to these traditional evidential approaches. 

Firstly, it is crucial to avoid categorising all digital health technologies as one homogeneous group of products or services that demand the same level of assurance. Digital health technologies are connected as a group through their primary delivery model – digital – yet technologies vary widely in complexity, use cases and associated risk.

A product that plays music to help children brush their teeth for the right amount of time, is rightly part of the digital health technology space, alongside a product that checks heart rhythms for arrhythmias like atrial fibrillation. Both are great products and have value to offer, but beyond their common mode of delivery, they are poles apart, with different levels of complexity and risk. As such, should the evidence of efficacy be the same for both? Or should the complexity and risk have a bearing on what needs to be demonstrated? We describe this as the Proportionality Principle.

Secondly, digital health technologies are typically developed using an agile and iterative approach. They are built on a rapid real world evidence approach, with user-centred design and user-driven feedback and iteration dictating product refinement and driving product efficacy in target areas. This model involves a growing maturity cycle, with early stage prototypes and proof of concepts morphing into more stable and established products. 

Traditional evidential approaches assume once built and tested, a product doesn’t change; they are not designed to deal with this type of product development model, with rapid evolution underpinning it. 

At some point in this maturity, digital health technologies will reach a point where more substantive and traditional approaches to evidence collation are appropriate, and where the rate of change and development has slowed down to a manageable and predictable level. 

More work needs to be done to factor this situation into the various assessment frameworks. We describe this as the Lifecycle Challenge.

Finally, for digital health technologies, evidence of efficacy often needs to span distinct areas of required proof. These are:

  • Evidence of comparative effectiveness: Evidence that a digital health technology is as effective or more effective at delivering a given outcome than the equivalent non-digital process. The digital health technology should lead to a demonstrable improvement in patient or clinical outcomes. In traditional evidence terminology, this may be considered as a ‘per protocol’ analysis, determining the benefit of the technology if used properly.

  • Human factor analysis: This is an assessment of how effectively the digital health technology is utilised and engaged with by end users, be they patients, or health and care professionals. It is often missing in many studies and trials which focus on the efficacy of the algorithm but not on the user interface and engagement. This may be considered as an ‘intention to treat’ analysis, i.e. how is it actually used (and resulting in a benefit), under real-world circumstances, as opposed to those under ‘ideal’ conditions.

  • Evidence of economic benefit: Just like any other health intervention, a digital health technology can have strong evidence of comparative effectiveness but not actually deliver any material economic benefit. In the pharmaceutical world, the split between ‘clinical effectiveness’ and economic value is well established, but for digital health technologies, this is an area where it is even more uncommon to find substantive health economic evidence. To date there are little over 700 published and peer-reviewed economic analyses of digital health technologies, compared to the 366,000 technologies available to download today.

We call this the Evidential Range Challenge.

Given these specific challenges that digital health technologies must meet in the evidential space, how can we develop a model that navigates a path through these challenges and maintains a suitable balance between on the one hand assurance rigour, and on the other practicality and achievability? 

Placing the evidential bar too high is likely to exclude a huge number of digital health technologies that are either not at the right level of maturity, or are very simple, low risk technologies that are unlikely to ever reach the higher levels of evidence. Placing the bar too low, however, risks the accreditation affording no material assurance and so defeating the purpose of undertaking such a process. Striking this balance remains one of the most challenging aspects of digital health technology accreditation development. 

How, therefore, have accreditation models and assessment frameworks in the digital health space responded to these challenges and how have they sought to manage this balancing act?

Where does evidence fit into standards?

The global response to this challenge for digital health assessment has been limited historically, but over the last few years a number of frameworks and models have emerged that seek to address some of these issues. Below, we look at the emerging models and how they approach the incorporation of evidence into digital health assessment.

Evidence Standards Framework for Digital Health Technologies (ESF)

To move away from a reliance on RCTs in order to demonstrate evidential credibility, between June 2018 and February 2019, the National Institute for Health and Care Excellence (NICE) created the Evidence Standards Framework for Digital Health Technologies (ESF), in collaboration with NHS England, Public Health England and MedCity. Although lots of research has been undertaken globally into what evidence should look like, this tiered approach is the most established methodology to date.

The Framework, which was recently updated,  groups products into tiers based on their functionality, each of which outlines what the developer must establish for their digital health technology. NICE explains that their Framework is ‘a set of standards that support innovation while ensuring an appropriate level of rigour and assurance for the health and care system’. 

Whereas the higher tiers of digital health require RCTs, and observational studies which stated minimum quality standards, those in the lower tiers of the Framework do not require traditional evidence requirements. Alternative methods of assurance can instead be accepted. This model was the first to properly enshrine the Proportionality Principle and at its heart is the acknowledgement that not all products in this space have the same risk profile.

Introducing a benchmark appropriate to the role and risks of a product helps many of the smaller innovators to avoid the prohibitive cost, time and skills barrier of being required to conduct trials or studies. This aims to ensure a good supply of innovative products of varying levels of complexity, enabling a more streamlined and proportionate roadmap to accreditation.

Although the NICE Framework predominantly caters to a UK audience in response to UK-related needs, elements of the Framework, particularly with regards to the tiering of products, have inspired the development of global standards and the Proportionality Principle that has now been largely adopted, albeit in different ways, by subsequent models.

ISO 82304-2

The ISO 82304-2 (Health software — Part 2: Health and wellness apps—Quality and reliability), developed as a new European standard, is expected to be officially launched in 2021. The standard has been designed for self-certification by developers, enabling innovators to easily identify if they meet the standard. It has also been designed to inform assessment processes being developed with different accrediting bodies.

For assurance, the guidance asks organisations to demonstrate different types of evidence, and indicates, based on some intended usages, that an observational study or a randomised controlled trial is required. Whilst the tiering isn’t as detailed as the NICE Framework, it does adapt the requirements based on functionality and reinforces the Proportionality Principle.

The ISO has yet to be tested at scale and there will, as in all models, undoubtedly be some challenges around the tiering definitions, as more products start to seek certification. However it is a major piece of work and really supports the emerging direction of travel established through the NICE Framework.

The Digital Health Applications (DiGA) process

Created in Germany in 2020, the DiGA requires healthcare tools to meet specified criteria to be recognised as DiGA under the Digital Healthcare Act, including providing preliminary data on the benefits they provide, akin to the European Medicines Agency’s ‘adaptive pathways’ approach, in addition to being CE-certified as medical products in the EU’s lowest-risk classes – the DiGA only covers medical devices. 

The DiGA includes data on evidence, with the requirement for a quantitative comparative study with the methodology adequate for the product. The DiGA also considers the maturity of digital health technologies in their product development cycle. Innovators can apply for the DiGA without a RCT and get temporary registration for a year, but they have to complete a RCT within this year. This gives lower maturity technologies time to gather evidence. By creating a process for complex technologies only, whilst considering the maturity of the technologies. 

The DiGA is still in a relatively early stage of its evolution and has so far been applied to a quite small number of products. It will be interesting to see how the model evolves and whether the approach to the Lifecycle Challenge offers a potential route that could be adapted elsewhere.

Adapted Evidence Standards Framework

The Organisation for the Review of Care and Health Applications (ORCHA) continues to evolve standards in this field, and introduced an Adapted Evidence Standards Framework to its reviews in early 2021.

Designed to be adopted both in the UK and internationally, as part of international assessment methodologies, this adapted model is very much a build on the NICE Framework and adopts the same core structure and approach. The Adapted Evidence Standards Framework has evolved following the application of the original NICE Framework to in excess of 1,000 assessments of digital health products. Through this process, a number of edge cases and issues were identified with the original model and these drove the development of a range of adaptations that have been collated in this alternative model. 

Through these adaptations the model seeks to avoid any subjective or broad questions and includes only questions that can be assessed objectively with clear ‘evidence/assurance’ requirements. Where bands of tiers in other standards are quite broad, this model introduces subclasses within tiers. This prevents simpler products getting caught in a category which causes them to be assessed with higher criteria than is appropriate but also up-levels some products that would under the original NICE Framework be classed at a lower Tier. 

The model also looks to address some of the specific issues identified from the significant sample set, regarding the higher end evidential requirements in the NICE Framework and, in particular, the challenges associated with undertaking RCTs or Interventional Studies for some types of products – for example, diagnostics. 

Having now been applied to in excess of 1,500 digital health assessments, the model has now been adopted within a growing number of international accreditation models.


A number of models have now emerged, seeking to meet the challenges faced when assessing digital health technologies. The Proportionality Principle is clearly factored into most of the leading frameworks. Some are also starting to address the Evidential Range Challenge. But the one challenge area that is still largely un-resolved is the Lifecycle Challenge, although the DiGA model does address aspects.  

ORCHA continues to investigate the best ways to incorporate evidence and assurance components into digital health assessment. It is working with a growing variety of international bodies to incorporate its approach into accreditation methodologies specific to the needs of particular countries and their populations. 

When considering which elements to include in the development of your own digital health assessment, it is important to consider the gaps and challenges in current approaches, but also the growing consensus that is emerging at a principle level.

We are currently undertaking a number of research projects to test our approaches and understand the complexities of digital health evidence, including: 

(1) A wide ranging Delphi study with global leaders in digital health accreditation to identify existing assessment gaps.

(2) A research project with the Netherlands eLiving Lab (NeLL) testing a representative sample set of digital health technologies against the key evidence assessment frameworks outlined above to determine the comparative impact of each model.

(3) An assessment of how user experience plays a role in promoting clinical effectiveness with Ulster University. 

(4) An exploration and critical appraisal of behavioural change techniques employed in digital health technologies, and how best to evaluate them, with the University of Warwick.

To find out more about these projects, receive the published papers, or to discuss your accreditation needs, please email: