Innovating R&D with the cloud

Cloud-enabled business transformation for R&D

The pandemic is often credited with helping to accelerate change, challenge the status quo, and drive innovative research and development (R&D) solutions. Organizations the world over are innovating quickly, whether it is making ventilators (largely) from car parts,1 creating contact-tracing apps,2 or investing in R&D for public health.3 But the technology infrastructure required to support innovation and data strategies can be substantial. After all, upfront technology investment into on-premise infrastructure is heavy, procurement periods are long, and fixed costs are considerable—and together, these may become a barrier to progress. While many global organizations are embracing cloud infrastructure to support remote work,4 many may be missing out on optimizing their cloud and data strategies to enable innovation and R&D.

The cloud could be uniquely positioned to support R&D given that it can provision infrastructure almost instantaneously, scale up or down as the need shifts, and provide physical data center and virtual network security. What’s more, organizations can innovate secure digital applications and platforms more rapidly. Finally, the cloud brings a capacity to store and integrate information across robust networks. So, organizations can have interoperable data, and can engage in teamwork, collaboration, and cocreation.


To understand the potential of cloud technology for next-generation R&D, in September and October of 2020, we interviewed 10 specialists in business transformation, cloud, data engineering, and R&D whose input informed the insights of this article.

Based on our research, we found three distinct approaches related to cloud-enabled data, ecosystems, and services (figure 1).

Three key cloud approaches for next-generation R&D

While organizations across all industries can benefit, the opportunity may be especially relevant in the life sciences and health care (LSHC) industry. The global pandemic has highlighted the need for global coordination with omnichannel audiences across public, private, academia, and consortiums. This paper examines the different ways in which the cloud can enable R&D across the LSHC industry and beyond.

LSHC: What COVID-19 teaches us about cloud-enabled collaborative research

Is the pandemic a “break the glass” scenario for cloud-enabled R&D in LSHC?

The technology to bring together real-world and clinical data, securely, at scale, and at velocity—the cloud—has been there all along. By urgent necessity, the pandemic has hypercharged this transformation for global collaboration around shared research objectives.

  • Pharmaceutical research is seeing unprecedented levels of collaboration. For COVID-19, 213 vaccines are in development and 319 potential treatments in progress.5
  • Biopharma companies have reconfigured their individual clinical operations and trials to manage efficiency and cost, and better connect with patients.6
  • Health care organizations are maturing in using advanced analytics and machine learning (ML) for diagnostics and are sharing real-world evidence from clinical notes, lab reports, pathology images, and radiology scans. Deloitte research shows that 84% of physicians expect secure, efficient sharing of patient data integrated into care in the next 5–10 years.7
  • Bioinformatics have progressed to sequence and analyze the RNA of SARS-CoV-2 (the virus that causes COVID-19) to develop antiviral drugs by drawing on ML tools.8

Deloitte’s research on radical data interoperability reveals that 60% of US LSHC organizations host more than half of their applications on the cloud already.9 However, nonstandardized data infrastructures pose a challenge in coordination10 and data interoperability within and across organizations. This is where cloud technologies can help advance transformation now and into the future.

The cloud is an enabler of data, and cloud and data modernization strategies are inextricably intertwined.11 As human genome sequencing data volumes grow to an expected 40 exabytes by 2025,12 and as scientists spend up to 30%–40% of their time searching for, aggregating, and cleansing data,13 the cloud may well be a force multiplier to get drugs to market faster and cheaper. In fact, it’s already set the world record14 for elastic analysis of genomic data.15 And Deloitte research shows that shared infrastructure and resources for master protocols can reduce the research cycle time by 13%–18% and overall cost savings of 12%–15%.16 The question becomes: How does cloud enable such efficiencies and cost savings?

Innovating innovation technology for LS&HC R&D across three core approaches

Cloud data platforms

Organizations are beginning to understand a centralized data warehouse isn’t the only model. There are numerous cloud data platform options, including enterprise data management, data exchanges, open architecture strategies, the centralized data warehouse, and data lakes each with their unique advantages.

Enterprise data management

It’s common for valuable laboratory data to be saved on local hard drives, thumb drives, and storage area networks, thus introducing storage capacity, searchability, and security challenges. The cloud gives organizations various solutions to ingest, transform, analyze and share millions of existing records with flexibility and at scale to make data a reusable asset across teams. Cloud-enabled enterprise data management platforms with a shared data lake are one common solution, and our interviews revealed LSHC organizations, in particular are beginning to explore a new, emerging operating model to manage data across the organization in the form of “data-sharing neighborhoods” to generate cross-domain insights and to share data with regulators.

The biotechnology company Biogen, for example, produced 50 images for a single sample per day, which were saved in local, electronic lab notes that were archived every six months. This created a data access and searchability challenge, which they addressed via the cloud.17 In another instance, Pfizer launched its Scientific Data Cloud to make research data shareable, customizable, and reusable for researchers, data scientists, software engineers, and operations. The platform was designed to enable automated scalable analysis for precision medicine to find individualized treatments for diseases like cancer, and it provides a foundation for a longer-term data marketplace.18

External data exchanges

Cloud providers have started to offer data exchanges with clinical data, real-world evidence, and imaging data to power R&D studies and match drugs to patient populations.19 Over the next five years, cloud-enabled data exchanges and marketplaces could disintermediate data aggregators and resellers and provide new opportunities for large-scale and secure data transfers across organizations. These data marketplaces are expected to become increasingly important and could lead to a world where digital health data enables “learning health care,” with real-time clinical interventions that save lives.

The National Institutes of Health’s (NIH) All of Us Research Program will collect genomic data for 1 million people over 20 years for collaborative research.20 The UK National Health System’s Biobank generated 1.5PB+ of genetic, clinical, behavioral, and biometric data in its global population study.21 These projects are a concerted effort to create massive data repositories that can be used to support future internal or external data exchanges.

Merck, a leading global biopharmaceutical company, deployed an enterprisewide cloud-based real-world data (e.g., medical claims, EMR, etc.) analytics platform called the “Real World Data Exchange” to advance product development and commercialization. Merck’s Real World Data Exchange is an open, API-first platform and serves a broad set of stakeholders decreasing the time to insight-generation and fostering collaboration to positively impact all aspects of the product life cycle.22Show more

Creating more open and interoperable systems

It may be a challenge for commercial pharmaceutical companies to share proprietary data today, but many are sharing noncompetitive, HIPAA-compliant placebo data across studies without privacy concerns. They are doing so via an application programming interface (API)–first approach (a strategy which anticipates data-sharing across applications by design and allows for standardized, programmatic connection of applications). This approach creates a technical foundation for interoperable data-sharing as future collaboration incentives and cultural norms change and as an alternative to more open models (i.e. open APIs). The API-first organizations have frameworks that:

  1. Rank, flag, and categorize data sets for regulatory purposes
  2. Manage shareable and nonshareable data
  3. Establish baseline and incremental controls as greater security is needed

In health care, wearables are API-first technologies that allow data to be shared from devices to digital apps with privacy controls in place. Available in shareable and manageable formats, this data can be distributed across organizations for R&D purposes.

Cloud data warehouses, storage archives, and more

The cloud encrypts all data at rest and offers a variety of storage options for data with high input/output requirements. It also allows unique scale-up/scale-down capabilities. These advantages enable researchers to ingest petabytes of data at a given time and run queries by scaling up compute on-demand and temporarily, only paying for additional capacity when used. At the same time, the cloud offers dense/cold archive storage for long-term archival needs.

Scalable cloud ecosystem infrastructure

A vast majority (94%) of respondents believe real-world evidence in R&D will become increasingly important by 2022.23 A scalable cloud ecosystem would allow hospitals to share real-world data across their internal and external public/private networks to accelerate the entire ecosystem’s ability to target novel diseases or subpopulations who express major diseases differently. This is the cloud’s network effect in action—with just one important data set. To play that scenario forward, collaborative cloud ecosystems require the right operating model, such as a third-party arbiter, and incentives to ensure data safety, risk management, and IP protection—which is a journey still in progress to achieving these network benefits.

Research has shown that a network of ecosystems can help harness and accelerate distributed innovation for complex, externally-driven problems.24 It can facilitate a more collaborative approach and enables more diverse perspectives.25 Deloitte’s Transforming Clinical Development research has found some organizations are already experimenting with transformative approaches to drug development, such as use of real-world evidence and adaptive trials. Scaling the use of these approaches requires an ecosystem model wherein companies work collaboratively and transparently with multiple stakeholders. From a technology perspective, companies require interoperable data, knowledge management, and analytics platforms and processes, as well as scalable and secure cloud capabilities.26

From an R&D perspective, Deloitte’s 2020 Real World Evidence survey reveals that 16 out of 17 participating companies are using cloud platforms for real-world evidence and all the surveyed mature companies have a centralized, primarily cloud-based analytics platform.27 A scalable cloud ecosystem would allow hospitals to share real-world data across their internal and external public/private networks to accelerate their ability to target rare and novel diseases and cater to underrepresented populations. This could also open the door to cloud ML services to predict acute events like sepsis and understand disease states and linkages, such as congestive heart failure and diabetes.

The COVID-19 Healthcare Coalition, a private sector-led collaborative response to coronavirus, developed a cloud-native platform with secure/authenticated cloud storage, a data ingestion pipeline to sort/understand 300+ curated resources, and a big query searchable metadata repository for secure, scalable collaborative research. The platform has enabled members to support frontline responders and researchers and to improve treatment, regiments, vaccines, and device testing.28

However, there are challenges too. Take biomedical R&D, for example, where researchers lack coding expertise, and therefore the technology transformation ability. For this very reason, as the volume of genomics data has grown, most biomedical researcher organizations have embraced PaaS/SaaS data platforms powered by the cloud. But these point solutions have created data silos that can make it challenging to create clean and shareable data as part of a broader digital ecosystem. As organizations realize that data is an asset beyond a single analysis, this creates an opportunity for an API-enabled, cloud-native digital ecosystem.

Advanced cloud services for LSHC R&D

Two of the top outcomes that life sciences companies are attempting to achieve with AI are enhancing existing products and creating new products and services. Cloud AI, ML, and the internet of things (IoT) services can provide greater innovation speed and agility across the R&D value chain. In fact, some organizations are exploring AI to better manage clinical trial data.29 Some are also using cloud AI services to coordinate and accelerate recruiting and matching patients with clinical trials sites,30 to analyze existing drugs on the market, to screen drugs for other diseases (such as COVID-19),31 and potentially to detect future disease outbreaks before they occur.32

By aiding physicians with real-time data-driven diagnosis and treatment plans, AI-based solutions can play an important role in streamlining clinical diagnostics.33 In cancer diagnostics, for example, both false positives and false negatives pose real challenges, but part of the problem with training AI/ML models for diagnostics is getting a large enough data sample set to detect lung and pancreatic cancer for example—typically found in stage 3—at stage 1. Public and private organizations are using cloud ML to improve accuracy of cancer diagnoses in private sector research34 and for early diagnosis.35

  • Takeda, a pharmaceutical company, used a cloud-enabled data platform and advanced deep learning to predict at 92% accuracy (a 40-percentage point improvement) the patients likely to respond and improve during a nonalcoholic steatohepatitis and treatment-resistant depression therapeutics trial.36
  • Johnson & Johnson, a pharmaceutical and consumer packaged goods company, connected siloed data with intelligent automation, invested in cloud/core modernization, and aligned its cyber strategy to improve predictive decision-making, expedite clinical trial screening, and improve financial processing across 300,000 transactions for a hypertension drug trial.37
  • Mayo Clinic, the hospital network, announced it would use cloud AI/ML services to store, compute, and analyze patient data to advance the diagnosis and treatment of disease38 and bring the Mayo Clinic Platform—a digital extension of the organization’s capabilities—worldwide.39

Show more

The NIH on advancing public/private partnerships

Bringing all three of these approaches together, the NIH has made progress with the NIH Data Commons to store, share, access, and interact with digital files generated from biomedical research.40 Its Accelerating COVID-19 Therapeutic Interventions and Vaccines partnership has brought together over a dozen biopharma companies to standardize collaborative frameworks across vaccine and therapeutic R&D from preclinical evaluation to immune-response testing.41 Through the NIH Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability Initiative, the organization explores cloud and ML capabilities42 to generate, analyze, and share research data with commercial and 2,500 NIH-funded institutions.43 Most recently, the NIH National Center for Advancing Translational Sciences’ (NCATS’s) National COVID Cohort Collaborative Data Enclave (N3C) has created a centralized, secure, and cloud-enabled data platform to analyze real-world COVID-19 patient data for factors and long-term health consequences across 57 sites.44 For its part, the Rapid Acceleration of Diagnostics initiative aims to expedite innovation around rapid COVID-19 testing,45 undetected cases,46 and public health monitoring.47Show more

How Broadcom innovated a cloud-native digital application to manage workforce safety while protecting data privacy


As organizations responded to the pandemic, many have adapted to remote work. However, for Broadcom, a global infrastructure technology company that employs essential workers, engineers, and fabrication units, remote solutions were not an option. Faced with new workforce safety and risks challenges related to COVID-19 and the need to operate in a physical working environment, Broadcom wanted to go beyond standard social distancing, personal protective equipment (PPE), and hygiene, and sought to rapidly innovate a digital solution to alert and act on potential exposure–with employee privacy and local regulations in mind.


The cloud-native approach allowed the organization to quickly innovate a digital solution for data-sharing and analysis with worker privacy and data security considerations managed by the organization’s global privacy officer. The cloud application allowed Broadcom to:

  • Collect relevant personal worker information on the digital app on employee mobile devices
  • Integrate the data into internal systems, organization policies, security protocols, and global communications
  • Mine, aggregate, and analyze the data to identify potential COVID-19 exposure
  • Alert potentially at-risk individuals and initiate location-based cleaning protocols

A digital, cloud-data-sharing application was launched at scale in under 10 weeks across 10 countries for 15,000+ workers and 5,000 contractors that managed 350,000+ survey collections and work passes following daily symptom analysis and 250,000 automated workplace check-ins.

Source: Deloitte analysis.Show more

Five recommendations to advance the cloud-enabled data strategy across industries

Many organizations across industries are looking to modernize data platforms to reduce data costs, harness big data, create more data analysis flexibility, and tap into powerful artificial intelligence tools. Cloud technology could be the key enabler for them.48 Success may boil down to scalable and secure cloud data platforms to support interoperable data strategies, an ecosystem for collaborative analysis, and services to expedite and scale R&D innovation with low latency.

To give just one example of how these aspects are coming together in another industry, the US DoD— Defense Information Systems Agency Joint AI Center is looking to advance its intelligence strategy with a common, shared cloud-native/edge platform across 6–7 mission areas. It is expected to bring in multiple petabytes of data that would be impossible to move otherwise. The solution, referred to as the Joint Common Foundation (JCF) is designed to include high-level controls based on security clearance for various DoD organizations to access, buy, and acquire AI solutions and the data behind them. There is expected to also be defensive cyber operations including incident response, vulnerability management, continuous monitoring, and zero-trust architecture. Ultimately, the AI infrastructure could enable war fighters with secure data/tools for speed to decision that enable and enhance national security.

Going forward, there are five key takeaways to keep in mind (figure 2).

Five recommendations for cloud-enabled R&D business transformation

  1. Make data-sharing FAIR in the cloud. Cloud-native databases provide elasticity and reduced total cost of ownership.49 Guide cloud data migration by making sure it is findable, accessible, indexable, and reusable (F.A.I.R.) for collaborative research. Any R&D organization needs a repository of searchable and discoverable information. This baseline gives a perspective to a new data scientist or engineer to consider in order to focus research, build on existing data, implement improved practices, and find the white space for innovation.
  2. Build a business case for data-sharing. Integration patterns across organizations should change to enable data-sharing. That would require building and articulating a well-defined business case across a shared purpose that addresses data use, confidentiality requirements, and controls. Consider articulating why both/all parties should want to share the data, how IP/ownership would be protected, what incentives there are, and adhere to industry standards/protocols for data integrity. In many industries, there may be commercial and other incentives that directly conflict with data-sharing, in which case, finding a shared purpose and realigning incentive models based on that purpose can provide a useful starting point.
  3. Address data privacy. Cloud platforms should be designed with the proper governance, compliance, and security boundaries in place. They should be configured to industry norms, but be porous enough to encourage data-sharing, trust building, and network integrity maintenance. This type of open data platform would be nearly impossible on premise. Cloud, however, is built to support federated data models, with identity and access management, encryption and network security frameworks, and controls at each layer.50 It also has the inherent ability to create connected “islands” that can be crossed for private and secure collaboration. For data privacy, consider obfuscating, anonymizing, deidentifying, or pseudonymizing the data based on the level of control needed for trusted actors across the ecosystem. 
  4. Build the right cloud engagement model. Organizations can consider service-orienteddata-oriented, and process-oriented architectures.51

Service-oriented approaches might include SaaS solutions that allow for shared services, like online task management, work planning, and collaboration tools, across cloud-native web-based applications. These solutions can enable exploratory R&D and teaming on a broader scale for greater transparency. They have the ability to manage financial incentives based on contribution, protect the IP, and are basically a “try as you go” model. Additionally, IaaS models allow for ideation with ease and accessibility given the ability to rent standardized cloud environments for hours at a time (and at a fraction of the cost/time on the secondary market) and to tap into hundreds of cloud services. R&D can benefit from the ability to build a hypothesis, incubate an idea, and if it fails, scale down—if it succeeds, burst capacity across teams, organizations, and geographies with identical test environments across increasingly open and portable or interoperable and proprietary tech stacks.

In terms of data approachesPaaS enables the developer with rapid delivery and prototyping capability to test ideas. When combined with Data-as-a-Service, PaaS allows organizations to replicate and propagate data across the R&D environment for a single version of truth. This model generally needs some level of agreement on API standards to facilitate reusable data calls and queries for an open and interoperable network.52

Finally, with process-oriented R&D, organizations can integrate innovation into the work processes. For example, Google’s R&D follows a hybrid research model that can make the line between research and engineering blurry by writing R&D as near-production code to accelerate production timelines.53

5. Embrace cloud services for automation, analytics, and ML. R&D and innovation teams are using cloud services for everything from automated research54 and next best action recommendations55 to advanced military-grade AI programs. Investigate which of the many available services could advance the organizations broader digital ecosystem for innovation.

Whether on the front lines of LSHC R&D or working in another industry looking to innovate new products and services, a solid cloud-enabled data strategy may increasingly be a cornerstone to advancing the organizational journey toward becoming a data-driven digital enterprise in a data-driven digital ecosystem. Cloud technology could be uniquely positioned to serve as the digital core across a variety of enablement models to support secure collaboration and innovate the future.

The stakes are high, but the lessons could be transformative for R&D across industries.