PathLAKE Data Lake

The PathLAKE project aims to curate and make available to researchers, raw image data from multiple samples across the entire landscape of human disease. This will be achieved by taking digitised images created in the routine scanning of histopathology slides, either at the time they are reported or by scanning slides from existing research cohorts. De-identified image and metadata from these cases is then transferred from the contributing NHS centres to the Data Lake and stored. Each record is tagged with fields taken from the pathology system (such as specimen type, site, SNOMED code) to enable the cases to be identified in search strategies. In addition, each case contains metadata files which are also derived from the original hospital record and stored in de-identified form. This is designed to give researchers as much detail as possible about the individual cases they are studying. For more information on what data is currently stored within the Data Lake, please click the link below

PathLake Portal
PathLAKE

Data Access

PathLAKE is moving to become part of West Midlands Secure Data Environment; applications will now be received and approved via their processes [Data Request Process - West Midlands Secure Data Environment].

Approved Applications

Application No. Company Title of Research
DL2022/007 University of Warwick Automatic cell recognition in colon tissue samples
DL2022/015 RAIR Health Limited The extraction of proprietary clinical insights from real world pathology data using graph database technology.

What is Computer Assisted Diagnosis?

The PathLAKE research database has been reviewed and approved by South Central – Oxford C Research Ethics Committee,