first release
This commit is contained in:
72
README.md
Normal file
72
README.md
Normal file
@@ -0,0 +1,72 @@
|
||||
# BRAinS
|
||||
|
||||
This repo contains the code for creating the BRAinS-Graph (*Biomedical Knowledge Graph for Recommending and Analysing Health Studies*). It loads data from four sources into a Neo4j graph database:
|
||||
- [ClinicalTrials.gov](https://clinicaltrials.gov/)
|
||||
- [The Portal of Medical Data Models (MDM Portal)](https://medical-data-models.org/)
|
||||
- [Medical Subject Headings (MeSH)](https://www.nlm.nih.gov/mesh/meshhome.html)
|
||||
- [Unified Medical Language System (UMLS)](https://www.nlm.nih.gov/research/umls/index.html)
|
||||
|
||||
## Structure of the repository
|
||||
|
||||
The repository consists of three dataloaders (`study2neo4j`, `moi`, `umls2neo4j`) and a postprocessing script (`postprocessing`).
|
||||
|
||||
> [!NOTE]
|
||||
> The dataloader for the MDM Portal, `mdm2neo4j`, can be found in an additional repository: [mdm2neo4j repo](https://git.uni-greifswald.de/MILA_public/mdm2neo4j).
|
||||
> It should be run as well to create the BRAinS-Graph.
|
||||
|
||||
### File Structure
|
||||
|
||||
```
|
||||
src/
|
||||
│
|
||||
├── study2neo4j
|
||||
│ ├── run.py # Main script to execute the ClinicalTrials.gov import
|
||||
│ ├── ct2neo4j.py # Functions for database connecting and data importing
|
||||
│ └── README.md # Documentation for setup and usage
|
||||
│
|
||||
├── moi
|
||||
│ ├── moi.py # Main script to execute the ontology import and processing
|
||||
│ ├── methods.py # Functions for configuring, importing, and processing the graph
|
||||
│ └── README.md # Documentation for setup and usage
|
||||
│
|
||||
├── umls2neo4j
|
||||
│ ├── umls2neo4j.py # Main script to execute the umls import
|
||||
│ ├── methods_umls2neo4j.py # Functions for loading cui names, parsing relations, loading into neo4j
|
||||
│ └── README.md # Documentation for setup and usage
|
||||
│
|
||||
└── postprocessing
|
||||
└── postprocess.py # Main script for postprocessing (creating relationships between data sources)
|
||||
```
|
||||
|
||||
## How to
|
||||
|
||||
Create a configuration file, storing your details for the database connection, e.g., in your home-directory with the name `brains.conf`.
|
||||
It should have the following structure:
|
||||
|
||||
```ini
|
||||
[neo4j]
|
||||
uri = bolt://localhost:7687
|
||||
username = neo4j
|
||||
password = myfancypassword
|
||||
```
|
||||
|
||||
Run the dataloaders `study2neo4j`, `moi`, `umls2neo4j`, which can be found under :file_folder: `src`.
|
||||
Instructions for usage are provided in the individual README.md files.
|
||||
Run the dataloader [mdm2neo4j repo](https://git.uni-greifswald.de/MILA_public/mdm2neo4j).
|
||||
|
||||
> [!IMPORTANT]
|
||||
> In general, the individual dataloader (`study2neo4j`, `moi`, `umls2neo4j`) can be run in any order.
|
||||
> The postprocessing only works if data has been loaded into the Neo4j. It should be run last (after all dataloaders have been run).
|
||||
> Make sure not to forget to run mdm2neo4j as a dataloader.
|
||||
|
||||
## Requirements
|
||||
- make sure `python3` is installed
|
||||
- install the required libraries with `pip install -r requirements.txt`
|
||||
- have a running Neo4j DB (Neo4j version 5)
|
||||
- create the configuration file as described in the [How To section](#how-to)
|
||||
|
||||
Have a running Neo4j instance with APOC and neosemantics installed.
|
||||
|
||||
## Licence
|
||||
|
||||
This program is released under [Version 3 of the GPL or any later version](https://www.gnu.org/licenses/gpl-3.0.en.html).
|
Reference in New Issue
Block a user