first release

2025-08-22 11:52:43 +02:00
commit ec27c71148
23 changed files with 1543 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,72 @@
+# BRAinS
+
+This repo contains the code for creating the BRAinS-Graph (*Biomedical Knowledge Graph for Recommending and Analysing Health Studies*). It loads data from four sources into a Neo4j graph database:
+- [ClinicalTrials.gov](https://clinicaltrials.gov/)
+- [The Portal of Medical Data Models (MDM Portal)](https://medical-data-models.org/)
+- [Medical Subject Headings (MeSH)](https://www.nlm.nih.gov/mesh/meshhome.html)
+- [Unified Medical Language System (UMLS)](https://www.nlm.nih.gov/research/umls/index.html)
+
+## Structure of the repository
+
+The repository consists of three dataloaders (`study2neo4j`, `moi`, `umls2neo4j`) and a postprocessing script (`postprocessing`). 
+
+> [!NOTE]
+> The dataloader for the MDM Portal, `mdm2neo4j`, can be found in an additional repository: [mdm2neo4j repo](https://git.uni-greifswald.de/MILA_public/mdm2neo4j).
+> It should be run as well to create the BRAinS-Graph.
+
+### File Structure
+
+```
+src/
+│
+├── study2neo4j
+│   ├── run.py                  # Main script to execute the ClinicalTrials.gov import
+│   ├── ct2neo4j.py             # Functions for database connecting and data importing
+│   └── README.md               # Documentation for setup and usage
+│
+├── moi
+│   ├── moi.py                  # Main script to execute the ontology import and processing
+│   ├── methods.py              # Functions for configuring, importing, and processing the graph
+│   └── README.md               # Documentation for setup and usage
+│
+├── umls2neo4j
+│   ├── umls2neo4j.py           # Main script to execute the umls import
+│   ├── methods_umls2neo4j.py   # Functions for loading cui names, parsing relations, loading into neo4j
+│   └── README.md               # Documentation for setup and usage
+│
+└── postprocessing
+    └── postprocess.py          # Main script for postprocessing (creating relationships between data sources)
+```
+
+## How to
+
+Create a configuration file, storing your details for the database connection, e.g., in your home-directory with the name `brains.conf`.
+It should have the following structure:
+
+```ini
+[neo4j]
+uri = bolt://localhost:7687
+username = neo4j
+password = myfancypassword
+```
+
+Run the dataloaders `study2neo4j`, `moi`, `umls2neo4j`, which can be found under :file_folder: `src`.
+Instructions for usage are provided in the individual README.md files.
+Run the dataloader [mdm2neo4j repo](https://git.uni-greifswald.de/MILA_public/mdm2neo4j).
+
+> [!IMPORTANT]  
+> In general, the individual dataloader (`study2neo4j`, `moi`, `umls2neo4j`) can be run in any order. 
+> The postprocessing only works if data has been loaded into the Neo4j. It should be run last (after all dataloaders have been run).
+> Make sure not to forget to run mdm2neo4j as a dataloader.
+
+## Requirements
+- make sure `python3` is installed
+- install the required libraries with `pip install -r requirements.txt`
+- have a running Neo4j DB (Neo4j version 5)
+- create the configuration file as described in the [How To section](#how-to)
+
+Have a running Neo4j instance with APOC and neosemantics installed.
+
+## Licence
+
+This program is released under [Version 3 of the GPL or any later version](https://www.gnu.org/licenses/gpl-3.0.en.html).