Files
BRAinS-Graph/src/umls2neo4j
2025-08-22 11:52:43 +02:00
..
2025-08-22 11:52:43 +02:00
2025-08-22 11:52:43 +02:00
2025-08-22 11:52:43 +02:00

umls2neo4j: UMLS to Neo4j Importer

This Python script parses selected relationships from the UMLS Metathesaurus (MRREL.RRF and MRCONSO.RRF) and loads them into a Neo4j graph database.

Important

Requires a UMLS licence!

Features

  • Filters and loads PAR (parent) and CHD (child) relationships from MRREL.RRF
  • Loads only preferred English concept names from MRCONSO.RRF

Quickstart

Create a configuration file, storing your details for the database-connection. E.g. in your home-directory with the name umls.conf.

[neo4j]
uri = bolt://localhost:7687
username = neo4j
password = myfancypassword

Start the program by providing the location of your configuration-file and the location of the UMLS-files.

python3 src/umls2neo4j.py --conf ~/umls.conf --mrconsofiles ~/umls/MRCONSO.RRF --mrrelfiles ~/umls/MRREL.RRF

Requirements

  • make sure, python3 is installed
  • install the required libraries with pip install -r requirements.txt
  • download the UMLS Metathesaurus files (MRREL.RRF, MRCONSO.RRF) from → requires a UMLS licence
  • have a running Neo4j DB (Neo4j version 5), with APOC installed
  • create the configuration-file as described in the Quickstart section

Detailled Infos

The script will:

  1. Load preferred English concept names from MRCONSO.RRF
  2. Parse allowed relationships from MRREL.RRF
  3. Insert nodes and relationships into Neo4j using chunked batches

Customisation

  • Adjust ALLOWED_RELS in the script to include more relationship types
  • Tune batch_chunk_size and apoc_batch_size for better performance