480 lines
13 KiB
Plaintext
480 lines
13 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "70659ec5-54c4-4eee-ba6a-c3f17ac88638",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Output options for rendering the dataframes/tables"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "98078b40-9c8f-4b74-aaa7-275df72c9b79",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import pandas as pd\n",
|
|
"# how many rows of table to show\n",
|
|
"pd.set_option(\"display.max_rows\", 1000)\n",
|
|
"# we want to see the full error messages (often longer than default colwidth)\n",
|
|
"pd.set_option(\"max_colwidth\", 10000)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "9566f3ad-3587-415b-a49a-02ffaee35b34",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Validate and render validation results of one FHIR-Resource"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "f93e44bc-9644-4c50-94af-d9a29a91d54b",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"#from fhirvalidation import Validator\n",
|
|
"#validator = Validator()\n",
|
|
"#validator.validate_resource_and_render_validation_outcome ('Condition/resource1')\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "e0cd94b5-3a97-40c8-9a4a-d5d33feb9d32",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Bulk validation of found resources by FHIR Search\n",
|
|
"\n",
|
|
"Validate all resources from FHIR search results"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "d8ec4f1c-d933-4222-a35a-178454aba98a",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Validate conditions"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "42eeca2e-8b33-44d7-b194-e72630e66140",
|
|
"metadata": {
|
|
"editable": true,
|
|
"scrolled": true,
|
|
"slideshow": {
|
|
"slide_type": ""
|
|
},
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from fhirvalidation import Validator\n",
|
|
"\n",
|
|
"validator = Validator()\n",
|
|
"\n",
|
|
"# Set auth parameters for your FHIR server (if you do not want to use basic auth credentials from the environment variables in .env)\n",
|
|
"# Documentation: https://requests.readthedocs.io/en/latest/user/authentication/#basic-authentication\n",
|
|
"# validator.requests_kwargs['auth'] = ( 'myusername', 'mypassword )\n",
|
|
"\n",
|
|
"# Search for all resources of the resource_type\n",
|
|
"search_parameters = {}\n",
|
|
"\n",
|
|
"# Search resources with for certain code\n",
|
|
"#search_parameters={\"code\": \"A00.0\"}\n",
|
|
"\n",
|
|
"df = validator.search_and_validate(resource_type=\"Condition\", search_parameters=search_parameters, limit=10000)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "a400b6d7-2565-4c06-8fcf-354cd9f0e970",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Issues\n",
|
|
"Found issues in dataframe returned by bulk validation"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "001dcc6b-0517-4101-8270-686dcde78e55",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Count of resources with issues\n",
|
|
"\n",
|
|
"Count of resources (unique fullURL) with issues of all severities (even severity \"info\", so maybe no real issue)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "efa56586-2b67-4219-814a-3d679f360faa",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import pandas as pd\n",
|
|
"len( pd.unique(df['fullUrl']) )"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "7e0fe487-fa30-4c64-8ba2-b13ac20a7714",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Grouped issues with aggregation of codesystems sorted by count of affected resources\n",
|
|
"\n",
|
|
"Issues grouped by additional aggregation of Codesystems (e.g. ICD10) by removing the different codes of same codesystem resulting in no separate issue for each used code (e.g. ICD10-Code) of the code system\n",
|
|
"\n",
|
|
"Sorted by count of affected resources"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "09af41b7-c4c4-422a-a1c6-577afbec98ac",
|
|
"metadata": {
|
|
"scrolled": true
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"df[['severity', 'location_aggregated', 'diagnostics_aggregated', 'fullUrl']].groupby([\"severity\", \"location_aggregated\", \"diagnostics_aggregated\"]).count().sort_values(['fullUrl'], ascending=False)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "e4761060-19c3-4a9e-84a5-ce83088caefd",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Grouped issues with aggregation of codesystems sorted by severty\n",
|
|
"\n",
|
|
"Issues grouped by additional aggregation of Codesystems (e.g. ICD10) by removing the different codes of same codesystem resulting in no separate issue for each used code (e.g. ICD10-Code) of the code system\n",
|
|
"\n",
|
|
"Sorterd by severity"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "e596c4df-bfc2-4faa-b63a-b606e96dbade",
|
|
"metadata": {
|
|
"editable": true,
|
|
"scrolled": true,
|
|
"slideshow": {
|
|
"slide_type": ""
|
|
},
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"df[['severity', 'location_aggregated', 'diagnostics_aggregated', 'fullUrl']].groupby([\"severity\", \"location_aggregated\", \"diagnostics_aggregated\"]).count().sort_values(['severity','fullUrl'], ascending=False)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "b84016bd-2aba-4c28-a20a-f4b48a497234",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Grouped issues without aggregation of codesystems\n",
|
|
"\n",
|
|
"Issues and count of affected resources sorted by amount of affected resources due to no aggregation of codesystem (for additional aggregation of codesystems see upper sections). This will show a separate issue for each used code used from a codesystem"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "83f03e4a-d3f5-406a-aa16-de846a72b0b1",
|
|
"metadata": {
|
|
"scrolled": true
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"df[['severity', 'location', 'diagnostics', 'fullUrl']].groupby([\"severity\", \"location\", \"diagnostics\"]).count().sort_values(['fullUrl'], ascending=False)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "227a8710-5b07-421f-91ba-8a2a5ba25172",
|
|
"metadata": {
|
|
"editable": true,
|
|
"slideshow": {
|
|
"slide_type": ""
|
|
},
|
|
"tags": []
|
|
},
|
|
"source": [
|
|
"### Filter on severity \"error\"\n",
|
|
"\n",
|
|
"Show only issues filtered by severity \"error\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "20b5c938-6a08-4698-9297-7d2764c49838",
|
|
"metadata": {
|
|
"editable": true,
|
|
"slideshow": {
|
|
"slide_type": ""
|
|
},
|
|
"tags": []
|
|
},
|
|
"source": [
|
|
"#### Count of resources with severity \"error\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "5609dfb4-501b-4cc8-9318-b19cd6399b69",
|
|
"metadata": {
|
|
"editable": true,
|
|
"slideshow": {
|
|
"slide_type": ""
|
|
},
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"len( pd.unique(df[df['severity']==\"error\"]['fullUrl']) )"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "8e72ccfa-813f-43f6-9e1d-c713c03c2714",
|
|
"metadata": {
|
|
"editable": true,
|
|
"slideshow": {
|
|
"slide_type": ""
|
|
},
|
|
"tags": []
|
|
},
|
|
"source": [
|
|
"#### Show only issues with severity \"error\" grouped by codesystems\n",
|
|
"\n",
|
|
"Show grouped issues with filter on severity \"error\"\n",
|
|
"\n",
|
|
"Issues grouped by additional aggregation of Codesystems (e.g. ICD10) by removing the different codes of same codesystem resulting in no separate issue for each used code (e.g. ICD10-Code) of the code system\n",
|
|
"\n",
|
|
"Sorted by count of affected resources\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "c1205db8-0efe-491c-961b-97ad7d8149cf",
|
|
"metadata": {
|
|
"editable": true,
|
|
"slideshow": {
|
|
"slide_type": ""
|
|
},
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"df.query('severity==\"error\"')[['location_aggregated', 'diagnostics_aggregated', 'fullUrl']].groupby([\"location_aggregated\", \"diagnostics_aggregated\"]).count().sort_values(['fullUrl'], ascending=False)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "1a8f220d-d314-4053-b42b-1ef6bf9a9bdd",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Grouped issues with severity \"error\" without aggregation of codesystems\n",
|
|
"\n",
|
|
"Issues and count of affected resources sorted on amount of affected resources\n",
|
|
"Since no aggregation of codesystem (for additional aggregation of codesystems see upper sections) this will show a separate issue for each used code used from a codesystem"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "e12ab11b-c245-440f-acfb-400cdb24e6d1",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"df.query('severity==\"error\"')[['location', 'diagnostics', 'fullUrl']].groupby([\"location\", \"diagnostics\"]).count().sort_values(['fullUrl'], ascending=False)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ea051ae1-6698-4888-a672-3bc55c6740cd",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Resources with a specific error"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "cef7c2f4-22a3-4c18-8eb2-ae7f811df532",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"myerror = \"Condition.code.coding:icd10-gm.version: minimum required = 1, but only found 0 (from https://www.medizininformatik-initiative.de/fhir/core/modul-diagnose/StructureDefinition/Diagnose|2024.0.0)\"\n",
|
|
"\n",
|
|
"# Use Python syntax:\n",
|
|
"# df[df['diagnostics']==myerror]\n",
|
|
"#\n",
|
|
"# or use df.query\n",
|
|
"# https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.query.html and https://docs.python.org/3/reference/lexical_analysis.html#f-strings:\n",
|
|
"df_query = f'diagnostics==\"{myerror}\"'\n",
|
|
"\n",
|
|
"df.query(df_query)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "5f4146e5-ce88-42e0-b1ac-41b93c1d59f1",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Info\n",
|
|
"\n",
|
|
"Information concerning the dataframe, e.g. dataframe memory usage"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "2be8ee0e-5c48-4453-804e-c2db23115bd9",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Dataframe memory usage"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "73994859-6824-42e3-9ee3-9570ef9183a8",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"df.info()\n",
|
|
"df.memory_usage(deep=True)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "c46b93d4-35ad-427d-bda8-44c42f6b91a1",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Head - Returns first rows of dataframe"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "2e521216-ad7d-4ca6-8e04-7d86435a3a6a",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"df.head()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "6311f067-72d1-4a32-91fe-585ebfb74c55",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Snippets\n",
|
|
"\n",
|
|
"Additional code snippets"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ff1ed096-dc18-492b-988b-8c5b7899adb9",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Markdown generation\n",
|
|
"\n",
|
|
"How to generate table in markdown format (e.g. for CI/CD status report)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "8843debf-ca4c-409c-89eb-8eba64432438",
|
|
"metadata": {
|
|
"scrolled": true
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Reserved char pipe | has to be escaped by | (https://github.com/astanin/python-tabulate/issues/241)\n",
|
|
"df_escaped = df.applymap(lambda s: s.replace('|','\\\\|') if isinstance(s, str) else s)\n",
|
|
"\n",
|
|
"print(df_escaped[['severity', 'location_aggregated', 'diagnostics_aggregated', 'fullUrl']].groupby([\"severity\", \"location_aggregated\", \"diagnostics_aggregated\"]).count().sort_values(['fullUrl'], ascending=False).to_markdown(tablefmt=\"github\") )\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "a9dd17c8-3249-4959-967c-affb1d30cf23",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Navigate validation results dataframe with interactive user interface\n",
|
|
"\n",
|
|
"Use interactive UI to navigate and filter the dataframe\n",
|
|
"\n",
|
|
"Documentation: [English](https://docs.kanaries.net/pygwalker#use-pygwalker-in-jupyter-notebook) / [German](https://docs.kanaries.net/de/pygwalker#verwendung-von-pygwalker-in-jupyter-notebook)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "086cda6f-2508-4b04-9c29-2894cf3d8b4b",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Install pip package in the current Jupyter kernel\n",
|
|
"import sys\n",
|
|
"!{sys.executable} -m pip install pygwalker --proxy http://141.53.65.163:8080/"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "e4b8b1e8-bb9d-48ae-818f-b32b75b47ec6",
|
|
"metadata": {
|
|
"scrolled": true
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# render dataframe with pygwalker\n",
|
|
"import pygwalker as pyg\n",
|
|
"walker = pyg.walk(df)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "054d49ef-1c15-4ce1-bc0a-d941446e60dd",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.11.6"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|