PSS neo4j database

Basic schema

The basic schema of the PSS database (in terms of reactions) is shown below:

For a more detailed description of the schema, see: PSS database schema

Cypher querying language

The standard query language to interface with a neo4j database is Cypher. In Cypher nodes are represented by () and relationships are represented by -[]->. An edge therefore has the following general syntax:

(source)-[relationship]->(target)

The following patterns apply to nodes:
()	any node, no assigned variable
(:Metabolite)	any Metabolite node
(m:Metabolite)	any Metabolite node, assigned to variable m
( {<property>:<value>})	any node with property equal to value
The following patterns apply to relationships:
()--()	any relationship
()-[:SUBSTRATE]-()	any substrate relationship
()-[{<property>:<value>}]-()	any relationship with property equal to value
()-[p:SUBSTRATE]-()	any substrate relationship and assign to a variable called p
(:Metabolite)-[:SUBSTRATE]-(:Reaction)	any substrate relationship between a Metabolite and a Reaction
(:Metabolite)-[:SUBSTRATE]->(:Reaction)	any substrate relationship with Metabolite source and Reaction target (i.e. directed)
Basic query structures:
MATCH <pattern>	searches for the given pattern in the db (SELECT in SQL)
OPTIONAL MATCH	searches for the given pattern in the db, results in NULL if no match (still returns)
WHERE	filter result, can be combined with AND OR XOR NOT
WHERE <entity.property> <comparison> <value>	Comparisons include: =< > <= >= =~ (a regex expression) IN (a list)
WHERE	More filter options: exists(<pattern>) <entity.property> STARTS WITH <entity.property> ENDS WITH <entity.property> CONTAINS
RETURN <result>	returns values or results of a query
RETURN <result> AS <alias>	return result with an alias property key instead of property name
DISTINCT	filters to unique set

Further statements:

CREATE
SET
MERGE
DELETE
REMOVE

Example cypher statements

Fetch all nodes:: MATCH (n)
RETURN DISTINCT n.name AS name
Fetch all undirected neighbours of JAR:: MATCH (n {name:"JAR"})--()
RETURN DISTINCT other.name AS name

Installation and deployment by Docker

Installation

First, install Docker on your computer. The instructions can be found at: Get Docker.

Clone the repository

Clone the repository to your computer:

    git clone https://github.com/nib-si/skm-neo4j.git

The correct branch is ---:

    cd skm-neo4j
    git checkout ---

The repository is organised as follows:

      skm-neo4j
        ├─ docker-compose.yaml
        ├─ docker-entrypoint.sh
        ├─ dockerfile
        ├─ conf
        ├─ logs
        ├─ data
        .   ├─ dumps
        .   ├─ raw
        .   ├─ import
        .   └─ db
        ├─ work
        └─ docs
            └─readme.md

docker-compose file
neo4j startup script
neo4j container build script
contains configuration files for neo4j and plugins
for neo4j logging

contains dumped databases
raw, original data (dev only)
for neo4j importing (dev only)
neo4j graph storage volume (dev only)
work scripts & notebooks
documentation folder

Set up neo4j database

to deploy the graph database on your computer as a container, you will need docker installed.

in the repository root folder, build the neo4j container with the graph:
```
    cd skm-neo4j
    docker build --tag skm-graph .
```
To run this image:
```
    docker run -it \
        -p7474:7474 -p1337:1337 -p7687:7687 \
        -v$pwd/logs:/logs -v$pwd/conf:/conf
        skm-graph
```
And visit the neo4j browser at http://localhost:7474/.

Link the neo4j database to a Jupyter notebook

Start up the neo4j database and link a jupyter notebook using docker-compose. To run the first time, navigate to the repo folder (containing docker-compose.yml) and run:
```
    docker-compose up
```
Thereafter you can use:
```
    docker-compose start
```
Open your browser at the http://localhost:8888/?token=... link printed to the terminal. If the logs are not printed, run
```
    docker-compose logs | grep 'http://localhost:.*/?token' | tail -1
```
to find the correct link.

To stop the containers, use:
```
    docker-compose stop
```
Or to stop and remove them:
```
    docker-compose down
```
Remove database

If you need to remove the graph image, first remove the container:
```
    docker-compose down
```
Or get the container ID from the first column in:
```
    docker ps -a | grep skm-graph
```
And delete it using:
```
    docker rm <container id>
```
Then delete the image:
```
    docker rmi skm-graph
```
Delete the folder with the graph data (if it exists).
```
    sudo rm -rf data/db/*
```
Dump a neo4j graph from Docker

In neo4j enterprise edition, follow instructions here.

For community edition, do the following
1. Start up a neo4j container, without starting the neo4j database:
```
  docker run \
  --volume=$pwd/data/db/:/data \
  --volume=$pwd/data/dumps:/dumps \
  --ulimit=nofile=40000:40000 \
  --user=1000:1000 \
  -it \
  neo4j:4.3.2 \
  /bin/bash
```
2. In the container, use neo4j-admin to dump the graph:
```
  bin/neo4j-admin dump --database=skm --to=/dumps/skm.dump
```
  Ctrl + d to exit the container.
3. Remove the new container:
```
    docker rm dump
```

Stress Knowledge Map

PSS neo4j database

Basic schema

Cypher querying language

Example cypher statements

Installation and deployment by Docker

Installation

Clone the repository

Set up neo4j database

Link the neo4j database to a Jupyter notebook

Remove database

Dump a neo4j graph from Docker