SQL Server Graph

28 Mar 2021 18304 views 0 minutes to read Contributors

Getting Started with SQL Server Graph

 

The concept of storing data in a graph structure is not new. The data in graph databases is not stored in a relational structure, so storing graph data in SQL Server therefore required careful planning and execution. The SQL Server relational engine supports graph data as of SQL Server 2017 and this support represents the first steps of a full graph functionality implementation in the product.

SQL Server’s graph databases can help simplify the process of modeling data that contains complex many-to-many and hierarchical relationships. At its most basic, a graph database is a collection of nodes and edges that work together to define various types of relationships. A node is an entity such as a person or location. An edge is a relationship between two entities. For example, a relationship might exist between a location such as Seattle and a person named John, who lives in Seattle. John and Seattle are the entities, and ‘lives in’ is the relationship between the two.

The "special" nature of graph relational table storage consists of the addition of several system defined and managed columns in the graph tables. Beyond this, the tables are regular relational tables and therefore can be indexed with clustered, non-clustered and even columnstore indexes. It is possible to have multiple node types and multiple edge types in single tables, but generally you would separate them and create additional columns in each table for the node and edge properties specific to each type.

To create a node table you just need to add "AS NODE" to a create table statement. To create an edge table, similarly, just requires "AS EDGE" on a create table statement. Below is a script that creates a small sample graph structure with City, Person and Restaurant nodes and friendof, likes, livesIn and locatedIn edges.

 

Setting up demo environment :

-- Create a graph demo database

IF NOT EXISTS (SELECT * FROM sys.databases WHERE NAME = 'SqlTreeoGraphDemo')

                  CREATE DATABASE SqlTreeoGraphDemo;

GO

USE SqlTreeoGraphDemo;

GO

-- Create NODE tables

CREATE TABLE Person (

  ID INTEGER PRIMARY KEY,

  name VARCHAR(100)

) AS NODE;

 

CREATE TABLE Restaurant (

  ID INTEGER NOT NULL,

  name VARCHAR(100),

  city VARCHAR(100)

) AS NODE;

 

CREATE TABLE City (

  ID INTEGER PRIMARY KEY,

  name VARCHAR(100),

  stateName VARCHAR(100)

) AS NODE;

 

-- Create EDGE tables.

CREATE TABLE likes (rating INTEGER) AS EDGE;

CREATE TABLE friendOf AS EDGE;

CREATE TABLE livesIn AS EDGE;

CREATE TABLE locatedIn AS EDGE;

 

 

 

We can use below query to verify the node and edge tables in the database

 

 

 

 

 

Inserting sample data into Node and Edge tables :

-- Insert data into node tables. Inserting into a node table is same as inserting into a regular table

INSERT INTO Person (Id, name)

                  VALUES (1, 'John')

                                     , (2, 'Mary')

                                     , (3, 'Alice')

                                     , (4, 'Jacob')

                                     , (5, 'Julie');

 

INSERT INTO Restaurant (Id, name, city)

                  VALUES (1, 'Taco Dell','Bellevue')

                                     , (2, 'Ginger and Spice','Seattle')

                                     , (3, 'Noodle Land', 'Redmond');

 

INSERT INTO City (Id, name, stateName)

                  VALUES (1,'Bellevue','wa')

                                     , (2,'Seattle','wa')

                                     , (3,'Redmond','wa');

 

-- Insert into edge table. While inserting into an edge table,

-- you need to provide the $node_id from $from_id and $to_id columns.

/* Insert which restaurants each person likes */

INSERT INTO likes

  VALUES ((SELECT $node_id FROM Person WHERE ID = 1), (SELECT $node_id FROM Restaurant WHERE ID = 1), 9)

               , ((SELECT $node_id FROM Person WHERE ID = 2), (SELECT $node_id FROM Restaurant WHERE ID = 2), 9)

              , ((SELECT $node_id FROM Person WHERE ID = 3), (SELECT $node_id FROM Restaurant WHERE ID = 3), 9)

              , ((SELECT $node_id FROM Person WHERE ID = 4), (SELECT $node_id FROM Restaurant WHERE ID = 3), 9)

              , ((SELECT $node_id FROM Person WHERE ID = 5), (SELECT $node_id FROM Restaurant WHERE ID = 3), 9);

 

/* Associate in which city live each person*/

INSERT INTO livesIn

              VALUES ((SELECT $node_id FROM Person WHERE ID = 1), (SELECT $node_id FROM City WHERE ID = 1))

              , ((SELECT $node_id FROM Person WHERE ID = 2), (SELECT $node_id FROM City WHERE ID = 2))

              , ((SELECT $node_id FROM Person WHERE ID = 3), (SELECT $node_id FROM City WHERE ID = 3))

              , ((SELECT $node_id FROM Person WHERE ID = 4), (SELECT $node_id FROM City WHERE ID = 3))

               , ((SELECT $node_id FROM Person WHERE ID = 5), (SELECT $node_id FROM City WHERE ID = 1));

 

/* Insert data where the restaurants are located */

INSERT INTO locatedIn

               VALUES ((SELECT $node_id FROM Restaurant WHERE ID = 1), (SELECT $node_id FROM City WHERE ID =1))

               , ((SELECT $node_id FROM Restaurant WHERE ID = 2), (SELECT $node_id FROM City WHERE ID =2))

               , ((SELECT $node_id FROM Restaurant WHERE ID = 3), (SELECT $node_id FROM City WHERE ID =3));

 

/* Insert data into the friendOf edge */

INSERT INTO friendOf

               VALUES ((SELECT $NODE_ID FROM Person WHERE ID = 1), (SELECT $NODE_ID FROM Person WHERE ID = 2))

               ,((SELECT $NODE_ID FROM Person WHERE ID = 2), (SELECT $NODE_ID FROM Person WHERE ID = 3))

               , ((SELECT $NODE_ID FROM Person WHERE ID = 3), (SELECT $NODE_ID FROM Person WHERE ID = 1))

               , ((SELECT $NODE_ID FROM Person WHERE ID = 4), (SELECT $NODE_ID FROM Person WHERE ID = 2))

               , ((SELECT $NODE_ID FROM Person WHERE ID = 5), (SELECT $NODE_ID FROM Person WHERE ID = 4));

 

 

As notes, we could have combined the two edge tables into a single table, since they don't have edge specific properties, but separation makes the model more understandable.

Node tables have a system generated $node_id column. The $node_id column is a pseudo column that you can use in queries, but the actual table column name has a string of hex digits after it.

The node tables also have a graph_id column, similarly followed by a string of hex digits, but this column is not selectable and does not show up in a select * from the table.

When the node table is created, a unique index on the $node_id column is created. During loading, you may want to disable the unique index temporarily for performance reasons 

 

Edge tables have an $edge_id column, $from_id and $to_id columns (again, these are all pseudo column references, the actual columns have hex strings after them). There are also some other columns in the underlying table (i.e. graph_id, from_obj_id, from_id etc.), but these are not accessible and are system maintained. The key columns for most uses are the $from_id and $to_id columns.

 

 

 

In edge tables, some of the columns are also hidden from a select *, but you can see them in SSMS.

 

 

 

Querying Node and Edge tables :

The extensions to TSQL consist of a new "WHERE" predicate "MATCH". You always have the alternative to explicitly join the node and edge tables using normal join syntax, but the ASCII-art MATCH predicate can be used to easily traverse a graph in a join free manner.

 

-- Find Restaurants that John likes

SELECT Restaurant.name

FROM Person, likes, Restaurant

WHERE MATCH (Person-(likes)->Restaurant)

AND Person.name = 'John';

 

 

 

-- Find Restaurants that John's friends like

SELECT Restaurant.name

FROM Person person1, Person person2, likes, friendOf, Restaurant

WHERE MATCH(person1-(friendOf)->person2-(likes)->Restaurant)

 

AND person1.name='John';

 

 

 

-- Find people who like a restaurant in the same city they live in

SELECT Person.name

FROM Person, likes, Restaurant, livesIn, City, locatedIn

 

WHERE MATCH (Person-(likes)->Restaurant-(locatedIn)->City AND Person-(livesIn)->City);

 

 

 

 

 

Report a Bug

In this article