Query Language

From Dgraph Wiki
Jump to: navigation, search

GraphQL+-

Dgraph uses a variation of GraphQL[1] as the primary language of communication. GraphQL[1] is a query language created by Facebook for describing the capabilities and requirements of data models for client‐server applications. While GraphQL isn't aimed at Graph databases, it's graph-like query syntax, schema validation and subgraph shaped response make it a great language choice. Having said that, we have modified GraphQL to support graph operations and removed some of the features that we felt weren't a right fit to be a language for a graph database. We're calling this simplified, feature rich language, GraphQL+-.

Note:This language is a work in progress. We're adding more features and we might further simplify some of the existing ones.

Mutations

Note: All mutations queries shown here can be run using curl localhost:8080/query -XPOST -d $'mutation { ... }'

To add data to Dgraph, GraphQL+- uses a widely-accepted RDF N-Quad format[2]. Simply put, RDF N-Quad[2] represents a fact. Let's understand this with a few examples.

<0x01> <name> "Alice" .

Here, 0x01 is the internal unique id assigned to an entity. name is the predicate (relationship). This is the connector between the entity (person) and her name. And finally, we have "Alice", which is a string value representing her name. RDF N-Quad ends with a dot (.), to represent the end of a single fact.

Language

RDF N-Quad[2] allows specifying the language for string values, using @lang. Using that syntax, we can set 0x01's name in other languages.

<0x01> <name> "Алисия"@ru .
<0x01> <name> "Adélaïde"@fr .

Note that Dgraph converts these to predicate.lang. So, the above would be equivalent to the following for Dgraph. In fact, this is how we'll be querying for names later.

<0x01> <name.ru> "Алисия" .
<0x01> <name.fr> "Adélaïde" .

Batch mutations

You can put multiple RDF lines into a single query to Dgraph. This is highly recommended. Dgraph loader by default batches a 1000 RDF lines into one query, while running 100 such queries in parallel.

mutation {
  set {
    <0x01> <name> "Alice" .
    <0x01> <name> "Алисия"@ru .
    <0x01> <name> "Adélaïde"@fr .
  }
}
Tip:Using a $ in front of the curl payload maintains newline characters, which are essential for RDF N-Quads.


Assigning UID

Blank nodes (_:<identifier>) in mutations let you create a new entity in the database by assigning it a UID. Dgraph can assign multiple UIDs in a single query while preserving relationships between these entities. Let us say you want to create a database of students of a class using Dgraph, you could create new person entity using this method. Consider the example below which creates a class, students and some relationships.

mutation {
 set {
   _:class <student> _:x .
   _:class <name> "awesome class" .
   _:x <name> "alice" .
   _:x <planet> "Mars" .
   _:x <friend> _:y .
   _:y <name> "bob" .
 }
}

# This on execution returns
{"code":"ErrorOk","message":"Done","uids":{"class":13454320254523534561,"x":15823208343267848009,"y":14313500258320550269}}
# Three new entitities have been assigned UIDs.

The result of the mutation has a field called uids which contains the assigned UIDs. The mutation above resulted in the formation of three new entities class, x and y which could be used by the user for later reference. Now we can query as follows using the _uid_ of class.

{
 class(_uid_:<assigned-uid>) {
  name
  student {
   name
   planet
   friend {
    name
   }
  }
 }
}

# This query on execution results in
{
    "class": {
        "name": "awesome class",
        "student": {
            "friend": {
                "name": "bob"
            },
            "name": "alice",
            "planet": "Mars"
        }
    }
}

External IDs (_xid_)

While not recommended, Dgraph supports directly using external ids in queries. These could be useful if you are exporting existing data to Dgraph. You could rewrite the above example as follows:

<alice> <name> "Alice" .

This would use a deterministic fingerprint of the XID alice and map that to a UID.

Warning:Dgraph does not store a UID to XID mapping. So, given a UID, it won't locate the corresponding XID automatically. Also, XID collisions are possible and are NOT checked for. If two XIDs fingerprint map to the same UID, their data would get mixed up.

You can add and query the above data like so.

mutation{
  set {
    <alice> <name> "Alice" .
  }
}

$ curl localhost:8080/query -XPOST -d '{debug(_xid_:alice) {name}}'

Delete

delete keyword lets you delete a specified S P O triple through a mutation. For example, if you want to delete the record <lewis-carrol> <died> "1998" ., you would do the following.

mutation {
  delete {
     <lewis-carrol> <died> "1998" .
  }
}

Or you can delete it using the _uid_ for lewis-carrol like

mutation {
  delete {
     <0xf11168064b01135b> <died> "1998" .
  }
}

Queries

Note:

Most of the examples here are based on the 21million.rdf.gz file located here. The geo-location queries are based on sf.tourism.gz file located here. The corresponding 21million.schema for both is located here.

To try out these queries, you can download these files, run Dgraph and load the data like so.

$ dgraph --schema 21million.schema
$ dgraphloader -r 21million.rdf.gz,sf.tourism.gz

With Dgraph running, all queries shown can be run using curl localhost:8080/query -XPOST -d $'{ ... }'.

Queries in GraphQL+- look very much like queries in GraphQL. You typically start with a node, and expand edges from there. Each {} block goes one layer deep.

{
  me(_xid_: m.06pj8) {
    type.object.name.en
    film.director.film  {
      film.film.genre {
        type.object.name.en
      }
      type.object.name.en
      film.film.initial_release_date
    }
  }
}
{
  "me": [
  {
    "film.director.film": [
      {
        "film.film.genre": [
          {
            "type.object.name.en": "Costume Adventure"
          },
          {
            "type.object.name.en": "Adventure Film"
          },
          {
            "type.object.name.en": "Action/Adventure"
          },
          {
            "type.object.name.en": "Action Film"
          }
        ],
        "film.film.initial_release_date": "1984-05-22",
        "type.object.name.en": "Indiana Jones and the Temple of Doom"
      },
      ...
      ...
      ...
      {
        "film.film.genre": [
          {
            "type.object.name.en": "Drama"
          }
        ],
        "film.film.initial_release_date": "1985-12-15",
        "type.object.name.en": "The Color Purple"
      }
    ],
    "type.object.name.en": "Steven Spielberg"
  }
  ]
}

What happened above is that we start the query with an entity denoting Steven Spielberg (from Freebase data), then expand by two predicates type.object.name.en which yields the value Steven Spielberg, and film.director.film which yields the entities of films directed by Steven Spielberg.

Then for each of these film entities, we expand by three predicates: type.object.name.en which yields the name of the film, film.film.initial_release_date which yields the film release date and film.film.genre which yields a list of genre entities. Each genre entity is then expanded by type.object.name.en to get the name of the genre.

Pagination

Often there is too much data and you only want a slice of the data.

First

If you want just the first few results as you expand the predicate for an entity, you can use the first argument. The value of first can be positive for the first N results, and negative for the last N.

Note:Without a sort order specified, the results are sorted by _uid_, which is assigned randomly. So the ordering while deterministic, might not be what you expected.
# Retrieve the first two films of Steven Spielberg.

{
  me(_xid_: m.06pj8) {
    film.director.film (first: 2) {
      type.object.name.en
      film.film.initial_release_date
      film.film.genre (first: -3) {
          type.object.name.en
      }
    }
  }
 }
{
    "me": [
        {
            "film.director.film": [
                {
                    "film.film.genre": [
                        {
                            "type.object.name.en": "Adventure Film"
                        },
                        {
                            "type.object.name.en": "Action/Adventure"
                        },
                        {
                            "type.object.name.en": "Action Film"
                        }
                    ],
                    "film.film.initial_release_date": "1984-05-23",
                    "type.object.name.en": "Indiana Jones and the Temple of Doom"
                },
                {
                    "film.film.genre": [
                        {
                            "type.object.name.en": "Mystery film"
                        },
                        {
                            "type.object.name.en": "Horror"
                        },
                        {
                            "type.object.name.en": "Thriller"
                        }
                    ],
                    "film.film.initial_release_date": "1975-06-20",
                    "type.object.name.en": "Jaws"
                }
            ]
        }
    ]
}

Offset

If you want the next one result, you want to skip the first two results with offset:2 and keep only one result with first:1.

{
  me(_xid_: m.06pj8) {
    type.object.name.en
    film.director.film(first:1, offset:2)  {
      film.film.genre {
        type.object.name.en
      }
      type.object.name.en
      film.film.initial_release_date
    }
  }
}

Notice the first and offset arguments. Here is the output which contains only one result.

{
  "me": [
    {
      "film.director.film": [
        {
          "film.film.genre": [
            {
              "type.object.name.en": "War film"
            },
            {
              "type.object.name.en": "Drama"
            },
            {
              "type.object.name.en": "Action Film"
            }
          ],
          "film.film.initial_release_date": "1998-07-23",
          "type.object.name.en": "Saving Private Ryan"
        }
      ],
      "type.object.name.en": "Steven Spielberg"
    }
  ]
}

After

Dgraph assigns uint64's to all entities which are called UIDs (unique internal IDs). All results are sorted by UIDs by default. Therefore, another way to get the next one result after the first two results is to specify that the UIDs of all results are larger than the UID of the second result.

This helps in pagination where the first query would be of the form <attribute>(after: 0x0, first: N) and the subsequent ones will be of the form <attribute>(after: <uid of last entity in last result>, first: N)

In the above example, the first two results are the film entities of "Indiana Jones and the Temple of Doom" and "Jaws". You can obtain their UIDs by adding the predicate _uid_ in the query.

{
  me(_xid_: m.06pj8) {
    type.object.name.en
    film.director.film(first:2) {
      _uid_
      type.object.name.en
    }
  }
}

The response looks like:

{
  "me": [
    {
      "film.director.film": [
        {
          "_uid_": "0xc17b416e58b32bb",
          "type.object.name.en": "Indiana Jones and the Temple of Doom"
        },
        {
          "_uid_": "0xc6f4b3d7f8cbbad",
          "type.object.name.en": "Jaws"
        }
      ],
      "type.object.name.en": "Steven Spielberg"
    }
  ]
}

Now we know the UID of the second result is 0xc6f4b3d7f8cbbad. We can get the next one result by specifying the after argument.

{
  me(_xid_: m.06pj8) {
    type.object.name.en
    film.director.film(first:1, after:0xc6f4b3d7f8cbbad)  {
      film.film.genre {
        type.object.name.en
      }
      type.object.name.en
      film.film.initial_release_date
    }
  }
}

The response is the same as before when we use offset:2 and first:1.

{
  "me": [
    {
      "film.director.film": [
        {
          "film.film.genre": [
            {
              "type.object.name.en": "War film"
            },
            {
              "type.object.name.en": "Drama"
            },
            {
              "type.object.name.en": "Action Film"
            }
          ],
          "film.film.initial_release_date": "1998-07-23",
          "type.object.name.en": "Saving Private Ryan"
        }
      ],
      "type.object.name.en": "Steven Spielberg"
    }
  ]
}

Alias

Alias lets us provide alternate names to predicates in results for convenience.

For example, the following query replaces the predicate type.object.name.en with name in the JSON result.

{
  me(_xid_: m.0bxtg) {
    name:type.object.name.en
    }
  }
}
{
  "me":[
    {
      "name":"Tom Hanks"
    }
  ]
}

Count

_count_ keyword lets us obtain the number of entities instead of retrieving the entire list. For example, the following query retrieves the name and the number of films acted by an actor with _xid_ m.0bxtg.

 {
  me(_xid_: m.0bxtg) {
    type.object.name.en
    film.actor.film {
      _count_
    }
  }
}

Functions

Note:Functions can only be applied to indexed attributes. Ensure that you specify that in the schema file.

Term matching

AllOf

AllOf function will search for entities which have all of one or more terms specified. In essence, this is an intersection of entities containing the specified terms; the ordering does not matter. It follows this syntax: allof(predicate, "space-separated terms")

Usage as Filter

Suppose we want the films of Steven Spielberg that contain the word indiana and jones.

{
  me(_xid_: m.06pj8) {
    type.object.name.en
    film.director.film @filter(allof("type.object.name.en", "jones indiana"))  {
      _uid_
      type.object.name.en
    }
  }
}

allof tells Dgraph that the matching films' type.object.name.en have to contain both the words "indiana" and "jones". Here is the response.

{
  "me": [
    {
      "film.director.film": [
        {
          "_uid_": "0xc17b416e58b32bb",
          "type.object.name.en": "Indiana Jones and the Temple of Doom"
        },
        {
          "_uid_": "0x7d0807a6740c25dc",
          "type.object.name.en": "Indiana Jones and the Kingdom of the Crystal Skull"
        },
        {
          "_uid_": "0xa4c4cc65751e98e7",
          "type.object.name.en": "Indiana Jones and the Last Crusade"
        },
        {
          "_uid_": "0xd1c161bed9769cbc",
          "type.object.name.en": "Indiana Jones and the Raiders of the Lost Ark"
        }
      ],
      "type.object.name.en": "Steven Spielberg"
    }
  ]
}
Usage at root

In the following example, we list all the entities (in this case all films) that have both terms "jones" and "indiana". Moreover, for each entity, we query their film genre and names.

{
  me(allof("type.object.name.en", "jones indiana")) {
    type.object.name.en
    film.film.genre {
      type.object.name.en
    }
  }
}

Here is a part of the response.

{
  "me": [
    {
      "film.film.genre": [
        {
          "type.object.name.en": "Adventure Film"
        },
        {
          "type.object.name.en": "Horror"
        }
      ],
      "type.object.name.en": "The Adventures of Young Indiana Jones: Masks of Evil"
    },
    {
      "film.film.genre": [
        {
          "type.object.name.en": "War film"
        },
        {
          "type.object.name.en": "Adventure Film"
        }
      ],
      "type.object.name.en": "The Adventures of Young Indiana Jones: Adventures in the Secret Service"
    },
    ...
    {
      "film.film.genre": [
        {
          "type.object.name.en": "Comedy"
        }
      ],
      "type.object.name.en": "The Adventures of Young Indiana Jones: Espionage Escapades"
    }
  ]
}

AnyOf

AnyOf function will search for entities which have any of two or more terms specified. In essence, this is a union of entities containing the specified terms. Again, the ordering does not matter. It follows this syntax: anyof(predicate, "space-separated terms")

Usage as filter
{
  me(_xid_: m.06pj8) {
    type.object.name.en
    film.director.film @filter(anyof("type.object.name.en", "war spies"))  {
      _uid_
      type.object.name.en
    }
  }
}
{
  "me": [
    {
      "film.director.film": [
        {
          "_uid_": "0x38160fa42cf3f4c9",
          "type.object.name.en": "War Horse"
        },
        {
          "_uid_": "0x39d8574f26521fcc",
          "type.object.name.en": "War of the Worlds"
        },
        {
          "_uid_": "0xd915cb0eb9ad47c0",
          "type.object.name.en": "Bridge of Spies"
        }
      ],
      "type.object.name.en": "Steven Spielberg"
    }
  ]
}
Usage at root

We can look up films that contain either the word "passion" or "peacock". Surprisingly many films satisfy this criteria. We will query their name and their genres.

{
  me(anyof("type.object.name.en", "passion peacock")) {
    type.object.name.en
    film.film.genre {
      type.object.name.en
    }
  }
}
{
  "me": [
    {
      "type.object.name.en": "Unexpected Passion"
    },
    {
      "film.film.genre": [
        {
          "type.object.name.en": "Drama"
        },
        {
          "type.object.name.en": "Silent film"
        }
      ],
      "type.object.name.en": "The New York Peacock"
    },
    {
      "film.film.genre": [
        {
          "type.object.name.en": "Drama"
        },
        {
          "type.object.name.en": "Romance Film"
        }
      ],
      "type.object.name.en": "Passion of Love"
    },
    ...
    {
      "film.film.genre": [
        {
          "type.object.name.en": "Crime Fiction"
        },
        {
          "type.object.name.en": "Comedy"
        }
      ],
      "type.object.name.en": "The Passion of the Reefer"
    }
  ]
}

Note that the first result with the name "Unexpected Passion" is either not a film entity, or it is a film entity with no genre.

Inequality

Type Values

The following are possible values to be used in inequality functions for scalar types.

date

  • YYYY (year)
  • YYYY-MM (year-month)
  • YYYY-MM-DD (year-month-date)

Less than or equal to

leq is used to filter or obtain UIDs whose value for a predicate is less than or equal to a given value.

 {
    me(_xid_: m.06pj8) {
      type.object.name.en
      film.director.film @filter(leq("film.film.initial_release_date", "1970-01-01"))  {
          film.film.initial_release_date
          type.object.name.en
      }
    }
 }

This query would return the name and release date of all the movies directed by on or Steven Spielberg before 1970-01-01.

{
    "me": [
        {
            "film.director.film": [
                {
                    "film.film.initial_release_date": "1964-03-24",
                    "type.object.name.en": "Firelight"
                },
                {
                    "film.film.initial_release_date": "1968-12-18",
                    "type.object.name.en": "Amblin"
                },
                {
                    "film.film.initial_release_date": "1967-01-01",
                    "type.object.name.en": "Slipstream"
                }
            ],
            "type.object.name.en": "Steven Spielberg"
        }
    ]
}

Greater than or equal to

geq is used to filter or obtain UIDs whose value for a predicate is greater than or equal to a given value.

{
  me(_xid_: m.06pj8) {
    type.object.name.en
     film.director.film @filter(geq("film.film.initial_release_date", "2008"))  {
        Release_date: film.film.initial_release_date
        Name: type.object.name.en
    }
  }
}

This query would return Name and Release date of movies directed by Steven Spielberg after 2010.

{
    "me": [
        {
            "film.director.film": [
                {
                    "Name": "War Horse",
                    "Release_date": "2011-12-04"
                },
                {
                    "Name": "Indiana Jones and the Kingdom of the Crystal Skull",
                    "Release_date": "2008-05-18"
                },
                {
                    "Name": "Lincoln",
                    "Release_date": "2012-10-08"
                },
                {
                    "Name": "Bridge of Spies",
                    "Release_date": "2015-10-16"
                },
                {
                    "Name": "The Adventures of Tintin: The Secret of the Unicorn",
                    "Release_date": "2011-10-23"
                }
            ],
            "type.object.name.en": "Steven Spielberg"
        }
    ]
}

Less than, greater than, equal to

Above, we have seen the usage of geq and leq. You can also use gt for "strictly greater than" and lt for "strictly less than" and eq for "equal to".

Geolocation

Note: Geolocation functions support only polygons and points as of now. Also, polygons with holes are replaced with the outer loop ignoring any holes.

The data used for testing the geo functions can be found in benchmarks repository[3]. You will need to index loc predicate with type geo before loading the data for these queries to work.

Near

Near returns all entities which lie within a specified distance from a given point. It takes in three arguments namely the predicate (on which the index is based), geo-location point and a distance (in metres).

{
  tourist( near("loc", "{\'type\':\'Point\', \'coordinates\': [-122.469829, 37.771935]}", "1000" ) ) {
    name
  }
}

This query returns all the entities located within 1000 metres from the specified point in geojson format.

{
    "tourist": [
        {
            "name": "National AIDS Memorial Grove"
        },
        {
            "name": "Japanese Tea Garden"
        },
        {
            "name": "Peace Lantern"
        },
        {
            "name": "Steinhart Aquarium"
        },
        {
            "name": "De Young Museum"
        },
        {
            "name": "Morrison Planetarium"
        },
         .
         .
        {
            "name": "San Francisco Botanical Garden"
        },
        {
            "name": "Buddha"
        }
    ]
}

Within

Within returns all entities which completely lie within the specified region. It takes in two arguments namely the predicate (on which the index is based) and geo-location region.

{
  tourist(within("loc", "{ \'type\': \'Polygon\', \'coordinates\': [ [ [ -122.47266769409178, 37.769018558337926 ], [ -122.47266769409178, 37.773699921075135 ], [ -122.4651575088501, 37.773699921075135 ], [ -122.4651575088501, 37.769018558337926 ], [ -122.47266769409178, 37.769018558337926 ] ] ] }")) {
    name
  }
}

This query returns all the entities (points/polygons) located completely within the specified polygon in geojson format.

{
    "tourist": [
        {
            "name": "Japanese Tea Garden"
        },
        {
            "name": "Peace Lantern"
        },
        {
            "name": "Rose Garden"
        },
        {
            "name": "Steinhart Aquarium"
        },
        {
            "name": "De Young Museum"
        },
        {
            "name": "Morrison Planetarium"
        },
        {
            "name": "Spreckels Temple of Music"
        },
        {
            "name": "Hamon Tower"
        },
        {
            "name": "Buddha"
        }
    ]
}
Note: The containment check for polygons are approximate as of v0.7.1.

Contains

Contains returns all entities which completely enclose the specified point or region. It takes in two arguments namely the predicate (on which the index is based) and geo-location region.

{
  tourist(contains("loc", "{ \'type\': \'Point\', \'coordinates\': [ -122.50326097011566, 37.73353615592843 ] }")) {
    name
  }
}

This query returns all the entities that completely enclose the specified point (or polygon) in geojson format.

{
    "tourist": [
        {
            "name": "San Francisco Zoo"
        },
        {
            "name": "Flamingo"
        }
    ]
}

Intersects

Intersects returns all entities which intersect with the given polygon. It takes in two arguments namely the predicate (on which the index is based) and geo-location region.

{
  tourist(intersects("loc", "{ \'type\': \'Polygon\', \'coordinates\': [ [ [ -122.503325343132, 37.73345766902749 ], [ -122.503325343132, 37.733903134117966 ], [ -122.50271648168564, 37.733903134117966 ], [ -122.50271648168564, 37.73345766902749 ], [ -122.503325343132, 37.73345766902749 ] ] ] }")) {
    name
  }
}

This query returns all the entities that intersect with the specified polygon/point in geojson format.

{
    "tourist": [
        {
            "name": "San Francisco Zoo"
        },
        {
            "name": "Flamingo"
        }
    ]
}

Filters

Functions can be applied to results as Filters. Dgraph supports both AND and OR filters. The syntax is of form: A || B, or A && B. You can add round brackets to make these filters more complex. (A || B) && (C && (D || E))

In this query, we are getting film names which contain either both "indiana" and "jones" OR both "jurassic" and "park".

{
  me(_xid_: m.06pj8) {
    type.object.name.en
    film.director.film @filter(allof("type.object.name.en", "jones indiana") || allof("type.object.name.en", "jurassic park"))  {
      _uid_
      type.object.name.en
    }
  }
}
{
  "me": [
    {
      "film.director.film": [
        {
          "_uid_": "0xc17b416e58b32bb",
          "type.object.name.en": "Indiana Jones and the Temple of Doom"
        },
        {
          "_uid_": "0x22e65757df0c94d2",
          "type.object.name.en": "Jurassic Park"
        },
        {
          "_uid_": "0x7d0807a6740c25dc",
          "type.object.name.en": "Indiana Jones and the Kingdom of the Crystal Skull"
        },
        {
          "_uid_": "0x8f2485e4242cbe6e",
          "type.object.name.en": "The Lost World: Jurassic Park"
        },
        {
          "_uid_": "0xa4c4cc65751e98e7",
          "type.object.name.en": "Indiana Jones and the Last Crusade"
        },
        {
          "_uid_": "0xd1c161bed9769cbc",
          "type.object.name.en": "Indiana Jones and the Raiders of the Lost Ark"
        }
      ],
      "type.object.name.en": "Steven Spielberg"
    }
  ]
}

Sorting

We can sort results by a predicate using the order or orderdesc argument. The predicate has to be indexed and this has to be specified in the schema. As you may expect, order sorts in ascending order while orderdesc sorts in descending order.

For example, we can sort the films of Steven Spielberg by their release date, in ascending order.

{
  me(_xid_: m.06pj8) {
    type.object.name.en
    film.director.film(order: film.film.initial_release_date) {
      type.object.name.en
      film.film.initial_release_date
    }
  }
}
{
  "me": [
    {
      "film.director.film": [
        {
          "film.film.initial_release_date": "1964-03-23",
          "type.object.name.en": "Firelight"
        },
        {
          "film.film.initial_release_date": "1966-12-31",
          "type.object.name.en": "Slipstream"
        },
        {
          "film.film.initial_release_date": "1968-12-17",
          "type.object.name.en": "Amblin"
        },
        ...
        ...
        ...
        {
          "film.film.initial_release_date": "2012-10-07",
          "type.object.name.en": "Lincoln"
        },
        {
          "film.film.initial_release_date": "2015-10-15",
          "type.object.name.en": "Bridge of Spies"
        }
      ],
      "type.object.name.en": "Steven Spielberg"
    }
  ]
}

If you use orderdesc instead, the films will be listed in descending order.

{
  me(_xid_: m.06pj8) {
    type.object.name.en
    film.director.film(orderdesc: film.film.initial_release_date, first: 2) {
      type.object.name.en
      film.film.initial_release_date
    }
  }
}

Here is the output.

{
  "me": [
    {
      "film.director.film": [
        {
          "film.film.initial_release_date": "2015-10-15",
          "type.object.name.en": "Bridge of Spies"
        },
        {
          "film.film.initial_release_date": "2012-10-07",
          "type.object.name.en": "Lincoln"
        },
      ],
      "type.object.name.en": "Steven Spielberg"
    }
  ]
}

Schema

Schema is used to specify the scalar types of the predicates and what fields constitute an object type. This schema would be used for type checking, result validation, type coercion.

Scalar Types

Scalar types are specified with scalar keyword.

Dgraph Type Go type
int int32
float float
string string
bool bool
id string
date time.Time (only day, month, year fields are valid. It could be of the form yyyy-mm-dd or yyyy-mm or yyyy)
datetime time.Time (RFC3339 format [Optional timezone] eg: 2006-01-02T15:04:05.999999999+10:00 or 2006-01-02T15:04:05.999999999)
geo go-geom[4]
uid uint64
Note:uid type is used to denote objects though it internally uses uint64.

To declare a field age as int, this line has to be included in the schema file scalar age: int.

Object Types

Object types in the schema are defiend using the type keyword. For example, to declare an object Person we add the following snippet in the schema file. All objects are of uid type which denotes a uint64.

type person {
  name: string
  age: int
  strength: float
  profession: string
  friends: uid
  relatives: uid
}

The object can have scalar fields which contain values and object fields which link to other nodes in the graph. In the above declaration name, age, strength, profession are scalar fields which would have values of specified types and friends and relatives are of person object type which would link to other nodes in the graph.

The node could have zero or more entities linked to the object fields (In this example, friends and relatives).

Schema File

A sample schema file would look as follows:

scalar (
  age:int
  address: string
)
type  Person {
  name: string
  age: int
  address: string
  friends: uid
}
type Actor {
  name  : string
  films: uid 
}
type Film {
  name: string
  budget: int
}

A schema file is passed to the server during invocation trough --schema flag. Some points to remember about the schema system are:

  • A given field can have only one type throughout the schema (Both inside and outside object types). Example: age declared as int both using scalar and inside Person object can have only one type throughout the schema.
  • Scalar fields inside the object types are also considered global scalars and need not be explicitly declared globally. In the above example, name is automatically inferred as string type.
  • Mutations only check the scalar types (inside objects and global explicit definition). For example, in the given schema, any mutation that sets age would be checked for being a valid integer, any mutation that sets name would be checked for being a valid string (though name is not globally declared as a string scalar, it would be inferred from the object types).
  • The returned fields are of types specified in the schema (given they were specified, otherwise they would be strings).

Indexing

@index keyword at the end of a scalar field declaration in the schema file specifies that the predicate should be indexed. For example, if we want to index some fields, we should have a schema file similar to the one below.

scalar (
  name: string @index
  age: int @index
  address: string @index
  dateofbirth: date @index
  health: float @index
  location: geo @index
  timeafterbirth:  dateTime @index
)

All the scalar types except uid type can be indexed in dgraph.

Reverse Edges

Each graph edge is unidirectional. It points from one node to another. A lot of times, you wish to access data in both directions, forward and backward. Instead of having to send edges in both directions, you can use the @reverse keyword at the end of a uid (entity) field declaration in the schema file. This specifies that the reverse edge should be automatically generated. For example, if we want to add a reverse edge for film.film.directed_by predicate, we should have a schema file as follows.

scalar (
  type.object.name.en: string @index
  film.film.directed_by: uid @reverse
)

This would add a reverse edge for each film.film.directed_by edge and that edge can be accessed by prefixing ~ with the original predicate, i.e. ~film.film.directed_by.

In the following example, we find films that are directed by Steven Spielberg, by using the reverse edges of film.film.directed_by. Here is the sample query:

query {
  me(_xid_: m.06pj8) {
    type.object.name.en
    ~film.film.directed_by(first: 5) {
      type.object.name.en
    }
  }
}

The results are:

{
  "me":[
    {
      "type.object.name.en":"Steven Spielberg",
      "~film.film.directed_by":[
        {
          "type.object.name.en":"Indiana Jones and the Temple of Doom"
        },
        {
          "type.object.name.en":"Jaws"
        },
        {
          "type.object.name.en":"Saving Private Ryan"
        },
        {
          "type.object.name.en":"Close Encounters of the Third Kind"
        },
        {
          "type.object.name.en":"Catch Me If You Can"
        }
      ]
    }
  ]
}

RDF Types

RDF types can also be used to specify the type of values. They can be attached to the values using the ^^ separator.

mutation {
 set {
  _:a <name> "Alice" .
  _:a <age> "15"^^<xs:int> .
  _:a <health> "99.99"^^<xs:float> .   
 }
}

This implies that name be stored as string(default), age as int and health as float.

Note: RDF type overwrites schema type in case both are present. If both the RDF type and the schema type is missing, value is assumed to be a string.

Supported

The following table lists all the supported RDF types and the corresponding internal type format in which the data is stored.

Storage Type Dgraph type
<xs:string> String
<xs:dateTime> DateTime
<xs:date> Date
<xs:int> Int32
<xs:boolean> Bool
<xs:double> Float
<xs:float> Float
<geo:geojson> Geo
<http://www.w3.org/2001/XMLSchema#string> String
<http://www.w3.org/2001/XMLSchema#dateTime> DateTime
<http://www.w3.org/2001/XMLSchema#date> Date
<http://www.w3.org/2001/XMLSchema#int> Int32
<http://www.w3.org/2001/XMLSchema#boolean> Bool
<http://www.w3.org/2001/XMLSchema#double> Float
<http://www.w3.org/2001/XMLSchema#float> Float

In case a predicate has different schema type and storage type, the convertibility between the two is ensured during mutation and an error is thrown if they are incompatible.

Debug root

debug as the root attribute of a query lets you retrieve the _uid_ attribute for all the entities along with the latency information. Any other root attribute would only return the fields that were explicitly asked for which is in accordance with the GraphQL specification.

Query with debug as root

 {
  debug(_xid_: m.07bwr) {
    type.object.name.en
  }
}

Returns _uid_ and server_latency

{
  "debug": [
    {
      "_uid_": "0xff4c6752867d137d",
      "type.object.name.en": "The Big Lebowski"
    }
  ],
  "server_latency": {
    "json": "29.149µs",
    "parsing": "129.713µs",
    "processing": "500.276µs",
    "total": "661.374µs"
  }
}

Query with any other root attribute

{
  me(_xid_: m.07bwr) {
    type.object.name.en
  }
}

Returns

{
  "me": [
    {
      "type.object.name.en": "The Big Lebowski"
    }
  ]
}

Fragments

fragment keyword allows you to define new fragments that can be referenced in a query, as per GraphQL specification[5]. The point is that if there are multiple parts which query the same set of fields, you can define a fragment and refer to it multiple times instead. Fragments can be nested inside fragments, but no cycles are allowed. Here is one contrived example.

query {
  debug(_xid_: m.07bwr) {
    type.object.name.en
    ...TestFrag
  }
}
fragment TestFrag {
  film.film.initial_release_date
  ...TestFragB
}
fragment TestFragB {
  film.film.country
}

Variables

Variables can be defined and used in GraphQL queries which helps in query reuse and avoids costly string building in clients at runtime by passing a separate variable map. A variable starts with a $ symbol. For complete information on variables, please check out GraphQL specification on variables[6]. We encode the variables as a separate JSON object as show in the example below.

{
 "query": "query test($a: int, $b: int){  me(_xid_: m.06pj8) {film.director.film (first: $a, offset: $b) {film.film.genre(first: $a) { type.object.name.en }}}}",
 "variables" : {
  "$a": "5",
  "$b": "10"
 }
}

The type of a variable can be suffixed with a ! to enforce that the variable must have a value. Also, the value of the variable must be parsable to the given type, if not, an error is thrown. Any variable that is being used must be declared in the named query clause in the beginning. And we also support default values for the variables. Example:

{
 "query": "query test($a: int = 2, $b: int! = 3){  me(_xid_: m.06pj8) {film.director.film (first: $a, offset: $b) {film.film.genre(first: $a) { type.object.name.en }}}}",
 "variables" : {
  "$a": "5"
 }
}

If the variable is initialized in the variable map, the default value will be overridden (In the example, $a will be 5 and $b will be 3).

The variable types that are supported as of now are: int, float, bool and string.

Note: In GraphiQL interface, the query and the variables have to be separately entered in their respective boxes.

Resources

Legacy Test Queries

  1. 1.0 1.1 GraphQL Working Draft (syntax guide)
  2. 2.0 2.1 2.2 RDF N-Quad format
  3. https://github.com/dgraph-io/benchmarks/blob/master/data/sf.tourism.gz
  4. https://github.com/twpayne/go-geom
  5. https://facebook.github.io/graphql/#sec-Language.Fragments specification
  6. https://facebook.github.io/graphql/#sec-Language.Variables