Get Started

From Dgraph Wiki
Jump to: navigation, search

New to Dgraph? Here's a 5 step tutorial to get you up and running.

Step 1: Installation

System Installation

You could simply install the binaries with

$ curl https://get.dgraph.io -sSf | sh

That script would automatically install Dgraph for you. It might also prompt you to set some environment variables, which you should. Once done, you can jump straight to step 2.

Alternative: To mitigate potential security risks, you could instead do this:

$ curl https://get.dgraph.io > /tmp/get.sh
$ vim /tmp/get.sh  # Inspect the script
$ sh /tmp/get.sh   # Execute the script

Docker Image Installation

You may pull our Docker images from here[1]. From terminal, just type

$ docker pull dgraph/dgraph

# Mapping port 8080 from within the container to the instance:
$ docker run -it -p 8080:8080 dgraph/dgraph

To mount the volumes for posting and wal (for persistence) you could run:

# Replace $(pwd) with the absolute path of the directory where you'd like to persist the data.
$ docker run -it -p 8080:8080 -v $(pwd)/p:/dgraph/p -v $(pwd)/w:/dgraph/w dgraph/dgraph

Step 2: Download dataset

Download the goldendata.rdf.gz file from here. Also, download the corresponding schema from here. Put them both in ~/dgraph directory, creating it if necessary using mkdir ~/dgraph.

$ mkdir -p ~/dgraph
$ cd ~/dgraph
$ wget "https://github.com/dgraph-io/benchmarks/blob/master/data/goldendata.rdf.gz?raw=true" -O goldendata.rdf.gz -q
$ wget "https://github.com/dgraph-io/benchmarks/blob/master/data/goldendata.schema?raw=true" -O goldendata.schema -q

Step 3: Run Dgraph

Using System Installation

Warning:Ensure that ICU_DATA environment variable is set and pointing correctly to ICU data (as mentioned at the end of the dgraph installation in Step 1) by running echo $ICU_DATA and then ls $ICU_DATA, before proceeding.

Follow these commands to run Dgraph:

$ cd ~/dgraph # The directory where you downloaded the rdf.gz and schema files.
$ dgraph --schema goldendata.schema

Using Docker

# Running dgraph image with the directories mounted.
$ docker run -it -p 8080:8080 -v $(pwd)/p:/dgraph/p -v $(pwd)/w:/dgraph/w dgraph/dgraph dgraph -schema goldendata.schema
Tip:The dgraph server listens on port 8080 with log output to the terminal.


Step 4: Load dataset

You can load the golden dataset downloaded above, (in another terminal) as follows:

$ cd ~/dgraph # The directory where you downloaded the rdf.gz and schema files.
$ dgraphloader -r goldendata.rdf.gz
...
Processing goldendata.rdf.gz
Number of mutations run   : 1121
Number of RDFs processed  : 1120879
Time spent                : MMmSS.FFFFFFFFs
RDFs processed per second : XXXXX
$
Tip:Your counts should be the same, but your statistics will vary.


Step 5: Query Dgraph

Movies by Steven Spielberg

Let's now find all the entities named "Steven Spielberg," and the movies directed by them.

curl localhost:8080/query -XPOST -d '{
  director(allof("type.object.name.en", "steven spielberg")) {
    type.object.name.en
    film.director.film (orderdesc: film.film.initial_release_date) {
      type.object.name.en
      film.film.initial_release_date
    }
  }
}
' | python -m json.tool | less

This would return all the movies by the popular director Steven Spielberg, sorted by their release date in descending order. This query would also return two other entities which have "Steven Spielberg" in their names.

Tip:You may use python or python3 equally well.


Released after August 1984

Now, let's do some filtering. This time we'll only retrieve the movies which were released after August 1984. We'll sort in increasing order this time by using order, instead of orderdesc.

curl localhost:8080/query -XPOST -d '{
  director(allof("type.object.name.en", "steven spielberg")) {
    type.object.name.en
    film.director.film (order: film.film.initial_release_date) @filter(geq("film.film.initial_release_date", "1984-08")) {
      type.object.name.en
      film.film.initial_release_date
    }
  }
}
' | python -m json.tool | less

Released in 1990s

We'll now add an AND filter using && and find only the movies released in the 90s.

curl localhost:8080/query -XPOST -d '{
  director(allof("type.object.name.en", "steven spielberg")) {
    type.object.name.en
    film.director.film (order: film.film.initial_release_date) @filter(geq("film.film.initial_release_date", "1990") && leq("film.film.initial_release_date", "2000")) {
      type.object.name.en
      film.film.initial_release_date
    }
  }
}
' | python -m json.tool | less

Released since 2016

So far, we've been going from director to films. Now, we'll start with films released since 2016, and their directors. To make things interesting, we'll only retrieve the director name, if it matches any of travis or knight. In addition, we'll also alias type.object.name.en to name and film.film.initial_release_date to release. This would make the result look better.

curl localhost:8080/query -XPOST -d '{
  films(geq("film.film.initial_release_date", "2016")) {
    name: type.object.name.en
    release: film.film.initial_release_date
    film.film.directed_by @filter(anyof("type.object.name.en", "travis knight")) {
      name: type.object.name.en
    }
  }
}
' | python -m json.tool | less

This should give you an idea for the sort of queries Dgraph can do. To see the whole range of queries you can run, visit Query Language.

Need Help

References

  1. https://hub.docker.com/r/dgraph/dgraph/