juniper/docs/book/content/advanced/dataloaders.md

# Avoiding the N+1 Problem With Dataloaders

A common issue with graphql servers is how the resolvers query their datasource.
This issue results in a large number of unneccessary database queries or http requests.
Say you were wanting to list a bunch of cults people were in

```graphql
query {
  persons {
    id
    name
    cult {
      id
      name
    }
  }
}
```

What would be executed by a SQL database would be:

```sql
SELECT id, name, cult_id FROM persons;
SELECT id, name FROM cults WHERE id = 1;
SELECT id, name FROM cults WHERE id = 1;
SELECT id, name FROM cults WHERE id = 1;
SELECT id, name FROM cults WHERE id = 1;
SELECT id, name FROM cults WHERE id = 2;
SELECT id, name FROM cults WHERE id = 2;
SELECT id, name FROM cults WHERE id = 2;
# ...
```

Once the list of users has been returned, a separate query is run to find the cult of each user.
You can see how this could quickly become a problem.

A common solution to this is to introduce a **dataloader**.
This can be done with Juniper using the crate [cksac/dataloader-rs](https://github.com/cksac/dataloader-rs), which has two types of dataloaders; cached and non-cached.

#### Cached Loader
DataLoader provides a memoization cache, after .load() is called once with a given key, the resulting value is cached to eliminate redundant loads.

DataLoader caching does not replace Redis, Memcache, or any other shared application-level cache. DataLoader is first and foremost a data loading mechanism, and its cache only serves the purpose of not repeatedly loading the same data in the context of a single request to your Application. [(read more)](https://github.com/graphql/dataloader#caching)

### What does it look like?

!FILENAME Cargo.toml

```toml
[dependencies]
actix-identity = "0.4.0-beta.2"
actix-rt = "1.0"
actix-web = {version = "2.0", features = []}
juniper = { git = "https://github.com/graphql-rust/juniper" }
futures = "0.3"
postgres = "0.15.2"
dataloader = "0.12.0"
async-trait = "0.1.30"
```

```rust, ignore
// use dataloader::cached::Loader;
use dataloader::non_cached::Loader;
use dataloader::BatchFn;
use std::collections::HashMap;
use postgres::{Connection, TlsMode};
use std::env;

pub fn get_db_conn() -> Connection {
    let pg_connection_string = env::var("DATABASE_URI").expect("need a db uri");
    println!("Connecting to {}", pg_connection_string);
    let conn = Connection::connect(&pg_connection_string[..], TlsMode::None).unwrap();
    println!("Connection is fine");
    conn
}

#[derive(Debug, Clone)]
pub struct Cult {
  pub id: i32,
  pub name: String,
}

pub fn get_cult_by_ids(hashmap: &mut HashMap<i32, Cult>, ids: Vec<i32>) {
  let conn = get_db_conn();
  for row in &conn
    .query("SELECT id, name FROM cults WHERE id = ANY($1)", &[&ids])
    .unwrap()
  {
    let cult = Cult {
      id: row.get(0),
      name: row.get(1),
    };
    hashmap.insert(cult.id, cult);
  }
}

pub struct CultBatcher;

#[async_trait]
impl BatchFn<i32, Cult> for CultBatcher {

    // A hashmap is used, as we need to return an array which maps each original key to a Cult.
    async fn load(&self, keys: &[i32]) -> HashMap<i32, Cult> {
        println!("load cult batch {:?}", keys);
        let mut cult_hashmap = HashMap::new();
        get_cult_by_ids(&mut cult_hashmap, keys.to_vec());
        cult_hashmap
    }
}

pub type CultLoader = Loader<i32, Cult, CultBatcher>;

// To create a new loader
pub fn get_loader() -> CultLoader {
    Loader::new(CultBatcher)
      // Usually a DataLoader will coalesce all individual loads which occur 
      // within a single frame of execution before calling your batch function with all requested keys.
      // However sometimes this behavior is not desirable or optimal. 
      // Perhaps you expect requests to be spread out over a few subsequent ticks
      // See: https://github.com/cksac/dataloader-rs/issues/12 
      // More info: https://github.com/graphql/dataloader#batch-scheduling 
      // A larger yield count will allow more requests to append to batch but will wait longer before actual load.
      .with_yield_count(100)
}

#[juniper::graphql_object(Context = Context)]
impl Cult {
  //  your resolvers

  // To call the dataloader 
  pub async fn cult_by_id(ctx: &Context, id: i32) -> Cult {
    ctx.cult_loader.load(id).await
  }
}

```

### How do I call them?

Once created, a dataloader has the async functions `.load()` and `.load_many()`.
In the above example `cult_loader.load(id: i32).await` returns `Cult`. If  we had used `cult_loader.load_many(Vec<i32>).await` it would have returned `Vec<Cult>`.


### Where do I create my dataloaders?

**Dataloaders** should be created per-request to avoid risk of bugs where one user is able to load cached/batched data from another user/ outside of its authenticated scope.
Creating dataloaders within individual resolvers will prevent batching from occurring and will nullify the benefits of the dataloader.

For example:

_When you declare your context_
```rust, ignore
use juniper;

#[derive(Clone)]
pub struct Context {
    pub cult_loader: CultLoader,
}

impl juniper::Context for Context {}

impl Context {
    pub fn new(cult_loader: CultLoader) -> Self {
        Self {
            cult_loader
        }
    }
}
```

_Your handler for GraphQL (Note: instantiating context here keeps it per-request)_
```rust, ignore
pub async fn graphql(
    st: web::Data<Arc<Schema>>,
    data: web::Json<GraphQLRequest>,
) -> Result<HttpResponse, Error> {

    // Context setup
    let cult_loader = get_loader();
    let ctx = Context::new(cult_loader);

    // Execute
    let res = data.execute(&st, &ctx).await; 
    let json = serde_json::to_string(&res).map_err(error::ErrorInternalServerError)?;

    Ok(HttpResponse::Ok()
        .content_type("application/json")
        .body(json))
}
```

### Further Example:

For a full example using Dataloaders and Context check out [jayy-lmao/rust-graphql-docker](https://github.com/jayy-lmao/rust-graphql-docker).
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00			`# Avoiding the N+1 Problem With Dataloaders`

			`A common issue with graphql servers is how the resolvers query their datasource.`
Fix typos in dataloader docs 2020-04-26 01:26:16 -05:00			`This issue results in a large number of unneccessary database queries or http requests.`
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00			`Say you were wanting to list a bunch of cults people were in`

			```graphql
			`query {`
			`persons {`
			`id`
			`name`
			`cult {`
			`id`
			`name`
			`}`
			`}`
			`}`
			```

			`What would be executed by a SQL database would be:`

			```sql
			`SELECT id, name, cult_id FROM persons;`
			`SELECT id, name FROM cults WHERE id = 1;`
			`SELECT id, name FROM cults WHERE id = 1;`
			`SELECT id, name FROM cults WHERE id = 1;`
			`SELECT id, name FROM cults WHERE id = 1;`
			`SELECT id, name FROM cults WHERE id = 2;`
			`SELECT id, name FROM cults WHERE id = 2;`
			`SELECT id, name FROM cults WHERE id = 2;`
			`# ...`
			```

			`Once the list of users has been returned, a separate query is run to find the cult of each user.`
			`You can see how this could quickly become a problem.`

			`A common solution to this is to introduce a dataloader.`
Update dataloader explanation code (#661) 2020-05-13 21:37:14 -05:00			`This can be done with Juniper using the crate [cksac/dataloader-rs](https://github.com/cksac/dataloader-rs), which has two types of dataloaders; cached and non-cached.`
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00
Update dataloader explanation code (#661) 2020-05-13 21:37:14 -05:00			`#### Cached Loader`
			`DataLoader provides a memoization cache, after .load() is called once with a given key, the resulting value is cached to eliminate redundant loads.`

			`DataLoader caching does not replace Redis, Memcache, or any other shared application-level cache. DataLoader is first and foremost a data loading mechanism, and its cache only serves the purpose of not repeatedly loading the same data in the context of a single request to your Application. [(read more)](https://github.com/graphql/dataloader#caching)`
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00
			`### What does it look like?`

			`!FILENAME Cargo.toml`

			```toml
			`[dependencies]`
Use actix betas to allow publishing on crates.io (#954) 2021-07-06 18:23:41 -05:00			`actix-identity = "0.4.0-beta.2"`
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00			`actix-rt = "1.0"`
			`actix-web = {version = "2.0", features = []}`
Update dataloader explanation code (#661) 2020-05-13 21:37:14 -05:00			`juniper = { git = "https://github.com/graphql-rust/juniper" }`
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00			`futures = "0.3"`
			`postgres = "0.15.2"`
Update dataloader explanation code (#661) 2020-05-13 21:37:14 -05:00			`dataloader = "0.12.0"`
			`async-trait = "0.1.30"`
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00			```

			```rust, ignore
Update dataloader explanation code (#661) 2020-05-13 21:37:14 -05:00			`// use dataloader::cached::Loader;`
			`use dataloader::non_cached::Loader;`
			`use dataloader::BatchFn;`
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00			`use std::collections::HashMap;`
			`use postgres::{Connection, TlsMode};`
			`use std::env;`

			`pub fn get_db_conn() -> Connection {`
			`let pg_connection_string = env::var("DATABASE_URI").expect("need a db uri");`
			`println!("Connecting to {}", pg_connection_string);`
			`let conn = Connection::connect(&pg_connection_string[..], TlsMode::None).unwrap();`
			`println!("Connection is fine");`
			`conn`
			`}`

			`#[derive(Debug, Clone)]`
			`pub struct Cult {`
			`pub id: i32,`
			`pub name: String,`
			`}`

			`pub fn get_cult_by_ids(hashmap: &mut HashMap<i32, Cult>, ids: Vec<i32>) {`
			`let conn = get_db_conn();`
			`for row in &conn`
			`.query("SELECT id, name FROM cults WHERE id = ANY($1)", &[&ids])`
			`.unwrap()`
			`{`
			`let cult = Cult {`
			`id: row.get(0),`
			`name: row.get(1),`
			`};`
			`hashmap.insert(cult.id, cult);`
			`}`
			`}`

			`pub struct CultBatcher;`

Update dataloader explanation code (#661) 2020-05-13 21:37:14 -05:00			`#[async_trait]`
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00			`impl BatchFn<i32, Cult> for CultBatcher {`

			`// A hashmap is used, as we need to return an array which maps each original key to a Cult.`
Update dataloader explanation code (#661) 2020-05-13 21:37:14 -05:00			`async fn load(&self, keys: &[i32]) -> HashMap<i32, Cult> {`
			`println!("load cult batch {:?}", keys);`
			`let mut cult_hashmap = HashMap::new();`
			`get_cult_by_ids(&mut cult_hashmap, keys.to_vec());`
			`cult_hashmap`
			`}`
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00			`}`

Update dataloader explanation code (#661) 2020-05-13 21:37:14 -05:00			`pub type CultLoader = Loader<i32, Cult, CultBatcher>;`
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00
			`// To create a new loader`
			`pub fn get_loader() -> CultLoader {`
			`Loader::new(CultBatcher)`
Update dataloader explanation code (#661) 2020-05-13 21:37:14 -05:00			`// Usually a DataLoader will coalesce all individual loads which occur`
			`// within a single frame of execution before calling your batch function with all requested keys.`
			`// However sometimes this behavior is not desirable or optimal.`
			`// Perhaps you expect requests to be spread out over a few subsequent ticks`
			`// See: https://github.com/cksac/dataloader-rs/issues/12`
			`// More info: https://github.com/graphql/dataloader#batch-scheduling`
			`// A larger yield count will allow more requests to append to batch but will wait longer before actual load.`
			`.with_yield_count(100)`
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00			`}`

			`#[juniper::graphql_object(Context = Context)]`
			`impl Cult {`
			`// your resolvers`

			`// To call the dataloader`
			`pub async fn cult_by_id(ctx: &Context, id: i32) -> Cult {`
Update dataloader explanation code (#661) 2020-05-13 21:37:14 -05:00			`ctx.cult_loader.load(id).await`
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00			`}`
			`}`

			```

			`### How do I call them?`

Update dataloader explanation code (#661) 2020-05-13 21:37:14 -05:00			Once created, a dataloader has the async functions `.load()` and `.load_many()`.
			In the above example `cult_loader.load(id: i32).await` returns `Cult`. If we had used `cult_loader.load_many(Vec<i32>).await` it would have returned `Vec<Cult>`.
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00

			`### Where do I create my dataloaders?`

			`Dataloaders should be created per-request to avoid risk of bugs where one user is able to load cached/batched data from another user/ outside of its authenticated scope.`
			`Creating dataloaders within individual resolvers will prevent batching from occurring and will nullify the benefits of the dataloader.`

			`For example:`

			`_When you declare your context_`
			```rust, ignore
			`use juniper;`

			`#[derive(Clone)]`
			`pub struct Context {`
			`pub cult_loader: CultLoader,`
			`}`

			`impl juniper::Context for Context {}`

			`impl Context {`
			`pub fn new(cult_loader: CultLoader) -> Self {`
			`Self {`
			`cult_loader`
			`}`
			`}`
			`}`
			```

			`_Your handler for GraphQL (Note: instantiating context here keeps it per-request)_`
			```rust, ignore
			`pub async fn graphql(`
			`st: web::Data<Arc<Schema>>,`
			`data: web::Json<GraphQLRequest>,`
			`) -> Result<HttpResponse, Error> {`

			`// Context setup`
			`let cult_loader = get_loader();`
			`let ctx = Context::new(cult_loader);`

			`// Execute`
Update dataloader explanation code (#661) 2020-05-13 21:37:14 -05:00			`let res = data.execute(&st, &ctx).await;`
Add dataloader explaination to book (#518) 2020-02-13 00:45:15 -06:00			`let json = serde_json::to_string(&res).map_err(error::ErrorInternalServerError)?;`

			`Ok(HttpResponse::Ok()`
			`.content_type("application/json")`
			`.body(json))`
			`}`
			```

			`### Further Example:`

			`For a full example using Dataloaders and Context check out [jayy-lmao/rust-graphql-docker](https://github.com/jayy-lmao/rust-graphql-docker).`