Rails provides a convenient .excluding method for identifying a group of items while omitting another set.

There may be a drawback that you might not notice when using this method.

When using .excluding ActiveRecord will execute 2 database queries, one to get the id or items that are to be excluded, and the second to perform the main query.

Imagine we have a Post model and a table of posts, for simplicity a Post only has an id and a published boolean attribute. The table would look like this:

id published
1 true
2 true
3 false
4 true
5 false

If we want to get all the Posts excluding the published Posts we could use the excluding method, resulting in 2 queries being executed

published_posts = Post.where(published: true)
# Generated SQL for the query
# => SELECT "posts".* FROM "posts" WHERE "posts"."published" = true

unpublished_posts = Post.all.excluding(published_posts)
# **Executes** published_posts query
# => #<ActiveRecord::Relation [#<Post id: 1, published: true>, #<Post id: 2, published: true>, #<Post id: 4, published: true>]>

# Injects the result into the main query
# => SELECT "posts".* FROM "posts" WHERE "posts"."id" NOT IN (1, 2, 4)

# **Executes** the unpublished_posts query

For large or complex queries, this double execution may be costly to the applicationโ€™s performance.

Part of the reason that 2 queries are executed, is that the .excluding method calls .flatten on the collection, which will trigger the first database query.

To optimise for fewer database queries, we can switch back to the examples that .excluding was introduced to remove.

We can use the .where.not chain to essentially inject the query from the first part (published_posts) into the second part of the query (unpublished_posts).


published_posts = Post.where(published: true)
# Generated SQL for the query
# => SELECT "posts".* FROM "posts" WHERE "posts"."published" = true

# SQL generated but not executed

unpublished_posts = Post.all.where.not(id: published_posts)
# Injects the SQL from published_posts into the main query
# => SELECT "posts".* FROM "posts" WHERE "posts"."id" NOT IN (SELECT "posts".* FROM "posts" WHERE "posts"."published" = true)

# **Executes** the combined query

While using .where.not isnโ€™t as pleasant to read as .excluding in certain circumstances it can be more performant by only executing a single database query rather than multiple queries.