What You Should Know Before Choosing EdgeDB for a Real Production

July 25, 2023

8 min read

Our company has its own set of rules for introducing new technologies into production. One of the things we do is describe our experience in the internal knowledge base, so that it is easier for other developers in the company to decide whether to implement certain technologies or not. This time we decided to publicly share our experience, because information about EdgeDB on the internet is actually quite scarce.

Some lyrical musings: I have been following the development of EdgeDB since its announcement and I had been thinking about trying this technology out in some project or another. The opportunity presented itself when EdgeDB had already reached version 2.0.

Overall, everything described below relates to this version and the state of EdgeDB at the time of writing the article, but I have updated some points in connection with the release of the 3.0.

Reasons for choosing EdgeDB

We needed a service (or rather an MVP of a service) that would store certain sets of related data and provide granular access to that data by other services (the backend-to-backend model). The service did not require any special business logic for working with this data, at least not in the first iteration. So, we needed a technology stack that would allow us to describe the stored data conveniently and flexibly, and also conveniently provide data to clients (which are other services), with the ability to dynamically change which data fields will be provided.

We ultimately chose EdgeDB because:

1. It allows us to conveniently describe the data structure and has a built-in migration mechanism. In general, document-oriented databases (such as MongoDB) could have been suitable here, but we wanted strict schema description.

2. Provided out of the box is GraphQL API, which ensures the necessary flexibility in data querying for service consumers.

3. EdgeDB is based on PostgreSQL, which theoretically makes it easier to switch to plain SQL if we ever need to abandon EdgeDB. For us, this also meant that our existing data backup and recovery, metrics collection, etc. would still work. In other words, there was no need to do everything specifically for the new DBMS.

In the classic version of things, the stack for such a project might look like the following:

1. PostgreSQL as the main database.

2. FastAPI for implementing the backend and API.

3. SQLAlchemy + Alembic for schema description and migrations.

In our case, however, EdgeDB allowed us to create the service without writing any backend code at all, which means that only the first item from the above list was required. EdgeDB also provides a WebUI for managing data, which relieves us of any need to implement our own admin panel. Moreover, there is also a simple web interface for GraphQL.

All in all, this saves resources and time that would have had to be spent on development and implementation.

In the end, indeed, using EdgeDB allowed us to quickly build the required service, but we also identified many drawbacks during the process that may lead one to reflect on the feasibility of using this technology for a specific project. I would like to share these observations more widely.

What to Consider Before Choosing EdgeDB

Here are some points to consider when working with EdgeDB for your own projects and requirements. These are not necessarily negatives, but some things to keep in mind.

[Initially, I had more objections, but with the release of the 3.0 version of EgdeDB some of them became irrelevant or needed to be rechecked, and eventually I deleted them.]

1. Information is lacking on the internet

There is really very little information available, which means that you will spend a significant amount of time and effort to perform tasks or solve problems during both development and operation. Almost all you will have at your disposal is the official documentation and the source code (we had to seek answers there as well). And the more complex your project is, the more you will be venturing into the unknown as you search for answers.

As a plus, it is worth noting that the documentation, guides, and examples are very well made.

2. The migration functionality is very limited

Though you may well find it sufficient, if you have experience with other systems (like Alembic or Django ORM), you might be surprised that:

There is no rollback functionality—yes, you can only move forward or restore from a backup.
Migration files cannot have custom names—I always prefer to use names with meanings for database schema change files, but here only numeric names are enforced for some reason.
There is no offline migration creation functionality—you need to have a database connection in order to generate a migration file.
There are no ready alternatives—if you hit the limits of the standard functionality, there will be nothing to replace it with.

3. Modifying data using the GraphQL API is available to everyone without authentication

The embedded GraphQL, firstly, does not support any authentication at all, and secondly, it provides an ability to perform mutations (data changes) that cannot be disabled. It is also impossible to restrict which types of data (models, tables) can be viewed and which cannot. There is hardly any indication of these issues in the documentation.

We solved this problem by using EdgeDB's Access Policies and Globals mechanism, and by means of this, we implemented a token system with granular access. However, this is no longer a trivial task and it may require a separate article. By the way, the documentation hints that this is exactly what should be done, but it does not contain any examples and this section of the documentation is clearly outdated (since version 2.0 has already been released):

In the documentation for EgdeDB 3.0 this block was deleted.

4. GraphQL API and Globals

If you are using Globals and you need to account for them in the GraphQL API, then the only way to pass them is to use the globals special field. An example from the documentation:

{
	"query": "query getMovie($title: String!) { Movie(filter: {title:{eq: $title}}) { id title }}",
	"variables": { "title": "The Batman" },
	"globals": {
		"default::current_user": "04e52807-6835-4eaa-999b-952804ab40a5"
	}
}

The problem here is that this is not a standard field for GraphQL, and ready-made libraries for working with GraphQL simply will not allow you to add it.

The consumers of our service had to give up their favorite libraries and form a raw http request.

I think it would be possible to extend the variables field to work with Globals, something like:

{
	"query": "query getMovie($title: String!) { Movie(filter: {title:{eq: $title}}) { id title }}",
	"variables": {
	    "title": "The Batman",
	    "__globals__": {
                    "default::current_user": "04e52807-6835-4eaa-999b-952804ab40a5"
            }
}

5. Non-informative errors when writing Access Policies

When describing Access Policies, errors may occur, but error messages provide little guidance on what needs to be done, and there is no information on the internet. I have encountered situations where it took a long time to get a rule to work, let alone migration. Sometimes I just wanted to give up trying to use Access Policies.
Apologies, I don't have any examples here, just emotions.

6. No Data Nesting

As it seems to me, EdgeDB aims to be as simple as using document-oriented databases in many ways, but there are places where the SQL nature of EdgeDB is clearly visible, and one is the lack of data nesting. From document-oriented databases, you can expect simplicity in storing a document in a document and obvious manipulation of them, for example, deleting all nested documents when deleting a parent.

With EdgeDB, nesting can be described by link fields, but it is important to correctly configure the properties of these fields. Otherwise, there is a risk that your database will have a lot of orphaned data, and you will need your own additional mechanism to clean it up. The complexity here is that you need to know how to configure the fields and be careful not to accidentally miss any potential errors during code review. These actions are roughly proportional to those that would have to be done for regular SQL systems. EdgeDB does not simplify these aspects, but only makes them slightly more visual.

Here I would expect EdgeDB to have some syntactic sugar that would allow describing a “typical” nested data structure, while under the hood automatically configuring all fields correctly.

Here are two similar examples with the “right” settings, in order to save you from having to painstakingly gather them from documentation or trial and error.

First way:

type Token {
  required link scope -> Scope {
    # Protection against link substitution after record creation:
    readonly := true;
    # One-to-one link:
    constraint exclusive;
    # Automatically deletes the linked Scope when Token is deleted:
    on source delete delete target;
  };
  # ...
}
 
type Scope {
  link token := .<scope[is Token];  # just backlink (optional)
  # ...
}

Second way:

type Application {
  link owner_contacts := .<application[is OwnerContacts];  # just backlink (optional)
  # ...
};
 
type OwnerContacts {
  required link application -> Application {
    # Protection against link substitution after creation:
    readonly := true;
    # One-to-one relationship:
    constraint exclusive;
    # Automatically deletes the linked OwnerContacts when deleting Application:
    on target delete delete source;
  };
  # ...
};

In the first case, there is a possibility to enforce the presence of Scope for Token at the schema level. The second approach is more flexible if there is a lot of related data, but it cannot guarantee that there will be a related record (OwnerContacts) for the parent record (Application).

7. Unexpected UnsupportedFeatureError

Sometimes I read the documentation, describe a schema, or write queries, doing everything pretty logically, and then edgedb says, “You did everything correctly, but I don’t support that right now”. I then had to search for and create workarounds that clearly seemed worse.

As an example of this, the Upserts pattern suggested in the documentation does not work if inheritance with an exclusive field is used, even if the base class is abstract:

abstract type Animal {
  required property name {
    readonly := true;
    constraint exclusive;
  }
}
 
type Dog extending Animal {
  property breed -> str;
}

Yet there is not a single word about this in the documentation. By the way, the same problem occurs if you do upserts by id, but on Github there is already an issue regarding this.

I did not document all such cases, but a series of such UnsupportedFeature led to the fact that it was necessary to completely abandon the use of inheritance (mixins) for data schemas and duplicate a lot of monotonous code multiple times in our service.

8. Transactions and Locks

Transactions can be managed manually and even the isolation levels familiar to SQL are supported, but locks cannot be managed and there is no analogue for SELECT ... FOR UPDATE.

9. No ORM/ODM for your programming language

I think this will be a significant drawback for some, though for me it is OK to write raw queries.

10. EdgeQL requires training

For simple tasks, EdgeQL is really more concise than SQL, but as the complexity increases, one will need to spend more and more time digging into the documentation in an attempt to understand how to form a query. If you have a good level of experience in SQL, it may often seem that it is easier to write in SQL than to understand how to shift everything to EdgeSQL.

11. One gigabyte of memory is needed for running in the cloud

We deployed EdgeDB in GKE and the EdgeDB instance (Pod) looks at PostgreSQL running in GCP, but the EdgeDB instance needs 1G of memory to work (requests.memory: 1Gi). Otherwise, it simply will not start and will crash with an error. It is unclear why it requires so much memory, and yet this requirement cannot be overcome in any way.

12. GraphQL is a schema option, not a server option

It's rather strange that enabling GraphQL functionality is performed at the level of data schema description and migration application. My own expectation is that this would be a server option and enabled by some -enable—graphql flag at startup or through the server configuration mechanism.

Incidentally, the same applies to the Edge over HTTP extension. Why is this not a server option?

13. EdgeDB utility

I have no complaints here. This is a really neat thing, as by using it, working with EdgeDB locally is made as convenient as possible. I would like to see similar mechanisms for other databases.

Instead of the conclusion

In general, we were able to launch the MVP of our service using the capabilities and functionality of EdgeDB as planned without writing backend code. I do not rule out that in the next iteration, we might abandon it in favor of a more classic stack, but this will be determined by, among other things, new requirements for our service.

In any case, I continue to closely follow EdgeDB’s development, and I wish the project every success.

Follow us:

Follow us:

What You Should Know Before Choosing EdgeDB for a Real Production

Reasons for choosing EdgeDB

What to Consider Before Choosing EdgeDB

Instead of the conclusion