PostgreSQL Indexing: GIN, Django, and Trigram Similarity
PostgreSQL supports a special type of index called GIN – Generalized Inverted Index. This type of index is especially helpful for improving the performance of queries against composite fields.
PostgreSQL supports a special type of index called GIN – Generalized Inverted Index. This type of index is especially helpful for improving the performance of queries against composite fields, particularly for applications such as full-text search or any relatively static data.
GIN and GiST
If you are familiar with GiST indexes in PostgreSQL, then GIN should be familiar. Some of the main performance differences between the two index types follow from the documentation [1]:
- GIN index lookups are about three times faster than GiST
- GIN indexes take about three times longer to build than GiST
- GIN indexes are moderately slower to update than GiST indexes, but about 10 times slower if fast-update support was disabled
- GIN indexes are two-to-three times larger than GiST indexes
In summary, while GIN lookups are generally faster, they can be much slower to build and update than GiST ones and take up more space. This can be an advantageous tradeoff, however, depending on the application. Static data is particularly well suited for GIN.
Django Support
In Django, there is support for GIN and related technology for querying against these indexes called Trigram Similarity. To enable support for this feature, one must use the extension TrigramExtension. In addition, if one is using model fields that are not natively supported by GIN such as character fields – anything that is not jsonb, array, and tsvector – the BtreeGinExtension must also be enabled [2].
These extensions can be placed in a Django migration to enable database support. Once in place, subsequent migration operations can add GIN indexes to fields using the GinIndex module. For example, if one were to index a name field - which should be relatively static in the database – with GIN, it may look like the following (assuming a Django model enclosing the Meta class):
class Meta:
indexes = [
GinIndex(fields=['name']),
]
Note that GIN does not support NULL values, so the column must be null=False in the model definition.
Once in place, one can then efficiently query the data using a trigram similarity annotation:
results = MyModel.objects.annotate(
similarity=TrigramSimilarity('name', ‘Joe’
).filter(similarity__gt=0.3).order_by('-similarity')
Conclusion
In this post, we have only scratched the surface of what GIN indexing can provide in the context of a web application. If you have been considering alternative search technologies but are already using PostgreSQL, you may find that leveraging its native support for text search and indexing beneficial.
The JBS Quick Launch Lab
Free Qualified Assessment
Quantify what it will take to implement your next big idea!
Our assessment session will deliver tangible timelines, costs, high-level requirements, and recommend architectures that will work best. Let JBS prove to you and your team why over 24 years of experience matters.