Skip to content

User Guide

django-pgbulk comes with the following functions:

Below we show examples and advanced functionality.

Using pgbulk.upsert

pgbulk.upsert allows for updating or inserting rows atomically and returning results based on inserts or updates. Update fields, returned values, and ignoring unchanged rows can be configured.

See the Postgres INSERT docs for more information on how ON CONFLICT works.

A basic bulk upsert on a model

import pgbulk

pgbulk.upsert(
    MyModel,
    [
        MyModel(int_field=1, some_attr="some_val1"),
        MyModel(int_field=2, some_attr="some_val2"),
    ],
    # These are the fields that identify the uniqueness constraint.
    ["int_field"],
    # These are the fields that will be updated if the row already
    # exists. If not provided, all fields will be updated
    ["some_attr"]
)

Return the results of an upsert

results = pgbulk.upsert(
    MyModel,
    [
        MyModel(int_field=1, some_attr="some_val1"),
        MyModel(int_field=2, some_attr="some_val2"),
    ],
    ["int_field"],
    ["some_attr"],
    # `True` will return all columns. One can also explicitly
    # list which columns will be returned
    returning=True
)

# Print which results were created
print(results.created)

# Print which results were updated.
# By default, if an update results in no changes, it will not
# be updated and will not be returned.
print(results.updated)

Use an expression for updates

In this example, we increment some_int_field by one whenever an update happens. Otherwise it defaults to zero:

pgbulk.upsert(
    MyModel,
    [
        MyModel(some_int_field=0, some_key="a"),
        MyModel(some_int_field=0, some_key="b")
    ],
    ["some_key"],
    [
        # Use UpdateField to specify an expression for the update.
        pgbulk.UpdateField(
            "some_int_field",
            expression=models.F("some_int_field") + 1
        )
    ],
)

Ignore updates to unchanged rows

pgbulk.upsert(
    MyModel,
    [
        MyModel(int_field=1, some_attr="some_val1"),
        MyModel(int_field=2, some_attr="some_val2"),
    ],
    ["int_field"],
    ["some_attr"],
    ignore_unchanged=True
)

Warning

Triggers and auto-generated fields not in the update won't be applied. Unchanged rows also won't be returned if using returning=True.

Using pgbulk.update

pgbulk.update issues updates to multiple rows with an UPDATE SET ... FROM VALUES statement. Update fields, returned values, and ignoring unchanged rows can be configured.

Update an attribute of multiple models in bulk

import pgbulk

pgbulk.update(
    MyModel,
    [
        MyModel(id=1, some_attr='some_val1'),
        MyModel(id=2, some_attr='some_val2')
    ],
    # These are the fields that will be updated. If not provided,
    # all fields will be updated
    ['some_attr']
)

Use an expression in an update

In the example, we increment some_int_field by one:

pgbulk.update(
    MyModel,
    [
        MyModel(some_int_field=0, some_key="a"),
        MyModel(some_int_field=0, some_key="b")
    ],
    [
        # Use UpdateField to specify an expression for the update.
        pgbulk.UpdateField(
            "some_int_field",
            expression=models.F("some_int_field") + 1
        )
    ],
)

Return the results of an update

results = pgbulk.upsert(
    MyModel,
    [
        MyModel(int_field=1, some_attr="some_val1"),
        MyModel(int_field=2, some_attr="some_val2"),
    ],
    ["int_field", "some_attr"],
    # `True` will return all columns. One can also explicitly
    # list which columns will be returned.
    returning=True
)

# Results can be accessed as a tuple
print(results[0].int_field)

Ignore updates to unchanged rows

pgbulk.upsert(
    MyModel,
    [
        MyModel(int_field=1, some_attr="some_val1"),
        MyModel(int_field=2, some_attr="some_val2"),
    ],
    ["int_field"],
    ignore_unchanged=True
)

Warning

Triggers and auto-generated fields not in the update won't be applied. Unchanged rows also won't be returned if using returning=True.

Using pgbulk.copy

Using pgbulk.copy issues a COPY ... FROM STDIN statement to insert rows, which can be substantially faster than bulk INSERT statement or Django's bulk_create. Unlike bulk_create, pgbulk.copy cannot return inserted results.

Note

pgbulk.copy is not only available when using psycopg2.

Inserting Rows

import pgbulk

pgbulk.copy(
    models.TestModel,
    [
        models.TestModel(int_field=5, float_field=1),
        models.TestModel(int_field=6, float_field=2),
        models.TestModel(int_field=7, float_field=3),
    ],
)

Inserting Specific Columns

Specify columns or use exclude to configure which columns are copied:

pgbulk.copy(
    models.TestModel,
    [
        models.TestModel(int_field=5, float_field=1),
        models.TestModel(int_field=6, float_field=2),
        models.TestModel(int_field=7, float_field=3),
    ],
    ["int_field"]  # Only copy the int_field
)
pgbulk.copy(
    ...,
    exclude=["generated_field"]  # Exclude only this field
)

Note

Columns that are excluded from the copy must be generated, nullable, or have database defaults.