Our Django application uses South to
manage database migrations for model changes. South seems to be universally
recommended and generally quite capable at its job. But when I went to do a
fresh install of our project,
./manage.py migrate failed with
messages about missing tables or no-such-relation. Where did things go
The answer, it turns out, is in how South handles multiple apps. The
command will, by default, walk through all
INSTALLED_APPS1 with models and run
migrations for each app in turn.
If the developer next to you has been working on the same installation of this project for the past year, that is almost certainly not the order their migrations ran in. Maybe Accounts got a migration in January, Inventory and Distributors got migrations in March, and Accounts got another migration in June. If that last Accounts migration depended on a field added in March's update to the Distributors table, it will not work when we try to run all the Accounts migrations before any of those for the Distributors.
That second Accounts migration can be made to work; South does have
the ability to declare the dependencies of a migration. But those dependencies aren't added
--schemamigration auto, so your historical migrations
probably don't have them. (After all, they worked for the migration author
at the time!)
Time out for a second here. If these models are all inter-related, why are
they in different apps? I know
what is an app? and
how many apps
should I have in my project? were certainly questions I had when I
came to Django, as they have been for many others.
If we turn to Two Scoops of Django for guidance here, they say
Each app should be tightly focused on its task. If an app can’t be explained in a single sentence of moderate length, or you need to sayandmore than once, it probably means the app is too big and should be broken up.
A little later, in the chapter on model design, there's a bold section
heading that reads
Break Up Apps With Too Many Models. So it's easy
to see why project authors might trend towards making a lot of apps.
We flagged down a Django veteran from the neighboring office and asked them for their take on the subject. They said that the conclusion they've been coming to is that it can be better to have one central app where you put all the models that may be shared by various parts of your project. You can then make more apps, if you find that organization helpful, but have them use the models from the central app. That way there is only one set of migrations for South to deal with, and you don't have to worry about explicitly adding migration dependencies.
The advice in Two Scoops quotes James Bennett, Django core developer
and author of presentations like Reusable Apps.
But if you aren't focusing on
reusable apps — if your apps
all live together in a single project, and you don't intend to use them
separately in your other projects or redistribute them for third-party use —
then it's quite likely they will grow interdependent as they evolve
alongside one another. At that point you may be doing yourself a disservice
if you're operating as if your apps are independent (as tools like South
will assume) when they're really not.
That doesn't mean the advice
break up apps with too many models
doesn't still apply; any unit of organization will get harder to
manage if it gets too big. Managing that is a central responsibility of a
software developer. But perhaps there are other ways of doing that, such as
using multiple modules or sub-packages within a single app. Don't feel like
you have to start your project with half a dozen new apps just because you
Anyway, back to the task at hand. We already have a project with models in a lot of apps. And we already have a lot of migrations. Migrations that aren't currently functional. What next?
We could, through some combination of trial-and-error and detective work, by looking at the individual migrations and the commit history, try to go back and put dependencies on all the old migrations that need them.
Or, if we don't actually have a set of data from the Beginning of Time that needs to be brought up through all these migrations, we could throw them away and start with fresh definitions.
Since our data was already in an up-do-date schema, that second option sounded a lot more attractive. So we set about doing that. It goes something like this:
- Throw away all your
- Remove all migrations for your apps from the migration history,
- taking care not to remove migrations for any third-party apps
you may have in
- Or, if you do zealously wipe the entire migration
history, bringing those back by running
migrate --fakeon them.
- taking care not to remove migrations for any third-party apps you may have in
- Create new
initialmigrations describing the current state of each of your apps with
- If some of your apps depend on others, be sure to add
depends_onattributes to those initial migrations.
- If some of your apps depend on others, be sure to add
migrate --fakeon each of those to get your local migration history back in sync.
Looking around again now that I know what's going on, I find Ben Roberts on
Stack Overflow figured out a good way to reset your
migrations that looks rather less error-prone than what we did;
--delete-ghost-migrations is a better idea than
other ways of revising your migration history. Do it that way.
Except we still had a problem.
We had circular dependencies between our apps — e.g. some model in Accounts
referred to some model in Distributors and some model in Distributors
referred to some model in Accounts — so
--initial couldn't create either app in a single step.
We ended up commenting out the model fields which created the circular
dependencies, made the
--initial schemas, un-commented those
fields, ran a second set of
schemamigration --auto on those
depends_on attributes to that second set of
migrations, and then we were done.
I did a diff of
pg_dump --schema on the existing database and
the one produced by our new set of migrations, and it checked out. (Although
that diff was pretty tedious to read, with occasionally reordered fields and
different names for indices and constraints. Is there a better way to compare
the schema of two databases?)
Was that less work than figuring out the dependencies for the entire history? Probably. Still too much work? It felt like it. But I have working migrations now.
What could have saved us some trouble this time?
Well, when South creates a migration, it knows which models are the targets
of new references. Determining if those models are not in the current app
should not be hard. It should, at the very least, throw up a flag at that
point and say
Hey, you should add some dependencies on this migration
before you commit it!
Ideally, it'd look at the migration history for the other app and add the dependency for you, depending on the most recently applied migration for the other app.
Not surprisingly, I am not the first person to think of this: #509: add dependencies
Also relevant: #829: add
how to reset
migrations to documentation.
I guess that means if you see me at a python sprint or project hack night, we have something to work on!
- The migrate command calls
all_migrations, which gets all apps with models from the django
- Two Scoops of Django: Best Practices for Django 1.5
- This book by Daniel
pydannyGreenfield and Audrey Roy has generally been helpful for me, as someone who already knows a bit about python and web servers and databases but isn't quite sure of the best way to go about things in Django. Even if I have been questioning some of the content in this article today, that's been outweighed by the many times I said
Yes! I was wondering how to do that!when reading the book.
When I asked myself if I should write this post today, I thought
it won't take as long as the last one!
The last one was a little over 400 words. This one is over 1300.
Please don't expect me to keep this pace. But then, I'm sure many of you will appreciate it if my posts don't get to tl;dr length. I do get verbose at times. But I do have more topics queued up!