Skip to content

Commit

Permalink
Add an ADR to keep document history in sync
Browse files Browse the repository at this point in the history
  • Loading branch information
pezholio committed Nov 28, 2024
1 parent 5a8efc2 commit 3932918
Showing 1 changed file with 64 additions and 0 deletions.
64 changes: 64 additions & 0 deletions docs/adr/0005-keep-document-history-in-sync-with-rabbit-mq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# 5. Keep Document history in sync with RabbitMQ

Date: 2025-11-27

## Status

Accepted

## Context

When Content Blocks created in the Content Block Manager are used in documents, we want to be able to
record when a change to a content block triggers an update to the host document. Currently this works
like so:

* Content block is updated
* We find all documents that use the content block
* Each document is then represented to the content store with the updated content block details

This all happens in Publishing API, so there is no record in Whitehall (or any other publishing apps)
of when a change to a document has been triggered by an update to a content block.

In order to do this, we need to update the Publishing API to record an event when a document has been
republished as a result to a change to a content block. We can then have an endpoint that allows us to
see the events for a particular document.

However, we still need a way to include these events in the history. Whitehall is particularly complex as
the document history is stored in the database and [paginated][1]. This means we can't fetch the events and
weave them into the history, as we don't have the entire history to hand to ensure we add the events to the
right place within the history.

We could send a request to the Publishing API endpoint before we fetch the history and then create
new events, however:

1. This will result in an API call every time a user views a document; and
2. Carrying out an INSERT query on a GET request isn't a pattern we want to encourage

## Decision

With this in mind, we are proposing adding a new message queue consumer in Whitehall. Rabbit MQ messages
are already sent by Publishing API when documents are republished, so we can consume the existing
`published_documents` queue. We will listen for events with the `host_content` key, so we only listen
for events triggered by a content object update. When we receive a message, we will:

* Make a call to the `events` endpoint in Publishing API for that Content ID to find the latest
`HostContentUpdateJob` event
* Create a new `EditorialRemark` for the latest live edition for the Whitehall Document with that
Content ID, informing the user that the document was republished by a change to the content block

Included in the events payload will be information about the triggering content block. We did consider
sending this information as part of the payload, but concluded that we should make the effort to make
the payload as small as possible, minimising bandwidth and reducing complexity in the Publishing API
code.

## Consequences

We will need to set up a RabbitMQ consumer in Whitehall, which will require some minor work on the
ops side of things. It will also mean we will need to consider two-way communication between the
two applications when thinking about the publishing platform architecture.

However, once this is set up, this could potentially open up the possibility of more two way
communication between Whitehall and Publishing API in the future, such as feeding back to
the user when something has not published successfully.

[1]: https://github.com/alphagov/whitehall/blob/main/app/models/document/paginated_timeline.rb

0 comments on commit 3932918

Please sign in to comment.