diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 0000000000..e69de29bb2 diff --git a/404.html b/404.html new file mode 100644 index 0000000000..7a96ebeb38 --- /dev/null +++ b/404.html @@ -0,0 +1,949 @@ + + + + + + + + + + + + + + + + + + Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ +

404 - Not found

+ +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/CNAME b/CNAME new file mode 100644 index 0000000000..9fa2072c7f --- /dev/null +++ b/CNAME @@ -0,0 +1 @@ +data-workspace.docs.trade.gov.uk \ No newline at end of file diff --git a/architecture/ADRs/0001/index.html b/architecture/ADRs/0001/index.html new file mode 100644 index 0000000000..ae01586cec --- /dev/null +++ b/architecture/ADRs/0001/index.html @@ -0,0 +1,1103 @@ + + + + + + + + + + + + + + + + + + + + + + 0001: Using a custom proxy - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + Skip to content + + +
+
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + + + +

0001: Using a custom proxy

+ +

Context

+

A common question is why not just NGINX instead of the custom proxy? The reason is the dynamic routing for the applications, e.g. URLs like https://jupyterlab-abcde1234.mydomain.com/some/path: each one has a lot of fairly complex requirements.

+
    +
  • It must redirect to SSO if not authenticated, and redirect back to the URL once authenticated.
  • +
  • It must perform ip-filtering that is not applicable to the main application.
  • +
  • It must check that the current user is allowed to access the application, and show a forbidden page if not.
  • +
  • It must start the application if it's not started.
  • +
  • It must show a starting page with countdown if it's starting.
  • +
  • It must detect if an application has started, and route requests to it if it is.
  • +
  • It must route cookies from all responses back to the user. For JupyterLab, the first response contains cookies used in XSRF protection that are never resent in later requests.
  • +
  • It must show an error page if there is an error starting or connecting to the application.
  • +
  • It must allow a refresh of the error page to attempt to start the application again.
  • +
  • It must support WebSockets, without knowledge ahead of time which paths are used by WebSockets.
  • +
  • It must support streaming uploads and downloads.
  • +
  • Ideally, there would not be duplicate reponsibilities between the proxy and other parts of the system, e.g. the Django application.
  • +
+

While not impossible to leverage NGINX to move some code from the proxy, there would still need to be custom code, and NGINX would have to communicate via some mechanism to this custom code to achieve all of the above: extra HTTP or Redis requests, or maybe through a custom NGINX module. It is suspected that this will make things more complex rather than less, and increase the burden on the developer.

+

Decision

+

We will use a custom proxy for Data Workspace, rather than simply using NGINX.

+

Consequences

+

Positive

+
    +
  • +

    This will decrease the burden on the developer that would have been required by custom NGINX modules, extra HTTP or Redis requests, which all would still have required custom code.

    +
  • +
  • +

    Using the custom proxy allows for all of the complex requirements and dynamic routing of our applications over which we have absolute control.

    +
  • +
+

Negative

+
    +
  • +

    Initial difficulty when onboarding new team members as they will need to understand these decisions and requirements.

    +
  • +
  • +

    There is an extra network hop compared to not having a proxy.

    +
  • +
+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/architecture/ADRs/0002/index.html b/architecture/ADRs/0002/index.html new file mode 100644 index 0000000000..75a523010d --- /dev/null +++ b/architecture/ADRs/0002/index.html @@ -0,0 +1,1108 @@ + + + + + + + + + + + + + + + + + + + + + + 0002: Usage of asyncio in proxy - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + Skip to content + + +
+
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + + + +

0002: Usage of asyncio in proxy

+ +

Context

+
    +
  • +

    The proxy fits the typical use-case of event-loop based programming: low CPU but high IO requirements, with potentially high number of connections.

    +
  • +
  • +

    The asyncio library aiohttp provides enough low-level control over the headers and the bytes of requests and responses to work as a controllable proxy. For example, the typical HTTP request cycle can be programmed fairly explicitly.

    +
  • +
  • +

    An incoming request begins: its headers are received.

    +
  • +
  • The proxy makes potentially several requests to the Django application, to Redis, and/or to SSO to authenticate and determine where to route the request.
  • +
  • The incoming request's headers are passed to the application [removing certain hop-by-hop-headers].
  • +
  • The incoming request's body is streamed to the application.
  • +
  • The response headers are sent back to the client, combining cookies from the application and from the proxy.
  • +
  • The response body is streamed back to the client.
  • +
+

The library also allows for receiving and making WebSockets requests. This is done without knowledge ahead of time which path is WebSockets, and which is HTTP. This is something that doesn't seem possible with, for example, Django Channels.

+

Requests and responses can be of the order of several GBs, so this streaming behaviour is a critical requirement.

+
    +
  • Django gives a lot of benefits for the main application: for example, it is within the skill set of most available developers. Only a small fraction of changes need to involve the proxy.
  • +
+

Decision

+

We will use the asyncio library aiohttp.

+

Consequences

+

Positive

+
    +
  • +

    Allows for critical requirement of streaming behaviour.

    +
  • +
  • +

    We can stream HTTP(S) and Websockets requests in an efficient way with one cohesive Python package.

    +
  • +
+

Negative

+
    +
  • +

    A core bit of infrastructure will depend on a flavour of Python unknown to even experienced Python developers.

    +
  • +
  • +

    Aiohttp is unable to proxy things that are not HTTP or Websockets, i.e. SSH. This is why GitLab isn't behind the proxy.

    +
  • +
+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/architecture/ADRs/index.html b/architecture/ADRs/index.html new file mode 100644 index 0000000000..a116a1cf3b --- /dev/null +++ b/architecture/ADRs/index.html @@ -0,0 +1,985 @@ + + + + + + + + + + + + + + + + + + + + + + Architecture Decision Records - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + Skip to content + + +
+
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+ +
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/architecture/application-lifecycle/index.html b/architecture/application-lifecycle/index.html new file mode 100644 index 0000000000..4ba447db3a --- /dev/null +++ b/architecture/application-lifecycle/index.html @@ -0,0 +1,1004 @@ + + + + + + + + + + + + + + + + + + + + + + Application lifecycle - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + Skip to content + + +
+
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

Application lifecycle

+

As an example, from the point of view of user abcde1234, https://jupyterlab-abcde1234.mydomain.com/ is the fixed address of their private JupyterLab application. Going to https://jupyterlab-abcde1234.mydomain.com/ in a browser will:

+
    +
  • show a starting screen with a countdown;
  • +
  • and when the application is loaded, the page will reload and show the application itself;
  • +
  • and subsequent loads will show the application immediately.
  • +
+

If the application is stopped, then a visit to https://jupyterlab-abcde1234.mydomain.com/ will repeat the process. The user will never leave https://jupyterlab-abcde1234.mydomain.com/. If the user visits https://jupyterlab-abcde1234.mydomain.com/some/path, they will also remain at https://jupyterlab-abcde1234.mydomain.com/some/path to ensure, for example, bookmarks to any in-application page work even if they need to start the application to view them.

+

The browser will only make GET requests during the start of an application. While potentially a small abuse of HTTP, it allows the straightfoward behaviour described: no HTML form or JavaScript is required to start an application (although JavaScript is used to show a countdown to the user and to check if an application has loaded), and the GET requests are idempotent.

+

The proxy however, has a more complex behaviour. On an incoming request from the browser for https://jupyterlab-abcde1234.mydomain.com/:

+
    +
  • it will attempt to GET details of an application with the host jupyterlab-abcde1234 from an internal API of the main application;
  • +
  • if the GET returns a 404, it will make a PUT request to the main application that initiates creation of the Fargate task;
  • +
  • if the GET returns a 200, and the details contain a URL, the proxy will attempt to proxy the incoming request to it;
  • +
  • it does not treat errors connecting to a SPAWNING application as a true error: they are effectively swallowed.
  • +
  • if an application is returned from the GET as STOPPED, which happens on error, it will DELETE the application, and show an error to the user.
  • +
+

The proxy itself only responds to incoming requests from the browser, and has no long-lived tasks that go beyond one HTTP request or WebSockets connection. This ensures it can be horizontally scaled.

+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/architecture/comparison-with-jupyterhub/index.html b/architecture/comparison-with-jupyterhub/index.html new file mode 100644 index 0000000000..b50fcbc7df --- /dev/null +++ b/architecture/comparison-with-jupyterhub/index.html @@ -0,0 +1,1006 @@ + + + + + + + + + + + + + + + + + + + + + + Comparison with JupyterHub - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + Skip to content + + +
+
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

Comparison with JupyterHub

+

In addition to being able to run any Docker container, not just JupyterLab, Data Workspace has some deliberate architectural features that are different to JupyterHub.

+
    +
  • +

    All state is in the database, accessed by the main Django application.

    +
  • +
  • +

    Specifically, no state is kept in the memory of the main Django application. This means it can be horizontally scaled without issue.

    +
  • +
  • +

    The proxy is also stateless: it fetches how to route requests from the main application, which itself fetches the data from the database. This means it can also be horizontally scaled without issue, and potentially independently from the main application. This means sticky sessions are not needed, and multiple users could access the same application, which is a planned feature for user-supplied visualisation applications.

    +
  • +
  • +

    Authentication is completely handled by the proxy. Apart from specific exceptions like the healthcheck, non-authenticated requests do not reach the main application.

    +
  • +
  • +

    The launched containers do not make requests to the main application, and the main application does not make requests to the launched containers. This means there are fewer cyclic dependencies in terms of data flow, and that applications don't need to be customised for this environment. They just need to open a port for HTTP requests, which makes them extremely standard web-based Docker applications.

    +
  • +
+

There is a notable exception to the statelessness of the main application: the launch of an application is made of a sequence of calls to AWS, and is done in a Celery task. If this sequence is interrupted, the launch of the application will fail. This is a solvable problem: the state could be saving into the database and the sequence resumed later. However, since this sequence of calls lasts only a few seconds, and the user will be told of the error and can refresh to try to launch the application again, at this stage of the project this has been deemed unnecessary.

+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/architecture/components/index.html b/architecture/components/index.html new file mode 100644 index 0000000000..6bb431ad9e --- /dev/null +++ b/architecture/components/index.html @@ -0,0 +1,1131 @@ + + + + + + + + + + + + + + + + + + + + + + Components - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + Skip to content + + +
+
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

Components

+ +

Data Workspace is made of a number of components. This page explains what those are and how they work together.

+

Prerequisites

+

To understand the components of Data Workspace's architecture, you should have familiary with:

+ +

High level architecture

+

At the highest level, users access the Data Workspace application, which accesses a PostgreSQL database.

+
graph
+  A[User] --> B[Data Workspace]
+  B --> C["PostgreSQL (Aurora)"]
+

Medium level architecture

+

The architecture is heavily Docker/ECS Fargate based.

+
graph
+  A[User] -->|Staff SSO| B[Amazon Quicksight];
+  B --> C["PostgreSQL (Aurora)"];
+  A --> |Staff SSO|F["'The Proxy' (aiohttp)"];
+  F --> |rstudio-9c57e86a|G[Per-user and shared tools];
+  F --> H[Shiny, Flask, Django, NGINX];
+  F --> I[Django, Data Explorer];
+  G --> C;
+  H --> C;
+  I --> C;
+
+
+
+

User-facing

+
    +
  • +

    Main application: + A Django application to manage datasets and permissions, launch containers, a proxy to route requests to those containers, and an NGINX instance to route to the proxy and serve static files.

    +
  • +
  • +

    JupyterLab: + Launched by users of the main application, and populated with credentials in the environment to access certain datasets.

    +
  • +
  • +

    rStudio: + Launched by users of the main application, and populated with credentials in the environment to access certain datasets.

    +
  • +
  • +

    pgAdmin: + Launched by users of the main application, and populated with credentials in the environment to access certain datasets.

    +
  • +
  • +

    File browser: + A single-page-application that offers upload and download of files to/from each user's folder in S3. The data is transferred directly between the user's browser and S3.

    +
  • +
+

Infrastructure

+
    +
  • +

    metrics: + A sidecar-container for the user-launched containers that exposes metrics from the ECS task metadata endpoint in Prometheus format.

    +
  • +
  • +

    s3sync: + A sidecar-container for the user-launched containers that syncs to and from S3 using mobius3. This is to allow file-persistance on S3 without using FUSE, which at the time of writing is not possible on Fargate.

    +
  • +
  • +

    dns-rewrite-proxy: + The DNS server of the VPC that launched containers run in. It selectively allows only certain DNS requests through to migitate chance of data exfiltration through DNS. When this container is deployed, it changes DHCP settings in the VPC, and will most likely break aspects of user-launched containers.

    +
  • +
  • +

    healthcheck: + Proxies through to the healthcheck endpoint of the main application, so the main application can be in a security group locked-down to certain IP addresses, but still be monitored by Pingdom.

    +
  • +
  • +

    mirrors-sync: + Mirrors pypi, CRAN and (ana)conda repositories to S3, so user-launched JupyterLab and rStudio containers can install packages without having to contact the public internet.

    +
  • +
  • +

    prometheus: + Collects metrics from user-launched containers and re-exposes them through federation.

    +
  • +
  • +

    registry: + A Docker pull-through-cache to repositories in quay.io. This allows the VPC to not have public internet access but still launch containers from quay.io in Fargate.

    +
  • +
  • +

    sentryproxy: + Proxies errors to a Sentry instance: only used by JupyterLab.

    +
  • +
+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/assets/data-workspace-architecture.png b/assets/data-workspace-architecture.png new file mode 100644 index 0000000000..6a151d40c5 Binary files /dev/null and b/assets/data-workspace-architecture.png differ diff --git a/assets/dit-favicon.png b/assets/dit-favicon.png new file mode 100644 index 0000000000..176eb0fa16 Binary files /dev/null and b/assets/dit-favicon.png differ diff --git a/assets/dit-logo.png b/assets/dit-logo.png new file mode 100644 index 0000000000..d05834b0a8 Binary files /dev/null and b/assets/dit-logo.png differ diff --git a/assets/dw-readme-front-page.png b/assets/dw-readme-front-page.png new file mode 100644 index 0000000000..26d3ed487e Binary files /dev/null and b/assets/dw-readme-front-page.png differ diff --git a/assets/images/favicon.png b/assets/images/favicon.png new file mode 100644 index 0000000000..1cf13b9f9d Binary files /dev/null and b/assets/images/favicon.png differ diff --git a/assets/images/govuk-crest-2x.png b/assets/images/govuk-crest-2x.png new file mode 100644 index 0000000000..78e751cc20 Binary files /dev/null and b/assets/images/govuk-crest-2x.png differ diff --git a/assets/images/ogl.png b/assets/images/ogl.png new file mode 100644 index 0000000000..17dc7a4da3 Binary files /dev/null and b/assets/images/ogl.png differ diff --git a/assets/javascripts/bundle.407015b8.min.js b/assets/javascripts/bundle.407015b8.min.js new file mode 100644 index 0000000000..4361bb787e --- /dev/null +++ b/assets/javascripts/bundle.407015b8.min.js @@ -0,0 +1,29 @@ +"use strict";(()=>{var Ri=Object.create;var gr=Object.defineProperty;var ki=Object.getOwnPropertyDescriptor;var Hi=Object.getOwnPropertyNames,Ht=Object.getOwnPropertySymbols,Pi=Object.getPrototypeOf,yr=Object.prototype.hasOwnProperty,on=Object.prototype.propertyIsEnumerable;var nn=(e,t,r)=>t in e?gr(e,t,{enumerable:!0,configurable:!0,writable:!0,value:r}):e[t]=r,P=(e,t)=>{for(var r in t||(t={}))yr.call(t,r)&&nn(e,r,t[r]);if(Ht)for(var r of Ht(t))on.call(t,r)&&nn(e,r,t[r]);return e};var an=(e,t)=>{var r={};for(var n in e)yr.call(e,n)&&t.indexOf(n)<0&&(r[n]=e[n]);if(e!=null&&Ht)for(var n of Ht(e))t.indexOf(n)<0&&on.call(e,n)&&(r[n]=e[n]);return r};var Pt=(e,t)=>()=>(t||e((t={exports:{}}).exports,t),t.exports);var $i=(e,t,r,n)=>{if(t&&typeof t=="object"||typeof t=="function")for(let o of Hi(t))!yr.call(e,o)&&o!==r&&gr(e,o,{get:()=>t[o],enumerable:!(n=ki(t,o))||n.enumerable});return e};var yt=(e,t,r)=>(r=e!=null?Ri(Pi(e)):{},$i(t||!e||!e.__esModule?gr(r,"default",{value:e,enumerable:!0}):r,e));var cn=Pt((xr,sn)=>{(function(e,t){typeof xr=="object"&&typeof sn!="undefined"?t():typeof define=="function"&&define.amd?define(t):t()})(xr,function(){"use strict";function e(r){var n=!0,o=!1,i=null,s={text:!0,search:!0,url:!0,tel:!0,email:!0,password:!0,number:!0,date:!0,month:!0,week:!0,time:!0,datetime:!0,"datetime-local":!0};function a(T){return!!(T&&T!==document&&T.nodeName!=="HTML"&&T.nodeName!=="BODY"&&"classList"in T&&"contains"in T.classList)}function c(T){var Qe=T.type,De=T.tagName;return!!(De==="INPUT"&&s[Qe]&&!T.readOnly||De==="TEXTAREA"&&!T.readOnly||T.isContentEditable)}function f(T){T.classList.contains("focus-visible")||(T.classList.add("focus-visible"),T.setAttribute("data-focus-visible-added",""))}function u(T){T.hasAttribute("data-focus-visible-added")&&(T.classList.remove("focus-visible"),T.removeAttribute("data-focus-visible-added"))}function p(T){T.metaKey||T.altKey||T.ctrlKey||(a(r.activeElement)&&f(r.activeElement),n=!0)}function m(T){n=!1}function d(T){a(T.target)&&(n||c(T.target))&&f(T.target)}function h(T){a(T.target)&&(T.target.classList.contains("focus-visible")||T.target.hasAttribute("data-focus-visible-added"))&&(o=!0,window.clearTimeout(i),i=window.setTimeout(function(){o=!1},100),u(T.target))}function v(T){document.visibilityState==="hidden"&&(o&&(n=!0),G())}function G(){document.addEventListener("mousemove",N),document.addEventListener("mousedown",N),document.addEventListener("mouseup",N),document.addEventListener("pointermove",N),document.addEventListener("pointerdown",N),document.addEventListener("pointerup",N),document.addEventListener("touchmove",N),document.addEventListener("touchstart",N),document.addEventListener("touchend",N)}function oe(){document.removeEventListener("mousemove",N),document.removeEventListener("mousedown",N),document.removeEventListener("mouseup",N),document.removeEventListener("pointermove",N),document.removeEventListener("pointerdown",N),document.removeEventListener("pointerup",N),document.removeEventListener("touchmove",N),document.removeEventListener("touchstart",N),document.removeEventListener("touchend",N)}function N(T){T.target.nodeName&&T.target.nodeName.toLowerCase()==="html"||(n=!1,oe())}document.addEventListener("keydown",p,!0),document.addEventListener("mousedown",m,!0),document.addEventListener("pointerdown",m,!0),document.addEventListener("touchstart",m,!0),document.addEventListener("visibilitychange",v,!0),G(),r.addEventListener("focus",d,!0),r.addEventListener("blur",h,!0),r.nodeType===Node.DOCUMENT_FRAGMENT_NODE&&r.host?r.host.setAttribute("data-js-focus-visible",""):r.nodeType===Node.DOCUMENT_NODE&&(document.documentElement.classList.add("js-focus-visible"),document.documentElement.setAttribute("data-js-focus-visible",""))}if(typeof window!="undefined"&&typeof document!="undefined"){window.applyFocusVisiblePolyfill=e;var t;try{t=new CustomEvent("focus-visible-polyfill-ready")}catch(r){t=document.createEvent("CustomEvent"),t.initCustomEvent("focus-visible-polyfill-ready",!1,!1,{})}window.dispatchEvent(t)}typeof document!="undefined"&&e(document)})});var fn=Pt(Er=>{(function(e){var t=function(){try{return!!Symbol.iterator}catch(f){return!1}},r=t(),n=function(f){var u={next:function(){var p=f.shift();return{done:p===void 0,value:p}}};return r&&(u[Symbol.iterator]=function(){return u}),u},o=function(f){return encodeURIComponent(f).replace(/%20/g,"+")},i=function(f){return decodeURIComponent(String(f).replace(/\+/g," "))},s=function(){var f=function(p){Object.defineProperty(this,"_entries",{writable:!0,value:{}});var m=typeof p;if(m!=="undefined")if(m==="string")p!==""&&this._fromString(p);else if(p instanceof f){var d=this;p.forEach(function(oe,N){d.append(N,oe)})}else if(p!==null&&m==="object")if(Object.prototype.toString.call(p)==="[object Array]")for(var h=0;hd[0]?1:0}),f._entries&&(f._entries={});for(var p=0;p1?i(d[1]):"")}})})(typeof global!="undefined"?global:typeof window!="undefined"?window:typeof self!="undefined"?self:Er);(function(e){var t=function(){try{var o=new e.URL("b","http://a");return o.pathname="c d",o.href==="http://a/c%20d"&&o.searchParams}catch(i){return!1}},r=function(){var o=e.URL,i=function(c,f){typeof c!="string"&&(c=String(c)),f&&typeof f!="string"&&(f=String(f));var u=document,p;if(f&&(e.location===void 0||f!==e.location.href)){f=f.toLowerCase(),u=document.implementation.createHTMLDocument(""),p=u.createElement("base"),p.href=f,u.head.appendChild(p);try{if(p.href.indexOf(f)!==0)throw new Error(p.href)}catch(T){throw new Error("URL unable to set base "+f+" due to "+T)}}var m=u.createElement("a");m.href=c,p&&(u.body.appendChild(m),m.href=m.href);var d=u.createElement("input");if(d.type="url",d.value=c,m.protocol===":"||!/:/.test(m.href)||!d.checkValidity()&&!f)throw new TypeError("Invalid URL");Object.defineProperty(this,"_anchorElement",{value:m});var h=new e.URLSearchParams(this.search),v=!0,G=!0,oe=this;["append","delete","set"].forEach(function(T){var Qe=h[T];h[T]=function(){Qe.apply(h,arguments),v&&(G=!1,oe.search=h.toString(),G=!0)}}),Object.defineProperty(this,"searchParams",{value:h,enumerable:!0});var N=void 0;Object.defineProperty(this,"_updateSearchParams",{enumerable:!1,configurable:!1,writable:!1,value:function(){this.search!==N&&(N=this.search,G&&(v=!1,this.searchParams._fromString(this.search),v=!0))}})},s=i.prototype,a=function(c){Object.defineProperty(s,c,{get:function(){return this._anchorElement[c]},set:function(f){this._anchorElement[c]=f},enumerable:!0})};["hash","host","hostname","port","protocol"].forEach(function(c){a(c)}),Object.defineProperty(s,"search",{get:function(){return this._anchorElement.search},set:function(c){this._anchorElement.search=c,this._updateSearchParams()},enumerable:!0}),Object.defineProperties(s,{toString:{get:function(){var c=this;return function(){return c.href}}},href:{get:function(){return this._anchorElement.href.replace(/\?$/,"")},set:function(c){this._anchorElement.href=c,this._updateSearchParams()},enumerable:!0},pathname:{get:function(){return this._anchorElement.pathname.replace(/(^\/?)/,"/")},set:function(c){this._anchorElement.pathname=c},enumerable:!0},origin:{get:function(){var c={"http:":80,"https:":443,"ftp:":21}[this._anchorElement.protocol],f=this._anchorElement.port!=c&&this._anchorElement.port!=="";return this._anchorElement.protocol+"//"+this._anchorElement.hostname+(f?":"+this._anchorElement.port:"")},enumerable:!0},password:{get:function(){return""},set:function(c){},enumerable:!0},username:{get:function(){return""},set:function(c){},enumerable:!0}}),i.createObjectURL=function(c){return o.createObjectURL.apply(o,arguments)},i.revokeObjectURL=function(c){return o.revokeObjectURL.apply(o,arguments)},e.URL=i};if(t()||r(),e.location!==void 0&&!("origin"in e.location)){var n=function(){return e.location.protocol+"//"+e.location.hostname+(e.location.port?":"+e.location.port:"")};try{Object.defineProperty(e.location,"origin",{get:n,enumerable:!0})}catch(o){setInterval(function(){e.location.origin=n()},100)}}})(typeof global!="undefined"?global:typeof window!="undefined"?window:typeof self!="undefined"?self:Er)});var Kr=Pt((Mt,qr)=>{/*! + * clipboard.js v2.0.11 + * https://clipboardjs.com/ + * + * Licensed MIT © Zeno Rocha + */(function(t,r){typeof Mt=="object"&&typeof qr=="object"?qr.exports=r():typeof define=="function"&&define.amd?define([],r):typeof Mt=="object"?Mt.ClipboardJS=r():t.ClipboardJS=r()})(Mt,function(){return function(){var e={686:function(n,o,i){"use strict";i.d(o,{default:function(){return Ci}});var s=i(279),a=i.n(s),c=i(370),f=i.n(c),u=i(817),p=i.n(u);function m(j){try{return document.execCommand(j)}catch(O){return!1}}var d=function(O){var E=p()(O);return m("cut"),E},h=d;function v(j){var O=document.documentElement.getAttribute("dir")==="rtl",E=document.createElement("textarea");E.style.fontSize="12pt",E.style.border="0",E.style.padding="0",E.style.margin="0",E.style.position="absolute",E.style[O?"right":"left"]="-9999px";var H=window.pageYOffset||document.documentElement.scrollTop;return E.style.top="".concat(H,"px"),E.setAttribute("readonly",""),E.value=j,E}var G=function(O,E){var H=v(O);E.container.appendChild(H);var I=p()(H);return m("copy"),H.remove(),I},oe=function(O){var E=arguments.length>1&&arguments[1]!==void 0?arguments[1]:{container:document.body},H="";return typeof O=="string"?H=G(O,E):O instanceof HTMLInputElement&&!["text","search","url","tel","password"].includes(O==null?void 0:O.type)?H=G(O.value,E):(H=p()(O),m("copy")),H},N=oe;function T(j){return typeof Symbol=="function"&&typeof Symbol.iterator=="symbol"?T=function(E){return typeof E}:T=function(E){return E&&typeof Symbol=="function"&&E.constructor===Symbol&&E!==Symbol.prototype?"symbol":typeof E},T(j)}var Qe=function(){var O=arguments.length>0&&arguments[0]!==void 0?arguments[0]:{},E=O.action,H=E===void 0?"copy":E,I=O.container,q=O.target,Me=O.text;if(H!=="copy"&&H!=="cut")throw new Error('Invalid "action" value, use either "copy" or "cut"');if(q!==void 0)if(q&&T(q)==="object"&&q.nodeType===1){if(H==="copy"&&q.hasAttribute("disabled"))throw new Error('Invalid "target" attribute. Please use "readonly" instead of "disabled" attribute');if(H==="cut"&&(q.hasAttribute("readonly")||q.hasAttribute("disabled")))throw new Error(`Invalid "target" attribute. You can't cut text from elements with "readonly" or "disabled" attributes`)}else throw new Error('Invalid "target" value, use a valid Element');if(Me)return N(Me,{container:I});if(q)return H==="cut"?h(q):N(q,{container:I})},De=Qe;function $e(j){return typeof Symbol=="function"&&typeof Symbol.iterator=="symbol"?$e=function(E){return typeof E}:$e=function(E){return E&&typeof Symbol=="function"&&E.constructor===Symbol&&E!==Symbol.prototype?"symbol":typeof E},$e(j)}function wi(j,O){if(!(j instanceof O))throw new TypeError("Cannot call a class as a function")}function rn(j,O){for(var E=0;E0&&arguments[0]!==void 0?arguments[0]:{};this.action=typeof I.action=="function"?I.action:this.defaultAction,this.target=typeof I.target=="function"?I.target:this.defaultTarget,this.text=typeof I.text=="function"?I.text:this.defaultText,this.container=$e(I.container)==="object"?I.container:document.body}},{key:"listenClick",value:function(I){var q=this;this.listener=f()(I,"click",function(Me){return q.onClick(Me)})}},{key:"onClick",value:function(I){var q=I.delegateTarget||I.currentTarget,Me=this.action(q)||"copy",kt=De({action:Me,container:this.container,target:this.target(q),text:this.text(q)});this.emit(kt?"success":"error",{action:Me,text:kt,trigger:q,clearSelection:function(){q&&q.focus(),window.getSelection().removeAllRanges()}})}},{key:"defaultAction",value:function(I){return vr("action",I)}},{key:"defaultTarget",value:function(I){var q=vr("target",I);if(q)return document.querySelector(q)}},{key:"defaultText",value:function(I){return vr("text",I)}},{key:"destroy",value:function(){this.listener.destroy()}}],[{key:"copy",value:function(I){var q=arguments.length>1&&arguments[1]!==void 0?arguments[1]:{container:document.body};return N(I,q)}},{key:"cut",value:function(I){return h(I)}},{key:"isSupported",value:function(){var I=arguments.length>0&&arguments[0]!==void 0?arguments[0]:["copy","cut"],q=typeof I=="string"?[I]:I,Me=!!document.queryCommandSupported;return q.forEach(function(kt){Me=Me&&!!document.queryCommandSupported(kt)}),Me}}]),E}(a()),Ci=Ai},828:function(n){var o=9;if(typeof Element!="undefined"&&!Element.prototype.matches){var i=Element.prototype;i.matches=i.matchesSelector||i.mozMatchesSelector||i.msMatchesSelector||i.oMatchesSelector||i.webkitMatchesSelector}function s(a,c){for(;a&&a.nodeType!==o;){if(typeof a.matches=="function"&&a.matches(c))return a;a=a.parentNode}}n.exports=s},438:function(n,o,i){var s=i(828);function a(u,p,m,d,h){var v=f.apply(this,arguments);return u.addEventListener(m,v,h),{destroy:function(){u.removeEventListener(m,v,h)}}}function c(u,p,m,d,h){return typeof u.addEventListener=="function"?a.apply(null,arguments):typeof m=="function"?a.bind(null,document).apply(null,arguments):(typeof u=="string"&&(u=document.querySelectorAll(u)),Array.prototype.map.call(u,function(v){return a(v,p,m,d,h)}))}function f(u,p,m,d){return function(h){h.delegateTarget=s(h.target,p),h.delegateTarget&&d.call(u,h)}}n.exports=c},879:function(n,o){o.node=function(i){return i!==void 0&&i instanceof HTMLElement&&i.nodeType===1},o.nodeList=function(i){var s=Object.prototype.toString.call(i);return i!==void 0&&(s==="[object NodeList]"||s==="[object HTMLCollection]")&&"length"in i&&(i.length===0||o.node(i[0]))},o.string=function(i){return typeof i=="string"||i instanceof String},o.fn=function(i){var s=Object.prototype.toString.call(i);return s==="[object Function]"}},370:function(n,o,i){var s=i(879),a=i(438);function c(m,d,h){if(!m&&!d&&!h)throw new Error("Missing required arguments");if(!s.string(d))throw new TypeError("Second argument must be a String");if(!s.fn(h))throw new TypeError("Third argument must be a Function");if(s.node(m))return f(m,d,h);if(s.nodeList(m))return u(m,d,h);if(s.string(m))return p(m,d,h);throw new TypeError("First argument must be a String, HTMLElement, HTMLCollection, or NodeList")}function f(m,d,h){return m.addEventListener(d,h),{destroy:function(){m.removeEventListener(d,h)}}}function u(m,d,h){return Array.prototype.forEach.call(m,function(v){v.addEventListener(d,h)}),{destroy:function(){Array.prototype.forEach.call(m,function(v){v.removeEventListener(d,h)})}}}function p(m,d,h){return a(document.body,m,d,h)}n.exports=c},817:function(n){function o(i){var s;if(i.nodeName==="SELECT")i.focus(),s=i.value;else if(i.nodeName==="INPUT"||i.nodeName==="TEXTAREA"){var a=i.hasAttribute("readonly");a||i.setAttribute("readonly",""),i.select(),i.setSelectionRange(0,i.value.length),a||i.removeAttribute("readonly"),s=i.value}else{i.hasAttribute("contenteditable")&&i.focus();var c=window.getSelection(),f=document.createRange();f.selectNodeContents(i),c.removeAllRanges(),c.addRange(f),s=c.toString()}return s}n.exports=o},279:function(n){function o(){}o.prototype={on:function(i,s,a){var c=this.e||(this.e={});return(c[i]||(c[i]=[])).push({fn:s,ctx:a}),this},once:function(i,s,a){var c=this;function f(){c.off(i,f),s.apply(a,arguments)}return f._=s,this.on(i,f,a)},emit:function(i){var s=[].slice.call(arguments,1),a=((this.e||(this.e={}))[i]||[]).slice(),c=0,f=a.length;for(c;c{"use strict";/*! + * escape-html + * Copyright(c) 2012-2013 TJ Holowaychuk + * Copyright(c) 2015 Andreas Lubbe + * Copyright(c) 2015 Tiancheng "Timothy" Gu + * MIT Licensed + */var ns=/["'&<>]/;Go.exports=os;function os(e){var t=""+e,r=ns.exec(t);if(!r)return t;var n,o="",i=0,s=0;for(i=r.index;i0&&i[i.length-1])&&(f[0]===6||f[0]===2)){r=0;continue}if(f[0]===3&&(!i||f[1]>i[0]&&f[1]=e.length&&(e=void 0),{value:e&&e[n++],done:!e}}};throw new TypeError(t?"Object is not iterable.":"Symbol.iterator is not defined.")}function W(e,t){var r=typeof Symbol=="function"&&e[Symbol.iterator];if(!r)return e;var n=r.call(e),o,i=[],s;try{for(;(t===void 0||t-- >0)&&!(o=n.next()).done;)i.push(o.value)}catch(a){s={error:a}}finally{try{o&&!o.done&&(r=n.return)&&r.call(n)}finally{if(s)throw s.error}}return i}function D(e,t,r){if(r||arguments.length===2)for(var n=0,o=t.length,i;n1||a(m,d)})})}function a(m,d){try{c(n[m](d))}catch(h){p(i[0][3],h)}}function c(m){m.value instanceof et?Promise.resolve(m.value.v).then(f,u):p(i[0][2],m)}function f(m){a("next",m)}function u(m){a("throw",m)}function p(m,d){m(d),i.shift(),i.length&&a(i[0][0],i[0][1])}}function ln(e){if(!Symbol.asyncIterator)throw new TypeError("Symbol.asyncIterator is not defined.");var t=e[Symbol.asyncIterator],r;return t?t.call(e):(e=typeof Ee=="function"?Ee(e):e[Symbol.iterator](),r={},n("next"),n("throw"),n("return"),r[Symbol.asyncIterator]=function(){return this},r);function n(i){r[i]=e[i]&&function(s){return new Promise(function(a,c){s=e[i](s),o(a,c,s.done,s.value)})}}function o(i,s,a,c){Promise.resolve(c).then(function(f){i({value:f,done:a})},s)}}function C(e){return typeof e=="function"}function at(e){var t=function(n){Error.call(n),n.stack=new Error().stack},r=e(t);return r.prototype=Object.create(Error.prototype),r.prototype.constructor=r,r}var It=at(function(e){return function(r){e(this),this.message=r?r.length+` errors occurred during unsubscription: +`+r.map(function(n,o){return o+1+") "+n.toString()}).join(` + `):"",this.name="UnsubscriptionError",this.errors=r}});function Ve(e,t){if(e){var r=e.indexOf(t);0<=r&&e.splice(r,1)}}var Ie=function(){function e(t){this.initialTeardown=t,this.closed=!1,this._parentage=null,this._finalizers=null}return e.prototype.unsubscribe=function(){var t,r,n,o,i;if(!this.closed){this.closed=!0;var s=this._parentage;if(s)if(this._parentage=null,Array.isArray(s))try{for(var a=Ee(s),c=a.next();!c.done;c=a.next()){var f=c.value;f.remove(this)}}catch(v){t={error:v}}finally{try{c&&!c.done&&(r=a.return)&&r.call(a)}finally{if(t)throw t.error}}else s.remove(this);var u=this.initialTeardown;if(C(u))try{u()}catch(v){i=v instanceof It?v.errors:[v]}var p=this._finalizers;if(p){this._finalizers=null;try{for(var m=Ee(p),d=m.next();!d.done;d=m.next()){var h=d.value;try{mn(h)}catch(v){i=i!=null?i:[],v instanceof It?i=D(D([],W(i)),W(v.errors)):i.push(v)}}}catch(v){n={error:v}}finally{try{d&&!d.done&&(o=m.return)&&o.call(m)}finally{if(n)throw n.error}}}if(i)throw new It(i)}},e.prototype.add=function(t){var r;if(t&&t!==this)if(this.closed)mn(t);else{if(t instanceof e){if(t.closed||t._hasParent(this))return;t._addParent(this)}(this._finalizers=(r=this._finalizers)!==null&&r!==void 0?r:[]).push(t)}},e.prototype._hasParent=function(t){var r=this._parentage;return r===t||Array.isArray(r)&&r.includes(t)},e.prototype._addParent=function(t){var r=this._parentage;this._parentage=Array.isArray(r)?(r.push(t),r):r?[r,t]:t},e.prototype._removeParent=function(t){var r=this._parentage;r===t?this._parentage=null:Array.isArray(r)&&Ve(r,t)},e.prototype.remove=function(t){var r=this._finalizers;r&&Ve(r,t),t instanceof e&&t._removeParent(this)},e.EMPTY=function(){var t=new e;return t.closed=!0,t}(),e}();var Sr=Ie.EMPTY;function jt(e){return e instanceof Ie||e&&"closed"in e&&C(e.remove)&&C(e.add)&&C(e.unsubscribe)}function mn(e){C(e)?e():e.unsubscribe()}var Le={onUnhandledError:null,onStoppedNotification:null,Promise:void 0,useDeprecatedSynchronousErrorHandling:!1,useDeprecatedNextContext:!1};var st={setTimeout:function(e,t){for(var r=[],n=2;n0},enumerable:!1,configurable:!0}),t.prototype._trySubscribe=function(r){return this._throwIfClosed(),e.prototype._trySubscribe.call(this,r)},t.prototype._subscribe=function(r){return this._throwIfClosed(),this._checkFinalizedStatuses(r),this._innerSubscribe(r)},t.prototype._innerSubscribe=function(r){var n=this,o=this,i=o.hasError,s=o.isStopped,a=o.observers;return i||s?Sr:(this.currentObservers=null,a.push(r),new Ie(function(){n.currentObservers=null,Ve(a,r)}))},t.prototype._checkFinalizedStatuses=function(r){var n=this,o=n.hasError,i=n.thrownError,s=n.isStopped;o?r.error(i):s&&r.complete()},t.prototype.asObservable=function(){var r=new F;return r.source=this,r},t.create=function(r,n){return new En(r,n)},t}(F);var En=function(e){ie(t,e);function t(r,n){var o=e.call(this)||this;return o.destination=r,o.source=n,o}return t.prototype.next=function(r){var n,o;(o=(n=this.destination)===null||n===void 0?void 0:n.next)===null||o===void 0||o.call(n,r)},t.prototype.error=function(r){var n,o;(o=(n=this.destination)===null||n===void 0?void 0:n.error)===null||o===void 0||o.call(n,r)},t.prototype.complete=function(){var r,n;(n=(r=this.destination)===null||r===void 0?void 0:r.complete)===null||n===void 0||n.call(r)},t.prototype._subscribe=function(r){var n,o;return(o=(n=this.source)===null||n===void 0?void 0:n.subscribe(r))!==null&&o!==void 0?o:Sr},t}(x);var Et={now:function(){return(Et.delegate||Date).now()},delegate:void 0};var wt=function(e){ie(t,e);function t(r,n,o){r===void 0&&(r=1/0),n===void 0&&(n=1/0),o===void 0&&(o=Et);var i=e.call(this)||this;return i._bufferSize=r,i._windowTime=n,i._timestampProvider=o,i._buffer=[],i._infiniteTimeWindow=!0,i._infiniteTimeWindow=n===1/0,i._bufferSize=Math.max(1,r),i._windowTime=Math.max(1,n),i}return t.prototype.next=function(r){var n=this,o=n.isStopped,i=n._buffer,s=n._infiniteTimeWindow,a=n._timestampProvider,c=n._windowTime;o||(i.push(r),!s&&i.push(a.now()+c)),this._trimBuffer(),e.prototype.next.call(this,r)},t.prototype._subscribe=function(r){this._throwIfClosed(),this._trimBuffer();for(var n=this._innerSubscribe(r),o=this,i=o._infiniteTimeWindow,s=o._buffer,a=s.slice(),c=0;c0?e.prototype.requestAsyncId.call(this,r,n,o):(r.actions.push(this),r._scheduled||(r._scheduled=ut.requestAnimationFrame(function(){return r.flush(void 0)})))},t.prototype.recycleAsyncId=function(r,n,o){var i;if(o===void 0&&(o=0),o!=null?o>0:this.delay>0)return e.prototype.recycleAsyncId.call(this,r,n,o);var s=r.actions;n!=null&&((i=s[s.length-1])===null||i===void 0?void 0:i.id)!==n&&(ut.cancelAnimationFrame(n),r._scheduled=void 0)},t}(Wt);var Tn=function(e){ie(t,e);function t(){return e!==null&&e.apply(this,arguments)||this}return t.prototype.flush=function(r){this._active=!0;var n=this._scheduled;this._scheduled=void 0;var o=this.actions,i;r=r||o.shift();do if(i=r.execute(r.state,r.delay))break;while((r=o[0])&&r.id===n&&o.shift());if(this._active=!1,i){for(;(r=o[0])&&r.id===n&&o.shift();)r.unsubscribe();throw i}},t}(Dt);var Te=new Tn(Sn);var _=new F(function(e){return e.complete()});function Vt(e){return e&&C(e.schedule)}function Cr(e){return e[e.length-1]}function Ye(e){return C(Cr(e))?e.pop():void 0}function Oe(e){return Vt(Cr(e))?e.pop():void 0}function zt(e,t){return typeof Cr(e)=="number"?e.pop():t}var pt=function(e){return e&&typeof e.length=="number"&&typeof e!="function"};function Nt(e){return C(e==null?void 0:e.then)}function qt(e){return C(e[ft])}function Kt(e){return Symbol.asyncIterator&&C(e==null?void 0:e[Symbol.asyncIterator])}function Qt(e){return new TypeError("You provided "+(e!==null&&typeof e=="object"?"an invalid object":"'"+e+"'")+" where a stream was expected. You can provide an Observable, Promise, ReadableStream, Array, AsyncIterable, or Iterable.")}function Ni(){return typeof Symbol!="function"||!Symbol.iterator?"@@iterator":Symbol.iterator}var Yt=Ni();function Gt(e){return C(e==null?void 0:e[Yt])}function Bt(e){return pn(this,arguments,function(){var r,n,o,i;return $t(this,function(s){switch(s.label){case 0:r=e.getReader(),s.label=1;case 1:s.trys.push([1,,9,10]),s.label=2;case 2:return[4,et(r.read())];case 3:return n=s.sent(),o=n.value,i=n.done,i?[4,et(void 0)]:[3,5];case 4:return[2,s.sent()];case 5:return[4,et(o)];case 6:return[4,s.sent()];case 7:return s.sent(),[3,2];case 8:return[3,10];case 9:return r.releaseLock(),[7];case 10:return[2]}})})}function Jt(e){return C(e==null?void 0:e.getReader)}function U(e){if(e instanceof F)return e;if(e!=null){if(qt(e))return qi(e);if(pt(e))return Ki(e);if(Nt(e))return Qi(e);if(Kt(e))return On(e);if(Gt(e))return Yi(e);if(Jt(e))return Gi(e)}throw Qt(e)}function qi(e){return new F(function(t){var r=e[ft]();if(C(r.subscribe))return r.subscribe(t);throw new TypeError("Provided object does not correctly implement Symbol.observable")})}function Ki(e){return new F(function(t){for(var r=0;r=2;return function(n){return n.pipe(e?A(function(o,i){return e(o,i,n)}):de,ge(1),r?He(t):Vn(function(){return new Zt}))}}function zn(){for(var e=[],t=0;t=2,!0))}function pe(e){e===void 0&&(e={});var t=e.connector,r=t===void 0?function(){return new x}:t,n=e.resetOnError,o=n===void 0?!0:n,i=e.resetOnComplete,s=i===void 0?!0:i,a=e.resetOnRefCountZero,c=a===void 0?!0:a;return function(f){var u,p,m,d=0,h=!1,v=!1,G=function(){p==null||p.unsubscribe(),p=void 0},oe=function(){G(),u=m=void 0,h=v=!1},N=function(){var T=u;oe(),T==null||T.unsubscribe()};return y(function(T,Qe){d++,!v&&!h&&G();var De=m=m!=null?m:r();Qe.add(function(){d--,d===0&&!v&&!h&&(p=$r(N,c))}),De.subscribe(Qe),!u&&d>0&&(u=new rt({next:function($e){return De.next($e)},error:function($e){v=!0,G(),p=$r(oe,o,$e),De.error($e)},complete:function(){h=!0,G(),p=$r(oe,s),De.complete()}}),U(T).subscribe(u))})(f)}}function $r(e,t){for(var r=[],n=2;ne.next(document)),e}function K(e,t=document){return Array.from(t.querySelectorAll(e))}function z(e,t=document){let r=ce(e,t);if(typeof r=="undefined")throw new ReferenceError(`Missing element: expected "${e}" to be present`);return r}function ce(e,t=document){return t.querySelector(e)||void 0}function _e(){return document.activeElement instanceof HTMLElement&&document.activeElement||void 0}function tr(e){return L(b(document.body,"focusin"),b(document.body,"focusout")).pipe(ke(1),l(()=>{let t=_e();return typeof t!="undefined"?e.contains(t):!1}),V(e===_e()),B())}function Xe(e){return{x:e.offsetLeft,y:e.offsetTop}}function Qn(e){return L(b(window,"load"),b(window,"resize")).pipe(Ce(0,Te),l(()=>Xe(e)),V(Xe(e)))}function rr(e){return{x:e.scrollLeft,y:e.scrollTop}}function dt(e){return L(b(e,"scroll"),b(window,"resize")).pipe(Ce(0,Te),l(()=>rr(e)),V(rr(e)))}var Gn=function(){if(typeof Map!="undefined")return Map;function e(t,r){var n=-1;return t.some(function(o,i){return o[0]===r?(n=i,!0):!1}),n}return function(){function t(){this.__entries__=[]}return Object.defineProperty(t.prototype,"size",{get:function(){return this.__entries__.length},enumerable:!0,configurable:!0}),t.prototype.get=function(r){var n=e(this.__entries__,r),o=this.__entries__[n];return o&&o[1]},t.prototype.set=function(r,n){var o=e(this.__entries__,r);~o?this.__entries__[o][1]=n:this.__entries__.push([r,n])},t.prototype.delete=function(r){var n=this.__entries__,o=e(n,r);~o&&n.splice(o,1)},t.prototype.has=function(r){return!!~e(this.__entries__,r)},t.prototype.clear=function(){this.__entries__.splice(0)},t.prototype.forEach=function(r,n){n===void 0&&(n=null);for(var o=0,i=this.__entries__;o0},e.prototype.connect_=function(){!Dr||this.connected_||(document.addEventListener("transitionend",this.onTransitionEnd_),window.addEventListener("resize",this.refresh),ga?(this.mutationsObserver_=new MutationObserver(this.refresh),this.mutationsObserver_.observe(document,{attributes:!0,childList:!0,characterData:!0,subtree:!0})):(document.addEventListener("DOMSubtreeModified",this.refresh),this.mutationEventsAdded_=!0),this.connected_=!0)},e.prototype.disconnect_=function(){!Dr||!this.connected_||(document.removeEventListener("transitionend",this.onTransitionEnd_),window.removeEventListener("resize",this.refresh),this.mutationsObserver_&&this.mutationsObserver_.disconnect(),this.mutationEventsAdded_&&document.removeEventListener("DOMSubtreeModified",this.refresh),this.mutationsObserver_=null,this.mutationEventsAdded_=!1,this.connected_=!1)},e.prototype.onTransitionEnd_=function(t){var r=t.propertyName,n=r===void 0?"":r,o=va.some(function(i){return!!~n.indexOf(i)});o&&this.refresh()},e.getInstance=function(){return this.instance_||(this.instance_=new e),this.instance_},e.instance_=null,e}(),Bn=function(e,t){for(var r=0,n=Object.keys(t);r0},e}(),Xn=typeof WeakMap!="undefined"?new WeakMap:new Gn,Zn=function(){function e(t){if(!(this instanceof e))throw new TypeError("Cannot call a class as a function.");if(!arguments.length)throw new TypeError("1 argument required, but only 0 present.");var r=ya.getInstance(),n=new Aa(t,r,this);Xn.set(this,n)}return e}();["observe","unobserve","disconnect"].forEach(function(e){Zn.prototype[e]=function(){var t;return(t=Xn.get(this))[e].apply(t,arguments)}});var Ca=function(){return typeof nr.ResizeObserver!="undefined"?nr.ResizeObserver:Zn}(),eo=Ca;var to=new x,Ra=$(()=>k(new eo(e=>{for(let t of e)to.next(t)}))).pipe(g(e=>L(ze,k(e)).pipe(R(()=>e.disconnect()))),J(1));function he(e){return{width:e.offsetWidth,height:e.offsetHeight}}function ye(e){return Ra.pipe(S(t=>t.observe(e)),g(t=>to.pipe(A(({target:r})=>r===e),R(()=>t.unobserve(e)),l(()=>he(e)))),V(he(e)))}function bt(e){return{width:e.scrollWidth,height:e.scrollHeight}}function ar(e){let t=e.parentElement;for(;t&&(e.scrollWidth<=t.scrollWidth&&e.scrollHeight<=t.scrollHeight);)t=(e=t).parentElement;return t?e:void 0}var ro=new x,ka=$(()=>k(new IntersectionObserver(e=>{for(let t of e)ro.next(t)},{threshold:0}))).pipe(g(e=>L(ze,k(e)).pipe(R(()=>e.disconnect()))),J(1));function sr(e){return ka.pipe(S(t=>t.observe(e)),g(t=>ro.pipe(A(({target:r})=>r===e),R(()=>t.unobserve(e)),l(({isIntersecting:r})=>r))))}function no(e,t=16){return dt(e).pipe(l(({y:r})=>{let n=he(e),o=bt(e);return r>=o.height-n.height-t}),B())}var cr={drawer:z("[data-md-toggle=drawer]"),search:z("[data-md-toggle=search]")};function oo(e){return cr[e].checked}function Ke(e,t){cr[e].checked!==t&&cr[e].click()}function Ue(e){let t=cr[e];return b(t,"change").pipe(l(()=>t.checked),V(t.checked))}function Ha(e,t){switch(e.constructor){case HTMLInputElement:return e.type==="radio"?/^Arrow/.test(t):!0;case HTMLSelectElement:case HTMLTextAreaElement:return!0;default:return e.isContentEditable}}function Pa(){return L(b(window,"compositionstart").pipe(l(()=>!0)),b(window,"compositionend").pipe(l(()=>!1))).pipe(V(!1))}function io(){let e=b(window,"keydown").pipe(A(t=>!(t.metaKey||t.ctrlKey)),l(t=>({mode:oo("search")?"search":"global",type:t.key,claim(){t.preventDefault(),t.stopPropagation()}})),A(({mode:t,type:r})=>{if(t==="global"){let n=_e();if(typeof n!="undefined")return!Ha(n,r)}return!0}),pe());return Pa().pipe(g(t=>t?_:e))}function le(){return new URL(location.href)}function ot(e){location.href=e.href}function ao(){return new x}function so(e,t){if(typeof t=="string"||typeof t=="number")e.innerHTML+=t.toString();else if(t instanceof Node)e.appendChild(t);else if(Array.isArray(t))for(let r of t)so(e,r)}function M(e,t,...r){let n=document.createElement(e);if(t)for(let o of Object.keys(t))typeof t[o]!="undefined"&&(typeof t[o]!="boolean"?n.setAttribute(o,t[o]):n.setAttribute(o,""));for(let o of r)so(n,o);return n}function fr(e){if(e>999){let t=+((e-950)%1e3>99);return`${((e+1e-6)/1e3).toFixed(t)}k`}else return e.toString()}function co(){return location.hash.substring(1)}function Vr(e){let t=M("a",{href:e});t.addEventListener("click",r=>r.stopPropagation()),t.click()}function $a(e){return L(b(window,"hashchange"),e).pipe(l(co),V(co()),A(t=>t.length>0),J(1))}function fo(e){return $a(e).pipe(l(t=>ce(`[id="${t}"]`)),A(t=>typeof t!="undefined"))}function zr(e){let t=matchMedia(e);return er(r=>t.addListener(()=>r(t.matches))).pipe(V(t.matches))}function uo(){let e=matchMedia("print");return L(b(window,"beforeprint").pipe(l(()=>!0)),b(window,"afterprint").pipe(l(()=>!1))).pipe(V(e.matches))}function Nr(e,t){return e.pipe(g(r=>r?t():_))}function ur(e,t={credentials:"same-origin"}){return ue(fetch(`${e}`,t)).pipe(fe(()=>_),g(r=>r.status!==200?Tt(()=>new Error(r.statusText)):k(r)))}function We(e,t){return ur(e,t).pipe(g(r=>r.json()),J(1))}function po(e,t){let r=new DOMParser;return ur(e,t).pipe(g(n=>n.text()),l(n=>r.parseFromString(n,"text/xml")),J(1))}function pr(e){let t=M("script",{src:e});return $(()=>(document.head.appendChild(t),L(b(t,"load"),b(t,"error").pipe(g(()=>Tt(()=>new ReferenceError(`Invalid script: ${e}`))))).pipe(l(()=>{}),R(()=>document.head.removeChild(t)),ge(1))))}function lo(){return{x:Math.max(0,scrollX),y:Math.max(0,scrollY)}}function mo(){return L(b(window,"scroll",{passive:!0}),b(window,"resize",{passive:!0})).pipe(l(lo),V(lo()))}function ho(){return{width:innerWidth,height:innerHeight}}function bo(){return b(window,"resize",{passive:!0}).pipe(l(ho),V(ho()))}function vo(){return Q([mo(),bo()]).pipe(l(([e,t])=>({offset:e,size:t})),J(1))}function lr(e,{viewport$:t,header$:r}){let n=t.pipe(Z("size")),o=Q([n,r]).pipe(l(()=>Xe(e)));return Q([r,t,o]).pipe(l(([{height:i},{offset:s,size:a},{x:c,y:f}])=>({offset:{x:s.x-c,y:s.y-f+i},size:a})))}(()=>{function e(n,o){parent.postMessage(n,o||"*")}function t(...n){return n.reduce((o,i)=>o.then(()=>new Promise(s=>{let a=document.createElement("script");a.src=i,a.onload=s,document.body.appendChild(a)})),Promise.resolve())}var r=class extends EventTarget{constructor(n){super(),this.url=n,this.m=i=>{i.source===this.w&&(this.dispatchEvent(new MessageEvent("message",{data:i.data})),this.onmessage&&this.onmessage(i))},this.e=(i,s,a,c,f)=>{if(s===`${this.url}`){let u=new ErrorEvent("error",{message:i,filename:s,lineno:a,colno:c,error:f});this.dispatchEvent(u),this.onerror&&this.onerror(u)}};let o=document.createElement("iframe");o.hidden=!0,document.body.appendChild(this.iframe=o),this.w.document.open(),this.w.document.write(` + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

How to contribute

+

Contributions to Data Workspace are welcome, such as reporting issues, requesting features, making documentation changes, or submitting code changes.

+

Prerequisites

+
    +
  • In all cases a GitHub account is needed to contribute.
  • +
  • To contribute code or documentation, you must have a copy of the Data Workspace source code locally, and have certain tools installed. See Running locally for details of these.
  • +
  • To contribute code, knowledge of Python is required.
  • +
+

Issues

+

Suspected issues with Data Workspace can be submitted at Data Workspace issues. +An issue that contains a minimal, reproducible example stands the best chance of being resolved. However, it is understood that this is not possible in all circumstances.

+

Feature requests

+

A feature request can be submitted using the Ideas category in Data Workspace discussions.

+

Documentation

+

The source of the documentation is in the docs/ directory of the source code, and is written using Material for mkdocs.

+

Changes are then submitted via a Pull Request (PR). To do this:

+
    +
  1. +

    Decide on a short hyphen-separated descriptive name for your change, prefixed with docs/, for example docs/add-example.

    +
  2. +
  3. +

    Make a branch using this descriptive name:

    +
    git checkout -b docs/add-example
    +cd data-workspace
    +
    +
  4. +
  5. +

    Make your changes in a text editor.

    +
  6. +
  7. +

    Preview your changes locally:

    +
    pip install -r requirements-docs.txt  # Only needed once
    +mkdocs serve
    +
    +
  8. +
  9. +

    Commit your change and push to your fork. Ideally the commit message will follow the Conventional Commit specification:

    +
    git add docs/getting-started.md  # Repeat for each file changed
    +git commit -m "docs: add an example"
    +git push origin docs/add-example
    +
    +
  10. +
  11. +

    Raise a PR at https://github.com/uktrade/data-workspace/pulls against the master branch in data-workspace.

    +
  12. +
  13. +

    Wait for the PR to be approved and merged, and respond to any questions or suggested changes.

    +
  14. +
+

When the PR is merged, the documentation is deployed automatically to https://data-workspace.docs.trade.gov.uk/.

+

Code

+

Changes are submitted via a Pull Request (PR). To do this:

+
    +
  1. +

    Decide on a short hyphen-separated descriptive name for your change, prefixed with the type of change. For example fix/the-bug-description.

    +
  2. +
  3. +

    Make a branch using this descriptive name:

    +
    git checkout -b fix/a-bug-description
    +
    +
  4. +
  5. +

    Make sure you can run existing tests locally, for example by running:

    +
    make docker-test
    +
    +

    See Running tests for more details on running tests.

    +
  6. +
  7. +

    Make your changes in a text editor. In the cases of changing behaviour, this would usually include changing or adding tests within dataworkspace/dataworkspace/tests, and running them.

    +
  8. +
  9. +

    Commit your changes and push to your fork. Ideally the commit message will follow the Conventional Commit specification:

    +
    git add my_file.py  # Repeat for each file changed
    +git commit -m "fix: the bug description"
    +git push origin fix/the-bug-description
    +
    +
  10. +
  11. +

    Raise a PR at https://github.com/uktrade/data-workspace/pulls against the master branch of data-workspace.

    +
  12. +
  13. +

    Wait for the PR to be approved and merged, and respond to any questions or suggested changes.

    +
  14. +
+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/data-ingestion/index.html b/data-ingestion/index.html new file mode 100644 index 0000000000..dff1ae2583 --- /dev/null +++ b/data-ingestion/index.html @@ -0,0 +1,1072 @@ + + + + + + + + + + + + + + + + + + + + + + Data Ingestion - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

Data ingestion

+ +

Data Workspace is essentially an interface to a PostgreSQL database, referred to as the datasets database. Technical users can access specific tables in the datasets database directly, but there is a concept of "datasets" on top of this direct access. Each dataset has its own page in the user-facing data catalogue that has features for non-technical users.

+

Conceptually, there are 3 different types of datasets in Data Workspace: source datasets, reference datasets, and data cuts. Metadata for the 3 dataset types is controlled through a single administration interface, but how data is ingested into these depends on the dataset.

+

In addition to the structured data exposed in the catalogue, data can be uploaded by users on an ad-hoc basis, treated by Data Workspace as binary blobs.

+

Dataset metadata

+

Data Workspace is a Django application, with a staff-facing administration interface, usually refered to as Django admin. Metadata for of each the 3 types of dataset is managed within Django admin.

+

Source datasets

+

A source dataset is the core Data Workspace dataset type. It is made up of one or more tables in the PostgreSQL datasets database. Typically a source dataset would be updated frequently.

+

However, ingesting into these tables is not handled by the Data Workspace project itself. There are many ways to ingest data into PostgreSQL tables. The Department for Business and Trade uses Airflow to handle ingestion using a combination of Python and SQL code.

+
+

Note

+

The Airflow pipelines used by The Department for Business and Trade to ingest data are not open source. Some parts of Data Workspace relating to this ingestion depend on this closed source code.

+
+

Reference datasets

+

Reference datasets are datasets usually used to classify or contextualise other datasets, and are expected to not change frequently. "UK bank holidays" or "ISO country codes" could be reference datasets.

+

The structure and data of reference datasets can be completely controlled through Django admin.

+

Data cuts

+

Data isn't ingested into data cuts directly. Instead, data cuts are defined by SQL queries entered into Django admin that run dynamically, querying from source and reference datasets. As such they update as frequently as the data they query from updates.

+

A datacut could filter a larger source dataset for a specific country, calculate aggregate statistics, join multiple source datasets together, join a source dataset with a reference dataset, or a combination of these.

+

Ad-hoc binary blobs

+

Each user is able to upload binary blobs in ad-hoc cases to their own private prefix in an S3 bucket, as well to any authorized team prefixes. Read and write access to these prefixes is by 3 mechanisms:

+
    +
  • +

    Through a custom React-based S3 browser built into the Data Workspace Django application.

    +
  • +
  • +

    From tools using the S3 API or S3 SDKs, for example boto3.

    +
  • +
  • +

    Certain parts of each user's prefix are automatically synced to and from the local filesystem in on-demand tools they launch. This gives users the illusion of a permanent filesystem in their tools, even though the tools are ephermeral.

    +
  • +
+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/deployment/aws/index.html b/deployment/aws/index.html new file mode 100644 index 0000000000..b00d565a67 --- /dev/null +++ b/deployment/aws/index.html @@ -0,0 +1,1113 @@ + + + + + + + + + + + + + + + + + + + + + + Deploying to AWS - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

Deploying to AWS

+

Data Workspace contains code that helps it be deployed using Amazon Web Services (AWS). This page explains how to use this code.

+

Prerequisites

+

To deploy Data Workspace to AWS you must have:

+
    +
  • The source code of Data Workspace cloned to a folder data-workspace. See Running locally for details
  • +
  • An AWS account
  • +
  • Terraform installed
  • +
  • An OAuth 2.0 server for authentication
  • +
+

You should also have familiarity with working on the command line, working with Terraform, and with AWS.

+

Environment folder

+

Each deployment, or environment, of Data Workspace requires a folder for its configuration. This folder should be within a sibling folder to data-workspace.

+

The Data Workspace source code contains a template for this configuration. To create a folder in an appropriate location based on this template:

+
    +
  1. +

    Decide on a meaningful name for the environment. In the following production is used.

    +
  2. +
  3. +

    Ensure you're in the root of the data-workspace folder that contains the cloned Data Workspace source code.

    +
  4. +
  5. +

    Copy the template into a new folder for the environment:

    +
    mkdir -p ../data-workspace-environments
    +cp -Rp infra/environment-template ../data-workspace-environments/production
    +
    +
  6. +
+

This folder structure allows the configuration to find and use the infra/ folder in data-workspace which contains the low level details of the infrastructure to provision in each environment.

+

Initialising environment

+

Before deploying the environment, it must be initialised.

+
    +
  1. +

    Change to the new folder for the environment:

    +
    cd ../data-workspace-environments/production
    +
    +
  2. +
  3. +

    Generate new SSH keys:

    +
    ./create-keys.sh
    +
    +
  4. +
  5. +

    Install AWS CLI and configure an AWS CLI profile. This will support some of the included configuration scripts.

    +

    You can do this by putting credentials directly into ~/.aws/credentials or by using aws sso.

    +
  6. +
  7. +

    Create an S3 bucket and dynamodb table for Terraform to use, and add them to main.tf. --bucket will provide the base name for both objects.

    +
    ./bootstrap-terraform.sh \
    +    --profile <value> \
    +    --bucket <value> \
    +    --region <value>
    +
    +
  8. +
  9. +

    Enter the details of your hosting platform, SSH keys, and OAuth 2.0 server by changing all instances of REPLACE_ME in:

    +
      +
    • admin-environment.json
    • +
    • gitlab-secrets.json
    • +
    • main.tf
    • +
    +
  10. +
  11. +

    Initialise Terraform:

    +
    terraform init
    +
    +
  12. +
+

Deploying environment

+

Check the environment you created has worked correctly:

+
terraform plan
+
+

If everything looks right, you're ready to deploy:

+
terraform apply
+
+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/deployment/other-platforms/index.html b/deployment/other-platforms/index.html new file mode 100644 index 0000000000..655dc6dabd --- /dev/null +++ b/deployment/other-platforms/index.html @@ -0,0 +1,989 @@ + + + + + + + + + + + + + + + + + + + + + + Deploying to other platforms - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

Deploying to other platforms

+

It should possible to deploy to platforms other than Amazon Web Services (AWS), but at the time of writing this hasn't been done. It may involve a significant amount of work.

+

You can start a discussion on how best to approach this.

+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/development/assets/pycharm-breakpoint.png b/development/assets/pycharm-breakpoint.png new file mode 100644 index 0000000000..8aed2566b0 Binary files /dev/null and b/development/assets/pycharm-breakpoint.png differ diff --git a/development/assets/pycharm-debug-ouput.png b/development/assets/pycharm-debug-ouput.png new file mode 100644 index 0000000000..ff17a7be96 Binary files /dev/null and b/development/assets/pycharm-debug-ouput.png differ diff --git a/development/assets/pycharm-remote-interpreter.png b/development/assets/pycharm-remote-interpreter.png new file mode 100644 index 0000000000..eac5742396 Binary files /dev/null and b/development/assets/pycharm-remote-interpreter.png differ diff --git a/development/assets/pycharm-start-debugger.png b/development/assets/pycharm-start-debugger.png new file mode 100644 index 0000000000..c1f82651d6 Binary files /dev/null and b/development/assets/pycharm-start-debugger.png differ diff --git a/development/assets/remote-debug-server.png b/development/assets/remote-debug-server.png new file mode 100644 index 0000000000..9cdaf302cf Binary files /dev/null and b/development/assets/remote-debug-server.png differ diff --git a/development/assets/vscode-debugger-output.png b/development/assets/vscode-debugger-output.png new file mode 100644 index 0000000000..630c4f9680 Binary files /dev/null and b/development/assets/vscode-debugger-output.png differ diff --git a/development/assets/vscode-run-debug.png b/development/assets/vscode-run-debug.png new file mode 100644 index 0000000000..506a327439 Binary files /dev/null and b/development/assets/vscode-run-debug.png differ diff --git a/development/database-migrations/index.html b/development/database-migrations/index.html new file mode 100644 index 0000000000..7003b705bf --- /dev/null +++ b/development/database-migrations/index.html @@ -0,0 +1,1038 @@ + + + + + + + + + + + + + + + + + + + + + + Database migrations - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

Database migrations

+

Data Workspace's user-facing metadata catalogue uses Django. When developing Data Workspace, if a change is made to Django's models, to reflect this change in the metadata database, migrations must be created and run.

+

Prerequisites

+

To create migrations you must have the Data Workspace prerequisites and cloned its source code. See Running locally for details.

+

Creating migrations

+

After making changes to Django models, to create any required migrations:

+
docker compose build && \
+docker compose run \
+    --user root \
+    --volume=$PWD/dataworkspace:/dataworkspace/ \
+    data-workspace django-admin makemigrations
+
+

The migrations must be committed to the codebase, and will run when Data Workspace is next started.

+

This pattern can be used to run other Django management commands by replacing makemigrations with the name of the command.

+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/development/enhancedtables/index.html b/development/enhancedtables/index.html new file mode 100644 index 0000000000..dc108d002e --- /dev/null +++ b/development/enhancedtables/index.html @@ -0,0 +1,1095 @@ + + + + + + + + + + + + + + + + + + + + + + Enhanced tables - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

Enhanced tables

+

Turn an existing govuk styled table into a govuk styled ag-grid grid.

+
    +
  • Allows for sorting columns.
  • +
  • If the user has JavaScript disabled, automatically fall back to the standard govuk table.
  • +
  • In the future can be enhanced to add column filtering.
  • +
+

Create table

+
    +
  1. Create a gov uk style table and give it the class enhanced-table.
  2. +
  3. The table must have one <thead> and one <tbody>.
  4. +
  5. You can optionally add the data attribute data-size-to-fit to ensure columns fit the whole width of the table:
  6. +
+
<table class="govuk-table enhanced-table data-size-to-fit">
+  ...
+</table>
+
+

Configure rows

+

Configuration for the columns is done on the <th> elements via data attributes. The options are:

+
    +
  • data-sortable - enable sorting for this column (disabled by default).
  • +
  • data-column-type - use a specific ag-grid column type.
  • +
  • data-renderer - optionally specify the renderer for the column. Only needed for certain data types.
  • +
  • data-renderer="htmlRenderer" - render/sort column as html (mainly used to display buttons or links in a cell).
  • +
  • data-renderer="dateRenderer" - render/sort column as dates.
  • +
  • data-renderer="datetimeRenderer" - render/sort column as datetimes.
  • +
  • data-width - set a width for a column.
  • +
  • data-min-width - set a minimum width in pixels for a column.
  • +
  • data-max-width - set a maximum width in pixels for a column.
  • +
  • data-resizable - allow resizing of the column (disabled by default).
  • +
+
<table class="govuk-table enhanced-table data-size-to-fit">
+  <thead class="govuk-table__head">
+    <tr class="govuk-table__row">
+      <th class="govuk-table__header" data-sortable data-renderer="htmlRenderer">A link</th>
+      <th class="govuk-table__header" data-sortable data-renderer="dateRenderer">A date</th>
+      <th class="govuk-table__header" data-width="300">Some text</th>
+      <th class="govuk-table__header" data-column-type="numericColumn">A number</th>
+  </thead>
+  <tbody class="govuk-table__body">
+    {% for object in object_list %}
+      <tr>
+        <td class="name govuk-table__cell">
+          <a class="govuk-link" href="#">The link</a>
+        </td>
+        ...
+      </tr>
+    {% endfor %}
+  </tbody>
+</table>
+
+

Initialise it

+

Add the following to your page:

+
<script src="{% static 'ag-grid-community.min.js' %}"></script>
+<script src="{% static 'dayjs.min.js' %}"></script>
+<script src="{% static 'js/grid-utils.js' %}"></script>
+<script src="{% static 'js/enhanced-table.js' %}"></script>
+<link rel="stylesheet" type="text/css" href="{% static 'data-grid.css' %}"/>
+<script nonce="{{ request.csp_nonce }}">
+  document.addEventListener('DOMContentLoaded', () => {
+    initEnhancedTable("enhanced-table");
+  });
+</script>
+
+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/development/maintenance-mode/index.html b/development/maintenance-mode/index.html new file mode 100644 index 0000000000..dd5fcb39e6 --- /dev/null +++ b/development/maintenance-mode/index.html @@ -0,0 +1,990 @@ + + + + + + + + + + + + + + + + + + Enabling Maintenance Mode in Django - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

Enabling Maintenance Mode in Django

+ +

This guide will walk you through the process of enabling maintenance mode in your Django application through the admin interface.

+

Enabling Maintenance Mode

+
    +
  1. +

    Access the Django Admin Interface: Navigate to the Django Admin Interface (/admin).

    +
  2. +
  3. +

    Navigate to the Maintenance Settings: On the admin dashboard, go to the "Maintenance" section. In this section, click on "Settings".

    +
  4. +
  5. +

    Create or Edit a MaintenanceSettings Object: In the "Settings" section, you'll see a MaintenanceSettings object. If one does not exist, create a new one by clicking on "ADD MAINTENANCE SETTINGS".

    +
  6. +
  7. +

    Configure the Maintenance Settings: In the MaintenanceSettings form, you'll find two fields:

    +
      +
    • Maintenance Text: This is the message that will be displayed to users when maintenance mode is enabled. For example, "The service is currently unavailable. Please try again later".
    • +
    • Maintenance Toggle: This checkbox enables or disables maintenance mode. Check this box to enable maintenance mode.
    • +
    +
  8. +
+

Once these steps are completed, maintenance mode will be enabled, and users will see your maintenance message when they try to access your application.

+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/development/releases/index.html b/development/releases/index.html new file mode 100644 index 0000000000..423cef9da2 --- /dev/null +++ b/development/releases/index.html @@ -0,0 +1,1106 @@ + + + + + + + + + + + + + + + + + + + + + + Releases - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

Releases

+ +

Tag your release

+
    +
  1. +

    View the current tags in Data Workspace

    +
  2. +
  3. +

    Make a note of the latest tag

    +
  4. +
  5. Check out the master branch and pull the latest.
  6. +
  7. Create a new tag from the master branch following this format v<year>-<month>-<day> eg. v2024-01-19
  8. +
  9. Push the new tag to Github
  10. +
+

Example of how to tag and push

+
git tag -a v2024-01-19 -m v2024-01-19
+
+
git push origin v2024-01-19
+
+

Create draft release notes

+
    +
  1. +

    View the current tags in Data Workspace

    +
  2. +
  3. +

    Click on the tag that you just created/pushed

    +
  4. +
  5. Compare your tag with the previous. This will give you a list of all the changes to be released
  6. +
  7. View the current releases in Data Workspace
  8. +
  9. Click "Draft a new release"
  10. +
  11. Click the "Generate release notes" option
  12. +
  13. Check that the form is in the format below
  14. +
  15. Check the option "Set as latest release"
  16. +
  17. Click "Save as draft"
  18. +
+

Example of release notes

+
## What's Changed
+
+The main change today was a fix to an issue where external links in Quicksight dashboards weren't working. They were opening a new tab, but that tab remained blank. Now, that new tab loads the external page as it should.
+
+* build: add latest QuickSight embedding SDK (but don't use it) by @michalc in https://github.com/uktrade/data-workspace/pull/2949
+* fix: remove reference to source map to get collectstatic to work by @michalc in https://github.com/uktrade/data-workspace/pull/2950
+* feat: use latest Quicksight embedding SDK by @michalc in https://github.com/uktrade/data-workspace/pull/2951
+* feat: set COOP to same-origin-allow-popups for Quicksight by @michalc in https://github.com/uktrade/data-workspace/pull/2952
+
+**Full Changelog**: https://github.com/uktrade/data-workspace/commits/v2024-01-15
+
+

Release tag to production

+
    +
  1. +

    Visit the build job in Jenkins

    +
  2. +
  3. +

    Click "build with parameters" and wait until you have the option to click "proceed to staging"

    +
  4. +
  5. Click "proceed"
  6. +
  7. Check the changes in staging
  8. +
  9. Click "proceed" to production
  10. +
  11. Check the changes in production
  12. +
  13. Go back to the releases in Github
  14. +
  15. Click on the draft release you created earlier
  16. +
  17. Click "Publish release"
  18. +
+

Post in the DW channel

+

Once the release is complete you can then notify everyone in the Data Workspace Teams channel. Use the format below.

+
Data Workspace <tag> has just been released
+
+Whats changed?
+<high level description of what has been released>
+
+For more details please see the release notes
+
+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/development/remotedebugging/index.html b/development/remotedebugging/index.html new file mode 100644 index 0000000000..df6239050d --- /dev/null +++ b/development/remotedebugging/index.html @@ -0,0 +1,1117 @@ + + + + + + + + + + + + + + + + + + + + + + Remote debugging - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

Remote debugging containers

+

PDB

+

As ipdb has some issues with gevent and monkey patching we are only able to debug using vanilla pdb currently.

+

To set this up locally:

+
    +
  1. Install remote-pdb-client pip install remote-pdb-client or just pip install -r requirements-dev.txt.
  2. +
  3. Ensure you have the following in dev.env:
      +
    • PYTHONBREAKPOINT=remote_pdb.set_trace
    • +
    • REMOTE_PDB_HOST=0.0.0.0
    • +
    • REMOTE_PDB_PORT=4444
    • +
    +
  4. +
  5. Sprinkle some breakpoint()s liberally in your code.
  6. +
  7. Bring up the docker containers docker compose up.
  8. +
  9. Listen to remote pdb using remotepdb_client --host localhost --port 4444.
  10. +
  11. Go and break things http://dataworkspace.test:8000.
  12. +
+

Pycharm

+

To debug via the pycharm remote debugger you will need to jump through a few hoops:

+
    +
  1. +

    Configure docker-compose.yml as a remote interpreter:
    +Remote interpreter config

    +
  2. +
  3. +

    Configure a python debug server for pydev-pycharm to connect to. You will need to ensure the path mapping +is set to the path of your dev environment:
    +Python debug server

    +
  4. +
  5. +

    Bring up the containers: + docker compose up

    +
  6. +
  7. +

    Start the pycharm debugger:
    +Start the debugger

    +
  8. +
  9. +

    Add a breakpoint using pydev-pycharm:
    +Pydev breakpoint

    +
  10. +
  11. +

    Profit:
    +Pycharm debug output

    +
  12. +
+

VSCode

+

Below are the basic steps for debugging remotely with vscode. They are confirmed to work but may needs some tweaks so feel free to update the docs:

+
    +
  1. Install the Docker and Python debug plugins.
  2. +
  3. Add a remote debug configuration to your launch.json: +
    {
    +  "configurations": [
    +    {
    +      "name": "Python: Remote Attach",
    +      "type": "python",
    +      "request": "attach",
    +      "connect": {
    +        "host": "0.0.0.0",
    +        "port": 4444
    +      },
    +      "pathMappings": [
    +        {
    +          "localRoot": "${workspaceFolder}/dataworkspace",
    +          "remoteRoot": "/dataworkspace"
    +        }
    +      ]
    +    }
    +  ]
    +}
    +
  4. +
  5. Add the following code snippet to the file that you wish to debug: +
    import debugpy
    +debugpy.listen(('0.0.0.0', 4444))
    +debugpy.wait_for_client()
    +
  6. +
  7. Set a breakpoint in your code:
    +breakpoint()
  8. +
  9. Bring up the containers:
    +docker compose up
  10. +
  11. Start the remote python debugger:
    +Vscode run debug
  12. +
  13. Load the relevant page http://dataworkspace.test:8000.
  14. +
  15. Start debugging: + vscode debugger
  16. +
+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/development/running-locally/index.html b/development/running-locally/index.html new file mode 100644 index 0000000000..e953f6064d --- /dev/null +++ b/development/running-locally/index.html @@ -0,0 +1,1152 @@ + + + + + + + + + + + + + + + + + + + + + + Running locally - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

Running locally

+ +

To develop features on Data Workspace, or to evaluate if it's suitable for your use case, it can be helpful to run Data Workspace on your local computer.

+

Prerequisites

+

To run Data Workspace locally, you must have these tools installed:

+ +

You should also have familiarity with the command line, and editing text files. If you plan to make changes to the Data Workspace source code, you should also have familiarity with Python.

+

Cloning source code

+

To run Data Workspace locally, you must also have the Data Workspace source code, which is stored in the Data Workspace GitHub repository. The process of copying this code so it is available locally is known as cloning.

+
    +
  1. +

    If you don't already have a GitHub account, create a GitHub account.

    +
  2. +
  3. +

    Setup an SSH key and associate it with your GitHub account.

    +
  4. +
  5. +

    Create a new fork of the Data Workspace repository. Make a note of the owner you choose to fork to. This is usually your GitHub username. There is more documentation on forking at GitHub's guide on contributing to projects.

    +

    If you're a member if the uktrade GitHub organisation you should skip this step and not fork. If you're not planning on contributing changes, you can also skip forking.

    +
  6. +
  7. +

    Clone the repository by running the following command, replacing owner with the owner that you forked to in step 3. If you skipped forking, owner should be uktrade:

    +
    git clone git@github.com:owner/data-workspace.git
    +
    +

    This will create a new directory containing a copy of the Data Workspace source code, data-workspace.

    +
  8. +
  9. +

    Change to the data-workspace directory:

    +
    cd data-workspace
    +
    +
  10. +
+

Creating domains

+

In order to be able to properly test cookies that are shared with subdomains, localhost is not used for local development. Instead, by default the dataworkspace.test domain is used. For this to work, you will need the below in your /etc/hosts file:

+
127.0.0.1       dataworkspace.test
+127.0.0.1       data-workspace-localstack
+127.0.0.1       data-workspace-sso.test
+127.0.0.1       superset-admin.dataworkspace.test
+127.0.0.1       superset-edit.dataworkspace.test
+
+

To run tool and visualisation-related code, you will need subdomains in your /etc/hosts file, such as:

+
127.0.0.1       visualisation-a.dataworkspace.test
+
+

Starting the application

+

Set the required variables:

+
cp .envs/sample.env .envs/dev.env
+cp .envs/superset-sample.env .envs/superset.dev.env
+
+

Start the application:

+
docker compose up --build
+
+

The application should then visible at http://dataworkspace.test:8000.

+

Running Superset locally

+

Then run docker compose using the superset profile:

+
docker compose --profile superset up
+
+

You can then visit http://superset-edit.dataworkspace.test:8000/ or http://superset-admin.dataworkspace.test:8000/.

+

Front end static assets

+

We use node-sass to build the front end css and include the GOVUK Front End styles.

+

To build this locally requires NodeJS. Ideally installed via nvm https://github.com/nvm-sh/nvm:

+
  # this will configure node from .nvmrc or prompt you to install
+  nvm use
+  npm install
+  npm run build:css
+
+

React apps

+

We're set up to use django-webpack-loader for hotloading the React app while developing.

+

You can get it running by starting the dev server:

+
docker compose up
+
+

and in a separate terminal changing to the js app directory and running the webpack hotloader:

+
cd dataworkspace/dataworkspace/static/js/react_apps/
+npm run dev
+
+

For production usage we use pre-built JavaScript bundles to reduce the pain of having to build npm modules at deployment.

+

If you make any changes to the React apps you will need to rebuild and commit the bundles. +This will create the relevant js files in /static/js/bundles/ directory:

+
cd dataworkspace/dataworkspace/static/js/react_apps/
+# this may about 10 minutes to install all dependencies
+npm install
+npm run build
+git add ../bundles/*.js ../stats/react_apps-stats.json
+
+

Issues on Apple Silicon

+

If you have issues building the containers try the following:

+
DOCKER_DEFAULT_PLATFORM=linux/amd64 docker compose up --build
+
+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/development/running-tests/index.html b/development/running-tests/index.html new file mode 100644 index 0000000000..279a495e68 --- /dev/null +++ b/development/running-tests/index.html @@ -0,0 +1,1107 @@ + + + + + + + + + + + + + + + + + + + + + + Running tests - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

Running tests

+ +

Running tests locally is useful when developing features on Data Workspace to make sure existing functionality isn't broken, and to ensure any new functionality works as intended.

+

Prerequisites

+

To create migrations you must have the Data Workspace prerequisites and cloned its source code. See Running locally for details.

+

Unit and integration tests

+

To run all tests:

+
make docker-test
+
+

To only run Django unit tests:

+
make docker-test-unit
+
+

To only run higher level integration tests:

+
make docker-test-integration
+
+

Without rebuilding

+

To run the tests locally without having to rebuild the containers every time append -local to the test make commands:

+
make docker-test-unit-local
+
+
make docker-test-integration-local
+
+
make docker-test-local
+
+

To run specific tests pass -e TARGET=<test> into make:

+
make docker-test-unit-local -e TARGET=dataworkspace/dataworkspace/tests/test_admin.py::TestCustomAdminSite::test_non_admin_access
+
+
make docker-test-integration-local -e TARGET=test/test_application.py
+
+

Watching Selenium tests

+

We have some Selenium integration tests that launch a (headless) browser in order to interact with a running instance of Data Workspace to assure some core flows (only Data Explorer at the time of writing). It is sometimes desirable to watch these tests run, e.g. in order to debug where it is failing. To run the selenium tests through docker compose using a local browser, do the following:

+
    +
  1. +

    Download the latest Selenium Server and run it in the background, e.g. java -jar ~/Downloads/selenium-server-standalone-3.141.59 &.

    +
  2. +
  3. +

    Run the selenium tests via docker-compose, exposing the Data Workspace port and the mock-SSO port and setting the REMOTE_SELENIUM_URL environment variable, e.g. docker compose --profile test -p data-workspace-test run -e REMOTE_SELENIUM_URL=http://host.docker.internal:4444/wd/hub -p 8000:8000 -p 8005:8005 --rm data-workspace-test pytest -vvvs test/test_selenium.py.

    +
  4. +
+

E2E tests

+

There are 2 ways to run the E2E tests locally. The easiest way is to use docker, however this option should only be used to run and view the test results. Due to docker caching images, if any tests need to be updated or new ones added it is better to use npm as any test changes are immidiately accessible

+

Running the E2E tests locally using cypress docker image

+

The E2E tests can be run locally using the make command make docker-e2e-build-run from the root. This will spin up the data workspace app pointing at a dedicated E2E database, that will install some E2E specific fixtures. This DB will not interfere with any local test data you have.

+

The cypress tests are run using the data-workspace-e2e-test docker container. This container will start as soon as it detects the data workspace app is available on port 8000, and as soon as the test complete the docker containers will be closed. To view any tests that failed, browse the e2e-data-workspace-cypress-1 docker container, where the logs will show a summary of all tests. Any failed tests will also have their screenshots saved in the cypress/screenshots folder in your local environment.

+

Running the E2E tests locally using npm

+

Before running the tests, to get the E2E data workspace app run the make command make docker-e2e-start from the root, which will spin up the data workspace app pointing at a dedicated E2E database.

+

Once the containers have started, you can use either npm run cypress:run to run the tests in headless mode, or npm run cypress:open to use the Cypress test runner app.

+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/development/updating-dependencies/index.html b/development/updating-dependencies/index.html new file mode 100644 index 0000000000..ff48f19fd8 --- /dev/null +++ b/development/updating-dependencies/index.html @@ -0,0 +1,983 @@ + + + + + + + + + + + + + + + + + + + + + + Updating dependencies - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + +
+
+
+ + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + + + + +

Updating dependencies

+ +

We use pip-tools to manage dependencies across two files - requirements.txt and requirements-dev.txt. These have corresponding .in files where we specify our top-level dependencies.

+

Add the new dependencies to those .in files, or update an existing dependency, then (with pip-tools already installed), run make save-requirements.

+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/favicons/moj.ico b/favicons/moj.ico new file mode 100644 index 0000000000..de25990df7 Binary files /dev/null and b/favicons/moj.ico differ diff --git a/index.html b/index.html new file mode 100644 index 0000000000..f075144d4e --- /dev/null +++ b/index.html @@ -0,0 +1,1048 @@ + + + + + + + + + + + + + + + + + + + + Data Workspace Technical Documentation - Data Workspace Technical Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+ +
+ + + + + + +
+ + + + + + + +
+ +
+ + + + +
+
+ + + + + + + + + + +
+
+ + + + + + + + + + + + +

Data Workspace Technical Documentation

+ + + + + + + + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/logos/moj.png b/logos/moj.png new file mode 100644 index 0000000000..e7bfc249f9 Binary files /dev/null and b/logos/moj.png differ diff --git a/search/search_index.json b/search/search_index.json new file mode 100644 index 0000000000..61bc23c21d --- /dev/null +++ b/search/search_index.json @@ -0,0 +1 @@ +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Data Workspace Technical Documentation","text":"

Host your own data analysis platform

Data Workspace is an open source data analysis platform with features for users with a range of technical skills. Features include:

  • a data catalogue for users to discover, filter, and download data
  • a permission system that allows users to only access specific datasets
  • a framework for hosting tools that allows users to analyse data without downloading it, such as through JupyterLab, RStudio, or Theia (a VS Code-like IDE)
  • dashboard creation and hosting

Data Workspace has been built with features specifically for the Department for Business and Trade. However, we are open to contributions to make this more generic. See Contributing for details on how to make changes for your use case.

Run Data Workspace locally"},{"location":"contributing/","title":"How to contribute","text":"

Contributions to Data Workspace are welcome, such as reporting issues, requesting features, making documentation changes, or submitting code changes.

"},{"location":"contributing/#prerequisites","title":"Prerequisites","text":"
  • In all cases a GitHub account is needed to contribute.
  • To contribute code or documentation, you must have a copy of the Data Workspace source code locally, and have certain tools installed. See Running locally for details of these.
  • To contribute code, knowledge of Python is required.
"},{"location":"contributing/#issues","title":"Issues","text":"

Suspected issues with Data Workspace can be submitted at Data Workspace issues. An issue that contains a minimal, reproducible example stands the best chance of being resolved. However, it is understood that this is not possible in all circumstances.

"},{"location":"contributing/#feature-requests","title":"Feature requests","text":"

A feature request can be submitted using the Ideas category in Data Workspace discussions.

"},{"location":"contributing/#documentation","title":"Documentation","text":"

The source of the documentation is in the docs/ directory of the source code, and is written using Material for mkdocs.

Changes are then submitted via a Pull Request (PR). To do this:

  1. Decide on a short hyphen-separated descriptive name for your change, prefixed with docs/, for example docs/add-example.

  2. Make a branch using this descriptive name:

    git checkout -b docs/add-example\ncd data-workspace\n
  3. Make your changes in a text editor.

  4. Preview your changes locally:

    pip install -r requirements-docs.txt  # Only needed once\nmkdocs serve\n
  5. Commit your change and push to your fork. Ideally the commit message will follow the Conventional Commit specification:

    git add docs/getting-started.md  # Repeat for each file changed\ngit commit -m \"docs: add an example\"\ngit push origin docs/add-example\n
  6. Raise a PR at https://github.com/uktrade/data-workspace/pulls against the master branch in data-workspace.

  7. Wait for the PR to be approved and merged, and respond to any questions or suggested changes.

When the PR is merged, the documentation is deployed automatically to https://data-workspace.docs.trade.gov.uk/.

"},{"location":"contributing/#code","title":"Code","text":"

Changes are submitted via a Pull Request (PR). To do this:

  1. Decide on a short hyphen-separated descriptive name for your change, prefixed with the type of change. For example fix/the-bug-description.

  2. Make a branch using this descriptive name:

    git checkout -b fix/a-bug-description\n
  3. Make sure you can run existing tests locally, for example by running:

    make docker-test\n

    See Running tests for more details on running tests.

  4. Make your changes in a text editor. In the cases of changing behaviour, this would usually include changing or adding tests within dataworkspace/dataworkspace/tests, and running them.

  5. Commit your changes and push to your fork. Ideally the commit message will follow the Conventional Commit specification:

    git add my_file.py  # Repeat for each file changed\ngit commit -m \"fix: the bug description\"\ngit push origin fix/the-bug-description\n
  6. Raise a PR at https://github.com/uktrade/data-workspace/pulls against the master branch of data-workspace.

  7. Wait for the PR to be approved and merged, and respond to any questions or suggested changes.

"},{"location":"data-ingestion/","title":"Data Ingestion","text":"

Data Workspace is essentially an interface to a PostgreSQL database, referred to as the datasets database. Technical users can access specific tables in the datasets database directly, but there is a concept of \"datasets\" on top of this direct access. Each dataset has its own page in the user-facing data catalogue that has features for non-technical users.

Conceptually, there are 3 different types of datasets in Data Workspace: source datasets, reference datasets, and data cuts. Metadata for the 3 dataset types is controlled through a single administration interface, but how data is ingested into these depends on the dataset.

In addition to the structured data exposed in the catalogue, data can be uploaded by users on an ad-hoc basis, treated by Data Workspace as binary blobs.

"},{"location":"data-ingestion/#dataset-metadata","title":"Dataset metadata","text":"

Data Workspace is a Django application, with a staff-facing administration interface, usually refered to as Django admin. Metadata for of each the 3 types of dataset is managed within Django admin.

"},{"location":"data-ingestion/#source-datasets","title":"Source datasets","text":"

A source dataset is the core Data Workspace dataset type. It is made up of one or more tables in the PostgreSQL datasets database. Typically a source dataset would be updated frequently.

However, ingesting into these tables is not handled by the Data Workspace project itself. There are many ways to ingest data into PostgreSQL tables. The Department for Business and Trade uses Airflow to handle ingestion using a combination of Python and SQL code.

Note

The Airflow pipelines used by The Department for Business and Trade to ingest data are not open source. Some parts of Data Workspace relating to this ingestion depend on this closed source code.

"},{"location":"data-ingestion/#reference-datasets","title":"Reference datasets","text":"

Reference datasets are datasets usually used to classify or contextualise other datasets, and are expected to not change frequently. \"UK bank holidays\" or \"ISO country codes\" could be reference datasets.

The structure and data of reference datasets can be completely controlled through Django admin.

"},{"location":"data-ingestion/#data-cuts","title":"Data cuts","text":"

Data isn't ingested into data cuts directly. Instead, data cuts are defined by SQL queries entered into Django admin that run dynamically, querying from source and reference datasets. As such they update as frequently as the data they query from updates.

A datacut could filter a larger source dataset for a specific country, calculate aggregate statistics, join multiple source datasets together, join a source dataset with a reference dataset, or a combination of these.

"},{"location":"data-ingestion/#ad-hoc-binary-blobs","title":"Ad-hoc binary blobs","text":"

Each user is able to upload binary blobs in ad-hoc cases to their own private prefix in an S3 bucket, as well to any authorized team prefixes. Read and write access to these prefixes is by 3 mechanisms:

  • Through a custom React-based S3 browser built into the Data Workspace Django application.

  • From tools using the S3 API or S3 SDKs, for example boto3.

  • Certain parts of each user's prefix are automatically synced to and from the local filesystem in on-demand tools they launch. This gives users the illusion of a permanent filesystem in their tools, even though the tools are ephermeral.

"},{"location":"architecture/application-lifecycle/","title":"Application lifecycle","text":"

As an example, from the point of view of user abcde1234, https://jupyterlab-abcde1234.mydomain.com/ is the fixed address of their private JupyterLab application. Going to https://jupyterlab-abcde1234.mydomain.com/ in a browser will:

  • show a starting screen with a countdown;
  • and when the application is loaded, the page will reload and show the application itself;
  • and subsequent loads will show the application immediately.

If the application is stopped, then a visit to https://jupyterlab-abcde1234.mydomain.com/ will repeat the process. The user will never leave https://jupyterlab-abcde1234.mydomain.com/. If the user visits https://jupyterlab-abcde1234.mydomain.com/some/path, they will also remain at https://jupyterlab-abcde1234.mydomain.com/some/path to ensure, for example, bookmarks to any in-application page work even if they need to start the application to view them.

The browser will only make GET requests during the start of an application. While potentially a small abuse of HTTP, it allows the straightfoward behaviour described: no HTML form or JavaScript is required to start an application (although JavaScript is used to show a countdown to the user and to check if an application has loaded), and the GET requests are idempotent.

The proxy however, has a more complex behaviour. On an incoming request from the browser for https://jupyterlab-abcde1234.mydomain.com/:

  • it will attempt to GET details of an application with the host jupyterlab-abcde1234 from an internal API of the main application;
  • if the GET returns a 404, it will make a PUT request to the main application that initiates creation of the Fargate task;
  • if the GET returns a 200, and the details contain a URL, the proxy will attempt to proxy the incoming request to it;
  • it does not treat errors connecting to a SPAWNING application as a true error: they are effectively swallowed.
  • if an application is returned from the GET as STOPPED, which happens on error, it will DELETE the application, and show an error to the user.

The proxy itself only responds to incoming requests from the browser, and has no long-lived tasks that go beyond one HTTP request or WebSockets connection. This ensures it can be horizontally scaled.

"},{"location":"architecture/comparison-with-jupyterhub/","title":"Comparison with JupyterHub","text":"

In addition to being able to run any Docker container, not just JupyterLab, Data Workspace has some deliberate architectural features that are different to JupyterHub.

  • All state is in the database, accessed by the main Django application.

  • Specifically, no state is kept in the memory of the main Django application. This means it can be horizontally scaled without issue.

  • The proxy is also stateless: it fetches how to route requests from the main application, which itself fetches the data from the database. This means it can also be horizontally scaled without issue, and potentially independently from the main application. This means sticky sessions are not needed, and multiple users could access the same application, which is a planned feature for user-supplied visualisation applications.

  • Authentication is completely handled by the proxy. Apart from specific exceptions like the healthcheck, non-authenticated requests do not reach the main application.

  • The launched containers do not make requests to the main application, and the main application does not make requests to the launched containers. This means there are fewer cyclic dependencies in terms of data flow, and that applications don't need to be customised for this environment. They just need to open a port for HTTP requests, which makes them extremely standard web-based Docker applications.

There is a notable exception to the statelessness of the main application: the launch of an application is made of a sequence of calls to AWS, and is done in a Celery task. If this sequence is interrupted, the launch of the application will fail. This is a solvable problem: the state could be saving into the database and the sequence resumed later. However, since this sequence of calls lasts only a few seconds, and the user will be told of the error and can refresh to try to launch the application again, at this stage of the project this has been deemed unnecessary.

"},{"location":"architecture/components/","title":"Components","text":"

Data Workspace is made of a number of components. This page explains what those are and how they work together.

"},{"location":"architecture/components/#prerequisites","title":"Prerequisites","text":"

To understand the components of Data Workspace's architecture, you should have familiary with:

  • Amazon Web Services (AWS), especially VPCs and ECS
  • Docker
  • HTTP
  • The Domain name system (DNS)
  • PostgreSQL
"},{"location":"architecture/components/#high-level-architecture","title":"High level architecture","text":"

At the highest level, users access the Data Workspace application, which accesses a PostgreSQL database.

graph\n  A[User] --> B[Data Workspace]\n  B --> C[\"PostgreSQL (Aurora)\"]
"},{"location":"architecture/components/#medium-level-architecture","title":"Medium level architecture","text":"

The architecture is heavily Docker/ECS Fargate based.

graph\n  A[User] -->|Staff SSO| B[Amazon Quicksight];\n  B --> C[\"PostgreSQL (Aurora)\"];\n  A --> |Staff SSO|F[\"'The Proxy' (aiohttp)\"];\n  F --> |rstudio-9c57e86a|G[Per-user and shared tools];\n  F --> H[Shiny, Flask, Django, NGINX];\n  F --> I[Django, Data Explorer];\n  G --> C;\n  H --> C;\n  I --> C;\n\n\n
"},{"location":"architecture/components/#user-facing","title":"User-facing","text":"
  • Main application: A Django application to manage datasets and permissions, launch containers, a proxy to route requests to those containers, and an NGINX instance to route to the proxy and serve static files.

  • JupyterLab: Launched by users of the main application, and populated with credentials in the environment to access certain datasets.

  • rStudio: Launched by users of the main application, and populated with credentials in the environment to access certain datasets.

  • pgAdmin: Launched by users of the main application, and populated with credentials in the environment to access certain datasets.

  • File browser: A single-page-application that offers upload and download of files to/from each user's folder in S3. The data is transferred directly between the user's browser and S3.

"},{"location":"architecture/components/#infrastructure","title":"Infrastructure","text":"
  • metrics: A sidecar-container for the user-launched containers that exposes metrics from the ECS task metadata endpoint in Prometheus format.

  • s3sync: A sidecar-container for the user-launched containers that syncs to and from S3 using mobius3. This is to allow file-persistance on S3 without using FUSE, which at the time of writing is not possible on Fargate.

  • dns-rewrite-proxy: The DNS server of the VPC that launched containers run in. It selectively allows only certain DNS requests through to migitate chance of data exfiltration through DNS. When this container is deployed, it changes DHCP settings in the VPC, and will most likely break aspects of user-launched containers.

  • healthcheck: Proxies through to the healthcheck endpoint of the main application, so the main application can be in a security group locked-down to certain IP addresses, but still be monitored by Pingdom.

  • mirrors-sync: Mirrors pypi, CRAN and (ana)conda repositories to S3, so user-launched JupyterLab and rStudio containers can install packages without having to contact the public internet.

  • prometheus: Collects metrics from user-launched containers and re-exposes them through federation.

  • registry: A Docker pull-through-cache to repositories in quay.io. This allows the VPC to not have public internet access but still launch containers from quay.io in Fargate.

  • sentryproxy: Proxies errors to a Sentry instance: only used by JupyterLab.

"},{"location":"architecture/ADRs/","title":"Architecture Decision Records","text":"

This section contains a list of Architecture Decision Records (ADRs).

"},{"location":"architecture/ADRs/#accepted","title":"Accepted","text":"
  • 0001: Using a custom proxy
  • 0002: Usage of asyncio in proxy
"},{"location":"architecture/ADRs/0001/","title":"0001: Using a custom proxy","text":"","tags":["Accepted"]},{"location":"architecture/ADRs/0001/#context","title":"Context","text":"

A common question is why not just NGINX instead of the custom proxy? The reason is the dynamic routing for the applications, e.g. URLs like https://jupyterlab-abcde1234.mydomain.com/some/path: each one has a lot of fairly complex requirements.

  • It must redirect to SSO if not authenticated, and redirect back to the URL once authenticated.
  • It must perform ip-filtering that is not applicable to the main application.
  • It must check that the current user is allowed to access the application, and show a forbidden page if not.
  • It must start the application if it's not started.
  • It must show a starting page with countdown if it's starting.
  • It must detect if an application has started, and route requests to it if it is.
  • It must route cookies from all responses back to the user. For JupyterLab, the first response contains cookies used in XSRF protection that are never resent in later requests.
  • It must show an error page if there is an error starting or connecting to the application.
  • It must allow a refresh of the error page to attempt to start the application again.
  • It must support WebSockets, without knowledge ahead of time which paths are used by WebSockets.
  • It must support streaming uploads and downloads.
  • Ideally, there would not be duplicate reponsibilities between the proxy and other parts of the system, e.g. the Django application.

While not impossible to leverage NGINX to move some code from the proxy, there would still need to be custom code, and NGINX would have to communicate via some mechanism to this custom code to achieve all of the above: extra HTTP or Redis requests, or maybe through a custom NGINX module. It is suspected that this will make things more complex rather than less, and increase the burden on the developer.

","tags":["Accepted"]},{"location":"architecture/ADRs/0001/#decision","title":"Decision","text":"

We will use a custom proxy for Data Workspace, rather than simply using NGINX.

","tags":["Accepted"]},{"location":"architecture/ADRs/0001/#consequences","title":"Consequences","text":"","tags":["Accepted"]},{"location":"architecture/ADRs/0001/#positive","title":"Positive","text":"
  • This will decrease the burden on the developer that would have been required by custom NGINX modules, extra HTTP or Redis requests, which all would still have required custom code.

  • Using the custom proxy allows for all of the complex requirements and dynamic routing of our applications over which we have absolute control.

","tags":["Accepted"]},{"location":"architecture/ADRs/0001/#negative","title":"Negative","text":"
  • Initial difficulty when onboarding new team members as they will need to understand these decisions and requirements.

  • There is an extra network hop compared to not having a proxy.

","tags":["Accepted"]},{"location":"architecture/ADRs/0002/","title":"0002: Usage of asyncio in proxy","text":"","tags":["Accepted"]},{"location":"architecture/ADRs/0002/#context","title":"Context","text":"
  • The proxy fits the typical use-case of event-loop based programming: low CPU but high IO requirements, with potentially high number of connections.

  • The asyncio library aiohttp provides enough low-level control over the headers and the bytes of requests and responses to work as a controllable proxy. For example, the typical HTTP request cycle can be programmed fairly explicitly.

  • An incoming request begins: its headers are received.

  • The proxy makes potentially several requests to the Django application, to Redis, and/or to SSO to authenticate and determine where to route the request.
  • The incoming request's headers are passed to the application [removing certain hop-by-hop-headers].
  • The incoming request's body is streamed to the application.
  • The response headers are sent back to the client, combining cookies from the application and from the proxy.
  • The response body is streamed back to the client.

The library also allows for receiving and making WebSockets requests. This is done without knowledge ahead of time which path is WebSockets, and which is HTTP. This is something that doesn't seem possible with, for example, Django Channels.

Requests and responses can be of the order of several GBs, so this streaming behaviour is a critical requirement.

  • Django gives a lot of benefits for the main application: for example, it is within the skill set of most available developers. Only a small fraction of changes need to involve the proxy.
","tags":["Accepted"]},{"location":"architecture/ADRs/0002/#decision","title":"Decision","text":"

We will use the asyncio library aiohttp.

","tags":["Accepted"]},{"location":"architecture/ADRs/0002/#consequences","title":"Consequences","text":"","tags":["Accepted"]},{"location":"architecture/ADRs/0002/#positive","title":"Positive","text":"
  • Allows for critical requirement of streaming behaviour.

  • We can stream HTTP(S) and Websockets requests in an efficient way with one cohesive Python package.

","tags":["Accepted"]},{"location":"architecture/ADRs/0002/#negative","title":"Negative","text":"
  • A core bit of infrastructure will depend on a flavour of Python unknown to even experienced Python developers.

  • Aiohttp is unable to proxy things that are not HTTP or Websockets, i.e. SSH. This is why GitLab isn't behind the proxy.

","tags":["Accepted"]},{"location":"deployment/aws/","title":"Deploying to AWS","text":"

Data Workspace contains code that helps it be deployed using Amazon Web Services (AWS). This page explains how to use this code.

"},{"location":"deployment/aws/#prerequisites","title":"Prerequisites","text":"

To deploy Data Workspace to AWS you must have:

  • The source code of Data Workspace cloned to a folder data-workspace. See Running locally for details
  • An AWS account
  • Terraform installed
  • An OAuth 2.0 server for authentication

You should also have familiarity with working on the command line, working with Terraform, and with AWS.

"},{"location":"deployment/aws/#environment-folder","title":"Environment folder","text":"

Each deployment, or environment, of Data Workspace requires a folder for its configuration. This folder should be within a sibling folder to data-workspace.

The Data Workspace source code contains a template for this configuration. To create a folder in an appropriate location based on this template:

  1. Decide on a meaningful name for the environment. In the following production is used.

  2. Ensure you're in the root of the data-workspace folder that contains the cloned Data Workspace source code.

  3. Copy the template into a new folder for the environment:

    mkdir -p ../data-workspace-environments\ncp -Rp infra/environment-template ../data-workspace-environments/production\n

This folder structure allows the configuration to find and use the infra/ folder in data-workspace which contains the low level details of the infrastructure to provision in each environment.

"},{"location":"deployment/aws/#initialising-environment","title":"Initialising environment","text":"

Before deploying the environment, it must be initialised.

  1. Change to the new folder for the environment:

    cd ../data-workspace-environments/production\n
  2. Generate new SSH keys:

    ./create-keys.sh\n
  3. Install AWS CLI and configure an AWS CLI profile. This will support some of the included configuration scripts.

    You can do this by putting credentials directly into ~/.aws/credentials or by using aws sso.

  4. Create an S3 bucket and dynamodb table for Terraform to use, and add them to main.tf. --bucket will provide the base name for both objects.

    ./bootstrap-terraform.sh \\\n--profile <value> \\\n--bucket <value> \\\n--region <value>\n
  5. Enter the details of your hosting platform, SSH keys, and OAuth 2.0 server by changing all instances of REPLACE_ME in:

    • admin-environment.json
    • gitlab-secrets.json
    • main.tf
  6. Initialise Terraform:

    terraform init\n
"},{"location":"deployment/aws/#deploying-environment","title":"Deploying environment","text":"

Check the environment you created has worked correctly:

terraform plan\n

If everything looks right, you're ready to deploy:

terraform apply\n
"},{"location":"deployment/other-platforms/","title":"Deploying to other platforms","text":"

It should possible to deploy to platforms other than Amazon Web Services (AWS), but at the time of writing this hasn't been done. It may involve a significant amount of work.

You can start a discussion on how best to approach this.

"},{"location":"development/database-migrations/","title":"Database migrations","text":"

Data Workspace's user-facing metadata catalogue uses Django. When developing Data Workspace, if a change is made to Django's models, to reflect this change in the metadata database, migrations must be created and run.

"},{"location":"development/database-migrations/#prerequisites","title":"Prerequisites","text":"

To create migrations you must have the Data Workspace prerequisites and cloned its source code. See Running locally for details.

"},{"location":"development/database-migrations/#creating-migrations","title":"Creating migrations","text":"

After making changes to Django models, to create any required migrations:

docker compose build && \\\ndocker compose run \\\n--user root \\\n--volume=$PWD/dataworkspace:/dataworkspace/ \\\ndata-workspace django-admin makemigrations\n

The migrations must be committed to the codebase, and will run when Data Workspace is next started.

This pattern can be used to run other Django management commands by replacing makemigrations with the name of the command.

"},{"location":"development/enhancedtables/","title":"Enhanced tables","text":"

Turn an existing govuk styled table into a govuk styled ag-grid grid.

  • Allows for sorting columns.
  • If the user has JavaScript disabled, automatically fall back to the standard govuk table.
  • In the future can be enhanced to add column filtering.
"},{"location":"development/enhancedtables/#create-table","title":"Create table","text":"
  1. Create a gov uk style table and give it the class enhanced-table.
  2. The table must have one <thead> and one <tbody>.
  3. You can optionally add the data attribute data-size-to-fit to ensure columns fit the whole width of the table:
<table class=\"govuk-table enhanced-table data-size-to-fit\">\n  ...\n</table>\n
"},{"location":"development/enhancedtables/#configure-rows","title":"Configure rows","text":"

Configuration for the columns is done on the <th> elements via data attributes. The options are:

  • data-sortable - enable sorting for this column (disabled by default).
  • data-column-type - use a specific ag-grid column type.
  • data-renderer - optionally specify the renderer for the column. Only needed for certain data types.
  • data-renderer=\"htmlRenderer\" - render/sort column as html (mainly used to display buttons or links in a cell).
  • data-renderer=\"dateRenderer\" - render/sort column as dates.
  • data-renderer=\"datetimeRenderer\" - render/sort column as datetimes.
  • data-width - set a width for a column.
  • data-min-width - set a minimum width in pixels for a column.
  • data-max-width - set a maximum width in pixels for a column.
  • data-resizable - allow resizing of the column (disabled by default).
<table class=\"govuk-table enhanced-table data-size-to-fit\">\n  <thead class=\"govuk-table__head\">\n    <tr class=\"govuk-table__row\">\n      <th class=\"govuk-table__header\" data-sortable data-renderer=\"htmlRenderer\">A link</th>\n      <th class=\"govuk-table__header\" data-sortable data-renderer=\"dateRenderer\">A date</th>\n      <th class=\"govuk-table__header\" data-width=\"300\">Some text</th>\n      <th class=\"govuk-table__header\" data-column-type=\"numericColumn\">A number</th>\n  </thead>\n  <tbody class=\"govuk-table__body\">\n    {% for object in object_list %}\n      <tr>\n        <td class=\"name govuk-table__cell\">\n          <a class=\"govuk-link\" href=\"#\">The link</a>\n        </td>\n        ...\n      </tr>\n    {% endfor %}\n  </tbody>\n</table>\n
"},{"location":"development/enhancedtables/#initialise-it","title":"Initialise it","text":"

Add the following to your page:

<script src=\"{% static 'ag-grid-community.min.js' %}\"></script>\n<script src=\"{% static 'dayjs.min.js' %}\"></script>\n<script src=\"{% static 'js/grid-utils.js' %}\"></script>\n<script src=\"{% static 'js/enhanced-table.js' %}\"></script>\n<link rel=\"stylesheet\" type=\"text/css\" href=\"{% static 'data-grid.css' %}\"/>\n<script nonce=\"{{ request.csp_nonce }}\">\n  document.addEventListener('DOMContentLoaded', () => {\n    initEnhancedTable(\"enhanced-table\");\n  });\n</script>\n
"},{"location":"development/maintenance-mode/","title":"Enabling Maintenance Mode in Django","text":"

This guide will walk you through the process of enabling maintenance mode in your Django application through the admin interface.

"},{"location":"development/maintenance-mode/#enabling-maintenance-mode","title":"Enabling Maintenance Mode","text":"
  1. Access the Django Admin Interface: Navigate to the Django Admin Interface (/admin).

  2. Navigate to the Maintenance Settings: On the admin dashboard, go to the \"Maintenance\" section. In this section, click on \"Settings\".

  3. Create or Edit a MaintenanceSettings Object: In the \"Settings\" section, you'll see a MaintenanceSettings object. If one does not exist, create a new one by clicking on \"ADD MAINTENANCE SETTINGS\".

  4. Configure the Maintenance Settings: In the MaintenanceSettings form, you'll find two fields:

    • Maintenance Text: This is the message that will be displayed to users when maintenance mode is enabled. For example, \"The service is currently unavailable. Please try again later\".
    • Maintenance Toggle: This checkbox enables or disables maintenance mode. Check this box to enable maintenance mode.

Once these steps are completed, maintenance mode will be enabled, and users will see your maintenance message when they try to access your application.

"},{"location":"development/releases/","title":"Releases","text":""},{"location":"development/releases/#tag-your-release","title":"Tag your release","text":"
  1. View the current tags in Data Workspace

  2. Make a note of the latest tag

  3. Check out the master branch and pull the latest.
  4. Create a new tag from the master branch following this format v<year>-<month>-<day> eg. v2024-01-19
  5. Push the new tag to Github

Example of how to tag and push

git tag -a v2024-01-19 -m v2024-01-19\n
git push origin v2024-01-19\n
"},{"location":"development/releases/#create-draft-release-notes","title":"Create draft release notes","text":"
  1. View the current tags in Data Workspace

  2. Click on the tag that you just created/pushed

  3. Compare your tag with the previous. This will give you a list of all the changes to be released
  4. View the current releases in Data Workspace
  5. Click \"Draft a new release\"
  6. Click the \"Generate release notes\" option
  7. Check that the form is in the format below
  8. Check the option \"Set as latest release\"
  9. Click \"Save as draft\"

Example of release notes

## What's Changed\n\nThe main change today was a fix to an issue where external links in Quicksight dashboards weren't working. They were opening a new tab, but that tab remained blank. Now, that new tab loads the external page as it should.\n\n* build: add latest QuickSight embedding SDK (but don't use it) by @michalc in https://github.com/uktrade/data-workspace/pull/2949\n* fix: remove reference to source map to get collectstatic to work by @michalc in https://github.com/uktrade/data-workspace/pull/2950\n* feat: use latest Quicksight embedding SDK by @michalc in https://github.com/uktrade/data-workspace/pull/2951\n* feat: set COOP to same-origin-allow-popups for Quicksight by @michalc in https://github.com/uktrade/data-workspace/pull/2952\n\n**Full Changelog**: https://github.com/uktrade/data-workspace/commits/v2024-01-15\n
"},{"location":"development/releases/#release-tag-to-production","title":"Release tag to production","text":"
  1. Visit the build job in Jenkins

  2. Click \"build with parameters\" and wait until you have the option to click \"proceed to staging\"

  3. Click \"proceed\"
  4. Check the changes in staging
  5. Click \"proceed\" to production
  6. Check the changes in production
  7. Go back to the releases in Github
  8. Click on the draft release you created earlier
  9. Click \"Publish release\"
"},{"location":"development/releases/#post-in-the-dw-channel","title":"Post in the DW channel","text":"

Once the release is complete you can then notify everyone in the Data Workspace Teams channel. Use the format below.

Data Workspace <tag> has just been released\n\nWhats changed?\n<high level description of what has been released>\n\nFor more details please see the release notes\n
"},{"location":"development/remotedebugging/","title":"Remote debugging containers","text":""},{"location":"development/remotedebugging/#pdb","title":"PDB","text":"

As ipdb has some issues with gevent and monkey patching we are only able to debug using vanilla pdb currently.

To set this up locally:

  1. Install remote-pdb-client pip install remote-pdb-client or just pip install -r requirements-dev.txt.
  2. Ensure you have the following in dev.env:
    • PYTHONBREAKPOINT=remote_pdb.set_trace
    • REMOTE_PDB_HOST=0.0.0.0
    • REMOTE_PDB_PORT=4444
  3. Sprinkle some breakpoint()s liberally in your code.
  4. Bring up the docker containers docker compose up.
  5. Listen to remote pdb using remotepdb_client --host localhost --port 4444.
  6. Go and break things http://dataworkspace.test:8000.
"},{"location":"development/remotedebugging/#pycharm","title":"Pycharm","text":"

To debug via the pycharm remote debugger you will need to jump through a few hoops:

  1. Configure docker-compose.yml as a remote interpreter:

  2. Configure a python debug server for pydev-pycharm to connect to. You will need to ensure the path mapping is set to the path of your dev environment:

  3. Bring up the containers: docker compose up

  4. Start the pycharm debugger:

  5. Add a breakpoint using pydev-pycharm:

  6. Profit:

"},{"location":"development/remotedebugging/#vscode","title":"VSCode","text":"

Below are the basic steps for debugging remotely with vscode. They are confirmed to work but may needs some tweaks so feel free to update the docs:

  1. Install the Docker and Python debug plugins.
  2. Add a remote debug configuration to your launch.json:
    {\n\"configurations\": [\n{\n\"name\": \"Python: Remote Attach\",\n\"type\": \"python\",\n\"request\": \"attach\",\n\"connect\": {\n\"host\": \"0.0.0.0\",\n\"port\": 4444\n},\n\"pathMappings\": [\n{\n\"localRoot\": \"${workspaceFolder}/dataworkspace\",\n\"remoteRoot\": \"/dataworkspace\"\n}\n]\n}\n]\n}\n
  3. Add the following code snippet to the file that you wish to debug:
    import debugpy\ndebugpy.listen(('0.0.0.0', 4444))\ndebugpy.wait_for_client()\n
  4. Set a breakpoint in your code: breakpoint()
  5. Bring up the containers: docker compose up
  6. Start the remote python debugger:
  7. Load the relevant page http://dataworkspace.test:8000.
  8. Start debugging:
"},{"location":"development/running-locally/","title":"Running locally","text":"

To develop features on Data Workspace, or to evaluate if it's suitable for your use case, it can be helpful to run Data Workspace on your local computer.

"},{"location":"development/running-locally/#prerequisites","title":"Prerequisites","text":"

To run Data Workspace locally, you must have these tools installed:

  • Docker
  • Git

You should also have familiarity with the command line, and editing text files. If you plan to make changes to the Data Workspace source code, you should also have familiarity with Python.

"},{"location":"development/running-locally/#cloning-source-code","title":"Cloning source code","text":"

To run Data Workspace locally, you must also have the Data Workspace source code, which is stored in the Data Workspace GitHub repository. The process of copying this code so it is available locally is known as cloning.

  1. If you don't already have a GitHub account, create a GitHub account.

  2. Setup an SSH key and associate it with your GitHub account.

  3. Create a new fork of the Data Workspace repository. Make a note of the owner you choose to fork to. This is usually your GitHub username. There is more documentation on forking at GitHub's guide on contributing to projects.

    If you're a member if the uktrade GitHub organisation you should skip this step and not fork. If you're not planning on contributing changes, you can also skip forking.

  4. Clone the repository by running the following command, replacing owner with the owner that you forked to in step 3. If you skipped forking, owner should be uktrade:

    git clone git@github.com:owner/data-workspace.git\n

    This will create a new directory containing a copy of the Data Workspace source code, data-workspace.

  5. Change to the data-workspace directory:

    cd data-workspace\n
"},{"location":"development/running-locally/#creating-domains","title":"Creating domains","text":"

In order to be able to properly test cookies that are shared with subdomains, localhost is not used for local development. Instead, by default the dataworkspace.test domain is used. For this to work, you will need the below in your /etc/hosts file:

127.0.0.1       dataworkspace.test\n127.0.0.1       data-workspace-localstack\n127.0.0.1       data-workspace-sso.test\n127.0.0.1       superset-admin.dataworkspace.test\n127.0.0.1       superset-edit.dataworkspace.test\n

To run tool and visualisation-related code, you will need subdomains in your /etc/hosts file, such as:

127.0.0.1       visualisation-a.dataworkspace.test\n
"},{"location":"development/running-locally/#starting-the-application","title":"Starting the application","text":"

Set the required variables:

cp .envs/sample.env .envs/dev.env\ncp .envs/superset-sample.env .envs/superset.dev.env\n

Start the application:

docker compose up --build\n

The application should then visible at http://dataworkspace.test:8000.

"},{"location":"development/running-locally/#running-superset-locally","title":"Running Superset locally","text":"

Then run docker compose using the superset profile:

docker compose --profile superset up\n

You can then visit http://superset-edit.dataworkspace.test:8000/ or http://superset-admin.dataworkspace.test:8000/.

"},{"location":"development/running-locally/#front-end-static-assets","title":"Front end static assets","text":"

We use node-sass to build the front end css and include the GOVUK Front End styles.

To build this locally requires NodeJS. Ideally installed via nvm https://github.com/nvm-sh/nvm:

  # this will configure node from .nvmrc or prompt you to install\n  nvm use\n  npm install\n  npm run build:css\n
"},{"location":"development/running-locally/#react-apps","title":"React apps","text":"

We're set up to use django-webpack-loader for hotloading the React app while developing.

You can get it running by starting the dev server:

docker compose up\n

and in a separate terminal changing to the js app directory and running the webpack hotloader:

cd dataworkspace/dataworkspace/static/js/react_apps/\nnpm run dev\n

For production usage we use pre-built JavaScript bundles to reduce the pain of having to build npm modules at deployment.

If you make any changes to the React apps you will need to rebuild and commit the bundles. This will create the relevant js files in /static/js/bundles/ directory:

cd dataworkspace/dataworkspace/static/js/react_apps/\n# this may about 10 minutes to install all dependencies\nnpm install\nnpm run build\ngit add ../bundles/*.js ../stats/react_apps-stats.json\n
"},{"location":"development/running-locally/#issues-on-apple-silicon","title":"Issues on Apple Silicon","text":"

If you have issues building the containers try the following:

DOCKER_DEFAULT_PLATFORM=linux/amd64 docker compose up --build\n
"},{"location":"development/running-tests/","title":"Running tests","text":"

Running tests locally is useful when developing features on Data Workspace to make sure existing functionality isn't broken, and to ensure any new functionality works as intended.

"},{"location":"development/running-tests/#prerequisites","title":"Prerequisites","text":"

To create migrations you must have the Data Workspace prerequisites and cloned its source code. See Running locally for details.

"},{"location":"development/running-tests/#unit-and-integration-tests","title":"Unit and integration tests","text":"

To run all tests:

make docker-test\n

To only run Django unit tests:

make docker-test-unit\n

To only run higher level integration tests:

make docker-test-integration\n
"},{"location":"development/running-tests/#without-rebuilding","title":"Without rebuilding","text":"

To run the tests locally without having to rebuild the containers every time append -local to the test make commands:

make docker-test-unit-local\n
make docker-test-integration-local\n
make docker-test-local\n

To run specific tests pass -e TARGET=<test> into make:

make docker-test-unit-local -e TARGET=dataworkspace/dataworkspace/tests/test_admin.py::TestCustomAdminSite::test_non_admin_access\n
make docker-test-integration-local -e TARGET=test/test_application.py\n
"},{"location":"development/running-tests/#watching-selenium-tests","title":"Watching Selenium tests","text":"

We have some Selenium integration tests that launch a (headless) browser in order to interact with a running instance of Data Workspace to assure some core flows (only Data Explorer at the time of writing). It is sometimes desirable to watch these tests run, e.g. in order to debug where it is failing. To run the selenium tests through docker compose using a local browser, do the following:

  1. Download the latest Selenium Server and run it in the background, e.g. java -jar ~/Downloads/selenium-server-standalone-3.141.59 &.

  2. Run the selenium tests via docker-compose, exposing the Data Workspace port and the mock-SSO port and setting the REMOTE_SELENIUM_URL environment variable, e.g. docker compose --profile test -p data-workspace-test run -e REMOTE_SELENIUM_URL=http://host.docker.internal:4444/wd/hub -p 8000:8000 -p 8005:8005 --rm data-workspace-test pytest -vvvs test/test_selenium.py.

"},{"location":"development/running-tests/#e2e-tests","title":"E2E tests","text":"

There are 2 ways to run the E2E tests locally. The easiest way is to use docker, however this option should only be used to run and view the test results. Due to docker caching images, if any tests need to be updated or new ones added it is better to use npm as any test changes are immidiately accessible

"},{"location":"development/running-tests/#running-the-e2e-tests-locally-using-cypress-docker-image","title":"Running the E2E tests locally using cypress docker image","text":"

The E2E tests can be run locally using the make command make docker-e2e-build-run from the root. This will spin up the data workspace app pointing at a dedicated E2E database, that will install some E2E specific fixtures. This DB will not interfere with any local test data you have.

The cypress tests are run using the data-workspace-e2e-test docker container. This container will start as soon as it detects the data workspace app is available on port 8000, and as soon as the test complete the docker containers will be closed. To view any tests that failed, browse the e2e-data-workspace-cypress-1 docker container, where the logs will show a summary of all tests. Any failed tests will also have their screenshots saved in the cypress/screenshots folder in your local environment.

"},{"location":"development/running-tests/#running-the-e2e-tests-locally-using-npm","title":"Running the E2E tests locally using npm","text":"

Before running the tests, to get the E2E data workspace app run the make command make docker-e2e-start from the root, which will spin up the data workspace app pointing at a dedicated E2E database.

Once the containers have started, you can use either npm run cypress:run to run the tests in headless mode, or npm run cypress:open to use the Cypress test runner app.

"},{"location":"development/updating-dependencies/","title":"Updating dependencies","text":"

We use pip-tools to manage dependencies across two files - requirements.txt and requirements-dev.txt. These have corresponding .in files where we specify our top-level dependencies.

Add the new dependencies to those .in files, or update an existing dependency, then (with pip-tools already installed), run make save-requirements.

"},{"location":"architecture/ADRs/","title":"Architecture Decision Records","text":"

This section contains a list of Architecture Decision Records (ADRs).

"},{"location":"architecture/ADRs/#accepted","title":"Accepted","text":"
  • 0001: Using a custom proxy
  • 0002: Usage of asyncio in proxy
"}]} \ No newline at end of file diff --git a/sitemap.xml b/sitemap.xml new file mode 100644 index 0000000000..4852c64c99 --- /dev/null +++ b/sitemap.xml @@ -0,0 +1,103 @@ + + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + + None + 2024-01-26 + daily + + \ No newline at end of file diff --git a/sitemap.xml.gz b/sitemap.xml.gz new file mode 100644 index 0000000000..5b72fadf31 Binary files /dev/null and b/sitemap.xml.gz differ diff --git a/stylesheets/extra.css b/stylesheets/extra.css new file mode 100644 index 0000000000..820d439166 --- /dev/null +++ b/stylesheets/extra.css @@ -0,0 +1,54 @@ +/* This removes the borders in the combined nav/toc */ + +.md-nav--primary > .md-nav__list > .md-nav__item--active > .md-nav > .md-nav__list > .md-nav__item--active { + padding-left: 10px; + border-left: 4px solid; + border-left-color: #1d70b8; +} + +.md-nav--secondary > .md-nav__list > .md-nav__item > .md-nav__link--active{ + border-left: none; +} + +.md-nav--secondary { + font-weight: normal; +} + +.md-nav--secondary .md-nav__item::before { + content: '—'; + font-size: 20px; + display: inline-block; + vertical-align: middle; + color: #505a5f; +} + +.md-nav__item .md-nav__link--passed { + color: var(--md-typeset-a-color); +} + +.md-nav__item .md-nav__link--active { + color: var(--md-typeset-a-color); +} + +.md-nav--secondary .md-nav__link--active { + padding-left: 0; +} + +.md-nav--secondary .md-nav__link { + display: inline-block; + vertical-align: middle; + margin: 0; + line-height: 2; +} + +.md-nav--secondary .md-nav__list { + margin-top: 5px; +} + +[dir=ltr] .md-nav--secondary .md-nav__item { + padding-left: 0; +} + +[dir=ltr] .md-nav--integrated > .md-nav__list > .md-nav__item--active .md-nav--secondary { + border-left: none; +} diff --git a/stylesheets/tags-color.css b/stylesheets/tags-color.css new file mode 100644 index 0000000000..b3a09eb3ec --- /dev/null +++ b/stylesheets/tags-color.css @@ -0,0 +1,31 @@ +/* This sets the color of the ADR status tags see https://github.com/squidfunk/mkdocs-material/discussions/5101 */ + +.md-typeset .md-tag--draft, .md-typeset .md-tag--draft[href] { + background-color: #5694ca; + color: white; + } + + .md-typeset .md-tag--accepted, .md-typeset .md-tag--accepted[href] { + background-color: #00703c; + color: white; + } + + .md-typeset .md-tag--deprecated, .md-typeset .md-tag--deprecated[href] { + background-color: #b1b4b6; + color: white; + } + + .md-typeset .md-tag--proposed, .md-typeset .md-tag--proposed[href] { + background-color: #003078; + color: white; + } + + .md-typeset .md-tag--rejected, .md-typeset .md-tag--rejected[href] { + background-color: #f47738; + color: white; + } + + .md-typeset .md-tag--superseded, .md-typeset .md-tag--superseded[href] { + background-color: #505a5f; + color: white; + } \ No newline at end of file