Commit 532741df authored by 20after4's avatar 20after4 Committed by Mhurd
Browse files

Implement dashboards as web components

parent 54eb5781
# This file is a template, and might need editing before it works on your project.
# To contribute improvements to CI/CD templates, please follow the Development guide at:
# https://docs.gitlab.com/ee/development/cicd/templates.html
# This specific template is located at:
# https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/templates/Getting-Started.gitlab-ci.yml
# This is a sample GitLab CI/CD configuration file that should run without any modifications.
# It demonstrates a basic 3 stage CI/CD pipeline. Instead of real tests or scripts,
# it uses echo commands to simulate the pipeline execution.
#
# A pipeline is composed of independent jobs that run scripts, grouped into stages.
# Stages run in sequential order, but jobs within stages run in parallel.
#
# For more information, see: https://docs.gitlab.com/ee/ci/yaml/index.html#stages
# Official language image. Look for the different tagged releases at:
# https://hub.docker.com/r/library/python/tags/
image: docker-registry.wikimedia.org/python3-build-bullseye:latest
# Change pip's cache directory to be inside the project directory since we can
......@@ -31,16 +14,19 @@ cache:
paths:
- .cache/pip
- venv/
- node_modules/
stages: # List of stages for jobs, and their order of execution
- prep
- build
- test
- deploy
unit-test-job: # This job runs in the test stage.
stage: test # It only starts when the job in the build stage completes successfully.
venv:
stage: prep
script:
- python3 -V # Print out python version for debugging
- pip3 install virtualenv
- if [ ! -d "./venv" ]
- if [ ! -d "./venv" ]
- then
- virtualenv --python=python3 venv
- fi
......@@ -48,5 +34,37 @@ unit-test-job: # This job runs in the test stage.
- git submodule update --init --recursive
- pip3 install poetry pytest
- poetry build
npm:
image: docker-registry.wikimedia.org/releng/npm6:0.1.1
stage: build
script:
# From https://stackoverflow.com/q/61625191/1672995
- export NVM_DIR="$HOME/.nvm" && . "$NVM_DIR/nvm.sh" --no-use #load nvm
- eval "[ -f .nvmrc ] && nvm install || nvm install 14" #install node
- node --version
- npm --version
- npm install
- npm run build
- npx grunt
- npx tsc
pytest:
stage: test # It only starts when the job in the build stage completes successfully.
script:
- source venv/bin/activate
- poetry install
- pytest test/
deploy-testing:
stage: deploy
environment: data.releng.team
script:
- cp -af . /home/gitlab-runner/deploy
- cd /home/gitlab-runner/deploy
- virtualenv --python=python3 venv
- source /home/gitlab-runner/deploy.venv/bin/activate
- pip3 install poetry
- poetry install
- cp www/settings_staging.json www/settings.json
- touch /home/gitlab-runner/deploy/
module.exports = function(grunt) {
// Project configuration.
grunt.initConfig({
pkg: grunt.file.readJSON('package.json'),
"svgstore": {
"options": {
"prefix" : "icon-"
},
"default": {
"files": {
"www/static/icons.svg": ["www/static/icons/*.svg"]
}
}
}
});
grunt.loadNpmTasks('grunt-svgstore');
grunt.registerTask('default', ['svgstore']);
}
......@@ -7,26 +7,30 @@ The first applications for the Data³ tools are focused on exploring software de
The core of the toolkit consists of the following:
* Datasette.io provides a front-end for browsing and querying one or more SQLite databases.
* A customized version of datasette-dashboards is included for visualizing the output of queries in Vega/Vega-Lite charts and it can render jinja templates for custom reports or interactive displays.
* A simple dashboard web app that uses the datasette json api to query sqlite and renders the resulting data as charts (rendered with vega-lite) or html templates for custom reports or interactive displays.
* A comprehensive python library and command line interface for querying and processing Phabricator task data exported via conduit api requests.
* Several custom dashboards for datasette which provide visualization of metrics related to Phabricator tasks and workflows.
* A custom dashboard to explore data and statistics about production MediaWiki deployments.
## Demo / Development Instance
There is a development & testing instance of Datasette and the Data³ Dashboard at [https://data.releng.team/dev/](https://data.releng.team/dev/)
## Status
This tool an supporting libraries are in early stages of experimentation and
development. The APIs are not yet stable and the featureset is not yet decided
let alone completely implemented. Stay tuned or get involved.
This tool and supporting libraries are currently experimental. The dashboard and initial data model have reached the stage of [MVP](https://en.wikipedia.org/wiki/Minimum_viable_product). The future development direction is currently uncertain but this is a solid foundation to build on.
This project has a wiki page on MediaWiki.org: [Data³/Metrics-Dashboard](https://www.mediawiki.org/wiki/Data%C2%B3/Metrics-Dashboard )
## Currently supported data sources:
* Phabricator's conduit API.
## Coming soon:
## Future Possibilities:
* Elastic ELK
* Wikimedia SAL
* Gerrit's rest API
* GitLab APIs
# Usage
......@@ -37,7 +41,10 @@ setup.py will install a command line tool called `dddcli`
To install for development use:
```bash
python3 setup.py develop
pip3 install virtualenv poetry
virtualenv --python=python3 .venv
source .venv/bin/activate
poetry install
```
### dddcli
......@@ -104,7 +111,7 @@ For deployment on a server, there are sample systemd units in `etc/systemd/*` in
restart datasette when the data changes. Approximately the same behavior is achieved by the --reload argument to the
datasette command given here and that is adequate for development and testing locally.
### Datasette Plugins
### Datasette Plugins
Datasette has been extended with some plugins to add custom functionality.
......@@ -113,6 +120,10 @@ Datasette has been extended with some plugins to add custom functionality.
`src/datacube-dashboards`. Do the usual `git submodule update --init` to get that source code.
* There are custom views and routes added in ddd_datasette.py that map urls like /-/ddd/$page/ to files in `www/templates/view/`.
# Dashboards
The data³ Dashboards web application is documented in [./docs/DefiningDashboards.md](docs/DefiningDashboards.md).
# Example code:
## Conduit API client:
......
# Data³ Dashboards
Dashboards in Data³ are built in html by web-components implementing a few custom html elemeents.
The dashboard UI consists of a query form across the top of the page which is the primary navigation element. Editing any of the query fields will change state which is kept in the url and in the javascript code that implements the reactive application.
## Application Structure
A good place to start understanding the application is with an overview of the key components and their roles:
The dashboard page is built from the view template in [dashboard.html](../www/templates/views/dashboard.html)
* Most of the application is implemented in typescript which is compiled into javascript and then rolled up into a single app.js file called `static/app.js`.
* The application is loaded by `require.js` which is called from a script tag in dashboard.html.
* Once app.js is loaded the custom elements are bound to their javascript implementations and the components initialize in the order which they are referenced in the `initApp()` function in [DashboardApp.ts](../www/static/DashboardApp.ts)
* DashboardApp.ts - the "main" entrypoint for the application. Deals with initializing the application and facilitates coordination between the other components.
* The navigation ui is implemented by classes in filter-input.ts, specifically `AutocompleteFilter` and `DaterangeFilter`.
* The dashboard charts are implemented by passing a vega lite spec to vega-embed which is wrapped by the web component class named `VegaChart`. See [vega-tonic.ts](../www/static/vega-tonic.ts).
* The data for the charts is provided by DataSource instances.
** Each DataSource is defined with a sql query template.
** The query can contain :placeholders which are filled in at runtime with the values from corresponding url query parameters.
** When the url state changes, any data DataSource which references the changed variable will be notified to update.
** Each affected DataSource will then fetch fresh json from the back-end database. When finished fetching data, then dependent charts are notified to update and provided with the new data.
---
I've attempted to describe the application state and control flow with the following diagram:
### State Flowchart
```mermaid
flowchart TB
classDef dashed stroke:#ccc,stroke-width:2px,color:#fff,stroke-dasharray: 5 5;
User-->|interaction| InputFilter
InputFilter("InputFilter (Query filter user interface)")
Query{{Query state}}
subgraph DataSources
DataSource{{<data-source>}}
db[(DataSette back-end)]
SQL{{Parameterized SQL}}
SQL-- execute query -->db;
db --> |result json| DataSource;
DataSource-- :parameters -->SQL;
class SQL dashed
end
InputFilter-- query filter state change-->Query;
URL(Browser URL)-- popState / setState -->Query;
Query-- history.pushState -->URL;
Query-- :parameters -->DataSource
subgraph Charts[Charts]
DataSource--> |query results|a;
a{{<vega-chart>}} --> VegaChart
VegaChart -->|vega-lite spec + data| VegaEmbed(Vega-Embed renderer);
subgraph VegaChart[VegaChart instances...]
spec{{vega-lite spec www/views/charts/*.yaml}}
class layout,databind dashed
end
class VegaChart dashed
end
```
-----
## Adding new charts to the dashboard:
To add a new chart it's probably easiest to start from an existing example. A good starting point would be [leadtime.yaml](www/templates/views/charts/leadtime.yaml). So start by making a copy of leadtime.yaml under a different name.
The yaml structure controls the positioning of the chart as well as the vega spec which maps query columns to axes on the chart.
### Example:
-----
```yaml
# The first part of the yaml defines the name of the chart,
# the database ("metrics.db") and the query name that will be
# used to get the data for the chart.
title: Lead & Cycle Time Histogram
db: metrics
tab: charts
order: 4 # the order of the chart, relative to other
# charts on the page.
query: cycletime # the name of the query, this will read the
# query's sql definition from a file called
# cycletime.sql
type: vega # this tells dashboard to use the vega-embed
# library to render the chart.
# Everything within the display section defines a
# vega-lite specification. vega-lite is normally specified in
# json format and to satisfy the vega compiler we produce json. # This yaml is directly converted to json by
# parsing with the python yaml parser and then encoding the
# resulting structure using the python json encoder.
display:
# example vega-lite view specification formatted as yaml:
width: 400
height: 300
mark:
type: bar
tooltip: true
encoding:
x:
field: duration
type: ordinal
bin:
maxbins: 20
title: Cycle time (days, binned)
y:
aggregate: count
title: Count of tasks
color:
field: duration
scale:
scheme: browns
legend: null
```
-----
To learn more about the vega view specification language you can read about it in the [vega-lite documentation](https://vega.github.io/vega-lite/docs/spec.html) or browse some [examples](https://vega.github.io/vega-lite/examples/).
\ No newline at end of file
This diff is collapsed.
{
"name": "data-cubed",
"repository": {
"url": "https://gitlab.wikimedia.org/repos/releng/ddd",
"type": "git"
},
"main": "www/static/DashboardApp.ts",
"scripts": {
"prepare": "grunt svgstore",
"prebuild": "tsc",
"build": "esbuild ./www/static/DashboardApp.js --bundle --outfile=./www/static/app.js --target=es2020 --format=iife --platform=browser --keep-names"
},
"dependencies": {
"@operatortc/components": "^13.0.0",
"@operatortc/tonic": "^14.0.0",
"@popperjs/core": "^2.10.2",
"@trevoreyre/autocomplete-js": "^2.2.0",
"@types/luxon": "^2.0.7",
"@typescript/lib-dom": "npm:@types/web",
"bootstrap": "^5.1.3",
"chart.js": "^3.6.0",
"chartjs-adapter-luxon": "^1.1.0",
"chartjs-plugin-datalabels": "^2.0.0",
"jsoneditor": "^9.5.7",
"luxon": "^2.0.2",
"vega": "^5",
"vega-embed": "^6.19.1",
"vega-lite": "^5",
"xhr": "^2.6.0"
},
"devDependencies": {
"@babel/core": "^7.15.8",
"@babel/preset-env": "^7.15.8",
"@babel/preset-react": "^7.14.5",
"babel-preset-es2015": "^6.24.1",
"babel-preset-react": "^6.24.1",
"babel-preset-stage-0": "^6.24.1",
"babelify": "^10.0.0",
"browserify": "^17.0.0",
"es2015": "0.0.0"
"@rollup/plugin-node-resolve": "^13.0.6",
"esbuild": "^0.13.13",
"grunt": "^1.4.1",
"grunt-svgstore": "^2.0.0",
"typescript": "^4.6.0-dev.20211116"
},
"babel": {
"presets": [
"@babel/preset-env",
"@babel/preset-react"
]
}
"version": "0.0.1"
}
This diff is collapsed.
......@@ -10,16 +10,26 @@ packages = [
]
[tool.poetry.dependencies]
python = '^3.7'
python = '>=3.7.3, ~=3.10.1'
typer = {extras = ["all"], version = "^0.3.2"}
datasette = { path = "src/datasette", develop = true }
datasette-dashboards = { path = "src/datasette-dashboards", develop = true }
#datasette = { path = "src/datasette", develop = true }
#datasette-dashboards = { path = "src/datasette-dashboards", develop = true }
datasette = "^0.59"
datasette-render-markdown = "^2.0"
#note: it seems that datasette-markdown dependencies are broken because things break
#unless we force a newer version of importlib-metadata
importlib-metadata = ">3.10"
Markdown = "^3.3.6"
datasette-block-robots = "^1.0"
click = "<7.2"
semver = "^2.13.0"
requests = "^2.26.0"
sqlite-utils = "^3.17"
rich = "^10.11.0"
regex = "2021.10.8"
pandas = "^1.3.4"
numpy = "^1.21.4"
datasette-hovercards = "^0.1a0"
[tool.poetry.dev-dependencies]
black = "^21.9b0"
......
#!/bin/bash
# this should run after deployment, in the staging environment
export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh" # This loads nvm
nvm use 14
npx tsc
npx esbuild ./www/static/DashboardApp.js --bundle --outfile=www/static/app.js --target=es2020 --format=iife --platform=browser --keep-names
cp ./www/settings_staging.json ./www/settings.json
#!/bin/bash
nvm use 14
npx tsc --watch &
sleep 2
npx esbuild ./www/static/DashboardApp.js --bundle --outfile=www/static/app.js --target=es2020 --format=iife --platform=browser --keep-names --watch
// Snowpack Configuration File
// See all supported options: https://www.snowpack.dev/reference/configuration
/** @type {import("snowpack").SnowpackUserConfig } */
module.exports = {
root: "./www/static",
workspaceRoot: "./",
plugins: [
'@snowpack/plugin-typescript'
],
exclude: [
'**/*.py',
'**/*.sql',
'**/*.html',
'**/*.bak',
'**/*.pyc',
'**/*.zip',
'**/.mypy_cache',
'**/.mypy_cache/**',
'**/__pycache__',
'**/*.svg'
],
packageOptions: {
source: "local",
knownEntrypoints: ['chart.js', 'chart.js/helpers', 'chart.js/auto']
/* ... */
},
devOptions: {
/* ... */
},
buildOptions: {
out: "www/static/build",
clean: false
}
};
Subproject commit 944e256d653b723e171e3a2722a5478bbc41e0ed
Subproject commit d32e3ac7595791924b85ecc2cdc25ec95c0ceac1
from rich.console import Console
console = Console(stderr=True)
class FakeConsole:
"""this is used if rich is not installed. This makes rich an optional dependency
and keeps debugging code and console logging from breaking the world.
"""
def log(self, *output):
print(*output)
def print_exception(self, e=None):
print(e)
def status(self, *msg):
print(*msg)
return self
def update(self, *msg):
print(*msg)
return self
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
return False
try:
from rich.console import Console
console = Console(stderr=True)
except:
console = FakeConsole()
......@@ -9,7 +9,7 @@ import sys
from datetime import datetime, timedelta
from pprint import pprint
from sqlite3 import Connection
from typing import Iterable, Optional, Sized
from typing import Iterable, Optional, Sized, Union
import click
from rich.console import Console
......@@ -32,7 +32,7 @@ all_tables = ["columns", "events", "column_metrics", "task_metrics", "phobjects"
cli = Typer(callback=config, no_args_is_help=True, invoke_without_command=True)
def cache_tasks(conduit:Conduit, cache:DataCache, tasks:list, sts):
def cache_tasks(conduit:Conduit, cache:DataCache, tasks:Iterable, sts):
ids = ', '.join(tasks)
with cache.con as db:
rows = db.execute(f'select id from Task where id in ({ids})')
......@@ -45,11 +45,6 @@ def cache_tasks(conduit:Conduit, cache:DataCache, tasks:list, sts):
new_instances = []
for task in r.data:
task.save()
# instance = PHObject.instance(phid=PHID(key), data=vals, save=True)
# new_instances.append(instance)
# cache.store_all(r.data)
def cache_projects(conduit: Conduit, cache, sts, project):
......@@ -85,13 +80,13 @@ def cache_columns(ctx: typer.Context, project: str = Option("all")):
"""
config = ctx.meta["config"] # type: Config
PHObject.db = config.db
config.console.log("Fetching workboard column details from phabricator.")
console.log("Fetching workboard column details from phabricator.")
if project == "all":
r = config.phab.project_columns()
else:
r = config.phab.project_columns(project=PHID(project))
count = 0
with config.console.status("[bold green]Fetching more pages...") as sts:
with console.status("[bold green]Fetching more pages...") as sts:
r.fetch_all(sts)
proxy_phids = []
......@@ -113,7 +108,7 @@ def cache_columns(ctx: typer.Context, project: str = Option("all")):
f"Saved [bold green]{count}[/bold green] ([bold blue]{pct}%[/bold blue]) Project Columns."
)
config.console.log(f"Fetched & cached {count} Project Columns.")
console.log(f"Fetched & cached {count} Project Columns.")
config.db.conn.commit()
config.console.log("Updating phobjects cache.")
_, cache = init_caches(config.db, config.phab)
......@@ -123,18 +118,25 @@ def cache_columns(ctx: typer.Context, project: str = Option("all")):
cache_projects(config.phab, cache, sts, project)
cache_projects(config.phab, cache, sts, proxy_phids)
optimize(config)
def optimize(config):
with console.status("[bold green]Running optimize") as sts:
config.db.conn.executescript('PRAGMA analysis_limit=1000;PRAGMA optimize;')
@cli.command()
def map(
ctx: typer.Context,
project: str = Option(None),
task_ids: Optional[str] = Option(None),
taskids: Optional[str] = Option(None),
mock: Optional[str] = Option(None),
cache_objects: Optional[bool] = Option(False),
linear: Optional[bool] = Option(False),
after: Optional[int] = Option(0),
pages: Optional[int] = Option(1),
order: Optional[str] = Option('id'),
cursor_id: Optional[str] = Option(None),
reset_cursor: Optional[bool] = Option(False),
):
"""Gather workboard metrics from Phabricator"""
config = ctx.meta["config"] # type: Config
......@@ -143,7 +145,10 @@ def map(
db_path = config.db_path
console = config.console
project_phid = project
if (taskids):
task_ids = taskids.split(',')
else:
task_ids = []
all_projects:set[Project]
try:
......@@ -163,36 +168,57 @@ def map(
transactions = transactions["result"]
elif linear:
task_ids = []
arg={
"queryKey":"all",
"order": ['id'],
"order": [order, 'id'],
"attachments": {
"projects": True
},
"limit":100
}
if after == -1:
with db.conn:
res = db.conn.execute('select min(id) from Task')
row = res.fetchone()
after = row[0]
elif cursor_id:
if reset_cursor:
res = db.conn.execute('update conduit_cursor set after_id=0 where name=:name', {"name": cursor_id})
after = 0
else:
res = db.conn.execute('select after_id from conduit_cursor where name=:name', {"name": cursor_id})
row = res.fetchone()
if row and len(row):
after = row[0]
else:
after = 0
if after:
arg['after'] = after
r = phab.request("maniphest.search", arg
)
if pages and pages > 1:
for i in range(pages):
console.log("Fetching next page", r.cursor)
r.next_page()
for task in r.data:
console.log(f'Resuming after task id [bold blue]T{after}[/bold blue]')
r = phab.request("maniphest.search", arg)
while len(r.data):
task = r.data.popleft()
task_ids.append(task.id)
task.save()
if not project_phid and not task_ids:
console.log("Either A project phid or a list of tasks are required.")
return False