Command line interface
ehrql [--help] [--version] COMMAND_NAME ...
Name of the sub-command to execute.
Take a dataset definition file and output a dataset.
Take a measures definition file and output measures.
Start the ehrQL sandbox environment.
Dump example data for the ehrQL tutorial to the current directory.
Output the SQL that would be executed to fetch the results of the dataset definition.
Generate dummy tables and write them out as CSV files (one per table).
Experimental command for running assurance tests.
Internal command for testing the database connection configuration.
Internal command for serializing a definition file to a JSON representation.
Internal command for testing code isolation support.
show this help message and exit
Show the exact version of ehrQL in use and then exit.
generate-dataset 🔗
ehrql generate-dataset DEFINITION_FILE [--help] [--output DATASET_FILE]
[--dummy-data-file DUMMY_DATA_FILE] [--dummy-tables DUMMY_TABLES_PATH]
[--dsn DSN] [--query-engine QUERY_ENGINE_CLASS] [--backend BACKEND_CLASS]
[ -- ... PARAMETERS ...]
ehrQL is designed so that exactly the same command can be used to output a dummy dataset when run on your own computer and then output a real dataset when run inside the secure environment as part of an OpenSAFELY pipeline.
Path of the Python file where the dataset is defined.
show this help message and exit
Path of the file where the dataset will be written (console by default).
The file extension determines the file format used. Supported formats are:
.arrow
, .csv
, .csv.gz
Path to a dummy dataset.
This allows you to take complete control of the dummy dataset. ehrQL will ensure that the column names, types and categorical values match what they will be in the real dataset, but does no further validation.
Note that the dummy dataset doesn't need to be of the same type as the
real dataset (e.g. you can use a .csv
file here to produce a .arrow
file).
This argument is ignored when running against real tables.
Path to directory of CSV files (one per table) to use as dummy tables
(see create-dummy-tables
).
This argument is ignored when running against real tables.
Parameters are extra arguments you can pass to your Python definition file. They must be
supplied after all ehrQL arguments and separated from the ehrQL arguments with a
double-dash --
.
You should not normally need to use these arguments: they are for the internal operation of ehrQL and the OpenSAFELY platform.
Data Source Name: URL of remote database, or path to data on disk (defaults to value of DATABASE_URL environment variable).
Dotted import path to Query Engine class, or one of: mssql
, sqlite
, csv
, trino
Dotted import path to Backend class, or one of: emis
, tpp
generate-measures 🔗
ehrql generate-measures DEFINITION_FILE [--help] [--output OUTPUT_FILE]
[--dummy-tables DUMMY_TABLES_PATH] [--dummy-data-file DUMMY_DATA_FILE]
[--dsn DSN] [--query-engine QUERY_ENGINE_CLASS] [--backend BACKEND_CLASS]
[ -- ... PARAMETERS ...]
Path of the Python file where measures are defined.
show this help message and exit
Path of the file where the measures will be written (console by default),
supported formats: .arrow
, .csv
, .csv.gz
Path to directory of CSV files (one per table) to use as dummy tables
(see create-dummy-tables
).
This argument is ignored when running against real tables.
Path to dummy measures output.
This allows you to take complete control of the dummy measures output. ehrQL will ensure that the column names, types and categorical values match what they will be in the real measures output, but does no further validation.
Note that the dummy measures output doesn't need to be of the same type as the
real measures output (e.g. you can use a .csv
file here to produce a .arrow
file).
This argument is ignored when running against real tables.
Parameters are extra arguments you can pass to your Python definition file. They must be
supplied after all ehrQL arguments and separated from the ehrQL arguments with a
double-dash --
.
You should not normally need to use these arguments: they are for the internal operation of ehrQL and the OpenSAFELY platform.
Data Source Name: URL of remote database, or path to data on disk (defaults to value of DATABASE_URL environment variable).
Dotted import path to Query Engine class, or one of: mssql
, sqlite
, csv
, trino
Dotted import path to Backend class, or one of: emis
, tpp
sandbox 🔗
ehrql sandbox DUMMY_TABLES_PATH [--help]
Path to directory of CSV files (one per table).
show this help message and exit
dump-example-data 🔗
ehrql dump-example-data [--help]
show this help message and exit
dump-dataset-sql 🔗
ehrql dump-dataset-sql DEFINITION_FILE [--help] [--output OUTPUT_FILE]
[--query-engine QUERY_ENGINE_CLASS] [--backend BACKEND_CLASS]
[ -- ... PARAMETERS ...]
By default, this command will output SQL suitable for the SQLite database.
To get the SQL as it would be run against the real tables you will to supply
the appropriate --backend
argument, for example --backend tpp
.
Note that due to configuration differences this may not always exactly match what gets run against the real tables.
Path of the Python file where the dataset is defined.
show this help message and exit
SQL output file (outputs to console by default).
Dotted import path to Query Engine class, or one of: mssql
, sqlite
, csv
, trino
Dotted import path to Backend class, or one of: emis
, tpp
Parameters are extra arguments you can pass to your Python definition file. They must be
supplied after all ehrQL arguments and separated from the ehrQL arguments with a
double-dash --
.
create-dummy-tables 🔗
ehrql create-dummy-tables DEFINITION_FILE DUMMY_TABLES_PATH [--help]
[ -- ... PARAMETERS ...]
This command generates the same dummy tables that the generate-dataset
command would generate, but instead of using them to produce a dummy
dataset, it writes them out as CSV files.
The directory containing the CSV files can then be used as the
--dummy-tables
argument to
generate-dataset
to produce the dummy dataset.
The CSV files can be edited in any way you wish, giving you full control over the dummy tables.
Path of the Python file where the dataset is defined.
Path to directory where CSV files (one per table) will be written.
show this help message and exit
Parameters are extra arguments you can pass to your Python definition file. They must be
supplied after all ehrQL arguments and separated from the ehrQL arguments with a
double-dash --
.
assure 🔗
ehrql assure TEST_DATA_FILE [--help] [ -- ... PARAMETERS ...]
Note that this command is experimental and not yet intended for widespread use.
Path of the file where the test data is defined.
show this help message and exit
Parameters are extra arguments you can pass to your Python definition file. They must be
supplied after all ehrQL arguments and separated from the ehrQL arguments with a
double-dash --
.
test-connection 🔗
ehrql test-connection [--help] [-b BACKEND_CLASS] [-u URL]
Note that this in an internal command and not intended for end users.
show this help message and exit
Dotted import path to Backend class, or one of: emis
, tpp
Database connection string.
serialize-definition 🔗
ehrql serialize-definition DEFINITION_FILE [--help]
[--definition-type DEFINITION_TYPE] [--output OUTPUT_FILE]
[ -- ... PARAMETERS ...]
Note that this in an internal command and not intended for end users.
Definition file path
show this help message and exit
Options: dataset
, measures
, test
Output file path (stdout by default)
Parameters are extra arguments you can pass to your Python definition file. They must be
supplied after all ehrQL arguments and separated from the ehrQL arguments with a
double-dash --
.
isolation-report 🔗
ehrql isolation-report [--help]
Note that this in an internal command and not intended for end users.
show this help message and exit