Command Line Interface API

Users can interact with DSI Readers, Writers and Backends even easier with DSI’s Command Line Interface (CLI). While slightly more restrictive than the Python API, the CLI allows users to interact with DSI without any knowledge of Python.

Users can store several files in DSI, and query/find/export loaded data to other formats. Users can also write loaded data to a permanent database store for post-analysis.

The CLI actions and example workflows are shown below.

CLI Setup and Actions

Once a user has successfully installed DSI, they can active the CLI environment by entering dsi in their command line. This automatically creates a hidden Sqlite database that users can interact with.

However, if a user wants to use DuckDB instead, they should activate the CLI with dsi -b duckdb in their command line. From here on out, all actions will be using a hidden DuckDB database.

To view all available CLI actions without launching the CLI, users can enter dsi help in their command line.

A comprehensive list of all actions in the CLI environment is as follows:

help

Displays a help menu for CLI actions and their inputs.

display <table name> [-n num rows] [-e filename]

Displays data from a specified table, with optional arguments.

  • table_name is a mandatory input to display that table.

  • num_rows is optional and only displays the first N rows.

  • filename is optional and exports the table to a CSV or Parquet file.

draw [-f filename]

Draws an ER diagram of all data loaded into DSI.

  • filename is optional; default is er_diagram.png.

exit

Exits the CLI and closes all active DSI modules.

find <condition>

Finds all rows of a table that match the condition in the format: [column] [operator] [value]. Ex: find ‘age = 6’

Valid operators:

  • age > 4

  • age < 4

  • age >= 4

  • age <= 4

  • age = 4

  • age == 4

  • age ~ 4 –> column age contains the number 4

  • age ~~ 4 –> column age contains the number 4

  • age != 4

  • age (4, 8) –> all values in ‘age’ between 4 and 8 (inclusive)

list

Lists the names of all tables and their dimensions.

plot_table <table_name> [-f filename]

Plots numerical data from the specified table.

  • table_name is a mandatory input to plot that table

  • filename is optional; default is <table_name>_plot.png.

query <SQL query> [-n num rows] [-e filename]

Executes a specified query (in quotes) and prints the result with optional arguments.

  • SQL query is mandatory and must match SQLite or DuckDB syntax.

  • num_rows is optional; prints the first N rows of the result.

  • filename is optional; export the result as CSV or Parquet file.

read <filename> [-t table name]

Reads specified data into DSI

  • filename is a mandatory input of data to ingest. Accepted formats:

    • CSV, JSON, TOML, YAML, Parquet, SQLite databases, DuckDB databases

    • URL pointing to data stored in one of the above formats

  • table_name is optional. If reading a CSV, JSON, or Parquet, users can specify table_name

search <value>

Searches for an input value across all data loaded into DSI. Can be a number or text.

summary [-t table_name]

Displays numerical statistics of all tables or a specified table.

  • table_name is optional and summarizes only that specified table.

write <filename>

Writes the hidden DSI backend to a designated location. This permanent file will be of the same type as the hidden backend.

Users can also expect basic unix commands such as cd (change directory), ls (list all files) and clear (clear command line view).

CLI Example

The terminal output below displays various ways users can utilize DSI’s CLI for seamless data science analysis.

my_user@local-machine examples % dsi
    _____           ___                          
   /  /  \         /  /\         ___     
  /  / /\ \       /  / /_       /  /\    
 /  / /  \ \     /  / / /\     /  / /    
/__/ / \__\ |   /  / / /  \   /__/  \    
\  \ \ /  / /  /__/ / / /\ \  \__\/\ \__ 
 \  \ \  / /   \  \ \/ / / /     \  \ \/\
  \  \ \/ /     \  \  / / /       \__\  /
   \  \  /       \__\/ / /        /__/ / 
    \__\/          /__/ /         \__\/  
                   \__\/                   v1.1.4

Created a temporary Sqlite DSI backend

Enter "help" for usage hints.
dsi> help

display <table_name> [-n num_rows] [-e filename] Displays a table's data. Optionally limit 
                                                  displayed rows and export to CSV/Parquet
draw [-f filename]                               Draws an ER diagram of all tables in the 
                                                  current DSI database
exit                                             Exits the DSI Command Line Interface (CLI)
find <condition>                                 Finds all rows of a table that match a 
                                                  column-level condition.
help                                             Shows this help message.
list                                             Lists all tables in the current DSI database
plot_table <table_name> [-f filename]            Plots numerical data from a table to an 
                                                  optional file name argument
query <SQL_query> [-n num_rows] [-e filename]    Executes a SQL query (in quotes). Optionally 
                                                  limit printed rows or export to CSV/Parquet
read <data_source> [-t table_name]               Reads a file or URL into the DSI database. 
                                                  Optionally set table name.
search <value>                                   Searches for a string or number across DSI.
summary [-t table_name]                          Summary of the database or a specific table.
viewers                                          Prints the available viewers for the user.
view <available viewer>                          Creates an instance of the DSI viewer in 
                                                  another application.
write <filename>                                 Writes data in DSI database to a permanent 
                                                  location.
ls                                               Lists all files in the current or specified 
                                                  directory.
cd <path>                                        Changes the working directory within the CLI 
                                                  environment.

dsi> read wildfire/wildfire_oceans11.yml -t oceans_data
Loaded wildfire/wildfire_oceans11.yml into the table oceans_data
Database now has 1 table

dsi> read pennant/pennant_oceans11.yml -t oceans_data
Loaded pennant/pennant_oceans11.yml into the table oceans_data
Database now has 1 table

dsi> read test/example.toml
Loaded test/example.toml into the table nodes
Database now has 2 tables

dsi> read test/results.toml
Loaded test/results.toml into the table people
Database now has 3 tables

dsi> read test/yosemite5.csv
Loaded test/yosemite5.csv into the table yosemite5
Database now has 4 tables

dsi> list

Table: oceans_data
  - num of columns: 15
  - num of rows: 2

Table: nodes
  - num of columns: 2
  - num of rows: 2

Table: people
  - num of columns: 6
  - num of rows: 1

Table: yosemite5
  - num of columns: 9
  - num of rows: 4

dsi> query "SELECT * FROM nodes" -e nodes.csv
Printing the result from input SQL query: SELECT * FROM nodes

name  | resources_gpu
---------------------
node1 | 4            
node2 | 2            

Exported the query result to nodes.csv

dsi> display people -e people_output.csv

Table: people

avg_height_units | avg_height_value | median_speed_units | median_speed_value | std_gravity_units | std_gravity_value
---------------------------------------------------------------------------------------------------------------------
m                | 5.5              | s                  | 6.95               | m/s/s             | 9.83             

Exported people to people_output.csv

dsi> draw -f dsi_er_diagram.png
Saved an ER Diagram at dsi_er_diagram.png

dsi> plot_table people -f people_plot.png
Saved a plot of the people table in people_plot.png

dsi> summary -t oceans_data

Table: oceans_data

column                  | type    | unique | min  | max  | avg  | std_dev            
-------------------------------------------------------------------------------------
title                   | VARCHAR | 2      | None | None | None | None               
description             | VARCHAR | 2      | None | None | None | None               
keywords                | VARCHAR | 2      | None | None | None | None               
instructions_of_use     | VARCHAR | 2      | None | None | None | None               
authorship_authors      | VARCHAR | 2      | None | None | None | None               
authorship_release_date | VARCHAR | 2      | None | None | None | None               
authorship_la_ur        | VARCHAR | 2      | None | None | None | None               
authorship_funding      | VARCHAR | 1      | None | None | None | None               
authorship_rights       | VARCHAR | 2      | None | None | None | None               
data_file_types         | VARCHAR | 2      | None | None | None | None               
data_file_size          | VARCHAR | 2      | None | None | None | None               
data_num_files          | INTEGER | 2      | 10   | 75   | 42.5 | 32.5               
data_dataset_size       | VARCHAR | 2      | None | None | None | None               
data_version            | FLOAT   | 2      | 0.9  | 1.0  | 0.95 | 0.04999999999999999
data_doi                | VARCHAR | 1      | None | None | None | None  

dsi> viewers
Available viewers are: dashboard, ml

dsi> view ml

View the ML emulator at http://localhost:8501

To exit, press [Ctrl + C] here
^C
 Closing ML Emulator.

dsi> write dsi_output.db
Successfully wrote all data to dsi_output.db

dsi> exit

Exiting...
my_user@local-machine examples %