Client Usage¶

Installation¶

Molar is available through PYPI, so you can simply do

$ pip install molar

Login, credentials and authentication¶

Authentication and user management is implemented separately for each databases. Each database can have a different set of users and superusers. We decided to implement it this way to give the maximum flexibility to the users.

If you are using the python client and you provided valid credentials in the client config, the authentication will be done in the background for you. You can verify that your credentials are valid using Client.test_token().

[1]:

from molar import ClientConfig, Client

admin_cfg = ClientConfig(server_url="http://localhost:8000",
                         email="docs@molar.org",
                         password="molar password",
                         database_name="main")

admin_client = Client(admin_cfg)

admin_client.test_token()

/Users/tga/.pyenv/versions/3.7.9/envs/molar-test/lib/python3.7/site-packages/jose/backends/cryptography_backend.py:18: CryptographyDeprecationWarning: int_from_bytes is deprecated, use int.from_bytes instead
  from cryptography.utils import int_from_bytes, int_to_bytes

[1]:

{'email': 'docs@molar.org',
 'is_superuser': True,
 'is_active': True,
 'full_name': 'Molar Docs',
 'created_on': '2021-06-11T08:37:34.029713',
 'user_id': 1}

Creating a new database¶

Before creating a request to create a new database, one has to choose the structure of the database they want to use. Indeed, one of the features that makes Molar modular is to support virtually any database structure, thanks to Alembic.

Available databases structure are available as revisions, and can be queried through:

[2]:

admin_client.get_alembic_revisions()

[2]:

	revision	log_entry	branch_labels
0	0bc99b5f8fcc	Rev: 0bc99b5f8fcc (head)\nParent: f31c7d486f1f...	[eventsourcing, compchem]
1	311e0c050706	Rev: 311e0c050706 (head)\nParent: bf3c5d811155...	[molar-main]
2	443c140f72c3	Rev: 443c140f72c3 (head)\nParent: f31c7d486f1f...	[eventsourcing, a3md-optoelectronics]

molar-main is the structure used for registering database requests. compchem contains a base structure for computational chemistry efforts.

It is possible to add new structure to Molar. However, you will need to have access to the server. This process in detailed in the backend section.

To create a new database, one has to make make a request first. The credentials provided in ClientConfig do not exists yet, but will be used to create the first superuser once the database request is approved.

[3]:

user_cfg = ClientConfig(server_url="http://localhost:8000",
                        email="new_user@molar.org",
                        password="new_password",
                        database_name="compchem")

user_client = Client(user_cfg)

user_client.database_creation_request(
    superuser_fullname="New User",
    alembic_revisions=["compchem@head"]
)

[10:27:02] INFO     Database request created                                    client.py:108

[3]:

{'msg': 'Database request created'}

Then, the database request can be approved by an admin:

[4]:

admin_client.approve_database("compchem")

[10:27:24] INFO     Database compchem created.                                  client.py:108

[4]:

{'msg': 'Database compchem created.'}

Now that the database compchem has been created, I can login with a new client:

[5]:

user_client.test_token()

[5]:

{'email': 'new_user@molar.org',
 'is_superuser': True,
 'is_active': True,
 'full_name': 'New User',
 'created_on': '2021-06-16T08:27:24.470584',
 'user_id': 1}

As you can see here-above, new_user@molar.org is the superuser for this new database.

Managing users¶

There are two ways to add users to the newly created database. Either the superuser can add them using Client.add_user, or it is possible for new users to register themselves. For example:

[6]:

new_test_user_cfg = ClientConfig(server_url="http://localhost:8000",
                                 email="new_user2@molar.org",
                                 password="another new password",
                                 database_name="compchem")
new_test_user_client = Client(new_test_user_cfg)

new_test_user_client.register_user(full_name="Full Name")

[10:27:28] INFO     User new_user2@molar.org has been register. Ask your        client.py:108
                    database admin to activate this account

[6]:

{'msg': 'User new_user2@molar.org has been register. Ask your database admin to activate this account'}

Once this is done, the superuser of the database needs activate this new account.

[7]:

user_client.activate_user("new_user2@molar.org")

           INFO     User new_user2@molar.org is now active!                     client.py:108

[7]:

{'msg': 'User new_user2@molar.org is now active!'}

[8]:

new_test_user_client.test_token()

[8]:

{'email': 'new_user2@molar.org',
 'is_superuser': False,
 'is_active': True,
 'full_name': 'Full Name',
 'created_on': '2021-06-16T08:27:28.095871',
 'user_id': 2}

Similarly, a superuser could also deactivate this account with Client.deactivate_user().

To get information about the uers, the superuser can do the following:

[9]:

user_client.get_users()

[9]:

	email	is_superuser	is_active	full_name	created_on
0	new_user@molar.org	True	True	New User	2021-06-16T08:27:24.470584
1	new_user2@molar.org	False	True	Full Name	2021-06-16T08:27:28.095871

Data Manipulation¶

Before adding any data to the database, we need to know what is its structure.

[10]:

db_info = user_client.get_database_information()
db_info

[10]:

	table_name	column_name	type	subtype	is_nullable	constraint_name	containt_type	references
0	calculation	calculation_id	uuid	uuid	NO	calculation_pkey	None	calculation.calculation_id
1	calculation	command_line	text	text	YES	None	None	None
2	calculation	conformer_id	uuid	uuid	NO	calculation_conformer_id_fkey	None	conformer.conformer_id
3	calculation	created_on	timestamp without time zone	timestamp	NO	None	None	None
4	calculation	input_file	text	text	YES	None	None	None
5	calculation	metadata	jsonb	jsonb	YES	None	None	None
6	calculation	output_conformer_id	uuid	uuid	YES	calculation_output_conformer_id_fkey	None	conformer.conformer_id
7	calculation	software_id	uuid	uuid	NO	calculation_software_id_fkey	None	software.software_id
8	calculation	updated_on	timestamp without time zone	timestamp	NO	None	None	None
9	conformer	atomic_numbers	ARRAY	_int4	NO	None	None	None
10	conformer	conformer_id	uuid	uuid	NO	conformer_pkey	None	conformer.conformer_id
11	conformer	created_on	timestamp without time zone	timestamp	NO	None	None	None
12	conformer	metadata	jsonb	jsonb	YES	None	None	None
13	conformer	molecule_id	uuid	uuid	YES	conformer_molecule_id_fkey	None	molecule.molecule_id
14	conformer	updated_on	timestamp without time zone	timestamp	NO	None	None	None
15	conformer	x	ARRAY	_float8	NO	None	None	None
16	conformer	y	ARRAY	_float8	NO	None	None	None
17	conformer	z	ARRAY	_float8	NO	None	None	None
18	molecule	created_on	timestamp without time zone	timestamp	NO	None	None	None
19	molecule	metadata	jsonb	jsonb	YES	None	None	None
20	molecule	molecule_id	uuid	uuid	NO	molecule_pkey	None	molecule.molecule_id
21	molecule	molecule_type_id	uuid	uuid	YES	molecule_molecule_type_id_fkey	None	molecule_type.molecule_type_id
22	molecule	smiles	text	text	NO	molecule_smiles_key	None	molecule.smiles
23	molecule	updated_on	timestamp without time zone	timestamp	NO	None	None	None
24	molecule_type	created_on	timestamp without time zone	timestamp	NO	None	None	None
25	molecule_type	metadata	jsonb	jsonb	YES	None	None	None
26	molecule_type	molecule_type_id	uuid	uuid	NO	molecule_type_pkey	None	molecule_type.molecule_type_id
27	molecule_type	name	text	text	NO	molecule_type_name_key	None	molecule_type.name
28	molecule_type	updated_on	timestamp without time zone	timestamp	NO	None	None	None
29	numerical_data	calculation_id	uuid	uuid	NO	numerical_data_calculation_id_fkey	None	calculation.calculation_id
30	numerical_data	created_on	timestamp without time zone	timestamp	NO	None	None	None
31	numerical_data	data	ARRAY	_float8	YES	None	None	None
32	numerical_data	metadata	jsonb	jsonb	YES	None	None	None
33	numerical_data	numerical_data_id	uuid	uuid	NO	numerical_data_pkey	None	numerical_data.numerical_data_id
34	numerical_data	updated_on	timestamp without time zone	timestamp	NO	None	None	None
35	software	created_on	timestamp without time zone	timestamp	NO	None	None	None
36	software	name	text	text	NO	software_name_version_unique	None	software.name
37	software	name	text	text	NO	software_name_version_unique	None	software.version
38	software	software_id	uuid	uuid	NO	software_pkey	None	software.software_id
39	software	updated_on	timestamp without time zone	timestamp	NO	None	None	None
40	software	version	text	text	NO	software_name_version_unique	None	software.name
41	software	version	text	text	NO	software_name_version_unique	None	software.version

The information obtained contains all the details of the database structure. In this example, we can see the structure contains 6 tables: - molecule, - molecule_type, - conformer, - calculation, - software, - numerical_data

Adding data¶

Now, let say we want to add a molecule_type. First, let’s have a look at the table structure:

[11]:

db_info[db_info.table_name == "molecule_type"]

[11]:

	table_name	column_name	type	subtype	is_nullable	constraint_name	containt_type	references
24	molecule_type	created_on	timestamp without time zone	timestamp	NO	None	None	None
25	molecule_type	metadata	jsonb	jsonb	YES	None	None	None
26	molecule_type	molecule_type_id	uuid	uuid	NO	molecule_type_pkey	None	molecule_type.molecule_type_id
27	molecule_type	name	text	text	NO	molecule_type_name_key	None	molecule_type.name
28	molecule_type	updated_on	timestamp without time zone	timestamp	NO	None	None	None

The columns created_on, updated_on and molecule_type_id are here for bookkeeping and are filled automatically by the database. The remaining fields are name and metadata. We can see name is of type text and metadata jsonb. This means name can be any python string, and metadata any python object that can be serialized to json. We recomment being cautious with the use of jsonb field as it can be harder to query.

You can manipulate teh data user Client.create_entry, Client.delete_entry and Client.update_entry. Here is an example:

[12]:

event = user_client.create_entry(type="molecule_type", data={"name": "brain catalyst"})
event

[12]:

{'type': 'molecule_type',
 'id': 1,
 'uuid': '32a775f3-6184-4b3b-b889-daafee3d13c4',
 'event': 'create',
 'data': {'name': 'brain catalyst'},
 'timestamp': '2021-06-16T08:27:30.098010',
 'alembic_version': ['0bc99b5f8fcc'],
 'user_id': 1}

The db_info also tells us there is a contraint on the column name. Indeed, the name has to be unique and you won’t be able to add the same data again:

[13]:

user_client.create_entry(type="molecule_type", data={"name": "brain catalyst"})

---------------------------------------------------------------------------
MolarBackendError                         Traceback (most recent call last)
<ipython-input-13-f522a4c5189d> in <module>
----> 1 user_client.create_entry(type="molecule_type", data={"name": "brain catalyst"})

~/AGG/code/goldmine/molar/client.py in create_entry(self, type, data)
    240             json=datum,
    241             headers=self.headers,
--> 242             return_pandas_dataframe=False,
    243         )
    244

~/AGG/code/goldmine/molar/client.py in request(self, url, method, params, json, data, headers, return_pandas_dataframe)
     98         if response.status_code != 200:
     99             raise MolarBackendError(
--> 100                 status_code=response.status_code, message=out["detail"]
    101             )
    102

MolarBackendError: The client got an unexpected answer from the backend:
401: Unique constraint violation!

Now let’s have a look at the molecule table.

[14]:

db_info[db_info.table_name == "molecule"]

[14]:

	table_name	column_name	type	subtype	is_nullable	constraint_name	containt_type	references
18	molecule	created_on	timestamp without time zone	timestamp	NO	None	None	None
19	molecule	metadata	jsonb	jsonb	YES	None	None	None
20	molecule	molecule_id	uuid	uuid	NO	molecule_pkey	None	molecule.molecule_id
21	molecule	molecule_type_id	uuid	uuid	YES	molecule_molecule_type_id_fkey	None	molecule_type.molecule_type_id
22	molecule	smiles	text	text	NO	molecule_smiles_key	None	molecule.smiles
23	molecule	updated_on	timestamp without time zone	timestamp	NO	None	None	None

The molecule table can take optionally (see is_nullable) a molecule_type_id, which means it can refer to an entry in the molecule_type table.

[15]:

event = user_client.create_entry(
    type="molecule",
    data={
        "smiles": "CN1C=NC2=C1C(=O)N(C(=O)N2C)C",
        "molecule_type_id": event['uuid'],
        "metadata": {"name": "cafefine"}
    }
)
event

[15]:

{'type': 'molecule',
 'id': 3,
 'uuid': '0a13b53b-5a15-48aa-98d5-9b929c7c0407',
 'event': 'create',
 'data': {'smiles': 'CN1C=NC2=C1C(=O)N(C(=O)N2C)C',
  'metadata': {'name': 'cafefine'},
  'molecule_type_id': '32a775f3-6184-4b3b-b889-daafee3d13c4'},
 'timestamp': '2021-06-16T08:27:40.201146',
 'alembic_version': ['0bc99b5f8fcc'],
 'user_id': 1}

Similarly, one can update a field using its uuid:

[16]:

event = user_client.update_entry(
    uuid=event['uuid'],
    type='molecule',
    data={"metadata": {"name": "caffeine"}}
)
event

[16]:

{'type': 'molecule',
 'id': 4,
 'uuid': '0a13b53b-5a15-48aa-98d5-9b929c7c0407',
 'event': 'update',
 'data': {'metadata': {'name': 'caffeine'}},
 'timestamp': '2021-06-16T08:27:40.620397',
 'alembic_version': ['0bc99b5f8fcc'],
 'user_id': 1}

[17]:

event = user_client.delete_entry(
    type="molecule",
    uuid=event["uuid"]
)

Event sourcing¶

Rather than writing directly to the tables, Molar saves all the events into another table, called eventstore. The data is then projected to their respective table. This method is called event sourcing (as opposed to active records). The eventstore can be access through the client:

[18]:

es = user_client.view_entries()
es

[18]:

	type	id	uuid	event	data	timestamp	alembic_version	user_id
0	molecule_type	1	32a775f3-6184-4b3b-b889-daafee3d13c4	create	{'name': 'brain catalyst'}	2021-06-16T08:27:30.098010	[0bc99b5f8fcc]	1
1	molecule	3	0a13b53b-5a15-48aa-98d5-9b929c7c0407	create	{'smiles': 'CN1C=NC2=C1C(=O)N(C(=O)N2C)C', 'me...	2021-06-16T08:27:40.201146	[0bc99b5f8fcc]	1
2	molecule	4	0a13b53b-5a15-48aa-98d5-9b929c7c0407	update	{'metadata': {'name': 'caffeine'}}	2021-06-16T08:27:40.620397	[0bc99b5f8fcc]	1
3	molecule	5	0a13b53b-5a15-48aa-98d5-9b929c7c0407	delete	{}	2021-06-16T08:27:40.870340	[0bc99b5f8fcc]	1

Event sourcing becomes interesting when something goes wrong. Maybe the last entry, deleting the molecule with uuid 280ff363-78bf-4c58-b1b1-6ab152954ef6 was a mistake. In this case, we can undo the last events (or rather, empty the database and replay all the event until the last one):

[19]:

user_client.rollback(before=es.at[3, 'timestamp'])

[19]:

{'type': None,
 'id': 6,
 'uuid': None,
 'event': 'rollback-begin',
 'data': {'before': '2021-06-16 08:27:40.870340'},
 'timestamp': '2021-06-16T08:27:42.297604',
 'alembic_version': ['0bc99b5f8fcc'],
 'user_id': 1}

Here is the eventstore after the rollback. We can see the list of event that have been replayed.

[20]:

user_client.view_entries().sort_values(by=['id'])

[20]:

	type	id	uuid	event	data	timestamp	alembic_version	user_id
0	molecule_type	1	32a775f3-6184-4b3b-b889-daafee3d13c4	create	{'name': 'brain catalyst'}	2021-06-16T08:27:30.098010	[0bc99b5f8fcc]	1
1	molecule	3	0a13b53b-5a15-48aa-98d5-9b929c7c0407	create	{'smiles': 'CN1C=NC2=C1C(=O)N(C(=O)N2C)C', 'me...	2021-06-16T08:27:40.201146	[0bc99b5f8fcc]	1
2	molecule	4	0a13b53b-5a15-48aa-98d5-9b929c7c0407	update	{'metadata': {'name': 'caffeine'}}	2021-06-16T08:27:40.620397	[0bc99b5f8fcc]	1
3	molecule	5	0a13b53b-5a15-48aa-98d5-9b929c7c0407	delete	{}	2021-06-16T08:27:40.870340	[0bc99b5f8fcc]	1
9	None	6	None	rollback-begin	{'before': '2021-06-16 08:27:40.870340'}	2021-06-16T08:27:42.297604	[0bc99b5f8fcc]	1
4	molecule_type	7	32a775f3-6184-4b3b-b889-daafee3d13c4	delete	{}	2021-06-16T08:27:42.297604	[0bc99b5f8fcc]	1
5	molecule_type	8	32a775f3-6184-4b3b-b889-daafee3d13c4	create	{'name': 'brain catalyst'}	2021-06-16T08:27:42.297604	[0bc99b5f8fcc]	1
6	molecule	9	0a13b53b-5a15-48aa-98d5-9b929c7c0407	create	{'smiles': 'CN1C=NC2=C1C(=O)N(C(=O)N2C)C', 'me...	2021-06-16T08:27:42.297604	[0bc99b5f8fcc]	1
7	molecule	10	0a13b53b-5a15-48aa-98d5-9b929c7c0407	update	{'metadata': {'name': 'caffeine'}}	2021-06-16T08:27:42.297604	[0bc99b5f8fcc]	1
8	None	11	None	rollback-end	{'before': '2021-06-16 08:27:40.870340'}	2021-06-16T08:27:42.297604	[0bc99b5f8fcc]	1

In theory, an event sourcing makes it also possible to merge two databases with the same structure together. This has not been implemented yet.

Querying¶

Querying the database is relatively straight forward if you are familar with SQL. The queries that can be formulated with Molar are only a subset of what it is possible to do with SQL, but this should be sufficient for most use case.

Client.query_database is the only method that you need to use to make queries. It is relatively straight forward to use if you want to fetch data from a single table:

[21]:

user_client.query_database(
    types='molecule'
)

[21]:

	molecule_id	created_on	updated_on	metadata	smiles	molecule_type_id
0	0a13b53b-5a15-48aa-98d5-9b929c7c0407	2021-06-16T08:27:42.297604	2021-06-16T08:27:42.297604	{'name': 'caffeine'}	CN1C=NC2=C1C(=O)N(C(=O)N2C)C	32a775f3-6184-4b3b-b889-daafee3d13c4

and also:

[22]:

user_client.query_database(types="molecule_type")

[22]:

	molecule_type_id	created_on	updated_on	name
0	32a775f3-6184-4b3b-b889-daafee3d13c4	2021-06-16T08:27:42.297604	2021-06-16T08:27:42.297604	brain catalyst

Joins¶

It is also possible to perform joins. For instance we want to fetch all data from molecule and the name of the molecule_type. To do so, we can use the joins argument:

[23]:

user_client.query_database(
    types=['molecule', "molecule_type.name"],
    joins={
        'type': "molecule_type"
    }
)

[23]:

	molecule.molecule_id	molecule.created_on	molecule.updated_on	molecule.metadata	molecule.smiles	molecule.molecule_type_id	molecule_type.name
0	0a13b53b-5a15-48aa-98d5-9b929c7c0407	2021-06-16T08:27:42.297604	2021-06-16T08:27:42.297604	{'name': 'caffeine'}	CN1C=NC2=C1C(=O)N(C(=O)N2C)C	32a775f3-6184-4b3b-b889-daafee3d13c4	brain catalyst

Advanced joining¶

Of course, it is often more complicated than that. One may want to join twice the same table on different key. This would be the case with the table calculation that can have a conformer_id and an output_conformer_id.

Let’s first add some data and see how we can join these table together:

[24]:

conformer = user_client.create_entry(
    type="conformer",
    data={
        "x": [0],
        "y": [0],
        "z": [0],
        "atomic_numbers": [2],
        "molecule_id": event["uuid"]
    }
)
software = user_client.create_entry(
    type="software",
    data={
        "name": "cp2k",
        "version": "v1.0"
    }
)
calculation = user_client.create_entry(
    type="calculation",
    data={
        "software_id": software["uuid"],
        "conformer_id": conformer["uuid"],
        "output_conformer_id": conformer["uuid"]
    }
)

If we want to query for both initial_conformer and the output_conformer, we need to use aliases:

[25]:

user_client.query_database(
    types=["calculation", "initial_conformer", "output_conformer"],
    joins=[
        {
            "type": 'initial_conformer',
            "on": {
                "column1": "initial_conformer.conformer_id",
                "column2": "calculation.conformer_id"
            },
        },
        {
            "type": 'output_conformer',
            "on": {
                "column1": "output_conformer.conformer_id",
                "column2": "calculation.output_conformer_id"
            },
        }
    ],
    aliases=[
        {"type": "conformer", "alias": "initial_conformer"},
        {"type": "conformer", "alias": "output_conformer"}
    ]
)

[25]:

	calculation.calculation_id	calculation.created_on	calculation.updated_on	calculation.conformer_id	calculation.software_id	calculation.output_conformer_id	initial_conformer.conformer_id	initial_conformer.molecule_id	initial_conformer.created_on	initial_conformer.updated_on	...	initial_conformer.z	initial_conformer.atomic_numbers	output_conformer.conformer_id	output_conformer.molecule_id	output_conformer.created_on	output_conformer.updated_on	output_conformer.x	output_conformer.y	output_conformer.z	output_conformer.atomic_numbers
0	8ac0923d-51c3-4e33-b8bb-feb318282a15	2021-06-16T08:27:46.786091	2021-06-16T08:27:46.786091	dba3908e-83c7-4627-a301-6dfb31ceec53	9c4b9dcc-00ad-4a61-8034-c0be85836815	dba3908e-83c7-4627-a301-6dfb31ceec53	dba3908e-83c7-4627-a301-6dfb31ceec53	0a13b53b-5a15-48aa-98d5-9b929c7c0407	2021-06-16T08:27:46.671052	2021-06-16T08:27:46.671052	...	[0.0]	[2]	dba3908e-83c7-4627-a301-6dfb31ceec53	0a13b53b-5a15-48aa-98d5-9b929c7c0407	2021-06-16T08:27:46.671052	2021-06-16T08:27:46.671052	[0.0]	[0.0]	[0.0]	[2]

1 rows × 22 columns

This way the backend knows precisely which columns we are referring to in our join.

Filtering¶

It’s also possible to filter the entries:

[26]:

user_client.query_database(
    types='molecule',
    filters={
        'type': "molecule.metadata.name",
        "op": "==",
        "value": "caffeine"
    }
)

[26]:

	molecule_id	created_on	updated_on	metadata	smiles	molecule_type_id
0	0a13b53b-5a15-48aa-98d5-9b929c7c0407	2021-06-16T08:27:42.297604	2021-06-16T08:27:42.297604	{'name': 'caffeine'}	CN1C=NC2=C1C(=O)N(C(=O)N2C)C	32a775f3-6184-4b3b-b889-daafee3d13c4

The available operators are:

Operator	String
Equal	“==”
Not equal	“!=”
Greater than	“>”
Lower than	“<”
Greater or equal	“>=”
Lower or equal	“<=”
In	“in”
Not in	“not_in”
Pattern matching (case sensitive)	“like”
Pattern matching (cas insensitive)	“ilike”
	“notlike”
	“notilike”

For the pattern matching operators, please refer to the postgresql documentation

Ordering¶

The output of a query can be ordered according to any field.

[27]:

user_client.create_entry(
    type='molecule',
    data={
        "smiles": "CC(=O)OC1=CC=CC=C1C(=O)O",
        "metadata": {
            "name": "aspirin"
        }
    }
)

[27]:

{'type': 'molecule',
 'id': 15,
 'uuid': '4668e413-689c-43a9-81c3-9bc77d89e628',
 'event': 'create',
 'data': {'smiles': 'CC(=O)OC1=CC=CC=C1C(=O)O',
  'metadata': {'name': 'aspirin'}},
 'timestamp': '2021-06-16T08:27:50.388519',
 'alembic_version': ['0bc99b5f8fcc'],
 'user_id': 1}

[28]:

user_client.query_database(
    types='molecule',
    order_by={'type': 'molecule.metadata.name', 'order': 'asc'}
)

[28]:

	molecule_id	created_on	updated_on	metadata	smiles	molecule_type_id
0	4668e413-689c-43a9-81c3-9bc77d89e628	2021-06-16T08:27:50.388519	2021-06-16T08:27:50.388519	{'name': 'aspirin'}	CC(=O)OC1=CC=CC=C1C(=O)O	None
1	0a13b53b-5a15-48aa-98d5-9b929c7c0407	2021-06-16T08:27:42.297604	2021-06-16T08:27:42.297604	{'name': 'caffeine'}	CN1C=NC2=C1C(=O)N(C(=O)N2C)C	32a775f3-6184-4b3b-b889-daafee3d13c4

Debugging queries¶

Sometimes, to create or debug a query, it is convenient to access its raw SQL. This can be done using Client.debug_query.

[33]:

out = user_client.debug_query(
    types='molecule',
    order_by={'type': 'molecule.metadata.name', 'order': 'asc'}
)
print(out)

SELECT public.molecule.molecule_id, public.molecule.created_on, public.molecule.updated_on, public.molecule.metadata, public.molecule.smiles, public.molecule.molecule_type_id
FROM public.molecule ORDER BY public.molecule.metadata ->> 'name' ASC
 LIMIT 10 OFFSET 0

Molar provides also a way to analyze the query. Internally, this is done using the EXPLAIN ANALYZE statement. This can be useful to determine whether an index is required in the database to speed up a query. More information about this statement can be found in Postgres’ documentation.

[32]:

out = user_client.debug_query(
    types='molecule',
    order_by={'type': 'molecule.metadata.name', 'order': 'asc'},
    explain_analyze=True
)
print(out)

Limit  (cost=29.78..29.81 rows=10 width=144) (actual time=0.093..0.171 rows=2 loops=1)
  ->  Sort  (cost=29.78..31.23 rows=580 width=144) (actual time=0.078..0.114 rows=2 loops=1)
        Sort Key: ((metadata ->> 'name'::text))
        Sort Method: quicksort  Memory: 25kB
        ->  Seq Scan on molecule  (cost=0.00..17.25 rows=580 width=144) (actual time=0.021..0.045 rows=2 loops=1)
Planning Time: 0.087 ms
Execution Time: 0.237 ms

Removing a database¶

Databases can be removed. The following example will remove completely the database compchem (users and data). Be careful when using this command in code.

[ ]:

admin_client.remove_database("compchem")