Client Usage¶
Login, credentials and authentication¶
Authentication and user management is implemented separately for each databases. Each database can have a different set of users and superusers. We decided to implement it this way to give the maximum flexibility to the users.
If you are using the python client and you provided valid credentials in the client config, the authentication will be done in the background for you. You can verify that your credentials are valid using Client.test_token().
[1]:
from molar import ClientConfig, Client
admin_cfg = ClientConfig(server_url="http://localhost:8000",
email="docs@molar.org",
password="molar password",
database_name="main")
admin_client = Client(admin_cfg)
admin_client.test_token()
/Users/tga/.pyenv/versions/3.7.9/envs/molar-test/lib/python3.7/site-packages/jose/backends/cryptography_backend.py:18: CryptographyDeprecationWarning: int_from_bytes is deprecated, use int.from_bytes instead
from cryptography.utils import int_from_bytes, int_to_bytes
[1]:
{'email': 'docs@molar.org',
'is_superuser': True,
'is_active': True,
'full_name': 'Molar Docs',
'created_on': '2021-06-11T08:37:34.029713',
'user_id': 1}
Creating a new database¶
Before creating a request to create a new database, one has to choose the structure of the database they want to use. Indeed, one of the features that makes Molar modular is to support virtually any database structure, thanks to Alembic.
Available databases structure are available as revisions, and can be queried through:
[2]:
admin_client.get_alembic_revisions()
[2]:
| revision | log_entry | branch_labels | |
|---|---|---|---|
| 0 | 0bc99b5f8fcc | Rev: 0bc99b5f8fcc (head)\nParent: f31c7d486f1f... | [eventsourcing, compchem] |
| 1 | 311e0c050706 | Rev: 311e0c050706 (head)\nParent: bf3c5d811155... | [molar-main] |
| 2 | 443c140f72c3 | Rev: 443c140f72c3 (head)\nParent: f31c7d486f1f... | [eventsourcing, a3md-optoelectronics] |
molar-main is the structure used for registering database requests. compchem contains a base structure for computational chemistry efforts.
It is possible to add new structure to Molar. However, you will need to have access to the server. This process in detailed in the backend section.
To create a new database, one has to make make a request first. The credentials provided in ClientConfig do not exists yet, but will be used to create the first superuser once the database request is approved.
[3]:
user_cfg = ClientConfig(server_url="http://localhost:8000",
email="new_user@molar.org",
password="new_password",
database_name="compchem")
user_client = Client(user_cfg)
user_client.database_creation_request(
superuser_fullname="New User",
alembic_revisions=["compchem@head"]
)
[10:27:02] INFO Database request created client.py:108
[3]:
{'msg': 'Database request created'}
Then, the database request can be approved by an admin:
[4]:
admin_client.approve_database("compchem")
[10:27:24] INFO Database compchem created. client.py:108
[4]:
{'msg': 'Database compchem created.'}
Now that the database compchem has been created, I can login with a new client:
[5]:
user_client.test_token()
[5]:
{'email': 'new_user@molar.org',
'is_superuser': True,
'is_active': True,
'full_name': 'New User',
'created_on': '2021-06-16T08:27:24.470584',
'user_id': 1}
As you can see here-above, new_user@molar.org is the superuser for this new database.
Managing users¶
There are two ways to add users to the newly created database. Either the superuser can add them using Client.add_user, or it is possible for new users to register themselves. For example:
[6]:
new_test_user_cfg = ClientConfig(server_url="http://localhost:8000",
email="new_user2@molar.org",
password="another new password",
database_name="compchem")
new_test_user_client = Client(new_test_user_cfg)
new_test_user_client.register_user(full_name="Full Name")
[10:27:28] INFO User new_user2@molar.org has been register. Ask your client.py:108 database admin to activate this account
[6]:
{'msg': 'User new_user2@molar.org has been register. Ask your database admin to activate this account'}
Once this is done, the superuser of the database needs activate this new account.
[7]:
user_client.activate_user("new_user2@molar.org")
INFO User new_user2@molar.org is now active! client.py:108
[7]:
{'msg': 'User new_user2@molar.org is now active!'}
[8]:
new_test_user_client.test_token()
[8]:
{'email': 'new_user2@molar.org',
'is_superuser': False,
'is_active': True,
'full_name': 'Full Name',
'created_on': '2021-06-16T08:27:28.095871',
'user_id': 2}
Similarly, a superuser could also deactivate this account with Client.deactivate_user().
To get information about the uers, the superuser can do the following:
[9]:
user_client.get_users()
[9]:
| is_superuser | is_active | full_name | created_on | ||
|---|---|---|---|---|---|
| 0 | new_user@molar.org | True | True | New User | 2021-06-16T08:27:24.470584 |
| 1 | new_user2@molar.org | False | True | Full Name | 2021-06-16T08:27:28.095871 |
Data Manipulation¶
Before adding any data to the database, we need to know what is its structure.
[10]:
db_info = user_client.get_database_information()
db_info
[10]:
| table_name | column_name | type | subtype | is_nullable | constraint_name | containt_type | references | |
|---|---|---|---|---|---|---|---|---|
| 0 | calculation | calculation_id | uuid | uuid | NO | calculation_pkey | None | calculation.calculation_id |
| 1 | calculation | command_line | text | text | YES | None | None | None |
| 2 | calculation | conformer_id | uuid | uuid | NO | calculation_conformer_id_fkey | None | conformer.conformer_id |
| 3 | calculation | created_on | timestamp without time zone | timestamp | NO | None | None | None |
| 4 | calculation | input_file | text | text | YES | None | None | None |
| 5 | calculation | metadata | jsonb | jsonb | YES | None | None | None |
| 6 | calculation | output_conformer_id | uuid | uuid | YES | calculation_output_conformer_id_fkey | None | conformer.conformer_id |
| 7 | calculation | software_id | uuid | uuid | NO | calculation_software_id_fkey | None | software.software_id |
| 8 | calculation | updated_on | timestamp without time zone | timestamp | NO | None | None | None |
| 9 | conformer | atomic_numbers | ARRAY | _int4 | NO | None | None | None |
| 10 | conformer | conformer_id | uuid | uuid | NO | conformer_pkey | None | conformer.conformer_id |
| 11 | conformer | created_on | timestamp without time zone | timestamp | NO | None | None | None |
| 12 | conformer | metadata | jsonb | jsonb | YES | None | None | None |
| 13 | conformer | molecule_id | uuid | uuid | YES | conformer_molecule_id_fkey | None | molecule.molecule_id |
| 14 | conformer | updated_on | timestamp without time zone | timestamp | NO | None | None | None |
| 15 | conformer | x | ARRAY | _float8 | NO | None | None | None |
| 16 | conformer | y | ARRAY | _float8 | NO | None | None | None |
| 17 | conformer | z | ARRAY | _float8 | NO | None | None | None |
| 18 | molecule | created_on | timestamp without time zone | timestamp | NO | None | None | None |
| 19 | molecule | metadata | jsonb | jsonb | YES | None | None | None |
| 20 | molecule | molecule_id | uuid | uuid | NO | molecule_pkey | None | molecule.molecule_id |
| 21 | molecule | molecule_type_id | uuid | uuid | YES | molecule_molecule_type_id_fkey | None | molecule_type.molecule_type_id |
| 22 | molecule | smiles | text | text | NO | molecule_smiles_key | None | molecule.smiles |
| 23 | molecule | updated_on | timestamp without time zone | timestamp | NO | None | None | None |
| 24 | molecule_type | created_on | timestamp without time zone | timestamp | NO | None | None | None |
| 25 | molecule_type | metadata | jsonb | jsonb | YES | None | None | None |
| 26 | molecule_type | molecule_type_id | uuid | uuid | NO | molecule_type_pkey | None | molecule_type.molecule_type_id |
| 27 | molecule_type | name | text | text | NO | molecule_type_name_key | None | molecule_type.name |
| 28 | molecule_type | updated_on | timestamp without time zone | timestamp | NO | None | None | None |
| 29 | numerical_data | calculation_id | uuid | uuid | NO | numerical_data_calculation_id_fkey | None | calculation.calculation_id |
| 30 | numerical_data | created_on | timestamp without time zone | timestamp | NO | None | None | None |
| 31 | numerical_data | data | ARRAY | _float8 | YES | None | None | None |
| 32 | numerical_data | metadata | jsonb | jsonb | YES | None | None | None |
| 33 | numerical_data | numerical_data_id | uuid | uuid | NO | numerical_data_pkey | None | numerical_data.numerical_data_id |
| 34 | numerical_data | updated_on | timestamp without time zone | timestamp | NO | None | None | None |
| 35 | software | created_on | timestamp without time zone | timestamp | NO | None | None | None |
| 36 | software | name | text | text | NO | software_name_version_unique | None | software.name |
| 37 | software | name | text | text | NO | software_name_version_unique | None | software.version |
| 38 | software | software_id | uuid | uuid | NO | software_pkey | None | software.software_id |
| 39 | software | updated_on | timestamp without time zone | timestamp | NO | None | None | None |
| 40 | software | version | text | text | NO | software_name_version_unique | None | software.name |
| 41 | software | version | text | text | NO | software_name_version_unique | None | software.version |
The information obtained contains all the details of the database structure. In this example, we can see the structure contains 6 tables: - molecule, - molecule_type, - conformer, - calculation, - software, - numerical_data
Adding data¶
Now, let say we want to add a molecule_type. First, let’s have a look at the table structure:
[11]:
db_info[db_info.table_name == "molecule_type"]
[11]:
| table_name | column_name | type | subtype | is_nullable | constraint_name | containt_type | references | |
|---|---|---|---|---|---|---|---|---|
| 24 | molecule_type | created_on | timestamp without time zone | timestamp | NO | None | None | None |
| 25 | molecule_type | metadata | jsonb | jsonb | YES | None | None | None |
| 26 | molecule_type | molecule_type_id | uuid | uuid | NO | molecule_type_pkey | None | molecule_type.molecule_type_id |
| 27 | molecule_type | name | text | text | NO | molecule_type_name_key | None | molecule_type.name |
| 28 | molecule_type | updated_on | timestamp without time zone | timestamp | NO | None | None | None |
The columns created_on, updated_on and molecule_type_id are here for bookkeeping and are filled automatically by the database. The remaining fields are name and metadata. We can see name is of type text and metadata jsonb. This means name can be any python string, and metadata any python object that can be serialized to json. We recomment being cautious with the use of jsonb field as it can be harder to query.
You can manipulate teh data user Client.create_entry, Client.delete_entry and Client.update_entry. Here is an example:
[12]:
event = user_client.create_entry(type="molecule_type", data={"name": "brain catalyst"})
event
[12]:
{'type': 'molecule_type',
'id': 1,
'uuid': '32a775f3-6184-4b3b-b889-daafee3d13c4',
'event': 'create',
'data': {'name': 'brain catalyst'},
'timestamp': '2021-06-16T08:27:30.098010',
'alembic_version': ['0bc99b5f8fcc'],
'user_id': 1}
The db_info also tells us there is a contraint on the column name. Indeed, the name has to be unique and you won’t be able to add the same data again:
[13]:
user_client.create_entry(type="molecule_type", data={"name": "brain catalyst"})
---------------------------------------------------------------------------
MolarBackendError Traceback (most recent call last)
<ipython-input-13-f522a4c5189d> in <module>
----> 1 user_client.create_entry(type="molecule_type", data={"name": "brain catalyst"})
~/AGG/code/goldmine/molar/client.py in create_entry(self, type, data)
240 json=datum,
241 headers=self.headers,
--> 242 return_pandas_dataframe=False,
243 )
244
~/AGG/code/goldmine/molar/client.py in request(self, url, method, params, json, data, headers, return_pandas_dataframe)
98 if response.status_code != 200:
99 raise MolarBackendError(
--> 100 status_code=response.status_code, message=out["detail"]
101 )
102
MolarBackendError: The client got an unexpected answer from the backend:
401: Unique constraint violation!
Now let’s have a look at the molecule table.
[14]:
db_info[db_info.table_name == "molecule"]
[14]:
| table_name | column_name | type | subtype | is_nullable | constraint_name | containt_type | references | |
|---|---|---|---|---|---|---|---|---|
| 18 | molecule | created_on | timestamp without time zone | timestamp | NO | None | None | None |
| 19 | molecule | metadata | jsonb | jsonb | YES | None | None | None |
| 20 | molecule | molecule_id | uuid | uuid | NO | molecule_pkey | None | molecule.molecule_id |
| 21 | molecule | molecule_type_id | uuid | uuid | YES | molecule_molecule_type_id_fkey | None | molecule_type.molecule_type_id |
| 22 | molecule | smiles | text | text | NO | molecule_smiles_key | None | molecule.smiles |
| 23 | molecule | updated_on | timestamp without time zone | timestamp | NO | None | None | None |
The molecule table can take optionally (see is_nullable) a molecule_type_id, which means it can refer to an entry in the molecule_type table.
[15]:
event = user_client.create_entry(
type="molecule",
data={
"smiles": "CN1C=NC2=C1C(=O)N(C(=O)N2C)C",
"molecule_type_id": event['uuid'],
"metadata": {"name": "cafefine"}
}
)
event
[15]:
{'type': 'molecule',
'id': 3,
'uuid': '0a13b53b-5a15-48aa-98d5-9b929c7c0407',
'event': 'create',
'data': {'smiles': 'CN1C=NC2=C1C(=O)N(C(=O)N2C)C',
'metadata': {'name': 'cafefine'},
'molecule_type_id': '32a775f3-6184-4b3b-b889-daafee3d13c4'},
'timestamp': '2021-06-16T08:27:40.201146',
'alembic_version': ['0bc99b5f8fcc'],
'user_id': 1}
Similarly, one can update a field using its uuid:
[16]:
event = user_client.update_entry(
uuid=event['uuid'],
type='molecule',
data={"metadata": {"name": "caffeine"}}
)
event
[16]:
{'type': 'molecule',
'id': 4,
'uuid': '0a13b53b-5a15-48aa-98d5-9b929c7c0407',
'event': 'update',
'data': {'metadata': {'name': 'caffeine'}},
'timestamp': '2021-06-16T08:27:40.620397',
'alembic_version': ['0bc99b5f8fcc'],
'user_id': 1}
[17]:
event = user_client.delete_entry(
type="molecule",
uuid=event["uuid"]
)
Event sourcing¶
Rather than writing directly to the tables, Molar saves all the events into another table, called eventstore. The data is then projected to their respective table. This method is called event sourcing (as opposed to active records). The eventstore can be access through the client:
[18]:
es = user_client.view_entries()
es
[18]:
| type | id | uuid | event | data | timestamp | alembic_version | user_id | |
|---|---|---|---|---|---|---|---|---|
| 0 | molecule_type | 1 | 32a775f3-6184-4b3b-b889-daafee3d13c4 | create | {'name': 'brain catalyst'} | 2021-06-16T08:27:30.098010 | [0bc99b5f8fcc] | 1 |
| 1 | molecule | 3 | 0a13b53b-5a15-48aa-98d5-9b929c7c0407 | create | {'smiles': 'CN1C=NC2=C1C(=O)N(C(=O)N2C)C', 'me... | 2021-06-16T08:27:40.201146 | [0bc99b5f8fcc] | 1 |
| 2 | molecule | 4 | 0a13b53b-5a15-48aa-98d5-9b929c7c0407 | update | {'metadata': {'name': 'caffeine'}} | 2021-06-16T08:27:40.620397 | [0bc99b5f8fcc] | 1 |
| 3 | molecule | 5 | 0a13b53b-5a15-48aa-98d5-9b929c7c0407 | delete | {} | 2021-06-16T08:27:40.870340 | [0bc99b5f8fcc] | 1 |
Event sourcing becomes interesting when something goes wrong. Maybe the last entry, deleting the molecule with uuid 280ff363-78bf-4c58-b1b1-6ab152954ef6 was a mistake. In this case, we can undo the last events (or rather, empty the database and replay all the event until the last one):
[19]:
user_client.rollback(before=es.at[3, 'timestamp'])
[19]:
{'type': None,
'id': 6,
'uuid': None,
'event': 'rollback-begin',
'data': {'before': '2021-06-16 08:27:40.870340'},
'timestamp': '2021-06-16T08:27:42.297604',
'alembic_version': ['0bc99b5f8fcc'],
'user_id': 1}
Here is the eventstore after the rollback. We can see the list of event that have been replayed.
[20]:
user_client.view_entries().sort_values(by=['id'])
[20]:
| type | id | uuid | event | data | timestamp | alembic_version | user_id | |
|---|---|---|---|---|---|---|---|---|
| 0 | molecule_type | 1 | 32a775f3-6184-4b3b-b889-daafee3d13c4 | create | {'name': 'brain catalyst'} | 2021-06-16T08:27:30.098010 | [0bc99b5f8fcc] | 1 |
| 1 | molecule | 3 | 0a13b53b-5a15-48aa-98d5-9b929c7c0407 | create | {'smiles': 'CN1C=NC2=C1C(=O)N(C(=O)N2C)C', 'me... | 2021-06-16T08:27:40.201146 | [0bc99b5f8fcc] | 1 |
| 2 | molecule | 4 | 0a13b53b-5a15-48aa-98d5-9b929c7c0407 | update | {'metadata': {'name': 'caffeine'}} | 2021-06-16T08:27:40.620397 | [0bc99b5f8fcc] | 1 |
| 3 | molecule | 5 | 0a13b53b-5a15-48aa-98d5-9b929c7c0407 | delete | {} | 2021-06-16T08:27:40.870340 | [0bc99b5f8fcc] | 1 |
| 9 | None | 6 | None | rollback-begin | {'before': '2021-06-16 08:27:40.870340'} | 2021-06-16T08:27:42.297604 | [0bc99b5f8fcc] | 1 |
| 4 | molecule_type | 7 | 32a775f3-6184-4b3b-b889-daafee3d13c4 | delete | {} | 2021-06-16T08:27:42.297604 | [0bc99b5f8fcc] | 1 |
| 5 | molecule_type | 8 | 32a775f3-6184-4b3b-b889-daafee3d13c4 | create | {'name': 'brain catalyst'} | 2021-06-16T08:27:42.297604 | [0bc99b5f8fcc] | 1 |
| 6 | molecule | 9 | 0a13b53b-5a15-48aa-98d5-9b929c7c0407 | create | {'smiles': 'CN1C=NC2=C1C(=O)N(C(=O)N2C)C', 'me... | 2021-06-16T08:27:42.297604 | [0bc99b5f8fcc] | 1 |
| 7 | molecule | 10 | 0a13b53b-5a15-48aa-98d5-9b929c7c0407 | update | {'metadata': {'name': 'caffeine'}} | 2021-06-16T08:27:42.297604 | [0bc99b5f8fcc] | 1 |
| 8 | None | 11 | None | rollback-end | {'before': '2021-06-16 08:27:40.870340'} | 2021-06-16T08:27:42.297604 | [0bc99b5f8fcc] | 1 |
In theory, an event sourcing makes it also possible to merge two databases with the same structure together. This has not been implemented yet.
Querying¶
Querying the database is relatively straight forward if you are familar with SQL. The queries that can be formulated with Molar are only a subset of what it is possible to do with SQL, but this should be sufficient for most use case.
Client.query_database is the only method that you need to use to make queries. It is relatively straight forward to use if you want to fetch data from a single table:
[21]:
user_client.query_database(
types='molecule'
)
[21]:
| molecule_id | created_on | updated_on | metadata | smiles | molecule_type_id | |
|---|---|---|---|---|---|---|
| 0 | 0a13b53b-5a15-48aa-98d5-9b929c7c0407 | 2021-06-16T08:27:42.297604 | 2021-06-16T08:27:42.297604 | {'name': 'caffeine'} | CN1C=NC2=C1C(=O)N(C(=O)N2C)C | 32a775f3-6184-4b3b-b889-daafee3d13c4 |
and also:
[22]:
user_client.query_database(types="molecule_type")
[22]:
| molecule_type_id | created_on | updated_on | name | |
|---|---|---|---|---|
| 0 | 32a775f3-6184-4b3b-b889-daafee3d13c4 | 2021-06-16T08:27:42.297604 | 2021-06-16T08:27:42.297604 | brain catalyst |
Joins¶
It is also possible to perform joins. For instance we want to fetch all data from molecule and the name of the molecule_type. To do so, we can use the joins argument:
[23]:
user_client.query_database(
types=['molecule', "molecule_type.name"],
joins={
'type': "molecule_type"
}
)
[23]:
| molecule.molecule_id | molecule.created_on | molecule.updated_on | molecule.metadata | molecule.smiles | molecule.molecule_type_id | molecule_type.name | |
|---|---|---|---|---|---|---|---|
| 0 | 0a13b53b-5a15-48aa-98d5-9b929c7c0407 | 2021-06-16T08:27:42.297604 | 2021-06-16T08:27:42.297604 | {'name': 'caffeine'} | CN1C=NC2=C1C(=O)N(C(=O)N2C)C | 32a775f3-6184-4b3b-b889-daafee3d13c4 | brain catalyst |
Advanced joining¶
Of course, it is often more complicated than that. One may want to join twice the same table on different key. This would be the case with the table calculation that can have a conformer_id and an output_conformer_id.
Let’s first add some data and see how we can join these table together:
[24]:
conformer = user_client.create_entry(
type="conformer",
data={
"x": [0],
"y": [0],
"z": [0],
"atomic_numbers": [2],
"molecule_id": event["uuid"]
}
)
software = user_client.create_entry(
type="software",
data={
"name": "cp2k",
"version": "v1.0"
}
)
calculation = user_client.create_entry(
type="calculation",
data={
"software_id": software["uuid"],
"conformer_id": conformer["uuid"],
"output_conformer_id": conformer["uuid"]
}
)
If we want to query for both initial_conformer and the output_conformer, we need to use aliases:
[25]:
user_client.query_database(
types=["calculation", "initial_conformer", "output_conformer"],
joins=[
{
"type": 'initial_conformer',
"on": {
"column1": "initial_conformer.conformer_id",
"column2": "calculation.conformer_id"
},
},
{
"type": 'output_conformer',
"on": {
"column1": "output_conformer.conformer_id",
"column2": "calculation.output_conformer_id"
},
}
],
aliases=[
{"type": "conformer", "alias": "initial_conformer"},
{"type": "conformer", "alias": "output_conformer"}
]
)
[25]:
| calculation.calculation_id | calculation.created_on | calculation.updated_on | calculation.conformer_id | calculation.software_id | calculation.output_conformer_id | initial_conformer.conformer_id | initial_conformer.molecule_id | initial_conformer.created_on | initial_conformer.updated_on | ... | initial_conformer.z | initial_conformer.atomic_numbers | output_conformer.conformer_id | output_conformer.molecule_id | output_conformer.created_on | output_conformer.updated_on | output_conformer.x | output_conformer.y | output_conformer.z | output_conformer.atomic_numbers | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 8ac0923d-51c3-4e33-b8bb-feb318282a15 | 2021-06-16T08:27:46.786091 | 2021-06-16T08:27:46.786091 | dba3908e-83c7-4627-a301-6dfb31ceec53 | 9c4b9dcc-00ad-4a61-8034-c0be85836815 | dba3908e-83c7-4627-a301-6dfb31ceec53 | dba3908e-83c7-4627-a301-6dfb31ceec53 | 0a13b53b-5a15-48aa-98d5-9b929c7c0407 | 2021-06-16T08:27:46.671052 | 2021-06-16T08:27:46.671052 | ... | [0.0] | [2] | dba3908e-83c7-4627-a301-6dfb31ceec53 | 0a13b53b-5a15-48aa-98d5-9b929c7c0407 | 2021-06-16T08:27:46.671052 | 2021-06-16T08:27:46.671052 | [0.0] | [0.0] | [0.0] | [2] |
1 rows × 22 columns
This way the backend knows precisely which columns we are referring to in our join.
Filtering¶
It’s also possible to filter the entries:
[26]:
user_client.query_database(
types='molecule',
filters={
'type': "molecule.metadata.name",
"op": "==",
"value": "caffeine"
}
)
[26]:
| molecule_id | created_on | updated_on | metadata | smiles | molecule_type_id | |
|---|---|---|---|---|---|---|
| 0 | 0a13b53b-5a15-48aa-98d5-9b929c7c0407 | 2021-06-16T08:27:42.297604 | 2021-06-16T08:27:42.297604 | {'name': 'caffeine'} | CN1C=NC2=C1C(=O)N(C(=O)N2C)C | 32a775f3-6184-4b3b-b889-daafee3d13c4 |
The available operators are:
Operator |
String |
|---|---|
Equal |
“==” |
Not equal |
“!=” |
Greater than |
“>” |
Lower than |
“<” |
Greater or equal |
“>=” |
Lower or equal |
“<=” |
In |
“in” |
Not in |
“not_in” |
Pattern matching (case sensitive) |
“like” |
Pattern matching (cas insensitive) |
“ilike” |
“notlike” |
|
“notilike” |
For the pattern matching operators, please refer to the postgresql documentation
Ordering¶
The output of a query can be ordered according to any field.
[27]:
user_client.create_entry(
type='molecule',
data={
"smiles": "CC(=O)OC1=CC=CC=C1C(=O)O",
"metadata": {
"name": "aspirin"
}
}
)
[27]:
{'type': 'molecule',
'id': 15,
'uuid': '4668e413-689c-43a9-81c3-9bc77d89e628',
'event': 'create',
'data': {'smiles': 'CC(=O)OC1=CC=CC=C1C(=O)O',
'metadata': {'name': 'aspirin'}},
'timestamp': '2021-06-16T08:27:50.388519',
'alembic_version': ['0bc99b5f8fcc'],
'user_id': 1}
[28]:
user_client.query_database(
types='molecule',
order_by={'type': 'molecule.metadata.name', 'order': 'asc'}
)
[28]:
| molecule_id | created_on | updated_on | metadata | smiles | molecule_type_id | |
|---|---|---|---|---|---|---|
| 0 | 4668e413-689c-43a9-81c3-9bc77d89e628 | 2021-06-16T08:27:50.388519 | 2021-06-16T08:27:50.388519 | {'name': 'aspirin'} | CC(=O)OC1=CC=CC=C1C(=O)O | None |
| 1 | 0a13b53b-5a15-48aa-98d5-9b929c7c0407 | 2021-06-16T08:27:42.297604 | 2021-06-16T08:27:42.297604 | {'name': 'caffeine'} | CN1C=NC2=C1C(=O)N(C(=O)N2C)C | 32a775f3-6184-4b3b-b889-daafee3d13c4 |
Debugging queries¶
Sometimes, to create or debug a query, it is convenient to access its raw SQL. This can be done using Client.debug_query.
[33]:
out = user_client.debug_query(
types='molecule',
order_by={'type': 'molecule.metadata.name', 'order': 'asc'}
)
print(out)
SELECT public.molecule.molecule_id, public.molecule.created_on, public.molecule.updated_on, public.molecule.metadata, public.molecule.smiles, public.molecule.molecule_type_id
FROM public.molecule ORDER BY public.molecule.metadata ->> 'name' ASC
LIMIT 10 OFFSET 0
Molar provides also a way to analyze the query. Internally, this is done using the EXPLAIN ANALYZE statement. This can be useful to determine whether an index is required in the database to speed up a query. More information about this statement can be found in Postgres’ documentation.
[32]:
out = user_client.debug_query(
types='molecule',
order_by={'type': 'molecule.metadata.name', 'order': 'asc'},
explain_analyze=True
)
print(out)
Limit (cost=29.78..29.81 rows=10 width=144) (actual time=0.093..0.171 rows=2 loops=1)
-> Sort (cost=29.78..31.23 rows=580 width=144) (actual time=0.078..0.114 rows=2 loops=1)
Sort Key: ((metadata ->> 'name'::text))
Sort Method: quicksort Memory: 25kB
-> Seq Scan on molecule (cost=0.00..17.25 rows=580 width=144) (actual time=0.021..0.045 rows=2 loops=1)
Planning Time: 0.087 ms
Execution Time: 0.237 ms
Removing a database¶
Databases can be removed. The following example will remove completely the database compchem (users and data). Be careful when using this command in code.
[ ]:
admin_client.remove_database("compchem")