Toucan Toco data connectors.
In order to work you need Python 3.6
(consider running pip install -U pip setuptools
if needed)
You can then install:
- the main dependencies by typing
pip install -e .
- the test requirements by typing
pip install -r requirements-testing.txt
You should be able to run basic tests pytest tests/test_connector.py
azure_mssql
andmssql
connector, you must installfreetds
running for instance:brew install freetds
postgres
, you must installpostgresql
running for instance:brew install postgres
then you can install the library withenv LDFLAGS='-L/usr/local/lib -L/usr/local/opt/openssl/lib -L/usr/local/opt/readline/lib' pip install psycopg2
If you want to run the tests for another connector, you can install the extra dependencies
(e.g to test MySQL just type pip install -e ".[mysql]"
)
Now pytest tests/mysql
should run all the mysql tests properly.
If you want to run the tests for all the connectors you can add all the dependencies by typing
pip install -e ".[all]"
and make test
.
To generate the connector and test modules from boilerplate, run:
$ make new_connector type=mytype
mytype
should be the name of a system we would like to build a connector for,
such as MySQL
or Hive
or Magento
.
Open the folder in tests
for the new connector. You can start writing your tests
before implementing it. Please do not hesitate to add a docker image in
the docker-compose.yml
. You can then use the fixture service_container
to automatically
start the docker and shut it down for you!
--pull
to retrieve them
Open the folder mytype
in toucan_connectors
for your new connector and
create your classes
import pandas as pd
# Careful here you need to import ToucanConnector from the deep path, not the __init__ path.
from toucan_connectors.toucan_connector import ToucanDataSource, ToucanConnector
class MyTypeDataSource(ToucanDataSource):
"""Model of my datasource"""
query: str
class MyTypeConnector(ToucanConnector):
"""Model of my connector"""
data_source_model: MyTypeDataSource
host: str
port: int
database: str
def _retrieve_data(self, data_source: MyTypeDataSource) -> pd.DataFrame:
"""how to retrieve a dataframe"""
Please add your connector in toucan_connectors/__init__.py
.
The key is what we call the type
of the connector, which
is basically like an id used to retrieve it.
CONNECTORS_CATALOGUE = {
...,
'MyType': 'mytype.mytype_connector.MyTypeConnector',
...
}
You can now generate and edit the documentation page for your connector:
PYTHONPATH=. python doc/generate.py MyTypeConnector > doc/mytypeconnector.md
Add the main requirements to the setup.py
in the extras_require
dictionary:
extras_require = {
...
'mytype': ['my_dependency_pkg1==x.x.x', 'my_dependency_pkg2>=x.x.x']
}
If you need to add testing dependencies, add them to the requirements-testing.txt
file.
Make sure your new code is properly formatted by typing make lint
.
If it's not, please use make format
!