- Are you missing a special transformation in bamboolib?
- Do you want to provide custom transformations for your team?
- Do you want to load data from a custom source?
- Do you want to add custom visualizations or data explorations?
bamboolib enables you to quickly add plugins or even write your own plugins based on your specific needs.
-
make sure that you are running bamboolib 1.18.0 or higher. You can check this via running:
bam.__version__
If you need to upgrade, please follow this guide -
write your own plugin or copy an example
-
execute the plugin code
-
use the plugin from within the bamboolib user interface
If you have questions, please reach out via bamboolib-feedback@databricks.com
3 things you should know about how plugins work:
- Plugins are added to bamboolib after you execute the plugin code.
- If you add a plugin, it will be available as long as the Python kernel is running.
- If you restart your Python kernel, the plugin will no longer be available.
Given those constraints there are multiple alternatives. Our preferred option is number 2:
- Put the plugin code into an internal Python package and import the package at the top of your Jupyter Notebook. For example, you can quickly create a new Python package with pyscaffold. You might also want to upload your own plugin package to a private or public GitHub repository and collaborate with others to make sure that you will always have all the best plugins available for your use case.
- [Preferred] Create an internal library like described in step 1. Then, use Jupyter templates in order to always automatically import bamboolib and your internal library at the top of the notebook when creating a new notebook file.
- [Discouraged] Put the raw plugin code or the import of your internal library into a Python file in the IPython auto startup folder which is located in your home directory at
~/.ipython/profile_default/startup
This code is run by IPython every time you start a new Jupyter Python kernel. The only reason why we list this approach is because we want to let you know why we kindly discourage it. This approach hides the state and dependencies of your notebook. Thus, your notebook might not work out of the box when run on the computer of a colleague who might not have the same startup script like you do.
Do you prefer another way? If so, please let us know via the issues. Your approach might be helpful to others as well :)
If you want to build the plugins of your dreams, you basically need 2 ingredients:
- the bamboolib internals that
bamboolib.plugins
provides to you - any of the user interface elements of ipywidgets
Below, you find the description of some of the core plugin components like LoaderPlugin
, TransformationPlugin
, DF_OLD
, and DF_NEW
.
In addition, the following components can be imported from bamboolib.plugins
:
BamboolibError
- helpful for raising beautiful errorsSingleselect
Multiselect
Button
Text
If you want more information about their usage, please check the Docstring e.g. using Text?
or help(Text)
For more infos about their usage in the real life, please check the examples.
Methods that you can OVERRIDE:
- get_code(): this is the bare minimum that is required. You need to return a string that contains Python code.
- render(): for adding custom user interface elements
Helpers that you might want to USE:
Methods:
- set_title()
- set_content()
- execute(): starts the code execution
Attributes:
- new_df_name_group: input for giving the new dataframe a name that is referenced as DF_NEW in the code
- execute_button: button that calls execute() when called
Methods that you can OVERRIDE:
- get_code(): this is the bare minimum that is required. You need to return a string that contains Python code.
- render(): for adding custom user interface elements
- is_valid_transformation(): return True or False or even raise exceptions
- get_description(): return a description of the transformation that is shown in the history
Helpers that you might want to USE:
Methods:
- set_title()
- set_content()
- get_df()
- get_name_of_df()
- ADVANCED
- update_code_preview()
- get_final_code()
- execute()
Attributes:
- rename_df_group
- code_preview_group
- spacer
- new_df_name_input
-
Placeholder that you NEED to use within
get_code()
-
At runtime, bamboolib will replace the placeholder with the name of the current dataframe.
-
Placeholder that you CAN use within
get_code()
-
At runtime, bamboolib will replace the placeholder with the new name of the current dataframe. The new name can be specified by the user inside the
rename_df_group
input element.
Attention: for TransformationPlugin the renaming will only work if you add
self.rename_df_group
toself.set_content()
Attention: for LoaderPlugin the renaming will only work if you add
self.new_df_name_group
toself.set_content()