As docker is straight forward, here i will discuss more on install from scratch.
At this point of time we have python 3.8.5
Let’s get pip,
sudo apt update
sudo apt install python3-pip
We need these dependencies
sudo apt-get install build-essential libssl-dev libffi-dev python3-dev python3-pip libsasl2-dev libldap2-dev
Let’s Create Python Virtual Environment from python3-venv, its always recommended to deploy and run python projects in venv to avoid conflicts with other packages and to handle failures better.
pip install python3-venv
python3 -m venv venv
Note: Always create a user for superset rather than using root. this will constraint our processes to specific user and helps a lot with security. i am not creating that to avoid confusions.
Install superset main build, We are not doing it from source code, refer to this guide for more details.
pip install apache-superset
Now get the database ready for us to start, if you want to use default sqlite datastore for superset, it’s default you can skip this step and continue. But if you need mysql or postgresql, you need to edit superset config file, present in /root/venv/lib/python3.8/site-packages/superset/config.py to update SQLALCHEMY_DATABASE_URI. alternatively if you dont want to modify superset core file because you might break updates, keep superset_config.py file in PYTHONPATH environment variable or use SUPERSET_CONFIG_FILE environment variable and make sure superset_config.py file is present in that location, you can make copy of config.py into this superset_config.py and modify parameters as required like database uri.
Make sure you install respective python database driver before you continue, i am using mysql as datasource for superset, mysql+pymysql://admin:XXXXXX@localhost:3306/superset
Refer to SQLALCHEMY connection strings.
I will need pymysql to connect with database.
pip install pymysql
let superset build our database
superset db upgrade
If you have existing database, or coming from old version of superset, this step is very crucial and it applies migrations from old versions also along with generating fresh schema. during migration you might need to validate database operations like alter columns with unique keys and adding new tables, i faced two issues,
- Migration is trying to make a column unique when it has duplicated data in rows. I have to manually update those rows as required for my dashboards.
- Migration is trying to create table with foreign key relations with unmatched collation and charset on a column, I have to update them to get them matched.
now lets create admin, initialize superset to get roles, load examples, start dev server.
# Create an admin user (you will be prompted to set a username, first and last name before setting a password)
$ export FLASK_APP=superset
superset fab create-admin
# Load some data to play with
superset load_examples
# Create default roles and permissions
superset init
# To start a development web server on port 8088, use -p to bind to another port
superset run -p 8088 --with-threads --reload --debugger
To use it in product we should run with guniorn
/root/venv/bin/python /root/venv/bin/gunicorn -w 2 --timeout 60 -b 0.0.0.0:8088 --limit-request-line 0 --limit-request-field_size 0 "superset.app:create_app()"
Lets create a service to make it easy to turn on, off and run at startup.
[Unit]
Description = Apache Superset Webserver Daemon
After = network.target
[Service]
PIDFile = /root/superset-webserver.PIDFile
User = root
Group = root
Environment=SUPERSET_HOME=/root
Environment=PYTHONPATH=/root
WorkingDirectory = /root
ExecStart =/root/venv/bin/python /root/venv/bin/gunicorn --workers 8 --worker-class gevent --bind 0.0.0.0:8088 --pid /root/superset-webserver.PIDFile "superset.app:create_app()"
ExecStop = /bin/kill -s TERM $MAINPID
[Install]
WantedBy=multi-user.target
then systemctl reload and systemctl enable.
That’s it Folks!.