Skip to content

How to Install Superset in Ubuntu 20?

    As docker is straight forward, here i will discuss more on install from scratch.

    At this point of time we have python 3.8.5

    Let’s get pip,

    sudo apt update
    sudo apt install python3-pip

    We need these dependencies

    sudo apt-get install build-essential libssl-dev libffi-dev python3-dev python3-pip libsasl2-dev libldap2-dev
    

    Let’s Create Python Virtual Environment from python3-venv, its always recommended to deploy and run python projects in venv to avoid conflicts with other packages and to handle failures better.

    pip install python3-venv
    python3 -m venv venv

    Note: Always create a user for superset rather than using root. this will constraint our processes to specific user and helps a lot with security. i am not creating that to avoid confusions.

    Install superset main build, We are not doing it from source code, refer to this guide for more details.

    pip install apache-superset

    Now get the database ready for us to start, if you want to use default sqlite datastore for superset, it’s default you can skip this step and continue. But if you need mysql or postgresql, you need to edit superset config file, present in /root/venv/lib/python3.8/site-packages/superset/config.py to update SQLALCHEMY_DATABASE_URI. alternatively if you dont want to modify superset core file because you might break updates, keep superset_config.py file in PYTHONPATH environment variable or use SUPERSET_CONFIG_FILE environment variable and make sure superset_config.py file is present in that location, you can make copy of config.py into this superset_config.py and modify parameters as required like database uri.

    Make sure you install respective python database driver before you continue, i am using mysql as datasource for superset, mysql+pymysql://admin:XXXXXX@localhost:3306/superset

    Refer to SQLALCHEMY connection strings.

    I will need pymysql to connect with database.

    pip install pymysql

    let superset build our database

    superset db upgrade

    If you have existing database, or coming from old version of superset, this step is very crucial and it applies migrations from old versions also along with generating fresh schema. during migration you might need to validate database operations like alter columns with unique keys and adding new tables, i faced two issues,

    1. Migration is trying to make a column unique when it has duplicated data in rows. I have to manually update those rows as required for my dashboards.
    2. Migration is trying to create table with foreign key relations with unmatched collation and charset on a column, I have to update them to get them matched.

    now lets create admin, initialize superset to get roles, load examples, start dev server.

    # Create an admin user (you will be prompted to set a username, first and last name before setting a password)
    $ export FLASK_APP=superset
    superset fab create-admin
    
    # Load some data to play with
    superset load_examples
    
    # Create default roles and permissions
    superset init
    
    # To start a development web server on port 8088, use -p to bind to another port
    superset run -p 8088 --with-threads --reload --debugger

    To use it in product we should run with guniorn

    
    /root/venv/bin/python /root/venv/bin/gunicorn -w 2 --timeout 60 -b 0.0.0.0:8088 --limit-request-line 0 --limit-request-field_size 0 "superset.app:create_app()"

    Lets create a service to make it easy to turn on, off and run at startup.

    [Unit]
    Description = Apache Superset Webserver Daemon
    After = network.target
    
    [Service]
    PIDFile = /root/superset-webserver.PIDFile
    User = root
    Group = root
    Environment=SUPERSET_HOME=/root
    Environment=PYTHONPATH=/root
    WorkingDirectory = /root
    ExecStart =/root/venv/bin/python /root/venv/bin/gunicorn --workers 8 --worker-class gevent  --bind 0.0.0.0:8088 --pid /root/superset-webserver.PIDFile "superset.app:create_app()"
    ExecStop = /bin/kill -s TERM $MAINPID
    
    
    [Install]
    WantedBy=multi-user.target
    

    then systemctl reload and systemctl enable.

    That’s it Folks!.