How to set up Virtuso — the semantic web data base managing system
Virtuoso is an open-source multi-model database management system. It’s scalable and versatile offering support to both SQL and SPARQL, making it a great choice for projects that are part semantic web and part tabular data.
In this blog post you will learn how to setup a Virtuoso instance running it in a server. As well as unlocking CORS settings to allow your server to be accessed by any service. You will learn about setting up the correct environment to install the server, how to compile Virtuoso from source, and how to configure it to meet the most common needs.
Setup
This blog assumes you have a Linux machine running with an up-to-date version of Ubuntu (at the time of the writing of this blog post this meant Ubuntu 24.04). You can check your OS’ version by using the lsb_release -a
command. You can also try to follow this tutorial with newer versions of the software indicated here, though that’s not recommended.
You will need a couple of packages to get through with this tutorial. Start off by updating your OS’ packages and downloading the fundamental ones for Virtuoso installation process:
sudo apt update
sudo apt install build-essential git autoconf automake libtool bison flex gperf gawk m4 libssl-dev vim
Installation
Now you have two options, you can either compile it from source or download a pre-built version of Virtuoso.
You can find the pre-built versions here, if they work for you, great! Feel free to skip the rest of this session, if they don’t, stick around.
The building it from source route
I will be walking you through how to build it from source (worry not! It’s not complicated, if everything goes well).
Clone the Virtuoso repository:
git clone git://github.com/openlink/virtuoso-opensource.git
Navigate to the directory you’ve downloaded in the previous step and display all tags:
git tag
Make sure to checkout in the 7.2.13 version
. This is very important to assure this tutorial will work ad infinitum. Try to be within all the versions I’ve made explicit here.
git checkout v7.2.13
Prepare the build environment:
./autogen.sh
And let the Virtuoso try to guess what are the best configurations for your system:
/.configure
If you have some issue here, I recommend reading the README.md
file of 7.2.13
. It lies in the home directory of Virtuoso and might contain useful tips to troubleshoot your problem.
Hopefully, your Ubuntu installation is properly setup to deal with a C compilation and all you have to do is run:
make
Followed by:
sudo make install
After a couple of minutes your project will be built and installed in your server. By default the project binaries is stored at /usr/local/virtuoso-opensource/bin/virtuoso-t
. In my installation there was no automatically alias created so if instead of typing this long path you just wish to type virtuoso
make sure to add the following line in your .bashrc
file:
alias virtuoso='/usr/local/virtuoso-opensource/bin/virtuoso-t'
Don’t forget to restart your terminal configurations with:
source ~/.bashrc
Troubleshooting
In case you get errors due to incorrect linking, there are some good pieces of advice in this blog post. If you have compiler problems you can try to install the missing dependencies with apt or you can download mamba and use the compilers package maintained by the incredible conda-forge
community:
mamba install -c conda-forge compilers
Make sure to do all of this inside a previously created mamba environment.
If you’re encountering other weird problems make sure to check the “Recent systems” session from the README.md
. Lastly you can try the documentation page for installing Virtuoso.
Configuring Virtuoso
If everything went well, your server is read to be used. Now we will make sure the configurations of your server works for the more generic use cases of Virtuoso.
For that choose your favorite terminal-based text editor, I like vim. Navigate to the installation directory for Virtuoso open the virtuoso.ini
file, that’s the configuration file for your server.
vim /usr/local/var/lib/virtuoso/db/virtuoso.ini
There are a few important sessions to your configuration file that you need to make sure look like the following:
[Parameters]
ServerPort = 1111
DirsAllowed = ., ../vad/, /usr/share/proj, path/to/data
[SPARQL]
DefaultGraph = http://localhost:8890/data
[HTTPServer]
ServerPort = 8890
HTTPProxyEnabled = 1
HttpProxyBackend = *
HTTPCORSOrigins = *
HTTPCORSHeaders = *
DefaultPage = index.html, index.htm, index.php
In the Parameters session you want to make sure your ServerPort
is set to 1111
, this will give you access to the isql
server, an extremely useful piece of software that allows you to run queries in your data and to install new packages. We will use this tool for both ends later on. The DirsAllowed
parameter should have the first three paths as a default, the fourth one you should add it yourself, it should be the complete path to where your data lies within your server.
In the SPARQL session the important parameter is your data’s IRI. That should match your Database file name, for clarity sake.
Lastly, note the CORS lines within the HTTPServer session setting universal access to the server, they must be added to your file. The default server comes with several CORS restrictions enabled by default. If you want your server to serve as an endpoint to various services it might require you to broaden these permissions. Feel free to restrain it to the IPs of only the services that you want to allow to draw from your Virtuoso instead of having any domain allowed like in the example file. Also, make sure your ServerPort
is set to 8890
and that the DefaultPage
parameter has these files, they’re important for setting up a the Conductor interface, which we will get into details later on.
Start Virtuoso
To start your Virtuoso server use your binary in combination to the virtuoso file we just created, depending on how you’ve configured your alias this might look different, but here’s a safe bet:
/usr/local/virtuoso-opensource/bin/virtuoso-t +configfile /usr/local/var/lib/virtuoso/db/virtuoso.ini
If you feel like this line is cumbersome you can set it up as a different alias in your bashrc
file like indicated in the Installation session.
Congratulations! If everything ran smoothly for you, you should now be able to access the SPARQL interface of your Virtuoso server, for example.
To do that navigate to http://your-server-address:8890/sparql
that should open the following window in your browser:
Good-to-know commands
To check whether your server is running you can use the following command:
ps aux | grep virtuoso-t
And to kill your server this is probably the easiest thing to do:
killall virtuoso-t
Though, be aware this will end all of your Virtuoso instances, if you want to end them individually use the kill
command with the proper process ID number.
Troubleshooting
You might be running into problems when trying to query your database. That might be because the name of your database was set incorrectly or the data wasn’t added correctly we can fix that by using the extremely handy tool I’ve talked about earlier, isql
.
You will need to supply your username and password for this. I’m going to fill up the gaps with the default values, so if you have no idea what I’m talking about (don’t worry, more on this in the future) just go along with my code, but if you set up your own make sure to change it to your values:
sudo /usr/local/virtuoso-opensource/bin/isql 1111 dba dba exec="DB.DBA.TTLP_MT (file_to_string_output('/path/to/dataset'), '', 'http://localhost:8890/dataset-name', 0);"
After running this command you might have to restart your server. After that, your data should be served correctly.
Installing Conductor
Conductor is the graphical interface for configuring Virtuoso and it’s a valuable ally when your server grows in complexity. It’s a few extra steps to set it up, but it might offer you valuable insights in your project, so I recommend going through this session as well!
Conductor might be available to you out-of-the-box, though for me it wasn’t. If it is you should be immediately able to access it through the following URL: http://your-server-address:8890/conductor
.
If that link opens to a 404, there’s a bit of extra steps. The easiest thing you can try is installing it with the apt
tool:
sudo apt-get install virtuoso-vad-{isparql,ods,cartridges,tutorial}
But that also didn’t work for me. Fortunately, there’s another solution for this. The isql
service should be up and running. If you remember from the beginning of this blog post, I mentioned it was a versatile tool that allowed for users to both query and install new packages into the server.
You will need a user login and a password to use isql
. If you have not set any then your default username and password are as is underneath, if you have set a new one, make sure you change the following string:
sudo /usr/local/virtuoso-opensource/bin/isql 1111 dba dba exec="vad_install('/usr/local/virtuoso-opensource/share/virtuoso/vad/conductor_dav.vad', 0);"
After restarting the server, you should now have access to Conductor.
Troubleshooting
If you don’t have conductor_dav.vad
lying in this directory, try to copy it over from your apt
download.
Making your server secure
The most important and simple thing to be done is to remove the default dba
user and create a real user with a safe password.
USER_CREATE('<new_username>', '<new_password>');
GRANT DBA TO <new_username>;
USER_DROP('dba');
Once that’s done you can enable authentication to your endpoints. That will allow you to configure how much different users can access. Here are some common ones:
- DBA: Full database administrator privileges.
- SELECT: Permission to perform SELECT queries.
- INSERT: Permission to insert data.
- UPDATE: Permission to update existing data.
- DELETE: Permission to delete data.
- EXECUTE: Permission to execute stored procedures.
In order to do that you will need an user and to add the following lines to your virtuoso.ini
file:
DigestAuthentication = 1
[SPARQL]
EnableOAuth = 1
OAuthIssuer = <your_oauth_issuer>
OAuthAudience = <your_oauth_audience>
And within isql
:
GRANT <privilege> TO <username>;
In your client-side you will now have to give these credentials every time you make a request.
Conclusion
That’s all that entails setting up a generic Virtuoso server in a remote server.
I have wrote this blog post because I haven’t found any other good resources online and LLMs seem to have very misleading notions on how to do this correctly (maybe this post will help them to offer better responses once they illegally scrape this information).
Corrections and other feedback is very much appreciated! Feel free to leave comments here or e-mail me.
Thank you for reading this post. Happy hacking and happy life!