The number of technologies we can find these days can be quite overwhelming. In this article, I’ll show you how to use StackOverflow to better understand which technologies are often connected to one another. You might find this topic interesting if you are a Sourcer/Recruiter (to find better search keywords or even potential candidates), a Manager (to gain high-level understanding) or even if you are tech-savvy (to be up-to-date). I’ve divided the article into two parts. Today, I’m going to focus on less technical side, and we’ll go through 3 example tags. In the next post, I’ll describe technical aspects and show how you can run my script.

StackOverflow
As you probably know, StackOverflow is Q&A platform for developers. It’s a part of a wider StackExchange network. There is a lot of discussion on how SO should be used. Sometimes people don’t think much about the posted answer and they just copy-paste the code, which is not always the right one. There are also many trivial questions. Nevertheless, it’s a very active community used by most developers all over the world. That’s why it’s a great source of data about technologies.
Putting StackOverflow tags into a graph
I’ve written a Python script which does the following:
- Gathers Q&A threads with the tag provided by you. Since now, we treat the term tag as equivalent to the term technology.
- Collects all the users who have provided answers in those threads.
- Builds a list of tags for each user. StackOverflow provides information about a particular user about tags he/she was active in.
- Transforms the relation: user -> tag into a graph.
- Plots the graph in Graphistry.
Tags don’t always represent technologies, but it happens very often. Let’s assume we are a Sourcer/Recruiter and we are looking for a Big Data Engineer. Probably, we have received quite a typical list of requirements from the customer. We want to get to know our profile better and build more suited search keywords.
Finding a Big Data Engineer!
I’ve run my script with bigdata
tag and built the graph. You can click on the image below and go to the interactive mode.
At the beginning, there’s a bit of chaos, but you can filter the data (in the centre of the top toolbar) to get a more readable picture. Here are some starting hints:
- Points, called nodes, represent tags/technologies (red) and users (blue).
- Lines, called edges, represent the already mentioned relationships between the user and the tag. Let’s come up with an example: If a user: John has written an answer to the question described with a tag: spark, there would be a node
John
,spark
and line (edge) fromJohn
tospark
. - Nodes (points) and edges (lines) have attributes describing them. One of the useful attributes is a degree – which is the sum of incoming and outgoing edges. If you want to filter out less important nodes (users and technologies), you can add a filter:
point:degree >= 10
. It will hide the tags which are related to less than 10 users and accounts which are bound to fewer than 10 tags. - If you want to show only tag nodes that are related to Y (let’s say – 50) or more users, set the filter to:
point:degree_in >= 50
. - You can change the visual settings of the graph. To do this, click on the brush button:
. I recommend that you decrease “Edge size” and “Edge opacity” in bigger graphs. You can also increase “Max Points of interest” in the Label settings (
)
I’ve also gathered data for mlops
and kubeflow
tags.
Searching by a tag or user
Graphistry provides a useful feature – Data Table. It’s a list of all nodes and edges combined into two tables. To display it, click on the Data Table icon (top toolbar). You’ll see a table like this:

It’s especially helpful when you have a number of nodes and edges, and the first view is not very clear. You can have a difficulty in finding the kubeflow tag on the last graph. In such a case, open the Data Table, type kubeflow
in the search field and click on the row. This point will be highlighted on the graph.
Selecting the right tag
MLOps and Kubeflow datasets have many common parts, but are not the same. This is a good example of two approaches. When you want to find tags/technologies related to a specific one, like Kubeflow in this case, you can run my script for this specific technology. You can also start with a more general term like: MLOps
to find a set of similar technologies, and then look for more specific ones.
How to run the script? If you are familiar with Python, it’s really easy, but if you’re not a technical person, you won’t find it difficult, either 🙂 We’ll focus on this in the next article. If you can’t wait and want to test the script right away, go to the repo: https://github.com/data-hunters/tech-skills-visualizer. Stay tuned!