You are currently reading Issue 181: Women in Mathematics, May 2024

Connecting women and opportunity

Womanthology is a digital magazine and professional community powered by female energy and ingenuity.

Connecting women and opportunity

Womanthology is a digital magazine and professional community powered by female energy and ingenuity.

Womanthology Icon

From music to mathematics and data visualisation: Taking a front seat on the wild ride that is the data revolution

Dr Charles T Gray, Data Scientist

Video gamer

Dr Charles T Gray specialises in data visualisation, engineering, simulation, and machine learning. After a decade in music and arts, she changed career and pursued mathematics, earning degrees and a PhD at La Trobe University in Melbourne, Australia. Now in Copenhagen’s gaming industry, she works with colleagues to revolutionise understanding of data structures and she is a passionate champion of women in data, regularly speaking at events that aim to support new entrants into data careers.

Dr Charles T Gray

“Working in the video game industry is a dream for me as a passionate gamer. While I’ve worked in various fields like psychology and medicine, nothing compares to my love for video games.”

From music to mathematics

It is a special thing to have a front seat on the wild ride that is the data revolution currently going on in the world. It’s hard at times and challenging, it’s also incredibly rewarding and has also taken me to so many different places.

In 2011, when I was 30 years old, I began my journey of reorienting my career over ten years of study. I decided I wanted to do something different than music. Within a year, whilst I was in my second year of university, I got my first part time job in data and I’ve had pretty much non-stop employment in data since.

I started with part-time studies in maths and statistics while working. Then, I pursued my undergraduate and honours (equivalent to master’s) degrees in Australia, followed by a direct entry into a PhD programme. Throughout my thirties, I dedicated a decade to formal education in mathematics and statistics. Then I gained practical experience in various fields such as ecology, psychology, neuro-marketing, finance, medicine and, most recently, video games. You can read the last Womanthology article I wrote way back in 2017 by way of an overview of my journey to that point.

As job titles shifted from informatics to bioinformatics, research software engineer, and now data scientist, the core job remained constant. Essentially, people provide questions about data, and my role involves cleaning and organising it to fit models for deriving answers. This process also includes creating visual representations like graphs and summarising statistics such as counts and percentages to aid this understanding.

Linegage graph Charles T Gray careerCreating a data visualisation is usually the endpoint, which is an area which has always been a real passion for me. It’s a delight that in my current role, I have a lot of autonomy in shaping interactive dashboards and visualisations to provide meaningful information for different stakeholders in the company, game producers, the CEO, the marketing and finance departments, and the CTO.

There’s a lot of buzz around technical aspects like AI and machine learning in the data realm, but I believe it’s overly complicated. At its core, people simply have data and questions to answer, leading to an increasing array of roles in this field. Despite entering in 2012, I’ve amassed over a decade of experience, yet I’ve only been working full-time as a data specialist for the past couple of years since finishing my studies.

The relationship between mathematics and data science

In data science, mathematics is prominently applied in modelling, where we examine the relationship between variables. We aim to isolate specific factors and understand their impact on outcomes while controlling for other variables. This involves representing the relationships mathematically as equations, allowing us to solve for coefficients that approximate the influence of each factor on the outcome of interest.

Broadly speaking, that is what people are doing most of the time when they’re modelling. Sometimes there are variations on a theme, but almost always, that is what’s going on with ‘modelling’, which is increasingly used interchangeably with the term ‘machine learning’. There are many different names for it, with people using principles with stacks of fancy equations, but under the hood, it is ‘maths’. So, there’s a direct relationship between mathematics and data science, but there are more indirect ways that aren’t necessarily obvious.

Charles T Gray - Abstract algebraWe can filter data down, for example to the things from last year but with this subgroup and not that subgroup. That can be represented as abstract algebra, which is where I’m trained. I still refer back to abstract algebra when I’m trying to solve a problem with some data sets. Abstract algebra is the formal language of data engineering but we also use software packages.

At work, myself and colleagues will frequently refer to the documentation for the software package that we built, and use it to manage our data. We have folders to help us navigate it and make sense of all the code, and the way we understand it is through what’s called a directed graph, which is a mathematical object.

I taught graph theory in academia, which is a pure maths subject. You may have heard of Groetzsch’s Three Color Theorem, or the Travelling Salesman Problem, or the Seven Bridges of Koenigsberg, which is the initial point for graph theory, but our documentation is graph theory in action.

There are arrows in the graph that all point forwards, but if you zoom in closely, you’ll notice smaller arrows. This is why it’s called a directed graph, which is essentially a network representation. Each of these represents a dataset.

We start by importing raw video game data into my software using a script, and then fetching it from the warehouse. Next, I modify the data by extracting values and adding new columns. I can examine the documentation and code for specific nodes on the graph. Essentially, we use a mathematical concept to analyse and work with our code constantly throughout the day.

To navigate and understand a directed graph you don’t need fancy graph theory, but you just need to be comfortable with the idea that it’s a mathematical object because you’re constantly using it to understand your own code. Each one is just shaping the table in a different way, so I might want to filter it down to only these things, or maybe I want to remove these columns. This is just fancy spreadsheeting and having a relationship between all the transformations so you can track and zoom in on them and inspect the output at each point.

It’s the same thing with data visualisation. We all had to learn about the X axis and the Y axis but nowadays in most data work you’ve got to use visualisation to provide your answers to your stakeholders and so everything comes down to ‘what’s on the X axis?’ or ‘What’s on the Y axis?’ It seems to get more and more complicated, but ultimately, it’s not that complicated because it is fundamentally maths.

So, where does maths come into play in data science? While there are certainly clear connections with the complex models of AI, machine learning, and statistical modelling, people often overlook the subtler ways maths influences our work. My philosophy is that the key to excelling in mathematics and data science lies in being comfortable with making mistakes and learning from them. That’s it.

I get a kick out of it when I finally get it to work. I like it when I can automate things. I’ve played simulation video games, like SimCity, for decades, so it’s just an extension of this. I’ve got an island, and now I’ll build some houses so people can move in. People are moving in, but now they’re hungry, so I’ve added a little fishing hut. Or you’ve arrived on a planet and now your colonists need oxygen and water, so you better build an oxygen maker.

I sometimes wonder if people’s negative views of maths stem from how it’s been portrayed by those trying to appear superior. Women, on the other hand, tend to be more inclusive and enjoy collaborating with others. When a group lacks diversity, it can affect perspectives on maths. Despite having a PhD and publishing papers in advanced maths, my focus is on making maths accessible to everyone. Many don’t realise that we use maths in our daily lives, believing it’s something only certain people can do when in reality, we’re all doing maths constantly.

Exploring video game analytics

When my team and I analyse video game data, we answer various questions about the game, such as which levels players struggle with or enjoy the most. We can’t always measure enjoyment directly, so we use proxy measures. We also examine metrics like frames per second to see how the game performs on different devices. For example, Apple products usually have higher frames per second than Android devices. This led me to discover differences in how operating systems function.

We also examine technical aspects like how data is sent to the game and we ensure it’s captured accurately. Visualisations help identify issues, such as bugs in the code, which engineers can fix. In marketing and finance, we analyse subscriber patterns to improve service. Additionally, as an ed tech company, we focus on learning outcomes for players.

Working in the video game industry is a dream for me as a passionate gamer. While I’ve worked in various fields like psychology and medicine, nothing compares to my love for video games.

Accessible to anyone

In previous sectors I’ve worked in I’d have to bend over backwards to explain the mathematical modelling I was doing, for example, network meta-analysis, a form of evidence synthesis. Working on gaming is much more accessible and people are always very interested in the data. Out of all the many data jobs I’ve had, even though the job is ultimately the same, this one’s the most accessible.

The delightful part about what I’m currently doing is it’s so accessible to anyone because most of us work with spreadsheets. So, every single time a single player performs an action we’re able to aggregate this and so we’re now working with data that goes to tens if not hundreds of billions of rows.

We need to focus on the necessary data to answer questions, which is the job of data engineering (and it’s important to make sure we are asking the correct questions. More on this later…) After cleaning the data, we often create summary statistics, like percentages of people who performed certain actions, rather than always making predictions with models.

I genuinely strive to be inclusive and open about data and maths because I believe everyone can understand them, even if they don’t realise it yet. My journey into maths and coding started later in life, after spending my twenties focused on music, writing, and the arts. Surprisingly, I’ve found maths and coding to be more creatively fulfilling than my previous career.

It’s important to challenge gender stereotypes that pigeonhole women as solely creative and lacking technical skills. Communication skills, for instance, are crucial in coding to make it understandable and accessible to others. By incorporating these principles into my work, I aim to be a better programmer, showing that the overlap between technical and artistic skills can lead to richer outcomes.

Advice to people interested in data careers

There are so many ways to get into tech and data now. You can join an organised bootcamp to learn to code or you can teach yourself at home. There are all kinds of amazing tools that use generative AI to help you learn. I’d recommend GitHub Copilot, an AI developer tool which allows you to type in questions and things you’re stuck on. I affectionately call it ‘Super Clippy’ in homage to the Microsoft Assistant animated paperclip character from Windows in the late 90s to early 2000s. So, you could use it to get started on learning a language like Python, for example, which is a great place to begin.

Dr Charles T Gray - codingBeyond the practicalities, the main thing I notice is that people often focus on what job title they want, but it’s more important to consider what you enjoy doing. There’s a place for everyone in the field of data, whether you prefer working with teams or diving into technical challenges. While many associate data science with hardcore mathematics, there’s a variety of roles, like data optimisation, that may not seem as technical but still involve complex maths behind the scenes.

Much of the work in data involves thinking about raw data sources and how to clean them up. For example, extracting specific information from a cluttered column or removing irrelevant rows. This technical process, known as data engineering, focuses on shaping and organising data to fit specific requirements without altering its information. While not heavily mathematical, it’s a crucial aspect of tasks like modelling and data visualisation, occupying much of the work in these fields. However, it may not appeal to everyone due to its detail-oriented nature.

In my experience, this type of work has dominated 95% of the time. Most of the time I don’t spend my time doing hardcore maths, but rather I am shaping and transforming data sets so that I can feed them into the tools. It’s not changing what’s in the data. I’m a scientist and I’d never change the truth, but it’s about getting it into the right format to feed into tools.

There is a career for everyone in data

There’s also plenty of other work in the data field, like data governance. For instance, the EU has strict privacy laws, so we need policies for managing data across teams and loading it onto different platforms while ensuring privacy principles. We rigorously check data multiple times to maintain cleanliness, such as hashing email addresses to protect personal information. In large companies, there’s a growing industry focused on writing protocols to ensure compliance with laws like GDPR for managing data securely.

Facilitation is another important area of work. Coders like me aren’t necessarily the people you want speaking directly with a client because we can get lost in the technical details. There’s an enormous amount of roles for people who can interpret the picture and explain it to a stakeholder, a client, or someone else. So, there are all these intermediary roles in data that don’t have anything to do with being a coder or doing mathematics.

If you’re only interested in maths and don’t enjoy coding or other aspects, you might not find a fit in the data field. I’m not a software engineer, nor did I study computer science. My coding expertise lies in designing mathematical algorithms for calculations and modelling. It’s important to consider how technical you want your role to be and what you enjoy, rather than assuming data work revolves solely around hardcore maths and machine learning.

Even for those with extensive training in complex maths, like myself, the reality of data work often involves diverse tasks like project planning and collaborative writing. The field is dynamic and constantly evolving, offering a wide range of opportunities beyond traditional expectations. It’s crucial to stay open-minded and adaptable, as what you train for may not align exactly with what you end up doing.

Discovering new passions

Passion projects and hobbies can significantly contribute to career growth in fields like data science, often opening unexpected doors for new opportunities and helping you explore what you enjoy in the world of data. This could involve creating visualisations, understanding concepts, or participating in community activities like coding meetups and events.

These gatherings are often volunteer-run and offer opportunities to learn and connect with others who share your interests. Speaking at such events isn’t about showing off, but rather about making connections and sharing knowledge with like-minded individuals.

I love hearing people’s questions, especially when they’re stuck on how their data is structured. Instead of letting the data dictate the visualisation, start with the question you want to answer and shape the data accordingly. This critical thinking approach helps avoid the trap of answering unnecessary questions just because you can, rather than focusing on the ones that should be answered.

There are a lot of people in data science answering questions that don’t need to be answered, just because they can but as a result, they’re not answering the questions that should be answered. These are the kinds of things that keep me up at night!

Connect with me

Please feel free to reach out to me on LinkedIn if you’re interested in speaking with me about all things data!

Coming up next

I’m giving a talk on data architecture and then the next month I’m working on a total hobby passion project.

I’ll be presenting at useR Copenhagen on animating music as mathematical objects, where I’m taking many files of music and pulling them into code, turning them into a table so I can think of it like a spreadsheet, and then taking that and turning it into a graph, which is a mathematical object. I want it to be animated.

Currently, I’ve got to the point where it is animating, but it’s not animating the way I want it to so I’m reading the documents on the animation package, and diving into each function, and just reading and sighing as it’s still not working how I want. As with all the most challenging problems, it is very frustrating but that’s what makes it all the more rewarding when I succeed!


Header image credit: Freepik

Share this article