Doodling Data logo

Doodling Data

Subscribe
Archives
September 24, 2023

Bad Data Science job specs

Fountain pens, image generated with whatever AI tool Substack uses.

This post was first published in 2021 on my blog but I’ve slightly edited it here. Maturity in Data Science is progressively improving but we still haven’t solved all issues, so I think this is still quite relevant.

Looking for a job means you have to go through a plethora of job specs to select the ones you are both interested in and you can be a good fit for - the process is often incredibly inefficient. Moreover, the fact that data science continues to be a bit of an ill-defined field and suffers from a high level of promotional activity does not help. A lot of what is in here would probably translate well to fields other than data science, possibly even outside of tech, but there are peculiarities of data science that make it particularly troublesome. Lots of companies still don’t know how to hire.

I will try to outline the main features of bad specs (IMHO, of course) in order to provide some (opinionated) guidance on when it is better to not spend the time. The hiring process in data science is still, to this day, affected by poor job descriptions, but I maintain that the more people will complain about them and the less candidates will accept them, the better things will get.

Chances are you might encounter a fair amount of job specs which thick yes to one or more of these:

  • Ask you to know many, many - too many - things at once: they’re written like shopping lists of technologies/concepts you are supposed to be good at;

  • Require a number of years of experience which is incommensurate to the scope of the role;

  • Focus much more on tools than on the contribution of a data scientist as a problem solver;

  • The writing is very much in self-praising mode when describing the company;

  • Spend much space to inform you about the personality of the hire they are looking for: someone who is “passionate” or “enthusiastic”.

Note that I am directly excluding all those specs littered with typos, creative use of punctuation or that in any case look like they have been put down in a rush and not even been proof-read for the basics - those do not deserve your attention.

Truth is, writing a good job spec is hard. You need to have a solid understanding of the role and its place in the tech landscape and correctly evaluate what level to hire for: do you need a starter, someone who may not have work experience but has all the skills desired, or do you require someone who has done (commercial) work already? Do you need someone fully focused on technical work or do you need a manager, someone whose job will be leading, whether a team or the strategy, or both? You also need to be able to delineate why there is a need for data science at the company and not, say, general analytics, or what problems are tackled that require sophistication like Machine Learning. You also need to have the basic writing skills to be able to produce a good text. Finally, you need to sell it without making it clear that you are selling it.

Let’s go through the points above one by one.

Specs that want you to know too many things

This is a common find - specs that look like shopping lists. A typical example would contain bullet points along the lines of “proficient in Python, Azure, Scala, Hadoop, AWS, hypothesis testing, SQL, Github, deep learning…”. Many times, these specs do not even separate their shopping list into categories: they put programming languages together with workflow tools, operating systems together with Machine Learning concepts, software together with Statistics notions. It really looks like the author has pulled a few of the most recurring words coming up from a superficial Google search and placed them together. This is a big red flag: while it is noble to give people the benefit of the doubt, you can assume that whoever wrote this does not have a clear understanding of the role itself and why the company is hiring for it.

Mentioning tools and concepts as bullet points does not manifest competence (note this is valid for writing a CV too). Of course there will be programming languages, areas/subfields, tools and concepts the candidate is required to be knowledgeable about, and there might also be strict requirements as to specifics because maybe the job is specialised. The problem arises when the list of job requirements is carelessly cobbled together as a bunch of words and with a flimsy connection to genuine needs.

When specifics of a job are well listed in a spec instead, they are usually divided into meaningful areas of focus. An example would be something like:

  • Programming: Python proficiency is a must, knowledge of R/Scala a plus;

  • Deployment and implementation: experience in a cloud computing service is essential, we operate in AWS but will evaluate candidates which have worked with any cloud provider;

  • Data Analysis: solid understanding of data cleansing techniques and statistical validation of results is required;

  • Machine Learning: we are looking for people with research experience in state-of-the-art Computer Vision.

In short, you want to consider specs which show the company knows what they are talking about: what actual skills are needed and why. Remember, they might be overshooting and listing more things than they actually need: hiring is a pas de deux where each side is trying to sell themselves to the other and it is up to both to read between the lines. There is a multitude of articles and blog posts advising job seekers to apply to a job even when not all requirements are met, precisely because of this. I agree with it, with some caveats: you should understand what the company is looking to fill (and as we said the way they present requirements is key) and the macro knowledge requirements which are actually important; you should be able to detect where in the spec some lines are listed in order to attract some types of profiles.

All in all, remember, data science skills are vastly about critical thinking and the ability to learn efficiently and independently. Think about what toolbox you have at your end and how would it match what they look for: do you think you could cover for their expectations, and you could easily get up to speed with what you might be lacking? Or are they looking for a profile more knowledgeable than you are for a specific core component?

Specs that require too many years of experience

The most popular way to assess experience in the job market is by counting the years spent working. It makes (some) sense: the more you have worked, the more experience you have and this does correlate with the things you know about the job as a whole and the industry - so you can bring more to the table.

However, a bare reliance on this measure is in my view old-fashioned and could even backfire in the long run. The requirement for experience length is usually expressed at the very beginning of a spec, indicating it is an essential one. Now, if the position is a “senior” one, it goes without saying that having been there and done that in some form is necessary - we all need to cut our teeth on the first projects before becoming seasoned professionals. If it is a “managerial” role, the company would be looking for someone who has spent time managing (projects and/or people) before. The troubles arise when no seniority in the role is explicitly formulated but years of experience are given as a must: what are they based upon? Likely just on an intuitive assessment by the hiring team/manager plus a general feel gathered from what the competition asks in similar roles, so it is a self-alimenting illusion.

Screenshot because X doesn’t allow for embeds anymore.

Asking for experience in a technology is a slippery slope anyway: would someone who has used Python for 10 years be better than someone who has only coded with it for a year? You cannot really tell by just that. If you have worked with Docker for the last 3 years does it mean you understand everything about containers? One can hardly tell. All this is obviously true for any job in tech, not just in data science, but with data science it can get even itchier as there are technology and skills which are more conceptual than practical, and measuring competence in years of use can really be meaningless.

If someone is writing a spec for a data science position and there are pieces of knowledge required, they should focus on assessing competence with means other than counting years of use. In fact, data science is all about measuring values and building quantitative and reliable information, right? So you should expect that the person writing a data job spec does a little data exercise in providing good information. When something like competence cannot be reliably measured and evaluated in numerical ways, it is way better to provide verbal qualifiers:

  • Proficiency in Python - this is not saying that 10 years of use are required, but that the hire should be able to write good, robust and tested code (which means they have not just learned the language);

  • Familiarity with serverless architecture paradigms - this will not put the accent on how long you would have to have deployed jobs in a cloud system for, but informs you that you should understand the concepts and have played with them.

There are of course other methods to assess competence, especially when the area is conceptual. In much the same way academia does when screening candidates, roles which are heavily focused on research may require you to have publications and/or a track record of delivered projects. But for a general data scientist position where specialisation is not a precondition, there should be no particular insistence on years of experience. Unfortunately, this is often not the case. My suggestion is to still apply when you do not meet the experience length requirement (but meet other ones), but to do it with cognisance: present yourself by stressing how you think you would fit the bill, regardless of how long you have spent in work previously.

Subscribe now

Specs that do not treat you like a problem solver

This one is probably the most common. A good data science job spec should put most of the focus on what the hire is supposed to work on. Data science can be extremely broad in practical use so that the actual work expected can vary a lot as a function of the product of the company, team culture and composition, the stage of technical and scientific headway, and in many cases, the company’s size.

Listing technical criteria for the hire is all good and well but if it is all there is in the spec, it is not a good presentation card: you as a job seeker should be given at least the gist of what the role encompasses. Sure enough, this will be covered in the interview if you get there, where you should be asking many questions, but a job spec is the door to that - a good one will give you an idea of what you can expect. Many do not do that. They treat you like a bundle of the same shopping list we discussed above rather than as a human with critical sense capabilities willing to put them at the company’s service.

A data scientist is a problem solver: their purpose is to tackle problems with quantitative information, and finding the most efficient solutions. This requires a lot of “human” skills that the sole technical background does not account for - a good data science job spec should put the accent on the intelligence you would bring in, not just the tools you can use.

Specs with a self-praising tone

A classic: those buzzwords companies use to promote themselves and appear good in the market, included in the context of attracting talent. While a bit of marketing of oneself is a good idea, an excessive use of self-praising terms is an irritating habit and IMHO out of place in a job spec. If you see too much of:

  • “Cutting-edge”,

  • “Award-winning”,

  • “World-leading”,

  • “Disruptive” (!?),

  • …

then it is probably one of those cases. I would not exclude these specs straight out, but I would pay higher attention to how the company generally describes themselves outside of the context of the spec (what presence do they have online, how does their website look, what language do they use in e.g., social media bios?). I would also suggest to go and have an active look at their company values (if they publish them publicly): how do they phrase them? Do they brag a lot there too?

Specs that insist on your personality

It is not uncommon that on top of a description of the company there is also a description of the desired hire in the culture bit, which a good spec will contain. This bit is meant to outline the kind of workflows the company uses at the overarching level and furnish a bird’s-eye idea of the relationships within the team and between teams as well as how the hierarchy flows. The way this part is written can tell you a lot about the company itself.

In much the same way as the point above on the self-praising description of the brand, this part can be full of banalities too, like when many adjectives are used for the ideal candidate that fall in the semantic area of “passionate” and “team-player” (who wants to hire someone who is not interested in the work or bullies others anyway?). The red flag arises when the whole discourse tries descending into tones that sound almost patronising or arrogant (they enlist things they do not want from candidates rather than phrasing the concept in the positive) - if this is the case, it is probably the worst of all in our points here.

Hope this was useful, let me know your thoughts!

Thanks for reading Doodling Data! You can subscribe (free), it shows support :)

Don't miss what's next. Subscribe to Doodling Data:
Start the conversation:
Website Bluesky LinkedIn
This email brought to you by Buttondown, the easiest way to start and grow your newsletter.