BLOG > PERSPECTIVE

Data Scientists and Data Engineers: Role Comparison (Part 1)

Why your data-driven ambitions cannot be realised by data scientists alone

Doug McKenzie
|
27 April 2020
BLOG > PERSPECTIVE

Data Scientists and Data Engineers: Role Comparison (Part 1)

Why your data-driven ambitions cannot be realised by data scientists alone

Doug McKenzie
27 April 2020

Formula 1 drivers have total confidence in their cars' ability to storm into corners at ridiculous speeds – and, crucially, out of them again – because their race engineers have told them that their cars will do it.

It's not a wishy-washy assurance. An F1 car's performance on any corner at any racetrack is a categorical and provable fact, the outcome of a series of brain-fryingly complex calculations that take into account all the elements that come into play in a corner: aerodynamics, mechanical grip, road surfaces, weather conditions, tyre compounds, the thickness of the driver’s socks – all right, maybe not that last one, but the point we’re trying to establish here is that there is a substantive link between the world’s greatest drivers and the equally brilliant engineers who are tasked with giving those drivers the best possible cars.

And the reason we’re talking about that is because a near-identical 'one can't do without the other' dependency exists between data scientists and data engineers. If the data scientist is the motoring maestro expected to drive a business to success, the data engineer is the behind-the-scenes brainbox responsible for creating the car they need to do that.

The glue that bonds or breaks a racing car just as surely as it does a business system is data. In business, the engineer’s job is to bring data in from a range of different places, to clean it up and join it up. The scientist’s role is to use that data to produce an insight, to build a repeatable model, or to solve a business problem.

There’s another parallel in the tasks that face teams competing on a racing grid or in a global market. Scale. The volume and diversity of data that has to be dealt with in both cases can be dauntingly vast. In business, developing data engineering and science disciplines as part of a data management strategy is not just an option for data and analytics leaders, it’s an absolute prerequisite.

Two heads are better than one

Supplying the right type of data to the right people at the right time for projects covering everything from data cleansing and matching to deploying predictive models is a huge challenge that the data scientist and the data engineer have to tackle as a team. Their success will depend not just on the ability of the scientist/driver to carve the right line through any bend in the business road but also on the ability of the engineer to supply the right quality of data.

At Optima, graduates are generally recruited into data engineering roles where they can acquire knowledge and experience. Some may go on and switch to a data scientist role, but that’s not a vertical movement. The two skillsets are quite separate, but highly complementary. Data engineers perform a crucial role in building the pipelines to channel the data that they transform into formats that data scientists can use. They’re further away from the end-product of the analysis,so they’re less in the spotlight than data scientists, but they’re no less important. Data scientists usually focus on a few areas and are complemented by a team of other scientists and analysts.

Quite a lot of engineers will know a bit about data science, and vice versa. There is some overlap, but it’s very rare to find someone who is a genuine expert in both disciplines. Data scientists will sometimes try to tackle the data engineering element of a project themselves – they can write code, after all – but short-cutting the engineer’s ability to construct robust and repeatable data pipelines just ‘to get the job done’ is muddled short-term thinking. Rickety, improperly engineered channels will not only add time to the analysis on that specific project, they’ll also rear up to haunt future projects with juddery data flows making the scientist’s own job more difficult or even impossible when particularly large or complex data sets are under review.

An Optima client might need an engineer to sort out their data, or a scientist to optimise already-sorted data, or both if they’re in more of a pickle. The senior members of Optima’s long-established team of dedicated data scientists and data engineers can instantly identify a client’s need and then deploy the right personnel to get a project completed quickly and to the highest standard. They’re also well equipped with the soft collaborative skills that are so important in understanding a client’s problem and building strong client relationships. Experience tells us this approach creates a formidable blend of competencies.  Whether it’s a Formula 1 car or a business, investing in excellence will always boost your chance of a spot on the podium.

We need to know your stuff too

Our people don't know their stuff until we know your stuff too. We'd rather tease out the actual issue rather than offer fancy solutions to problems that don't exist. No hoo ha. No blah-blah.

Book a blah-blah free chat now

Latest Posts