What question should you be asking yourself sooner?
What question should you be asking yourself sooner?
Further below are 3 jobs including: Data Engineer at Airtime Rewards, Permanent, Manchester, Analytics Engineer at Yoto, Permanent, London, Software engineers at the Bennett Institute of Applied Data Science
Below I note a recent question I asked in my RebelAI leadership group on "what question do you wish you'd been asking yourself sooner?". I also share a tip on Copy on Write for significant speed-ups in Pandas 2.2+ from my Fast Pandas course.
The PyDataLondon 2024 schedule is live and the talks look really good - tickets are on sale and they've always sold out in the past. On the Saturday morning I'll run another of my regular leadership discussions (reply to this and I can add you to the GCal reminder). The leadership discussion is only for ticket holders.
The goal for the leadership discussions is to dig into opportunities and challenges for team leaders and to get advice from the crowd on how they've solved similar issues, so members can get closer to repeatable success. This format evolved into my RebelAI private leadership group (noted below), I've been running these sessions for 7 or so years at PyData conferences now - people tell me that they meet very interesting people in these sessions. If you're looking for other leaders, advice and networking, I'd strongly suggest you get a ticket and attend on the Saturday morning (and mail me back to get a GCal invite).
Training
I have new dates to announce for my upcoming training in July and September. The links on my training page show the July and September dates - if you fill in my training notification form I'll happily send you a 10% discount code valid for this year. In July (and September) I'll be running:
- Fast Pandas July 18-19 - make your existing Pandas codebase 2-30x faster per bottleneck by addressing common issues with powerful speed-ups
- Successful Data Science Projects July 11-12 - decrease failures and make success more likely with better project planning and execution
- Software Engineering for Data Scientists July 8-10 - increase your speed of delivery by modularising, running code reviews, testing for increased confidence and preparing for production from early on
Each course above has early bird tickets if you buy soon, that'll combine with the discount code I'll send you if you fill in the training survey.
If you'd like your Pandas code to run faster, your team to write more maintainable DS code and your projects to succeed more frequently - check out the above and fill in my training survey.
RebelAI
My RebelAI leadership group for "excellent data scientists turned leaders" continues to grow very well, now 7 months in with 20+ leaders attending monthly Zoom calls to talk through opportunities and challenges backed by weekly conversation in the slack group. I've got a 3 page PDF doc which explains what we do, reply to this if you'd like a copy. I'm actively seeking new members for our next intake.
Recently we've been talking in the slack group about "what's a question that you wished you'd been asked sooner?". Some of the replies include:
- "What would my future self want me to do now?"
- "Where do you want your team to be in a year"
- "Write down the quality of relationships with the people you interact with - where are the blind spots?"
- "Should I stop doing this? What are the objectives you have to hit by when to continue doing this?"
- "How are they hoping to score?" - (is there a route from idea to value?)
- "Are you really taking advantage of this time with your child?" (we also have a parents subgroup as 30% of the group are parents - me too)
- "Are we on a high growth curve? / Is there lots of upside potential?"
- "What negative stories you are telling yourself? Are they true? What evidence do you have for them (is there any?)?" (for dealing with imposter syndrome)
All members participate in asking a Monday Morning Question to challenge the group, in part so members get answers to their questions and in part to make sure members are thinking about challenging the rest of the group - that's a useful part of leadership training. It leads to great discussion!
If you're a data science team leader and you'd like to have a private peer group who'd enable you to make better decisions, reply to this and I'll tell you more about RebelAI.
Fast Pandas - get large speed-ups with minor changes to your Pandas codebase
During my next Fast Pandas course in July I'll talk about the new Copy on Write mechanism that's available via a switch in Pandas 2.2 and will be on be default in Pandas 3.0. This new mechanism has a great positive impact - you can use less RAM and so DataFrame modifications can go much faster, the downside is that you may need to modify your code (even if you've followed and solved the warning messages you get).
The Copy on Write docs are good and give useful examples, I extend this with demos and an exercise during the class. The big thing is that by avoiding copying-all-the-time you can significantly reduce your RAM footprint (so you don't run out of RAM!) and also gain big speed-ups (e.g. 5-10x on complex operations).
I had a great discussion on this with my most recent hedge fund client and that's set them up for a migration to Pandas 3. If you've got a established code-base and you're not sure of the impact of moving to Pandas 3 when it comes out this year, you probably want to attend the public course to get your questions answered. If you want a private session, just reply and ask.
Motoscape - the return of the Five-LOW
We took the car on a test drive around the Isle of Wight - it still runs, hoorah! It easily passed its MOT and after an engine management system reset we seem to have better performance (for, you know, a 1.6 litre petrol non-turbo 24 year old estate car). The indicators did stop working at one point but that had bitten us before - you just pop out the harzard light box, wipe off the excess oil (nope, we don't know why there's excess oil in there), then the indicators work just fine again.
We say a huge thank you to John Sandall and a couple of others for our first donations. We've done some more work on the car and the Alzheimers Society are happily working with us.
Last year we raised £4k for Parkinson's Research and we plan to do the same or better for the Alzheimers Society as there's a family impact there for one of my co-drivers.
Footnotes
See recent issues of this newsletter for a dive back in time. Subscribe via the NotANumber site.
About Ian Ozsvald - author of High Performance Python (2nd edition), trainer for Higher Performance Python, Successful Data Science Projects and Software Engineering for Data Scientists, team coach and strategic advisor. I'm also on twitter, LinkedIn and GitHub.
Now some jobs…
Jobs are provided by readers, if you’re growing your team then reply to this and we can add a relevant job here. This list has 1,600+ subscribers. Your first job listing is free and it'll go to all 1,600 subscribers 3 times over 6 weeks, subsequent posts are charged.
Data Engineer at Airtime Rewards, Permanent, Manchester
Design and implement robust, scalable data pipelines to ingest data from internal platforms into our data warehouse. Monitor and maintain data pipelines, ensuring data quality, integrity, and availability. Optimise data pipelines to enhance performance and reduce cloud computing costs. Understand, gather, and document detailed business requirements. Take ownership of data projects from planning to delivery, collaborating with other departments as needed. Innovate and automate current processes, driving continuous improvement.
- Rate: £35,000 - 45,000
- Location: Manchester, Hybrid (2 days/week in office)
- Contact: oguzcan.koncagul@airtimerewards.com (please mention this list when you get in touch)
- Side reading: link
Analytics Engineer at Yoto, Permanent, London
We’re looking for an Analytics Engineer to join our team to accelerate the business and help us make sense of the terabytes of data we receive every day.
We’re a small team at the heart of all the decisions Yoto makes. We work in a mature, high-trust environment with a lot of independence. Everyone can contribute ideas and be part of the decision making process. We tackle a broad range of problems, from developing cutting-edge data products to building and maintaining our data orchestration platform. Our work spans across all the key strategic projects throughout the company.
- Rate: £30,000 - £40,000 based on experience.
- Location: Kings Cross, London (Hybrid)
- Contact: jeena.lakshmanan@yotoplay.com (please mention this list when you get in touch)
- Side reading: link
Software engineers at the Bennett Institute of Applied Data Science
We're looking for software developers, at all stages of their careers, to help build, maintain, and operate OpenSAFELY -- a revolutionary open source platform for secure clinical research. We're also looking for a team lead, a project manager, and a research software advocate (think "developer evangelist" for research).
Led by Ben Goldacre (clinician, researcher, and author of Bad Science and Bad Pharma), we’re a truly interdisciplinary team with a strong track record of delivering useful tools in a globally leading research setting. You’ll have the chance to use your software skills to save lives and further the state of medical data research. Our software delivery teams are collaborative, supportive, thoughtful and kind, and we support hybrid or fully remote working, with in person team events throughout the year.