This newsletter is now `NotANumber`
This newsletter is now NotANumber
I’ve renamed my newsletter from “Ian’s Thoughts and Jobs List” to “NotANumber”. I’m building up this newsletter, recently I’ve started to interview smart people e.g. Kaggle-Mani last issue and I want to get some guest editors involved over time. Hence I’ve taken my name out - I’ll be the editor in chief. Why this name? Lots of reasons, I might reveal them in the future.
Did you know that 2iQresearch are hiring for a Senior Dev and and a Quant Dev? VivacityLabs have a Special Projects role and Inawisdom need a Software Engineer. Details for all of these and more are down below.
NaN==NaN
is False
Had you noticed before that if you try np.nan == np.nan
in NumPy you get a False
back? A NaN
is equal to nothing including itself, you can only test it with np.isnan(np.nan)
which will return True
. We can say np.nan!=np.nan
as that always yields True
. Remember to use np.isnan
in your tests.
Thoughts
Next Tuesday evening I’m being interviewed by Coiled, the core team behind Dask. I’ll give a talk on the state of higher performance Python followed by Q&A with their Richard Pelgrim. Richard helps with new Dask user onboarding so if you have questions about Dask, or higher performance in general - you should join by registering here!
At home I’m trying to make my house more efficient - it is old and has lots of leaks, so we run the central heating more than I’d like. By reducing waste I’ll reduce our heating bill and carbon consumption via our gas boiler. To track this I initially used an Infra Red camera (FLIR One Android plugin) which shows obvious gaps around my doors. I have an older generation than this one.
I’ve recently bought 12 bluetooth thermometers and hygrometers to track room behaviour from Govee. The Android UI is ok but more importantly I can get a CSV export. The CSV header was wonky which was a bit of a surprise. Thankfully that was easily fixed so I could then back out from the temperature and hygrometer the g^m3 of water in the air per room (because…I’m curious!) using a pandas interpolate. Now I’m starting to experiment on which rooms lose the most heat overnight. More in the future.
Soon I’ll interview the author of Polars and some other interesting folk, I hope to have the Polars interview for the next issue.
If you find this useful please share it on Twitter, LinkedIn and your communities - the more folk I get here sharing interesting tips, the more I can share back to you.
Strategy
With one of my clients we’ve been ranking a long list of potential idea-stage projects to determine which to turn into proof of concepts. We’ve found the single most useful ranking factor to be “how many hours do many people spend in this company doing the task manually already?”. For some of the ideas we can find nobody, for others we’ve got teams working on the problem - so identifying PoC opportunities can be easy, if you ask around the organisation.
What’s your best tip to identify new opportunities for data science projects? I’ll be talking more about this in next year’s Successful Data Science Projects course (date TBC, probably Feb).
Open source
Did you know in conda
you can get an installation history via the --revisions
flag? It’ll tell you the time you installed or updated each package. More here.
I’ve mentioned VizTracer before for profiling - it is very good. I also use JobLib to easy parallelisation with clients and when running my Higher Performance Python course. Now it seems that JobLib has an integration with VizTracer to enable parallelised profiling. More power for debugging and profiling complex systems!
Footnotes
See recent issues of this newsletter for a dive back in time.
About Ian Ozsvald - author of High Performance Python (2nd edition), trainer for Higher Performance Python, Successful Data Science Projects and Software Engineering for Data Scientists, team coach and strategic advisor. I’m also on twitter, LinkedIn and GitHub.
Now some jobs…
Jobs are provided by readers, if you’re growing your team then reply to this and we can add a relevant job here. This list has 1,400+ subscribers. Your first job listing is free and it’ll go to all 1,400 subscribers 3 times over 6 weeks, subsequent posts are charged.
Special Projects - Solutions Developer at Vivacity Labs, permanent, London
At Vivacity, we make cities smarter. Using Reinforcement Learning techniques at the forefront of academic and research thinking, our award winning teams optimise traffic lights to prioritise cyclists and improve air quality. Our work makes a real difference to real people using ‘privacy by design’ principles.
We’re looking for a confident developer / ML engineer, who is comfortable working in an adaptive setting: get familiar with complex concepts, implement accurately, and communicate your plans effectively with various stakeholders. We’d like to see 1-2 years of industry experience in a relevant field. Our software is in many modern programming languages (Python, Golang, C++ etc) so you will need a willingness to learn. We’d also like to see good capability with Python or Golang.
- Rate: £45,000 - £60,000pa
- Location: Kentish Town, London
- Contact: lindsey.noakes@vivacitylabs.com (please mention this list when you get in touch)
- Side reading: link, link, link
Zarr Community Manager, NumFOCUS, Inc.
Zarr is a format for the storage of chunked, compressed, N-dimensional arrays. Built originally in Python for working with NumPy arrays, Zarr is now supported in more than half a dozen languages. With funding from the Chan Zuckerberg Initiative, we are looking to hire a full-time, open-source enthusiast for two years to work as our community manager.
- Rate: This role can be either a contract position or an employed position with fringe benefits. $60,000 – $80,000 per year dependent on position type and experience.
- Location: Remote
- Contact: hiring@numfocus.org (please mention this list when you get in touch)
- Side reading: link
SunPy Scientific Software Developer, NumFOCUS, Inc.
NumFOCUS is seeking a Scientific Software Developer to support the SunPy project. SunPy is a Python-based open source scientific software package supporting solar physics data analysis. Contract is available for U.S. residents only. This is a 1-year contract but work may be completed in less time.
- Rate: $80.00 per hour, not to exceed $51,000 for the duration of the contract (approximately 637 hours).
- Location: Remote
- Contact: hiring@numfocus.org (please mention this list when you get in touch)
- Side reading: link
Jupyter Community Events Manager at NumFOCUS, Inc.
The primary role of the Project Jupyter Community Events Manager will be to manage two event programs: JupyterCon and Jupyter Community Workshops. In conjunction with NumFOCUS and Project Jupyter leadership, you will create and implement a strategy to connect the international Jupyter community through both online and in-person events.
- Rate:
- Location: Austin, TX
- Contact: hiring@numfocus.org (please mention this list when you get in touch)
- Side reading: link
Software Engineer (Python Dev with AWS) at Inawisdom Ltd: Permanent; UK /WFH
Inawisdom are a Data Science & Machine Learning Consultancy, and AWS Premier Partner. We are looking for mid+ level Python developers with AWS experience (or OO Programmers with AWS who are willing to lean Python, or vice versa!) for a Permanent role. This is an exciting opportunity for someone to make an impact implementing and delivering cloud native solutions and serverless applications in a Data Science business. You will be required to develop software with the latest and greatest tech for high profile, enterprise clients.
• Knowledge of functional and object oriented programming. • Knowledge of synchronous and asynchronous programming. • 2 or more years developing in Python 2.6 or 3.x. • Experience in using Python frameworks (e.g. Flask, Boto 3) • Familiarity with Amazon Web Services (AWS) and REST APIs. • Understanding of databases and SQL. • Understanding of Non-SQL databases. • Experience in unit testing and TTD.
Desirable requirements: • Experience in AWS serverless services (Lambda, API GW, SNS, SQS, and Dynamo DB). • Has developed solutions using AWS SAM or the Serverless Framework and defined APIs in Swagger.
- Rate: £50k- £80k
- Location: UK or Holland
- Contact: greg@inawisdom.com (please mention this list when you get in touch)
- Side reading: link, link
Senior Python Developer (Lisbon)- 2iqresearch
We are looking for an experienced Python Developer with a strong background in Finance to join us as one of our first engineers in the core team.
You will play a key role in designing and maintaining analytics/predictions and visualizations for our new data platform, “Alpha Terminal.” It bundles 2iQ’s data and analytics into one easy-to-use product, offering fundamental investors a range of powerful insights.
Responsibilities: Working with the Quant and Product teams by designing, building and managing critical infrastructure while automating everything with code. Initially, this role will be based in our Lisbon office. However, there is the potential for flexible working arrangements in the future. The role may suit an individual that is looking for a change of scenery or better work-life balance.
Requirements: Experience in a DevOps or software engineering role Strong background with Linux, K8s and Docker (or other container) High proficiency in a language such as Python, Java, or Go
Nice to have Cloud or Big Data experience (Elastic, Aerospike, ClickHouse, KDB+, …) Experience with message buses Spark and/or Dask knowledge
- Rate: 80-90k
- Location: Lisbon, Portugal
- Contact: jobs@2iqresearch.com (please mention this list when you get in touch)
Quantitative Developer – Python (Lisbon) - 2iqresearch
We are seeking highly talented Quantitative Developer with a solid background in Python to join our platform analytics team. In this role, you will help implement, support, and run the hybrid compute infrastructure that manages all research and production workloads.
Working closely with the Quant and Product teams, to support and develop code that is running in our production systems. These systems are the building blocks of the “Alpha Terminal”, a tool for fundamental investors to explore the market. You will also build and optimise data analytics services as well as integrating the data to support the quantitative team. Adapting research prototypes of models to the production environment, is also a key responsibility of this role. This role is to be fulfilled in our Lisbon office. However, flexible working arrangements as well as a hybrid model transition period are available for all candidates.
Requirements: Experience in numerical Python and SQL Working knowledge of Pandas / NumPy libraries Dask and/or Spark knowledge CI/CD knowledge
Nice to have: Docker (or other containerization) knowledge Cloud or Big Data experience (Parquet, PyArrow, Aerospike, ClickHouse, KDB+, …) Knowledge of AI/ML libraries (Tensorflow, PyTorch, SciKit, ..)
- Rate:
- Location: Lisbon, Portugal
- Contact: jobs@2iqresearch.com (please mention this list when you get in touch)
Product Analyst at JW Player
Over half a billion videos are watched across millions of websites on a JW Player video player every day. Our product teams leverage data coming from our player to measure success, prioritize our next steps, and envision new possibilities for the thousands of video publishers we serve daily across the web. We iterate quickly, conduct frequent experiments as part of product development, and seek to be data driven in everything we do.
As a Product Analyst on the JW Player Data Science & Product Analytics team, you will work closely with product managers, engineers, and data scientists to develop insights that inform product decisions and strategy. Your findings will impact the next generation of JW Player products, from our flagship video player and video platform to our video recommendations service and other data products. You’ll play a critical role in improving these products and guiding our future development efforts.
- Rate:
- Location: Remote within the United States
- Contact: olga@jwplayer.com (please mention this list when you get in touch)
- Side reading: link
Senior Data Scientist at JW Player
JW Player powers billions of video plays every week across a wide spanning web of broadcasters and video publishers with a diverse set of audiences and content types. Leveraging the vast stream of data sent by our flagship player, the Data Science team works in close collaboration with adjacent teams to improve our existing products, drive sound decision making, and develop new data products that bring value to our customers in both the video publishing and video advertising spaces. We iterate quickly, conduct frequent experiments, and seek to be data driven in everything we do.
As a Senior Data Scientist at JW Player, you will be joining a collaborative, creative, multidisciplinary team of scientists, engineers, and data analysts responsible for research and development, product analytics, and running production machine learning models that make tens of millions of predictions every day.
- Rate:
- Location: Remote within the United States
- Contact: olga@jwplayer.com (please mention this list when you get in touch)
- Side reading: link, link, link
Research Advocate - Rasa
At Rasa we’re hiring for a bunch of engineering roles. We’re a friendly, remote company with many interesting problems to solve. We’re building open-source tools that are used globally to build virtual assistants. Want to invest in developer experience, Non-English NLP and scalable machine learning? Then there’s a lot to do!
Feel free to reach out to Vincent @fishnets88 if you have any questions.
- Rate: https://rasa.com/careers/#jobs
- Location: EU Remote
- Contact: vincentwarmerdam@gmail.com (please mention this list when you get in touch)
- Side reading: link, link
Senior Software Engineer (Full Stack) at Carbon Re
Carbon Re is an AI research and development company dedicated to removing Gigatons of CO2 (equivalent) from humanity’s emissions each year. We aim to do so by optimizing production processes, redesigning manufacturing systems, developing new control processes, and accelerating the development of new climate-friendly materials and systems.Carbon Re is an equal opportunity employer. We are still a small team and are committed to growing in an inclusive manner.
- Rate:
- Location: London Bridge, London
- Contact: careers@carbonre.tech (please mention this list when you get in touch)
- Side reading: link, link
Principal Data Scientist at National Grid
As Principal Data Scientist, your key role will be to establish, define and implement data science solutions in order to deliver business value by making the optimal decisions to ensure efficient and cost-effective performance. You will build data science tools, providing business experts throughout Gas Transmission (GT) with the technology and expertise to unlock and exploit the information we hold to support the effective running of the business.
- Rate:
- Location: Warwick
- Contact: adnan.fiaz@nationalgrid.com (please mention this list when you get in touch)
- Side reading: link
Data Scientist at Ripjar, Permanent, Remote
We’re looking for experienced, highly motivated Data Scientists to support the research and development of Ripjar’s analytics and data products. You will carry out data analysis tasks to develop Ripjar’s understanding of relevant data and will develop, train and evaluate machine learning models that can be integrated into Ripjar’s software products and data processing pipelines.
You will have a strong technical and theoretical background, with a strong understanding of statistics and statistical models. You will be proficient in at least one programming language, preferably Python. You will have a good understanding of machine learning and large-scale data analysis, and will be comfortable working with complex data at scale.
- Rate: £50,000 - £75,000
- Location: Cheltenham. Bristol or Remote
- Contact: anthony.birleybrown@ripjar.com 07498 778 597 (please mention this list when you get in touch)
- Side reading: link