Week 42
5 Minutes of Data Science - week 42
Highlights from October 17 to October 23
Foreword
The last episode of the Towards Data Science podcast is worth the time, since Jeremie (the host), talks about the future of AI. Simply put, we’re moving faster in AI that we initially predicted.
There’s an interesting trending github repo called MindsDB, and brands itself as “ML-SQL Server enables machine learning workflows for the most powerful databases and data warehouses using SQL.” Worth keeping an eye.
Lastly, Deepmind latest blogpost includes - among other things - how they forecasted wind output for improved better energy delivery. Check the blogposts below.
Come say hi on Twitter and see you next week!
Blogs
- Digital transformation with Google Cloud, by DeepMind
- Google at ECCV 2022, by Google AI
- PI-ARS: Accelerating Evolution-Learned Visual-Locomotion with Predictive Information Representations, by Google AI
- MUSIQ: Assessing Image Aesthetic and Technical Quality with Multi-scale Transformers, by Google AI
- Do Modern ImageNet Classifiers Accurately Predict Perceptual Similarity?, by Google AI
- Table Tennis: A Research Platform for Agile Robotics, by Google AI
- Lessons learned from 10 years of DynamoDB, by Amazon Science
- reMARS revisited: Net zero carbon goal and Amazon’s fulfillment network, by Amazon Science
- reMARS revisited: Amazon’s supply chain optimization, by Amazon Science
- Exploring the uncertainty of predictions, by Amazon Science
Podcasts
- Nano-targetted Facebook Ads, by Data Skeptic
- Jeremie Harris - TDS Podcast Finale: The future of AI, and the risks that come with it, by Towards Data Science
- Data for All, by Practical AI
- Building Foundational ML Platforms with Kubernetes and Kubeflow with Ali Rodell - #595, by The TWIML AI
- How Data Leaders Can Build an Effective Talent Strategy, by DataFramed
- From Data Science to DataOps - Tomasz Hinc, by Data Talks
- It’s kind of annoying to see that, in general, most data-related spaces are flush with “how do I get a job” and comparatively little discussion around the actual topic, at r/Data Science (💬160)
- every time I hear someone say num-pee i die a little bit, at r/Data Science (💬126)
- Is it just me, or did you also wake up 10-15 years later for your job to be called and branded as AI/ML?, at r/Data Science (💬123)
- Runway Stable Diffusion Inpainting: Erase and Replace, add a mask and text prompt to replace objects in an image, at r/Machine Learning (💬71)
- Speech-to-speech translation for a real-world unwritten language, at r/Machine Learning (💬159)
- Call for questions for Andrej Karpathy from Lex Fridman, at r/Machine Learning (💬349)
- I created a decision tree to prepare for my biostatistics exam. What information or guidance could be added, removed, fixed, or improved?, at r/Ask Statistics (💬25)
- Best “Pre-Doctoral” Jobs for Statistics PhD?, at r/Ask Statistics (💬2)
- Applied stats master’s degree focused on Python?, at r/Ask Statistics (💬8)
- AI Image Editing from Text! Imagic Explained, at r/Latest in ML (💬1)
- A list of Open source tools in Data Centric AI, at r/Latest in ML (💬0)
Github jupyter notebook trends
- novelai-colab-ver: You can use this version to experience how novelai works without a good gpu.
- paper2gui: Convert AI papers to GUI,Make it easy and convenient for everyone to use artificial intelligence technology。让每个人都简单方便的使用前沿人工智能技术
- learnopencv: Learn OpenCV : C++ and Python Examples
- google-research: Google Research
- ml-basics: Exercise notebooks for Machine Learning modules on Microsoft Learn
- carefree-creator: An AI-powered creator for everyone.
- Python-for-Data-Science-: None
- diffusion-nbs: Getting started with diffusion
- stability-sdk: SDK for interacting with stability.ai APIs (e.g. stable diffusion inference)
- handson-ml3: A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
- mlbookcamp-code: The code from the Machine Learning Bookcamp book and a free course based on the book
- latent-diffusion: High-Resolution Image Synthesis with Latent Diffusion Models
- course22p2: course.fast.ai 2022 part 2 - under construction
- data-engineering-zoomcamp: Free Data Engineering course!
- Machine-Learning-Specialization-Coursera: Contains Solutions and Notes for the Machine Learning Specialization By Stanford University and Deeplearning.ai - Coursera (2022) by Prof. Andrew NG
- reinforcement-learning: Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton’s Book and David Silver’s course.
- Complete-Python-3-Bootcamp: Course Files for Complete Python 3 Bootcamp Course on Udemy
- tpu: Reference models and tools for Cloud TPUs.
- yolov7: Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Github python trends
- PaddleOCR: Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
- mindsdb: In-Database Machine Learning
- FedML: FedML - The federated learning and analytics library enabling secure and collaborative machine learning on decentralized data anywhere at any scale. Supporting large-scale cross-silo federated learning, cross-device federated learning on smartphones/IoTs, and research simulation. MLOps and App Marketplace are also enabled (https://open.fedml.ai).
- stable-diffusion-webui: Stable Diffusion web UI
- DeepDanbooru: AI based multi-label girl image classification system, implemented by using TensorFlow.
- scikit-learn: scikit-learn: machine learning in Python
- english-words: 📝A text file containing 479k English words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion
- transformers: 🤗Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
- MotionDiffuse: MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
- fauxpilot: FauxPilot - an open-source GitHub Copilot server
- openpilot: openpilot is an open source driver assistance system. openpilot performs the functions of Automated Lane Centering and Adaptive Cruise Control for over 200 supported car makes and models.
Don't miss what's next. Subscribe to 5 minutes of Data Science: