Week 1 of 2023
5 Minutes of Data Science - week 1
Highlights from January 02 to January 08
Foreword
A few newsletter feeds have been added, enjoy!
Come say hi on Mastodon. See you next week!
Blogs
- Amazon’s papers at SLT, by Amazon Science
- Computer vision for automated quality inspection, by Amazon Science
- WACV: Where application-based research finds a home, by Amazon Science
- More-efficient annotation for semantic segmentation in video, by Amazon Science
Newsletters
- Last Week in AI #200: A Review of AI in 2022, by Last Week in AI
- Import AI 313: Smarter robots via foundation models; Stanford trains a small best-in-class medical LM; Baidu builds a multilingual coding dataset, by Import AI
Podcasts
- NLP research by & for local communities, by Practical AI
- Service Cards and ML Governance with Michael Kearns - #610, by The TWIML AI
- Data-Centric AI - Marysia Winkels, by Data Talks
Reddit’s top posts
- Changing my feminine first name to a masculine nickname on my resume gave me way more responses per application, at r/Data Science (💬246)
- Here’s another predatory unpaid internship that’s offering a promotion to a CTO title, at r/Data Science (💬59)
- The most epic DS job title, at r/Data Science (💬45)
- I built Adrenaline, a debugger that fixes errors and explains them with GPT-3, at r/Machine Learning (💬59)
- Fixing the angle of Skewed Paintings, see comments, at r/Machine Learning (💬34)
- Greg Yang’s work on a rigorous mathematical theory for neural networks, at r/Machine Learning (💬38)
- Which statistical methods became obsolete in the last 10-20-30 years?, at r/Ask Statistics (💬27)
- Does experiencing a highly-unlikely event effect the odds of experiencing it again?, at r/Ask Statistics (💬10)
- Are hazards in Cox regression even meaningful? Why has Cox regression become the norm for time-to-event analysis., at r/Ask Statistics (💬4)
- What happened in AI research in 2022 - My curated list of AI breakthroughs with a video explanation, article, and code for each paper, at r/Latest in ML (💬0)
- What to do when hyperparameter tuning doesn’t improve model performance?, at r/Latest in ML (💬2)
Github jupyter notebook trends
- nanoGPT: The simplest, fastest repository for training/finetuning medium-sized GPTs.
- Open-Assistant: OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
- data-engineering-zoomcamp: Free Data Engineering course!
- fastbook: The fastai book, published as Jupyter Notebooks
- stable-diffusion-webui-colab: stable diffusion webui colab
- MachineLearningNotebooks: Python notebooks with ML and deep learning examples with Azure Machine Learning Python SDK | Microsoft
- geospatial-data-catalogs: A list of open geospatial datasets available on AWS, Earth Engine, Planetary Computer, NASA CMR, and STAC Index
- pyprobml: Python code for “Probabilistic Machine learning” book by Kevin Murphy
- mlops-zoomcamp: Free MLOps course from DataTalks.Club
- fastai: The fastai deep learning library
- EconML: ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal …
- VToonify: [SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer
- coursera-deep-learning-specialization: Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks and Deep Learning; (ii) Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization; (iii) Structuring Machine Learning Projects; (iv) Convolutional Neural Network…
- practical-statistics-for-data-scientists: Code repository for O’Reilly book
- deep-learning-with-python-notebooks: Jupyter notebooks for the code samples of the book “Deep Learning with Python”
- yolov7: Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
- diff-svc: Singing Voice Conversion via diffusion model
- handson-ml3: A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
Github python trends
- minGPT: A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
- ColossalAI: Colossal-AI: A Unified Deep Learning System for Big Model Era
- awesome-python: A curated list of awesome Python frameworks, libraries, software and resources
- gpt_index: An index created by GPT to organize external information and answer queries!
- pyright: Static type checker for Python
- unilm: Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
- petals: 🌸Run 100B+ language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
- sqlglot: Python SQL Parser and Transpiler
- stable-diffusion-webui: Stable Diffusion web UI
- openai-cookbook: Examples and guides for using the OpenAI API
- gallery-dl: Command-line program to download image galleries and collections from several image hosting sites
- GFPGAN: GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
- CodeFormer: [NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
Don't miss what's next. Subscribe to 5 minutes of Data Science: