Week 49

by DeepMind

                            December 12, 2022

                Week 49

                        5 Minutes of Data Science - week 49
Highlights from December 05 to December 11
Foreword
ChatGPT has been all the rage, lately. And of course someone reverse engineered it and opensourced it. I don't know how long it will last.
Nonetheless, the best post I've found this week was from someone who created a virtual machine using ChatGPT.
Come say hi on Mastodon. See you next week!

Blogs

Competitive programming with AlphaCode, by DeepMind 
AI for the board game Diplomacy, by DeepMind 
Formation of Robust Bound States of Interacting Photons, by Google AI 
Private Ads Prediction with DP-SGD, by Google AI 
Google at EMNLP 2022, by Google AI 
Will You Find These Shortcuts?, by Google AI 
Economics students in Africa build computational skills, by Amazon Science 
How Amazon Robotics is working to eliminate the need for barcodes, by Amazon Science 
Dataset helps evaluate gender bias in machine translation models, by Amazon Science 
Amazon Scholar Yizhou Sun wins VLDB test of time award, by Amazon Science 
Self-learning Alexa: ML model updates with no human in the loop, by Amazon Science 
EMNLP: Prompt engineering is the new feature engineering, by Amazon Science 
A quick guide to Amazon's 40+ papers at EMNLP 2022, by Amazon Science 
Amazon adds Catalan to MASSIVE dataset, by Amazon Science 

Podcasts

From image to 3D model (Ep. 212), by Data Science At Home 
AI competitions & cloud resources, by Practical AI 
Exploring Large Language Models with ChatGPT, by The TWIML AI 
From Software Engineer to Data Science Manager - Sadat Anwar, by Data Talks 

Youtube

Prof. YANN LECUN and Dr. RANDALL BALESTRIERO - SSL, Data Augmentation [NEURIPS2022], by Machine Learning Street Talk 
Dr. Petar Veličković (Deepmind) - Categories, Graphs, Reasoning [NEURIPS22 UNPLUGGED], by Machine Learning Street Talk 
LAURA RUIS - Large language models are not zero-shot communicators [NEURIPS UNPLUGGED], by Machine Learning Street Talk 
Join us in shaping the future of technology, by Open AI 

Reddit

Judea Pearl, a pioneering figure in artificial intelligence, long argued that AI has been stuck in a decades-long rut because of our struggles digitizing causal reasoning. That's why the outcome of this basic test is sending chills down my spine., at r/Data Science (💬152)
An interesting job posting I found for a Work From Home Data Scientist at a startup, at r/Data Science (💬127)
Gaussian Processes for pirates. Courtesy of ChatGPT, at r/Data Science (💬32)
I made a command-line tool that explains your errors using ChatGPT (link in comments), at r/Machine Learning (💬111)
We're the Meta AI research team behind CICERO, the first AI agent to achieve human-level performance in the game Diplomacy. We’ll be answering your questions on December 8th starting at 10am PT. Ask us anything!, at r/Machine Learning (💬158)
[Project] Football Players Tracking with YOLOv5 + ByteTRACK, at r/Machine Learning (💬86)
what do you mean by degrees of freedom?, at r/Ask Statistics (💬23)
In multiple linear regression, if predictors account for some of the same variation in the outcome, how is that predictive power "split" between them?, at r/Ask Statistics (💬4)
Struggling with deciding on the distribution in the graph. Would it be right skewed? But the peak is on the very far right, so I wouldn’t think so. Can someone please let me know what they think?, at r/Ask Statistics (💬19)
What is Galactica and What Happened?, at r/Latest in ML (💬0)

Github jupyter notebook trends

InvokeAI: This version of Stable Diffusion features a slick WebGUI, an interactive command-line script that combines text2img and img2img functionality in a "dream bot" style interface, and multiple features and other enhancements. For more info, see the website link below.
whisper: Robust Speech Recognition via Large-Scale Weak Supervision
Data-Science-For-Beginners: 10 Weeks, 20 Lessons, Data Science for All!
stable-diffusion-2-gui: Lightweight Stable Diffusion v 2.1 web UI: txt2img, img2img, depth2img, inpaint and upscale4x.
stable-diffusion-webui-colab: stable diffusion webui colab
ML-For-Beginners: 12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
deep-rl-class: This repo contain the syllabus of the Hugging Face Deep Reinforcement Learning Class.
stable-diffusion: A latent text-to-image diffusion model
FinRL: FinRL: Financial Reinforcement Learning.🔥
tensorflow-deep-learning: All course materials for the Zero to Mastery Deep Learning with TensorFlow course.
zero-to-mastery-ml: All course materials for the Zero to Mastery Machine Learning and Data Science course.
CLIP: Contrastive Language-Image Pretraining
latent-diffusion: High-Resolution Image Synthesis with Latent Diffusion Models
yolov7: Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Machine-Learning-Specialization-Coursera: Contains Solutions and Notes for the Machine Learning Specialization By Stanford University and Deeplearning.ai - Coursera (2022) by Prof. Andrew NG
data-engineering-zoomcamp: Free Data Engineering course!
tacotron2: Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Github python trends

ChatGPT: Lightweight package for interacting with ChatGPT's API by OpenAI. Uses reverse engineered official API.
ml-stable-diffusion: Stable Diffusion with Core ML on Apple Silicon
openai-cookbook: Examples and guides for using the OpenAI API
minGPT: A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
DALL-E: PyTorch package for the discrete VAE used for DALL·E.
chatGPT-telegram-bot: This is a very early attempt at having chatGPT work within a telegram bot
chatgpt-api: This repo is unofficial ChatGPT api. It is based on Daniel Gross's WhatsApp GPT
baselines: OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
aoc-gpt: Solve Advent of Code puzzles with GPT-3
stable-diffusion-webui: Stable Diffusion web UI
DAMO-YOLO: DAMO-YOLO: a fast and accurate object detection method with some new techs, including NAS backbones, efficient RepGFPN, ZeroHead, AlignedOTA, and distillation enhancement.
fast-stable-diffusion: fast-stable-diffusion, +25-50% speed increase + memory efficient + DreamBooth
mesh-transformer-jax: Model parallel transformers in JAX and Haiku
langchain: ⚡Building applications with LLMs through composability⚡
gpt-neox: An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
jukebox: Code for the paper "Jukebox: A Generative Model for Music"

                            Don't miss what's next. Subscribe to 5 minutes of Data Science: