5 minutes of Data Science

Subscribe
Archives
March 9, 2023

Week 9

5 Minutes of Data Science - week 9

Highlights from February 27 to March 05

Newsletters

  • LWiAI Podcast #113 - Bing Chat Antics, Bio and Mario GPT, Stopping an AI Apocalypse, Stolen Voices, by Last Week in AI
  • Celebrating 10k subscribers, by Last Week in AI
  • Last Week in AI #208: OpenAI’s plan for AGI, generative voice broke into bank account, AI-Human romances, and more!, by Last Week in AI
  • MLOps 101: Feature Stores, Automation, Testing and Monitoring, by The AI Edge
  • The AiEdge Courses and more Newsletters!, by The AI Edge
  • Machine Learning Monthly Newsletter 💻🤖, by Zero To Mastery

Podcasts

  • Success (and failure) in prompting, by Practical AI
  • Privacy and Security for Stable Diffusion and LLMs with Nicholas Carlini - #618, by The TWIML AI
  • Analytics for a Better World - Parvathy Krishnan, by Data Talks

Youtube

  • Dr. MICHAEL OLIVER [CSO - Numerai], by Machine Learning Street Talk
  • Leetcode Challenge with DeepMind & Mila Scientists!, by Machine Learning Street Talk

Blogs

  • Introducing ChatGPT and Whisper APIs, by Open AI
  • Invalidating robotic ad clicks in real time, by Amazon Science
  • Amazon and Columbia announce 2023 CAIT Fellows, by Amazon Science
  • Jonathan Toner’s hunt for hard questions took him from Antarctica to Amazon, by Amazon Science
  • Improvements to Embedding-Matching Acoustic-to-Word ASR Using Multiple-Hypothesis Pronunciation-Based Embeddings, by Apple Machine Learning
  • HEiMDaL: Highly Efficient Method for Detection and Localization of wake-words, by Apple Machine Learning

Reddit’s top posts

  • Overpaid and don’t see the point, at r/Data Science (💬273)
  • Rich Jupyter Notebook Diffs on GitHub… Finally., at r/Data Science (💬27)
  • Tech layoffs since January 2022, at r/Data Science (💬37)
  • I built a chatbot that helps you debug your code, at r/Machine Learning (💬65)
  • Dropout Reduces Underfitting - Liu et al., at r/Machine Learning (💬45)
  • LazyShell - GPT based autocomplete for zsh, at r/Machine Learning (💬56)
  • How to test for difference between different logistic distributions, at r/Ask Statistics (💬16)
  • I’m studying about linear regression and I had a question about the normal equation of simple linear regressions, at r/Ask Statistics (💬6)
  • What is the difference between the Confidence interval and Expanded uncertainty?, at r/Ask Statistics (💬6)
  • cleanlab open-source — expanded support for Active Learning and other data-centric AI tasks, at r/Latest in ML (💬1)
  • AI generated video chapter titles (YouTube, Vimeo, etc), at r/Latest in ML (💬5)
  • Updated list of free open source resources in Data Centric AI, at r/Latest in ML (💬0)

Github jupyter notebook trends

  • openai-cookbook: Examples and guides for using the OpenAI API
  • stable-diffusion-webui-colab: stable diffusion webui colab
  • lora: Using Low-rank adaptation to quickly fine-tune diffusion models.
  • whisper: Robust Speech Recognition via Large-Scale Weak Supervision
  • Prompt-Engineering-Guide: 🐙Guides, papers, lecture, and resources for prompt engineering
  • google-research: Google Research
  • CLIP: CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
  • notebooks: Notebooks using the Hugging Face libraries🤗
  • DL-Algorithms: None
  • Machine-Learning-Goodness: The Machine Learning repository contains ML/DL projects, notebooks, cheat codes of ML/DL/AI, useful information on AI/AGI and codes or coding snippets/scripts/tasks.
  • latent-diffusion: High-Resolution Image Synthesis with Latent Diffusion Models
  • InvokeAI: InvokeAI is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, supports terminal use through a CLI, and serves as the foundation for multiple commercial products.
  • stable-diffusion: A latent text-to-image diffusion model
  • Dreambooth-Stable-Diffusion: Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
  • YOLOv6: YOLOv6: a single-stage object detection framework dedicated to industrial applications.
  • amazon-sagemaker-examples: Example📓Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using🧠Amazon SageMaker.
  • tutorials: MONAI Tutorials
  • nerf: Code release for NeRF (Neural Radiance Fields)
  • yolov7: Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
  • annotated_deep_learning_paper_implementations: 🧑‍🏫59 Implementations/tutorials of deep learning papers with side-by-side notes📝; including transformers (original, xl, switch, feedback, vit, …), optimizers (adam, adabelief, …), gans(cyclegan, stylegan2, …),🎮reinforcement learning (ppo, dqn), capsnet, distillation, …🧠
  • introduction_to_ml_with_python: Notebooks and code for the book “Introduction to Machine Learning with Python”

Github python trends

  • llama: Inference code for LLaMA models
  • xiaogpt: play chatgpt with xiaomi ai speaker
  • 30-Days-Of-Python: 30 days of Python programming challenge is a step-by-step guide to learn the Python programming language in 30 days. This challenge may take more than100 days, follow your own pace.
  • openai-python: The OpenAI Python library provides convenient access to the OpenAI API from applications written in the Python language.
  • nebullvm: Plug and play modules to optimize the performances of your AI systems🚀
  • researchgpt: An open-source LLM based research assistant that allows you to have a conversation with a research paper
  • unilm: Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
  • text-generation-webui: A gradio web UI for running Large Language Models like GPT-J 6B, OPT, GALACTICA, GPT-Neo, and Pygmalion.
  • paper-qa: LLM Chain for answering questions from documents with citations
  • langchain: ⚡Building applications with LLMs through composability⚡
  • gpt_index: LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM’s with external data.
  • chatgpt-on-wechat: 使用ChatGPT搭建微信聊天机器人,基于ChatGPT3.5 API和itchat实现。Wechat robot based on ChatGPT, which using OpenAI api and itchat library.
  • tiktoken: None
  • chatgpt_telegram_bot: None
  • stable-diffusion-webui: Stable Diffusion web UI
  • xformers: Hackable and optimized Transformers building blocks, supporting a composable construction.
  • system-design-primer: Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
  • OpenBBTerminal: Investment Research for Everyone, Anywhere.
  • T2I-Adapter: T2I-Adapter
  • poetry: Python packaging and dependency management made easy

See you next week!

Don't miss what's next. Subscribe to 5 minutes of Data Science:
GitHub X LinkedIn
Powered by Buttondown, the easiest way to start and grow your newsletter.