AI Week Jan 8th: What does 100010111001001 look like?
In this week's AI Week:
- Lawsuits 'n' more
- What does 100010111001001 look like?
- ChatGPT app store
- Some things not to do with ChatGPT
- Microsoft Copilot gets its own key
- Search news
- Longread: GPT in SQL
100010111001001 (Bing Image Creator)
Lawsuits
Last week, I predicted that 2024 would see more lawsuits against generative AI companies for infringing on copyright. This week brings two more lawsuits, plus fresh evidence for many more potential suits:
Two more nonfiction authors file suit against Microsoft
Microsoft, OpenAI sued over copyright infringement by authors
The new copyright infringement lawsuit against Microsoft and OpenAI comes a week after The New York Times filed a similar complaint in New York.
Anthropic's updated ToS pledge protection from copyright infringement
Anthropic's updated commercial ToS went into effect last week. Similar to OpenAI, Microsoft and Google, Anthropic now offers its customers some protection against copyright infringement claims.
Anthropic will help users if they get sued for copyright infringement. - The Verge
The company updated its commercial terms of service for those who use Claude, its AI chatbot, saying it will not only defend customers from copyright infringement claims but also pay for settlements. The new terms go into effect on January 1st, 2024 and follow similar commitments from Microsoft, Google, and other companies. These copyright protections won’t apply if a customer “knows or reasonably should know” they’re infringing copyright. This is consistent with what Anthropic previously said ...
But those indemnities are limited:
AI firms’ pledges to defend customers from IP issues have real limits | Ars Technica
Indemnities offered by Amazon, Google, and Microsoft are narrow.
More lawsuits to come
Gary Marcus (scientist) and Reid Southern (artist) have a guest post in IEEE: "Generative AI Has a Visual Plagiarism Problem". They got MidJourney v6 to output characters like the Simpson, and nearly-perfect movie scenes from Thanos, The Avengers, and more.
What's really fun is that even vague prompts can get you infringing stuff: "black armor with light sword, movie screencap" gets Darth Vader, "animated toys" gets Toy Story, "videogame plumber" gets Mario, and any prompt including the word "screencap" gets... well, images that are virtually identical to still frames from movies, including Iron Man, Batman v Superman, and The Dark Knight.
Quote:
We will call such near-verbatim outputs “plagiaristic outputs,” because if a human created them we would call them prima facie instances of plagiarism.
Read the whole article here:
https://spectrum.ieee.org/midjourney-copyright
A review of Midjourney 6: it's amazing
Too bad it's a plagiarism engine.
How much detail is too much? Midjourney v6 attempts to find out | Ars Technica
As Midjourney rolls out new features, it continues to make some artists furious.
A list of over 4000 artists whose work was used to train Midjourney
went viral on X last week. The list, in Google Sheets, was a court exhibit from a lawsuit against Midjourney, Stability AI, DeviantArt and Runway AI. It's no longer public, but is archived here
Database of 16,000 Artists Used to Train Midjourney AI Goes Viral – ARTnews.com
The 16,000 artists also included Frida Kahlo, Vincent van Gogh, and many more famed figures.
By the way: Firefly
Given all the lawsuits and the potential for more, it seems worth mentioning that Adobe says its image- generating AI, Adobe Firefly, was trained on licensed images only.
OpenAI lobbying for exemptions
"What do you when your potentially zillion dollar business suddenly runs into a massive obstacle that turns out to be bigger than expected?"
The desperate race to save Generative AI - by Gary Marcus
Copyright infringement issues could sink their business, but why should they have to pay licensing fees like everybody else?
What does 1110101010101101001 look like? (Fun with image-generating AI)
The image at the top of the newsletter is Bing Image Creator's response to the prompt "1110101010101101001".
Text-to-image will generate images from binary strings: sometimes random, sometimes strangely lovely, sometimes a bit horrifying. Here's Craiyon's response to that prompt:
Only one eldritch horror out of 9, not bad!
SCP-1110101010101101001
The same prompt to Stable Diffusion model SDXL 1.0 (via NightCafe) gave me a spacious modern room full of neutral tones and white furniture:
These responses to a nearly meaningless prompt can tell us something about what the models are tuned for. It looks like Craiyon is tuned for people's faces, particularly girl-faces. Bing Image Creator is tuned more toward the abstract, with high-contrast tech gloss.
SDXL 1.0, on the other hand, really seems to like modern rooms with abstract art and white furniture. I gave it the extended prompt "11101010101011010011100010001010100111100010011001010101010," and got a white wall of rectilinear modern art with a white chair:
When I sent the same prompt to Mysterious XL v4, a version of SDXL custom-prompted "for fantasy art and Asian culture", and asked for four responses, I got three Victorianesque geometric wallpapers and... a bunch of knights? (Honestly, I could do this all day.)
Try it! Prompt Bing Image Creator (free, login required) and Craiyon (free, no login) with a random string of numbers or letters. I'd love to see what you get them to generate.
OpenAI's "app store" launching this week
OpenAI was set to launch this before the whole firing-Sam-Altman-then-getting-all-the-women-off-the-board-and-rehiring-him debacle in November, but it took a back seat to the drama.
The idea is that you can build an app (a "GPT", which is a really confusing name[*]) without coding by lightly customizing ChatGPT with your own prompts. Forbes is already all over this, telling "Gen Z and Millenial professionals" to go create GPTs as a side hustle.
[*] It's confusing because "GPT" already stands for "Generative Pre-trained Transformer," which is the technical name for this kind of large language model, and not at all a proprietary OpenAI term.
OpenAI's Custom ChatGPT Store to Open Soon, Making AI More Approachable - CNET
Get ready to scroll through another sea of icons trying to find apps that are interesting, helpful or entertaining. This time it's AI apps.
Some things not to do with generative AI
So with the advent of a thousand thousand GPTs upon us, let's look at some things not to do with generative AI.
More fake legal cases, this time from Google Bard
Michael Cohen, a disbarred lawyer, admitted this week that he gave his lawyer a few cases Google Bard made up. Yeah, he shouldn't have done that, and his lawyer shouldn't have used cases sent to him by Mr. Disbarred.
Michael Cohen gave his lawyer fake citations invented by Google Bard AI tool | Ars Technica
Disbarred Cohen passed fake cases to his lawyer, who didn't do a fact-check.
Medical diagnoses -- seriously, don't
A research letter published in JAMA notes: ChatGPT is the wrong tool for difficult pediatric diagnoses. That's... not totally surprising, but it's impressive how often it was wrong: 82% of the time.
ChatGPT bombs test on diagnosing kids’ medical cases with 83% error rate | Ars Technica
It was bad at recognizing relationships and needs selective training, researchers say.
To be honest, I'm surprised this study used ChatGPT, and not a tool that's been fine-tuned on medical data. The authors acknowledge this in the paper: "However, some LLMs, like Google’s Med-PaLM 2, have been specifically trained on medical data and may be better equipped to provide accurate diagnoses."
(For any academics reading this: it seems like at the moment, it's pretty easy to get published just by playing with chatbots, finding conditions in which they tell you wrong answers, and making them do it a hundred times. IDK, this seems waaaay lower than the usual bar for getting published in JAMA.)
A wild fake news appears
AI-generated article hallucinates Christmas Day murder in small New Jersey town - Boing Boing
Police in Bridgeton, New Jersey, report that NewsBreak published an "entirely false" article about a Christmas Day murder in the town—the latest example of AI-generated fake news going live. This…
Filing fake bug reports
"Better crap is worse."
Last year, science fiction magazines like Clarkesworld complained about a flood of AI-generated submissions wasting their time. Now the maintainer of an open-source software tool (curl, a command-line tool to fetch webpages) is getting their time wasted by AI-generated bug reports submitted to their bug bounty program.
https://daniel.haxx.se/blog/2024/01/02/the-i-in-llm-stands-for-intelligence/A bug bounty program, in which a software maker offers payouts to people who report new bugs, shares the combination of “no human interaction”, “free open submissions” and “real $ payouts” with short fiction submissions. I wonder what else is in the intersection of that Venn diagram?
AI is making scams worse
Speaking of scammy bug reports, there was a bunch of reporting this week on how scammers are leveraging AI to scam people faster and better.
https://www.saturdayeveningpost.com/2024/01/con-watch-artificial-intelligence-is-making-scams-worse/In November, I shared this Ray Naylor essay, "AI and the rise of mediocricity". Scams are a perfect example of the kind of mediocricity that generative AI is good at cranking out: plausible, voluminous, and the accuracy doesn't matter. In many ways, scamming is the perfect use case for ChatGPT.
3 AI scams to watch out for in 2024
Forward this to Grandmas everywhere.
AI scams are coming in 2024 - here's what to watch out for
Be on the lookout for AI scams in 2024, from fraudulent investment opportunities to scammers using AI to replicate voices.
I wonder how well OpenAI will keep scam-enabling GPTs out of their app store? Hopefully, they'll have some kind of robust moderation infrastructure in place when they open it.
And one YES, DO DO THIS WITH AI: Mickey Mouse!
The 1928 version of Mickey Mouse is in the public domain now, and generative AI prompters are already on it. Image-generating models were most likely trained on old Mickey Mouse along with all the other IP it sucked in, so it's not bad at Mickeying.
Early Mickey Mouse is now in the public domain—and AI is already on the case | Ars Technica
Experimental AI image generator trained on Disney's 1928 cartoons can make eldritch horrors.
I'm here for Squid Mickey.
Microsoft Copilot gets its own key
Nothing says “we really want you to use this feature” like giving it its own physical key.
This clicks with what Microsoft's CTO, Kevn Scott, has said about the potential of AI to democratize computing: you don't even have to find Copilot in the Microsoft Office Ribbon, you can just hit a button.
Microsoft is adding a new key to PC keyboards for the first time since 1994 | Ars Technica
Copilot key will eventually be required in new PC keyboards, though not yet.
Every time anyone hits the Copilot key, they may be costing Microsoft some money: the WSJ reported in October of last year that GitHub Copilot (for coding) runs at a loss. And that's not even counting the lawsuit costs.
BTW, Microsoft Copilot now has an app for both iOS and Android.
Search news
$50 for selling your child’s facial geometry to Google
No comment.
Okay, yes, one comment: If you have a kid, maybe don't feed their facial geometry to the Googlelith. What could go wrong?
Google Contractor Pays Parents $50 to Scan Their Childrens' Faces
Google is having parents film their children wearing hats and sunglasses, with the collected data to include eyelid shape and skin tone.
Is freemium coming to Google Bard?
I hate this because Google Bard is for search and I don't want to have to pay for less-hallucinated search results.
Google appears to be working on an ‘advanced’ version of Bard that you have to pay for - The Verge
You might need a Google One subscription.
Perplexed by Perplexity
A new AI-powered search startup was getting some buzz this week:
Google challenger Perplexity attracts attention. - The Verge
The AI ”answer engine” has gained $74 million in funding from investors like Jeff Bezos and Institutional Venture Partners — the largest sum raised by an internet search startup in recent years, according to the WSJ. Former YouTube CEO Susan Wojcicki and Google AI’s Jeff Dean both made personal investments as well. Our own David Pierce recently said this about Perplexity:
Try it here: https://www.perplexity.ai
What’s nice about this: it shows sources. (So does Bing, though.) However, the answers still require some judgement… when asked for a “country in africa starting with K,” Perplexity is perplexed by “Kenya,” just like Google Bard:
Longread: GPT in SQL
Okay. This post isn't for everybody. But IYKYK. Blogger and programmer Alex Bolenok, who writes and posts as Quassnoi, managed to implement a large language model in SQL, a language more commonly used for retrieving employee records from HR databases.
Happy New Year: GPT in 500 lines of SQL - EXPLAIN EXTENDED at EXPLAIN EXTENDED
A complete GPT2 implementation as a single SQL query in PostgreSQL.
CORRECTION: I originally attributed the above post to Markus Winand. My apologies, Alex, and thank you for setting the record straight!