SAIL: Synthetic Data, Emotion Reading Software
Hi all,
Synthetic data is a topic that has been floating around for over a decade and is recently gaining recognition and attraction, with Forbes humbly declaring that it will transform AI. Gartner says ~60% of the data used to develop AI models will be synthetic by 2024.
What is synthetic data? And what problems does it solve?
To the first question, SD is programmatically generated data or data created through simulations rather than collected in the real world. See this for an accessible tutorial.
Regarding the problems that synthetic data solves, there are several primary ones: 1. Time and cost - using SD reduces data collection time and effort as well eliminating data labeling time and effort (in contrast with collecting traditional data), 2. Privacy - data raises concerns about privacy and ethics. Real world data needed for self-driving car model development is expensive, but doesn't really have privacy concerns. Data that involves humans - genomic, social interaction, learning - has significant privacy and ethics considerations. SD eliminates those concerns. On a quick side note - quality issues are actually not a concern as SD can outperform "real" data and scorecards have been developed to assess quality.
In education (learning analytics specifically) I wasn't able to find many examples of SD. Here is one from a few years ago (pdf). If you are working with SD, or are aware of projects in learning that make use of it, please let me know!
General:
AI and Surveillance All data collection is a type of "observation". We are actively surveilling what students are doing. This isn't novel - all human subjects research involves the same activity. A teacher in a classroom, not taking any notes or leaving any digital traces, is surveilling what students are doing. When the act of surveilling is connected with predictive models, things can get more than a bit worrying, especially around ethics. This article looks at predictive policing in China.
Walk the Random Walk Generally, an AI agent would experience "supervised learning" to direct their achievement of goals or maximization of outcomes by having humans provide the goal or supervision. This article offers an interesting overview of training "a goal-conditioned agent without any external rewards or any domain knowledge".
IBM's Global AI Adoption Index according to this report, 35% of companies, from a global sample of 7,500, are using AI in their business. That number seems small or at least misaligned with the hype, though the workforce development through AI adoption at 30% is about what I'd expect. I wonder where universities are positioned on AI adoption.
USA Dept of Defense has created a position paper of sorts on AI use in the defense sector.
To Learn:
There are growing numbers of open courses for school leaders to upgrade their AI literacy. Microsoft has a few: AI Business School for Education and Principles and Practices of Responsible AI.
An excellent overview of Essential Algorithms by Andrew Ng's DeepLearning.ai
Education:
Ryan Baker's lab has launched a wiki on Algorithmic Bias in Education
AI Tutors prepare workers for the modern workforce. A fairly rudimentary overview, but the focus on augmenting social emotional learning is important. Human and artificial cognition is the key here - how do we learn, team, decide, and make sense of the world when partnering with AI?
You'll be increasingly hearing about digital twins (a virtual version of something/someone physical). Siemens and Nvidia announced a partnership to develop digital industrial twins The lifelong learning implications of this are potentially fascinating as the upskilling market remains one of the fastest growing learning spaces.
Microsoft stops selling emotion reading technology. Given the interest in social emotional learning in classrooms, this may have an education implication. But automated emotion detection in classrooms was an ethical landmine from the start
Have a great week - and feel free to send articles of interest.
George