Ocsai 1.6: An Easy Swap-in Update

lang


            
        August 23, 2024
    
    
Ocsai 1.6: An Easy Swap-in Update


        Brief note today.
I updated the Ocsai 1.5 and best Ocsai 1 models to new versions, named ocsai-1.6 and ocsai1-4o. No technical differences from the earlier models, just a change of base model to GPT-4o-mini. The improvements are across the board, particularly for Arabic (+18), Chinese (+11), Spanish (+17), and Hebrew (+30 point jump!).
The changes
Alternate/Unusual Uses:
lang Ocsai 1.5 Ocsai 1.6 diff
ara 0.273 0.450 +0.177
chi 0.543 0.654 +0.111
dut 0.726 0.797 +0.071
eng 0.736 0.764 +0.028
fre 0.722 0.779 +0.057
ger 0.754 0.814 +0.060
heb 0.463 0.764 +0.301
ita 0.602 0.683 +0.081
pol 0.672 0.735 +0.063
rus 0.614 0.723 +0.109
spa 0.603 0.771 +0.168
Other tasks, English:
task Ocsai 1.5 Ocsai 1.6 diff
completion 0.860 0.889 +0.029
consequences 0.560 0.691 +0.131
instances 0.917 0.939 +0.022
metaphors 0.704 0.750 +0.046
And for the English-focused model in the original Ocsai 1 format, ocsai1-4o, the new performance is r=0.812, versus r=0.781 and r=0.777 for its immediate predecessors (ocsai-davinci2 and ocsai-chatgpt). These benchmarks are on a model with withheld data; the version available on OCS doesn't withhold data, so it will perform slightly better (but immeasurably so!).
I understand the frequent model updates may cause whiplash. The intent with OCS is to provide the best possible tools for automated scoring, which is why they're updated so much. To lend some clarity, the new updates are accompanied in the web interface with a 'Recommended' Tag or a 'Deprecated' classification (i.e. there's a new, better model, but the old one is still available for replicability).
The updates are at https://openscoring.du.edu/scoringllm.
    

                            Don't miss what's next. Subscribe to The Creativity Byte:
                        
                    
                        Institution (optional)
                        
                    
                        Role (optional)
                        
                    
                        Name (optional)
                        
                    
            Email address (required)
            
            
                Share this email:
                
                    
                                Share on Twitter
                            
                        
                                Share on LinkedIn
                            
                        
                                Share via email

lang	Ocsai 1.5	Ocsai 1.6	diff
ara	0.273	0.450	+0.177
chi	0.543	0.654	+0.111
dut	0.726	0.797	+0.071
eng	0.736	0.764	+0.028
fre	0.722	0.779	+0.057
ger	0.754	0.814	+0.060
heb	0.463	0.764	+0.301
ita	0.602	0.683	+0.081
pol	0.672	0.735	+0.063
rus	0.614	0.723	+0.109
spa	0.603	0.771	+0.168

task	Ocsai 1.5	Ocsai 1.6	diff
completion	0.860	0.889	+0.029
consequences	0.560	0.691	+0.131
instances	0.917	0.939	+0.022
metaphors	0.704	0.750	+0.046