Artificial Intelligence (Previously "Chat GPT")
-
The main part of why this graph is important, is the questions are private and they are formulated so they are not questions on the internet. So the models can't be trained to the benchmark, and it's not seen the question before.
That's why all the models previously were doing so poorly. The guy why runs this describes anything under 10% as noise.
This is the first step towards a model working out answers from the much overused term "first principles". It's a huge achievement. The first step to these things creating new knowledge (and yes I know Google's model has created a new algorithm, but this a new approach than what they did).
The other test was "Humanity's Last Exam" and the models were previously tapping out on that at 25%. These are all PHD and above questions in many different domains, no single human could possibly answer all the questions, it would take a team of experts.
Grok Heavy got 50.7% correct.
-
@Kirwan said in Artificial Intelligence (Previously "Chat GPT"):
WTAF
Hard to know what to make of this. Is it legit in its accuracy? The comments seem to sway between believers and the rest.
If real, man the implications for energy usage are just nuts.
-
-
A bit of a Doomer take, that's all implementation details and in some part that's open slather already before you consider AI. That's why Apple use privacy as a marketing ploy against that trend.
We are in the goldrush stage, and once actual products come out of these things (still smoke and mirrors really) then people will start taking security seriously.
I did have to laught at OpenAI saying very clearly that everything you say to a model could be turned over to the authorities. Helpfully, in testing, the models want to the call the authorities themselves.
-
Using AI to understand;
Google DeepMind, with Brown, NYU, and Stanford collaborators, developed enhanced Physics-Informed Neural Networks (PINNs) using second-order optimizers to uncover unstable singularities ("blow-ups") in fluid equations like Euler, Navier-Stokes, IPM, and Boussinesq. Discoveries include novel singularity families, a λ-instability order pattern in IPM/Boussinesq, and ultra-precise vorticity visuals (Earth-diameter accuracy).
These singularities probe fluid limits, tying directly to the unsolved $1M Millennium Navier-Stokes problem on turbulence and smoothness. The AI-math hybrid enables rigorous computer-assisted proofs, accelerating breakthroughs in physics (e.g., turbulence modeling) and engineering (e.g., aerodynamics).
Real-world benefits include optimized aircraft/car designs for fuel efficiency, improved weather forecasting via better atmospheric models, enhanced biomedical simulations of blood flow to aid cardiovascular treatments, more accurate ocean current predictions for climate and shipping, efficient petroleum extraction, and pollution dispersion modeling for environmental protection—ultimately enabling safer, greener technologies where traditional math falls short.
-
@Kirwan said in Artificial Intelligence (Previously "Chat GPT"):
Using AI to understand;
Google DeepMind, with Brown, NYU, and Stanford collaborators, developed enhanced Physics-Informed Neural Networks (PINNs) using second-order optimizers to uncover unstable singularities ("blow-ups") in fluid equations like Euler, Navier-Stokes, IPM, and Boussinesq. Discoveries include novel singularity families, a λ-instability order pattern in IPM/Boussinesq, and ultra-precise vorticity visuals (Earth-diameter accuracy).
These singularities probe fluid limits, tying directly to the unsolved $1M Millennium Navier-Stokes problem on turbulence and smoothness. The AI-math hybrid enables rigorous computer-assisted proofs, accelerating breakthroughs in physics (e.g., turbulence modeling) and engineering (e.g., aerodynamics).
Real-world benefits include optimized aircraft/car designs for fuel efficiency, improved weather forecasting via better atmospheric models, enhanced biomedical simulations of blood flow to aid cardiovascular treatments, more accurate ocean current predictions for climate and shipping, efficient petroleum extraction, and pollution dispersion modeling for environmental protection—ultimately enabling safer, greener technologies where traditional math falls short.
Yeah, I've been spending a lot of time thinking on this recently also...
-
I'm finding I'm increasingly using AI in homelife. I've mentioned medical advice before but just reinforcing that my infant had some health issues and plugging symptoms and blood results into grok gave me real peace of mind as to what the issues could be, and how likely they were. Groks diagnosis turned out to be correct once we eventually went through the system to specialists and more blood tests. I'd never say its a replacement for doctors as such but unlike googling symptoms which generally results in "you and your family have cancer and are going to die" with AI you get much better results in a format that is better than a doctors message at 4pm on a Friday saying you need to get bloodtest done in order for an urgent referral to a specialist early the following week -naturally having us freaked out.
The other area I'm finding it terrific is resolving tech issues. I know enough about tech to fiddle and get myself into trouble. I can generally nut a few things out to get a resolution but sometimes something so illogical happening (like intermittent issues) throws me completely. Just over the weekend I set up 2 wireless access points using old ISP routers to enable better comms to our garage and outdoor entertainment area. A year ago I got them kinda going, like the garage door would work 75% of the time and music streaming would skip every 30 seconds if it would connect at all. Grok was able to isolate the issue being with specific isp routers which have certain settings turned off or unavailable with crazy results being that 'some' websites worked while others didn't. Anyway I was able to bounce ideas off grok and worked out some obscure settings to change which seemed to have done the job.