Artificial Intelligence (Previously "Chat GPT")

Kirwan

The main part of why this graph is important, is the questions are private and they are formulated so they are not questions on the internet. So the models can't be trained to the benchmark, and it's not seen the question before.

That's why all the models previously were doing so poorly. The guy why runs this describes anything under 10% as noise.

This is the first step towards a model working out answers from the much overused term "first principles". It's a huge achievement. The first step to these things creating new knowledge (and yes I know Google's model has created a new algorithm, but this a new approach than what they did).

The other test was "Humanity's Last Exam" and the models were previously tapping out on that at 25%. These are all PHD and above questions in many different domains, no single human could possibly answer all the questions, it would take a team of experts.

Grok Heavy got 50.7% correct.

Kirwan

Rembrandt

@Kirwan The Sycophancy chapter is fascinating. AI with morals.

Kirwan

WTAF

voodoo

@Kirwan said in Artificial Intelligence (Previously "Chat GPT"):

WTAF

Hard to know what to make of this. Is it legit in its accuracy? The comments seem to sway between believers and the rest.

If real, man the implications for energy usage are just nuts.

Kirwan

Well, it’s one way to force the atomic age.

I was reading about some plans for data centers in space as well. World is changing.

voodoo

Behind the meter atomic age. Kind of scary

Kirwan

If they don't start building out power stations at a similar rate to China, electricity is going to get expensive over there.

NTA

Ignore the headline - read instead about how very few people are employing technology correctly but the ones who are can benefit.

Sheryl Estrada / Aug 17 / Newsletters

MIT report: 95% of generative AI pilots at companies are failing | Fortune

There’s a stark difference in success rates between companies that purchase AI tools from vendors and those that build them internally.

Tim

@Kirwan A lot of nuclear power plants under construction in China at the moment (~ 30).

Kirwan

Yeah, the bottleneck to win the AI race is power. US might have already lost.

voodoo

antipodean

Kirwan

A bit of a Doomer take, that's all implementation details and in some part that's open slather already before you consider AI. That's why Apple use privacy as a marketing ploy against that trend.

We are in the goldrush stage, and once actual products come out of these things (still smoke and mirrors really) then people will start taking security seriously.

I did have to laught at OpenAI saying very clearly that everything you say to a model could be turned over to the authorities. Helpfully, in testing, the models want to the call the authorities themselves.

Kirwan

One of the more impressive uses of various new video tools. Nano Banana;

Tim

Instrument

Discovering new solutions to century-old problems in fluid dynamics

Our new method could help mathematicians leverage AI techniques to tackle long-standing challenges in mathematics, physics and engineering.

Kirwan

Using AI to understand;

Google DeepMind, with Brown, NYU, and Stanford collaborators, developed enhanced Physics-Informed Neural Networks (PINNs) using second-order optimizers to uncover unstable singularities ("blow-ups") in fluid equations like Euler, Navier-Stokes, IPM, and Boussinesq. Discoveries include novel singularity families, a λ-instability order pattern in IPM/Boussinesq, and ultra-precise vorticity visuals (Earth-diameter accuracy).

These singularities probe fluid limits, tying directly to the unsolved $1M Millennium Navier-Stokes problem on turbulence and smoothness. The AI-math hybrid enables rigorous computer-assisted proofs, accelerating breakthroughs in physics (e.g., turbulence modeling) and engineering (e.g., aerodynamics).

Real-world benefits include optimized aircraft/car designs for fuel efficiency, improved weather forecasting via better atmospheric models, enhanced biomedical simulations of blood flow to aid cardiovascular treatments, more accurate ocean current predictions for climate and shipping, efficient petroleum extraction, and pollution dispersion modeling for environmental protection—ultimately enabling safer, greener technologies where traditional math falls short.

voodoo

@Kirwan said in Artificial Intelligence (Previously "Chat GPT"):

Using AI to understand;

Google DeepMind, with Brown, NYU, and Stanford collaborators, developed enhanced Physics-Informed Neural Networks (PINNs) using second-order optimizers to uncover unstable singularities ("blow-ups") in fluid equations like Euler, Navier-Stokes, IPM, and Boussinesq. Discoveries include novel singularity families, a λ-instability order pattern in IPM/Boussinesq, and ultra-precise vorticity visuals (Earth-diameter accuracy).

These singularities probe fluid limits, tying directly to the unsolved $1M Millennium Navier-Stokes problem on turbulence and smoothness. The AI-math hybrid enables rigorous computer-assisted proofs, accelerating breakthroughs in physics (e.g., turbulence modeling) and engineering (e.g., aerodynamics).

Real-world benefits include optimized aircraft/car designs for fuel efficiency, improved weather forecasting via better atmospheric models, enhanced biomedical simulations of blood flow to aid cardiovascular treatments, more accurate ocean current predictions for climate and shipping, efficient petroleum extraction, and pollution dispersion modeling for environmental protection—ultimately enabling safer, greener technologies where traditional math falls short.

Yeah, I've been spending a lot of time thinking on this recently also...

Tim

Circe Luna Cordeiro / Oct 3

New antibiotic targets IBD — and AI predicted how it would work before scientists could prove it - Faculty of Health Sciences

[...]Read More...

Rembrandt

I'm finding I'm increasingly using AI in homelife. I've mentioned medical advice before but just reinforcing that my infant had some health issues and plugging symptoms and blood results into grok gave me real peace of mind as to what the issues could be, and how likely they were. Groks diagnosis turned out to be correct once we eventually went through the system to specialists and more blood tests. I'd never say its a replacement for doctors as such but unlike googling symptoms which generally results in "you and your family have cancer and are going to die" with AI you get much better results in a format that is better than a doctors message at 4pm on a Friday saying you need to get bloodtest done in order for an urgent referral to a specialist early the following week -naturally having us freaked out.

The other area I'm finding it terrific is resolving tech issues. I know enough about tech to fiddle and get myself into trouble. I can generally nut a few things out to get a resolution but sometimes something so illogical happening (like intermittent issues) throws me completely. Just over the weekend I set up 2 wireless access points using old ISP routers to enable better comms to our garage and outdoor entertainment area. A year ago I got them kinda going, like the garage door would work 75% of the time and music streaming would skip every 30 seconds if it would connect at all. Grok was able to isolate the issue being with specific isp routers which have certain settings turned off or unavailable with crazy results being that 'some' websites worked while others didn't. Anyway I was able to bounce ideas off grok and worked out some obscure settings to change which seemed to have done the job.

The Silver Fern

Artificial Intelligence (Previously "Chat GPT")

MIT report: 95% of generative AI pilots at companies are failing | Fortune

Discovering new solutions to century-old problems in fluid dynamics

New antibiotic targets IBD — and AI predicted how it would work before scientists could prove it - Faculty of Health Sciences