OpenAI Introduces GPT-4

wingkongex · Mar 14, 2023

GPT-4

We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various...

openai.com

We've created GPT-4, the latest milestone in OpenAI's effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks.

Over the past two years, we rebuilt our entire deep learning stack and, together with Azure, co-designed a supercomputer from the ground up for our workload. A year ago, we trained GPT-3.5 as a first "test run" of the system. We found and fixed some bugs and improved our theoretical foundations. As a result, our GPT-4 training run was (for us at least!) unprecedentedly stable, becoming our first large model whose training performance we were able to accurately predict ahead of time. As we continue to focus on reliable scaling, we aim to hone our methodology to help us predict and prepare for future capabilities increasingly far in advance—something we view as critical for safety.

We are releasing GPT-4's text input capability via ChatGPT and the API (with a waitlist). To prepare the image input capability for wider availability, we're collaborating closely with a single partner to start. We're also open-sourcing OpenAI Evals, our framework for automated evaluation of AI model performance, to allow anyone to report shortcomings in our models to help guide further improvements.

Welp.

buttzilla · Mar 14, 2023

Ok that test image and response are incredible.

J75 · Mar 14, 2023

This AI future is scary.

YawZah · Mar 14, 2023

So AI is better than humans at everything now? That happened fast.

mugurumakensei · Mar 14, 2023

wingkongex said:
GPT-4

We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various...

openai.com

Welp.

Looking at where it performs best, it works best in known problem spaces (simple standardized tests) and performs worst on things requiring awareness and problem solving (see writing only being 54th percentile and USNCO local section and AP calculus)

cursed knowledge · Mar 14, 2023

oh shit, there's image now??

damn

EssBeeVee · Mar 14, 2023

YawZah said:
So AI is better than humans at everything now? That happened fast.

ai will great its own ai

also whose to say that Fat4All isn't one.

sedael · Mar 14, 2023

cursed knowledge said:
oh shit, there's image now??

Its not publicly available yet, but yeah one of the primary differences from GPT-3 to GPT-4 is that GPT-4 can process images and video and not jsut text input

EternalDarko · Mar 14, 2023

At least this means companies will finally get rid of those horrid captcha "security" checks.

DaciaJC · Mar 14, 2023

Genuinely surprised it only got a 4 on the AP Calc exam, would have expected it to smash that.

Sign My Guestbook! · Mar 14, 2023

it can't be stopped... someone stop it... it can't be stopped...

Senator Toadstool · Mar 14, 2023

wingkongex said:
GPT-4

We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various...

openai.com

Welp.

This is absolute garbage and I hate how they constantly trot this out.
This isn't how tests are taken. This isn't what they're testing. Anybody with access to information can master these tests with these scores. They intentionally exclude you from using that for that reason.

I've scanned a cal/law/biology book and can open it to any page at anytime! I'm so smart!

YawZah said:
So AI is better than humans at everything now? That happened fast.

No, not at all. This is an misleading ad for them to get funding.

cursed knowledge · Mar 14, 2023

i'm reading through their review and seems like the GPT-4 is wayyy better at math and history than 3.5

eyeball_kid · Mar 14, 2023

It looks like you can access it now if you're on their $20/mo. Plus plan.

Midramble · Mar 14, 2023

How many parameters are we up to in GPT4? Last I heard it was going to be somewhere in orders of magnitude over 3.

Edit: If old articles are right, this is a jump from 175 billion parameters with GPT-3 to 100 Trillion with GPT-4

Though not an apples to apples comparison at all, the human brain is at roughly 15 trillion parameters.

Senator Toadstool · Mar 14, 2023

mugurumakensei said:
Looking at where it performs best, it works best in known problem spaces (simple standardized tests) and performs worst on things requiring awareness and problem solving (see writing only being 54th percentile and USNCO local section and AP calculus)

because its not thinking. it's looking at statistical patterns based on previous works to infer solutions so it makes sense that problems that have known solutions and those have been written about and have similar linguistical or logical patterns it just spits out old answers because these things don't change.

gutshot · Mar 14, 2023

GPT has been a real time-saver when having to write and debug code, although it occasionally makes errors. A faster and less error-prone version will be great.

Teenage Fansub · Mar 14, 2023

AI that accurately explains subtle jokes will be useful here.

sedael · Mar 14, 2023

gutshot said:
A faster and less error-prone version will be great.

Yeah we are gonna go through a software development renaissance for the years between these models and it replacing us entirely. Copilot and CodeWhisper and whatever other models are just so useful for automating the annoying parts

vixolus · Mar 14, 2023

eyeball_kid said:
It looks like you can access it now if you're on their $20/mo. Plus plan.

Or by using Bing

Ashhong · Mar 14, 2023

buttzilla said:
Ok that test image and response are incredible.

I didn't even chuckle until I read the AI's response about what the humor is in the picture. AI is already taking over me

collige · Mar 14, 2023

Despite its capabilities, GPT-4 has similar limitations as earlier GPT models. Most importantly, it still is not fully reliable (it "hallucinates" facts and makes reasoning errors). Great care should be taken when using language model outputs, particularly in high-stakes contexts, with the exact protocol (such as human review, grounding with additional context, or avoiding high-stakes uses altogether) matching the needs of a specific use-case.

GPT-4 generally lacks knowledge of events that have occurred after the vast majority of its data cuts off (September 2021), and does not learn from its experience. It can sometimes make simple reasoning errors which do not seem to comport with competence across so many domains, or be overly gullible in accepting obvious false statements from a user. And sometimes it can fail at hard problems the same way humans do, such as introducing security vulnerabilities into code it produces.

GPT-4 can also be confidently wrong in its predictions, not taking care to double-check work when it's likely to make a mistake. Interestingly, the base pre-trained model is highly calibrated (its predicted confidence in an answer generally matches the probability of being correct). However, through our current post-training process, the calibration is reduced.

I would be a lot more cool with all these caveats if they weren't also in the processing of doing a wide rollout of the shittier last gen version at the same time. OpenAI being a sorta-kinda-not-for-profit now makes all this weird tbh

mugurumakensei said:
Looking at where it performs best, it works best in known problem spaces (simple standardized tests) and performs worst on things requiring awareness and problem solving (see writing only being 54th percentile and USNCO local section and AP calculus)

Sounds about right. It's still GPT at the end of the day. I have questions about how these tests are administered too, but the human equivalents aren't open-book exams anyway so it's an apples to oranges comparison.

Vapelord · Mar 14, 2023

So who here has Chat GPT4 to get its review on the $30 Italian lunch or that sad NY BBQ plate?

maximumzero · Mar 14, 2023

Terminator 2 was only off by about 20 years.

RUFF BEEST · Mar 14, 2023

Is it powering ChatGPT yet?

Cymbal Head · Mar 14, 2023

Describing the joke in that image is genuinely impressive, but It didn't get any better at the writing part of the GRE?

sedael · Mar 14, 2023

RUFF BEEST said:
Is it powering ChatGPT yet?

Only if you subscribe to ChatGPT+. otherwise, Bing uses it already if you try that

m_shortpants · Mar 14, 2023

RUFF BEEST said:
Is it powering ChatGPT yet?

There's a wait-list

eso76 · Mar 14, 2023

I haven't used it much but I tested it for After Effects expressions and it's amazing.
It won't just suggest an expression, it will go on and explain what every line and variable does.

Jedi2016 · Mar 14, 2023

Teenage Fansub said:
AI that accurately explains subtle jokes will be useful here.

Someone will have to test it on Sneed's Feed & Seed.

BigSkinny0310 · Mar 14, 2023

is it live?

edit: nvm

Kouriozan · Mar 14, 2023

Oh, this is why it's currently 1st on Twitter's top WW trends.

Stencil · Mar 14, 2023

Is Microsoft involved with this? They mention Azure in the abstract, but not sure what MS involvement is, if anything.

scottbeowulf · Mar 14, 2023

Livestream in about an hour

View: https://www.youtube.com/watch?v=outcGtbnMuQ

T the Talking Clock · Mar 14, 2023

Senator Toadstool said:
because its not thinking. it's looking at statistical patterns based on previous works to infer solutions so it makes sense that problems that have known solutions and those have been written about and have similar linguistical or logical patterns it just spits out old answers because these things don't change.

If it spouts out correct and coherent answers, what's the difference? It's pretty much the Chinese Box thought experiment.

Jordan117 · Mar 14, 2023

Stencil said:
Is Microsoft involved with this? They mention Azure in the abstract, but not sure what MS involvement is, if anything.

Huge. MS invested $10 billion in OAI (49% stake iirc), custom-built them a supercomputer for training, and have exclusive access to the codebase to build products like Bing Chat and various Windows integrations that need more than just the API.

turtle553 · Mar 14, 2023

But can it detect "Loss" in images?

Senator Toadstool · Mar 14, 2023

T the Talking Clock said:
If it spouts out correct and coherent answers, what's the difference? It's pretty much the Chinese Box thought experiment.

it's not "taking the test"

Jordan117 · Mar 14, 2023

For context, GPT 3.5 scored in the bottom 10% for a simulated bar exam, while 4 gets in the top 10%.

Its facility with language (outperforming state-of-the-art English models when given tests translated into other languages) is impressive.

I'm most interested in what the larger context window unlocks. 32k tokens is about 50 pages, imagine something this powerful maintaining coherency for that long. You could auto-summarize long essays, synthesize novellas, generate more than just toy programs. And it's surprisingly affordable.

renegade_sylens · Mar 14, 2023

It freakin did well on the AP and SAT exams? Insane!

Froli · Mar 14, 2023

I knew AI will be insane this year

sersteven · Mar 14, 2023

Stencil said:
Is Microsoft involved with this? They mention Azure in the abstract, but not sure what MS involvement is, if anything.

Microsoft recently made a pretty huge investment into OpenAI. They're trying to partner with OpenAI in a big way.

BigSkinny0310 · Mar 14, 2023

I wonder if I should spend $20 trying this out. I want to test its legal writing capabilities.

NunezL · Mar 14, 2023

BigSkinny0310 said:
I wonder if I should spend $20 trying this out. I want to test its legal writing capabilities.

Use bing

808s & Villainy · Mar 14, 2023

YawZah said:
So AI is better than humans at everything now? That happened fast.

How is that your takeaway from

while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks.

I'm surprised it didn't get perfect scores for math related tasks, although a lot of those require showing work so maybe that's where it lost points

Vertpin · Mar 14, 2023

Is it already live on Bing?

Zeliard · Mar 14, 2023

Yeah as others have noted, it's on Bing right now if you have access to it, for free.

Confirmed: the new Bing runs on OpenAI’s GPT-4

Congratulations to our partners at Open AI for their release of GPT-4 today. We are happy to confirm that the new Bing is running on GPT-4, which we’ve customized for search. If you’ve used the new Bing preview at any time in the last five weeks, you’ve already experienced an early version of...

blogs.bing.com

Also restrictions are now 15/150.

Yoga Flame · Mar 14, 2023

I use ChatCPT all day, it's actually a invaluable resource for coding, and providing insight into other people's code. It's part of my workflow.

It's like having an extremely knowledgeable companion, I can't go back.

IDE on one screen, ChatGPT in another. I get it to write my regex for instance. Unbelievable stuff.

Smoothcb · Mar 14, 2023

Can it create a executable programs instead of just code snippets?

BigSkinny0310 · Mar 14, 2023

NunezL said:
Use bing

Does the bing version give me full responses like 3.5 or is it a smaller "search engine like" response?

OpenAI Introduces GPT-4

Elizabeth, I’m coming to join you!

Attempted to circumvent ban with alt account

Force of Habit

Attempted to circumvent ban with alt account

Prophet of Truth

Prophet of Truth

Attempted to circumvent ban with alt account.

Attempted to circumvent ban with alt account

Attempted to circumvent ban with alt account.

Alt-Account

Attempted to circumvent ban with alt account.