Monday, March 11, 2024

Model Collapse. When LLM bots create content, and feed on their own stuff or output of other bots, things happen.

I asked Micro$oft's CoPIlot preview version about it, and got:

Model collapse, also known as AI collapse, is a phenomenon observed in generative artificial intelligence (AI) models. Let me break it down for you:

  1. What is Model Collapse?

  2. Why Does Model Collapse Occur?

  3. Consequences of Model Collapse:

  4. Handling Model Collapse:

  5. The Challenge Ahead:

In summary, model collapse poses a significant challenge, but ongoing research aims to address it and ensure the reliability of AI-generated content.

Learn more - these Copilot returned links should all work, (clicking text, not boxes), test them

 That was not the start of considering it. What triggered the bot questioning about bot difficulties? I watched, "The Turing Lectures: The future of generative AI," which is an hour and a half long, and recommended to any reader curious about the topic.

(Coincidentally - Apart from AI bot considerations, Jensen Huang, Founder and CEO of Nvidia, interviewed at Stanford, an hour long, is also worth the time taken to view it. Not having written code, it is refreshing that each linked video mentions the future will be bots doing the coding, which rings my bell and should resonate with others interested in new directions, without having to write code.)

So, basically a pitch of two video items, with text in support. Next, online items aside from what the bot returned, perhaps overlapping - via two search engine queries.

https://duckduckgo.com/?t=ffab&q=model+collapse&ia=web

https://scholar.google.com/scholar?hl=en&as_sdt=0%2C24&q=LLM+model+collapse

The two CoPilot returned items from arXiv are technical computer science items, 18p and 16p long, by credible scholars, with such items difficult to read, while other items are popular reports any intelligent person can follow. The Google Scholar returns are likewise from a technical literature search, while the duckduck returns are popular news and analyses. 

All for now. After posting, Crabgrass will be studying stuff about Model Collapse.

Readers not caring to pursue things further if at least trying out the videos, are expected - but some may be intrigued, so have at it.

__________UPDATE__________

Can you imagine if building architecture image search becomes too slanted if in the original training data set a web crawler picked up and retained too many fresh images from Gaza?

Leaving that thought to settle in, OpenAI (and the derivative Mirco$oft CoPilot stuff), are proprietary closed black box systems. It's technological "Trust me."

Is this any answer? Or is open source just a distinction without a difference, in terms of model collapse as more and more web content becomes artificially generated because it's cheaper than using human beings to write stuff?

Then, think, is life more interesting with dilemmas, or is certitude best? Or is your answer biased by personality traits, such as tidiness, or drug use?