Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One aspect of the spread of LLMs is that we have lost a useful heuristic. Poor spelling and grammar used to be a signal used to quickly filter out worthless posts.

Unfortunately, this doesn't work at all for AI-generated garbage. Its command of the language is perfect - in fact, it's much better than that of most human beings. Anyone can instantly generate superficially coherent posts. You no longer have to hire a copywriter, as many SEO spammers used to do.

curl's struggle with bogus AI-generated bug reports is a good example of the problems this causes: https://news.ycombinator.com/item?id=38845878

This is only the beginning, it will get much worse. At some point it may become impossible to separate the wheat from the chaff.



We should start donating more heavily to archive.org - the way back machine may soon be the only way to find useful data on the internet, by cutting out anything published after ~2020 or so.


I won't even bet on archive.org to survive. I will soon upgrade my home NAS to ~100TB and fill it up with all kinds of information and media /r/datahoarder style. Gonna archive the usual suspects like Wikipedia and also download some YouTube channels. I think now is the last chance to still get information that hasn't been tainted by LLM crap. The window of opportunity is closing fast.


> ~100TB

That's a lot, compared to mine. How do you organize replication and do you make backups on any external services? I kinda do want to hoard more, but I find it complicated to deal with at large scale. It gets expensive to back-up everything, and HDDs aren't really a solid media long-term. Now, I can kinda use my judgment of what is important and what is essentially trash I store just in case, but losing 100TB of trash would be pretty devastating too, TBH.


The last chance to get reliable information that hasn't been tainted by the bullshit of LLM hallucinations is CyberPravda (dot) com project. The window of opportunity is closing fast.


It will be like salvaging pre-1945 shipwrecks for their non-irradiated metal.



It's funny, I made the same analogy the other day: https://twitter.com/gramofdata/status/1736838023940112523

I think something interesting to note is that once we stopped atmospheric nuclear testing steel radiation levels went back down and are almost at normal background levels. So maybe the same thing will happen if we stop using GenAI.


It was quite limited set of entities that did atmospheric nuclear testing. Same cannot be said about LMMs.

Mine post-apocalyptic scenario I half-jokingly predicted some 5-10 years ago was that all general-purpose computing hardware, which is common and relatively cheap now will be abandoned and possibly outlawed in the end. People will use non-rootable thin clients to access Amazon, which will have general-purpose hardware, but it will be heavily audited by government entities.


Let's hope that, like irresponsible nuclear weapons tests, we also experience a societal change that eventually returns things back to a better way.


Interesting idea. Could there be a market for pre-AI era content? Or maybe it would be a combination of pre-AI content plus some extra barriers to entry for newer content that would increase the likelihood the content was generated by real people?


I'm in the camp where I want AI and automation to free people from drudgery in the hope that it will encourage the biggest HUMAN artwork renaissance ever in history.

I don't want AI to be at the forefront of all new media and artwork. That's a terrible outcome to me.

And honestly there's already too much "content" in the world and being produced every day, and it seems like every time we step further up the "content is easier to produce and deliver" ladder, it actually gets way more difficult to find much of value, and also more difficult for smaller artists to find an audience.

We see this on Steam where there are thousands of new game releases every week. You only ever hear of one or two. And it's almost never surprising which ones you hear about. Rarely you get an indie sensation out of nowhere, but that only usually happens when a big streamer showcases it.

Speaking of streamers, it's hard to find quality small streamers too. Twitch and YouTube are saturated with streams to watch but everyone gravitates to the biggest ones because there's just too much to see.

Everything is drowning in a sea of (mostly mediocre, honestly) content already, AI is going to make this problem much worse.

At least with human generated media, it's a person pursuing their dreams. Those thousands of games per week might not get noticed, but the person who made one of them might launch a career off their indie steam releases and eventually lead a team that makes the next Baldur's Gate 3 (substitute with whatever popular game you like)

I can't imagine the same with AI. Or actually, I can imagine much worse. The AI that generates 1000 games eventually gets bought by a company to replace half their staff and now a bunch of people are out of work and have a much harder uphill battle to pursue their dreams (assuming that working on games at that company was their dream)

I don't know. I am having a hard time seeing a better society growing out of the current AI boom.


> free people from drudgery in the hope that it will encourage the biggest HUMAN artwork renaissance ever in history.

This experiment has been run in most wealthy nations and the artwork renaissance didn't happen.

Most older people don't do arts/sciences when they retire from work.

From what I see of younger people that no longer have to work (for whatever reason) neither do younger people become artists given the opportunity.

Or look at what people of working age do with their free time in evenings or weekends after they've done their work for the week. Expect people freed from work to do more of the same as what they currently do in evenings/weekends: don't expect people will suddenly do something "productive".


> Most older people don't do arts/sciences when they retire from work.

This isn't my experience. I know a bunch of old folks doing woodcarving, quilting, etc. Its just not the kind of arts you've got in mind.


You don’t want older folks to generate reams of good art for consumption. Let the youngsters who need to make money do that. And many artistically-oriented youngsters do create art in their off hours from work, at least out here. I don’t think they think of it as “production” though. Why does a bird sing?

What retirees often do, rather, is develop an artist’s eye for images, a musician’s ear for sounds, a philosopher’s perspective, a writer’s voice, etc.. This often involve a broader exposure/consumption of arts and studying art history. Sometimes producing actual art as well…but less for the final artistic product but instead to engage in the artistic process itself so as to develop that way of seeing/feeling/being an artist has. When the work-related chunk of the mind is wholly freed up for other pursuits, there is often such a bit-flip. And since it is a deepening appreciation and greater consumption, there is no risk of overproduction of art and the soul devolution that arises from hyper competitiveness in the marketplace.


Becoming an artist is difficult. Sure, anyone can pick up a tool of their preference and learn to noodle around. Producing artwork sufficiently engaging to power a renaissance takes years of practice to mastery. We think that artists appear out of nowhere, fully formed, an impression we get from how popularity and spread works. Look under the surface, read some biographies of artists, and it turns out, with few exceptions, they all spend years going through education, apprenticeships, and generally poor visibility. Many of the artists we respect now weren't known in their lifetimes. The list includes Vincent van Gogh, Paul Cézanne, Claude Monet, Vivian Maier, Emily Dickinson, Edgar Allan Poe, Jeff Buckley, Robert Johnson, you get the idea.


They do art.

It’s just published on YouTube. Seriously the quality of diy videos and everything is sometimes PBS or BBC quality or better.


Well I do art in the evenings and weekends, so we exist you know


I'll go one further, though I expect to receive mockery for doing so: I think the internet as we conceive of it today is ultimately a failed experiment.

I think that society and humanity would be better off if the internet had remained a simple backbone for vetted organizations' official use. Turning the masses loose on it has effectively ruined so many aspects of our world that we can never get back, and I for one don't think that even the most lofty and oft-touted benefits of the internet are nearly as true as we pretend.

It's just another venue for the oldest of American traditions at this point: Snake Oil Sales.


I won't mock you, I get where you're coming from, but I think you're forgetting just how revolutionary many aspects of the internet have been. The ability to publish to a potentially global audience without a corporate mediator. Do commerce without physically going to a store or ordering over a phone. Access to information, culture and education beyond what can fit in one's local library. Bank without an ATM. Even just being able to communicate worldwide without long-distance charges (remember those) or an envelope and stamp. Even social media, which everyone hates, was a revolution in that it got people easily using the web to network and communicate en masse, whereas prior it was just people behind pseudonyms on niche forums. There is a real and tacit improvement in the quality of life for at least millions of people behind each of those.

Reducing the internet to only world-destroying negatives and writing off its positives as "snake oil" seems unnecessarily hyperbolic, as obvious as the negatives are. Although I suppose it's easier to accept the destruction of the internet if you believe that it was never worth anything to begin with. But I disagree that nothing of value is being lost. Much of value is being lost. That's what's tragic.


Humans will use whatever means available to us to spout bullshit, misinformation and peddle snake oil.

The Internet has just made it easier for us to communicate, in doing so it has made the bad easier, but it has also made the good easier too. And fortunately there's still a lot more good than bad.

So I totally disagree with you there, bettering communication only benefits our species overall.

Gay rights is a great example, we only got them because of the noise and ruckus, protests, parades, individuals being brave and coming out. It's easy to hate a type of person if you've never been exposed to or communicated with them. But sometimes all it took to change the opinion of a homophobic fuck was finding out their best friend, their child, their neighbour who helps out all the time, was gay. Then suddenly it clicks.

Though certainly the Internet is slightly at odds with our species; we didn't evolve to communicate in that way so it's not without its challenges.


Get off my lawn! ;]


The AI that generates 1000 games eventually gets bought by a company

That seems like only a temporary phenomenon. If we've got AI that can generate any games that people actually want to play then we don't need game companies at all. In the long run I don't see any company being able to build a moat around AI. It's a cat-and-mouse game at best!


> In the long run I don't see any company being able to build a moat around AI

Why do you think they are screaming about "the dangers of AI"? So they can regulate it and gain a moat via regulatory capture.


I don't think regulation will achieve what they want. Nothing short of a war-on-drugs style blanket prohibition would work. And you can look there to see how ineffective that's been at keeping drugs off the streets.


Another example of this behavior. The war on drugs not working didn't stop alcohol companies from lobbying for it, any effect that suppresses compition is valuable and its not like OpenAI and the like will be paying for enforcement, you will be.


I'd be very, very surprised if OpenAI was successful in setting up a war-on-drugs style regime that simultaneously sets them up as one of the soul providers of AI (a guaranteed monopoly on AI in the US). One of the big reasons is that it would put the US at an extreme disadvantage, competitively speaking. OpenAI would not be able to hire every single AI developer, so all of that talent would leave the US for greener pastures.


>> If we've got AI that can generate any games that people actually want to play then we don't need game companies at all.

> Why do you think they are screaming about "the dangers of AI"?

Perhaps it's those of us who enjoy making games or are otherwise invested in producing content that are concerned about humanity being reduced to braindead consumers of the neverending LLM sludge, who scream the loudest.


Yes, but we don't get to sit in Congressional committee hearings and bloviate about Existential Risks.


Or there is only one game…a metaverse in which you create a new game by customizing the world through ai generated content and rules.

Fantasy below, Star Wars above—in a galaxy far far away.


> In the long run I don't see any company being able to build a moat around AI.

This feels like a fantasy.


Think how many game developers were able to realize their vision because Unity3D was accessible to them but raw C++ programming was not. We may see similar outcomes for other budding artists with the help of AI models. I'm quite excited!


I'm cautiously optimistic, but I also think about things like "Rebel Moon". When I was growing up, movies were constrained by their special effects budget... if some special effects "wizard" couldn't think of a way to make it look like Luke Skywalker got his hand cut off in a light saber battle, he didn't get his hand cut off in a light saber battle. Now, with CGI, the sky is the limit - what we see on screen is whatever the writer can dream up. But what we're getting is... pretty awful. It's almost as if the technical constraints actually forced the writers to focus on crafting a good story to make up for lack of special effects.


Except 'their vision' is practically homogeneous. I can't think even think of a dozen Unity games that broke the mould, and genuinely stand out, out of the many tens of thousands (?).

There's Genshin Impact, Pokemon Go, Superhot, Beat Saber, Monument Valley, Subnautica, Among Us, Rust, Cities:Skylines (maybe), Ori (maybe), COD:Mobile (maybe) and...?


> Except 'their vision' is practically homogeneous. I can't think even think of a dozen Unity games that broke the mould, and genuinely stand out, out of the many tens of thousands (?).

You could say the same about books.

Lowering the barriers of entry does mean more content will be generated and that content won't the same bar as having a middleman who was the arbiter of who gets published but at the same time, you'll likely get more hits and new developers because you getting more people swinging faster to test the market and hone their eye.

I am doubtful that there are very many people who hit a "Best Seller" 10/10 on their first try. You just used to not see it or ever be able to consume it because their audience was like 7 people at their local club.


It's now over eighteen years later after the first few games made with Unity came out and at best, being generous, there's maybe two dozen.

Which suggests even after several iterations the vast vast majority of folks are not putting out anything noteworthy.


Necropolis, Ziggurat... Imo the best games nowadays are often those that no one heard about. Popularity wasn't a good metric for a very long while. And thankfully games like "New World" and "Starfield" are helping a lot for general population to finally figure this out.


I don't agree with you at all.

Angry birds, Slender: The Eight Pages, Kerbal Space Program, Plague Inc, The Room, Rust, Tabletop Simulator, Enter the Gungeon, Totally Accurate Battle Simulator, Clone Hero, Cuphead, Escape from Tarkov, Getting Over It with Bennett Foddy, Hollow Knight, Oxygen Not Included, Among Us, RimWorld, Subnautica, Magic: The Gathering Arena, Outer Wilds, Risk of Rain 2, Subnautica: Below Zero, Superliminal, Untitled Goose Game, Fall Guys, Raft, Slime Rancher, Firewatch, PolyBridge, Mini Metro, Luckslinger, Return of the Obra Dinn, 7 Days to Die, Cult of the Lamb, Punch Club.

Many more where those came from


Some other Unity games that are fun, and which others haven't mentioned:

Cuphead

Escape Academy

Overcooked

Monster Sanctuary

Lunistice


Kerbal Space Program is another.


True, KSP definitely qualifies as breaking the mould.


Rimworld. Dyson Sphere Program. Cult of the Lamb. Escape from Tarkov. Furi. Getting over it with Bennett Foddy. Hollow Knight. Kerbal Space Program. Oxygen not included. Pillars of Eternity. Risk of Rain 2. Tyranny.

I'd say all of those do some major thing that makes them stand out.


and Outer Wilds!


True, it definitely would count, at least more so than COD:Mobile.


The Long Dark.


I await more of the story campaign with bated breath. I'm adoring it, though the last episode felt a tad rushed, or flat maybe. To me, at least.


Valheim lol


Yeah, I can definitely see how Beat Saber, Hollow Knight, and Tunic didn’t really do anything particularly creative or impressive. /s


I mentioned Beat Saber? Did you skip reading the list?


Surely this time, a new invention will give people more leisure time, instead of making it easier to do more work.

Surely this time...


> I'm in the camp where I want AI and automation to free people from drudgery in the hope that it will encourage the biggest HUMAN artwork renaissance ever in history.

That is, to put it bluntly, hoping for a technological solution to a social problem. It won't happen. Ever.

We absolutely, 100% DO NOT have the social or ideological framework necessary to "free people from drudgery." The only options are 1) be rich, 2) drudge, or 3) starve. Even a technology as fantastic as a Star Trek replicator won't really free us from that. If it enables anything, the only new option provided by replicators would be: 4) die from an atom bomb replicated by a nutjob.


> free people from drudgery in the hope that it will encourage the biggest HUMAN artwork renaissance ever in history

Just like the industrial revolution or just like desktop computers?


> Could there be a market for pre-AI era content?

Like the market for pre-1940s iron resting at the bottom of seas and oceans, unsullied by atmospheric nuclear bomb testing.


the problem is that data tends to become less useful/relevant over time as opposed to iron that is still iron and fulfills the same purpose


Well, the AI generated data is only as usefull as the data it's based upon så no real difference there.


That was the first thing that came to mind for me as well


Extra barriers! LOL. Everything I have every submitted written by me (a human) to HN, reddit and others in the past 12 months gets rejected as self-promotion or some other BS even though it is totally original technical content. I am totally over the hurdles to get anything I do noticed, and as I don't have social media it seems the future is to publish it anywhere and rely on others or AI to scrape it into a publishable story somewhere else at a future date. I feel for the moderator's dilemma, but I am also over the stupid hoop's humans have to jump.


So true, barrier to entry is already too High.


Silly prediction: the only way to get guaranteed non-ai generated content will be to go to live performances of expert speakers. Kind of like going to the theater vs. TV and cinema or attending a live concert vs. listening to Spotify.


You could hash the hoard and stick the signature on a blockchain.

If only it was 2018, we could do this as a startup and make a mint.


We'll fight one buzzword with another!


There are companies that do this! It’s for proving that something existed at a particular moment back in time.


Until the experts are replaced by "experts" with teleprompter/earpieces.


> Could there be a market for pre-AI era content?

Yes, but largely it'll be people who don't want to train their AIs on garbage produced by other AIs


and a market for books published before 2022 (minus self-publishing on Amazon) :-)


Love that sentiment! The Internet Archive is in many ways one of the best things online right now IMO. One of the few organisations that I donate regularly to without any second thoughts. Protect the archive at all costs!


I update my wikipedia copy every few months, but I can't really afford to back up internet archive. I do send them and around $10 every christmas as part of my $100 bucks to my favorite sites like archive, wikipedia, etc


~2020, the end of history


Well, it was 2020. You kind of expect "ah, now we see where humanity's behavior went wrong." Hindsight and all.


I’m afraid that already happened after World War I, according to the final sentence of 1066 and All That <https://en.wikipedia.org/wiki/1066_and_All_That>:

> America was thus clearly Top Nation, and History came to a .

(For any confused Americans, remember . is a full stop, not a period.)


so the mayans were off by merely a decade


They probably just forgot some mundane detail.


Things go in cycle. Search engine was so much better at discovering linked websites. Then people play the SEO game, write bogus articles, cross link this and that, everyone got into writing. Everyone write the same cliches over and over, quality of search engine plumets. But then since we are regurgitating the same thought over and over again, why not automate it. Over time people will forget where the quality post comes up in the first place. e.g. LLM replaces stackoverflow replaces technical documentation. When the cost of production is dirt cheap, no one cares about quality. When enough is enough, people will start to curate a web of word of mouth of everything again.

What I typed above is extrememly broad stroking and lacking of nuances. But generally I think quality of online content will go to shit until people have had enough, then behaviour will swing to other side


Nah, you got the right of it. It feels like the end of Usenet all over again, only these days cyber-warlords have joined the spammers and trolls.

Mastodon sounded promising as What's Next, but I don't trust it-- that much feels like Bitcoin all over again. Too many evangelists, and there's already abuse of extended social networks going on.

Any tech worth using should sell itself. Nobody needed to convince me to try Usenet, most people never knew what it was, and nobody is worse off for it.

We created the Tower of Babel-- everyone now speaks with one tongue. Then we got blasted with babble. We need an angry god to destroy it.

I figure we'll finally see the fault in this implementation when we go to war with China and they brick literally everything we insisted on connecting to the internet, in the first few minutes of that campaign.


I hope / believe the future of social networks will go back to hyperlocal / hyperfocused.

I am definitely wearing rose-tinted glasses here but I had more fun on social media when it was just me, my local friends, and my interest friends messing around and engaging organically. When posting wasn't about getting something out of it, promoting a new product, posting a blog article... take me back to the days where people would tweet that they were headed to lunch then check in on Foursquare.

I get the need for marketing, etc etc. But so much of the internet and social media today is all about their personal branding, marketing, blah. Every post has an intention behind it. Every person is wearing a mask.


The decentralized social network Mastodon did not have an unbiased algorithm for analyzing the reliability of information and assessing the reputation of its authors. This shortcoming is now being addressed by a new method - we create a CyberPravda (dot) com platform for disputes with unbiased mathematical algorithm for assessing the reliability of statements, where people are accountable with personal reputation for their knowledge and arguments.


> It feels like the end of Usenet all over again

Eternal LLMber.


Great phrase!


I can see it already! The war with China... then we find ourselves around the camp fire with the dads and mums cooking food, the boys and girls singing songs and the grandparents telling stories about times long gone.


I feel like somehow this is all some economic/psychological version of a heat equation. Anytime someone comes up with some signal with economic value that value is exploited to spread the signal back out.

I think it’s similar to a Matt Levine quote I read which said something like Wall Street will find a way to take something riskless and monetize them so that they now become risky.


Insular splinternets with Web of trust where allowing corporate access is banworthy?


> You no longer have to hire a copywriter, as many SEO spammers used to do.

I used to do SEO copywriting in high school and yeah, ChatGPT's output is pretty much at the level of what I was producing (primarily, use certain keywords, secondarily, write a surface-level informative article tangential to what you want to sell to the customer).

> At some point it may become impossible to separate the wheat from the chaff.

I think over time there could be a weird eddy-like effect to AI intelligence. Today you can ask ChatGPT a Stack Overflow-style and get a Stack Overflow-style response instantly (complete with taking a bit of a gamble on whether it's true and accurate). Hooray for increased productivity?

But then, looking forward years in time, people start leaning more heavily on that and stop posting to Stack Overflow and the well of information for AI to train on starts to dry up, instead becoming a loop of sometimes-correct goop. Maybe that becomes a problem as technology evolves? Or maybe they train on technical documentation at that point?


I think you are generally correct in where things will likely go (sometimes correct goop) but the problem I think will be far more existential; when people start to feel like they are in a perpetual uncanny valley of noise, what DO they actually do next? I don't think we have even the remotest grasp of what that might look like and how it will impact us.


That is an interesting thought. Maybe the problem is not the ai generated useless noise, but that it is so easy and cheap to publish it.

One possible future is going back to a medium with higher cost of publication. Books. Handchiseled stone tablets. Offering information costs something.


This was the original use case of bitcoin's Proof of Work system. Initally it was to impose a (nominal) fee on senders of email by mail clients.

If you didn't submit a proof of work of N or greater difficulty the email would be thrown out.


> One possible future is going back to a medium with higher cost of publication. Books

Honestly I’ve switched to books and papers a few years ago and it has been fantastic. 2 hours of reading a half decent book or paper outweighs a week of reading the best blogposts, twitter threads, or YouTube videos.


I generally go to cited papers in Wikipedia articles


Do you have a favorite source for papers?


HN surfaces a lot of good ones. Sometimes friends recommend stuff. Or I search for things I'm interested in.

Then once you have a hook into your topic, it usually cites 30+ other papers that may be worth reading. You will never run out.


> One possible future is going back to a medium with higher cost of publication. Books.

The grifters are all over that already. No AI necessary to generate and publish drivel.

See “Contrepreneurs: The Mikkelsen Twins”¹ by Dan Olson² for an informative and entertaining documentary on the matter.

¹ https://www.youtube.com/watch?v=biYciU1uiUw

² A.k.a Folding Ideas. A.k.a. the creator of “Line Goes Up – The Problem With NFTs”.


fun thought: its more reliable to store information on stone tablets over very long time periods of time then it is harddrives or other modern data storage devices


I think we have plenty of examples of published “noise”, probably just not on the same scale. (“Noise” is subjective of course: I don’t watch reality television but others do, for example.) For the most part, I just ignore “noise”, so I suspect that the entire World Wide Web will eventually be considered “noise” by many. Instead it seems like it will be necessary to deploy AI to retrieve information as it will be necessary to programmatically evaluate the received content to filter out anything that you’ve trained it to consider “noise”.


"(“Noise” is subjective of course: I don’t watch reality television but others do, for example.)"

This brings up a good sub-topic. "Noise" as I mean it is where it's something you cannot definitely validate the veracity of in short order, or you do and it's useless.

The trash TV thing is a great example: if you are watching Beavis & Butthead because you know its trash and you need to zone out, that's a conscious, active decision, and you are in effect, 'in on the joke'...if you can't discern that it's satire and find yourself relating to the characters, you might be part of the problem :)


Spending less time on the internet in general or perhaps hyper strict closed off walled garden social networks for humans only.


...Leading to the interesting thought-experiment/SF-story-concept of, "how do you prove you're human to a computer?"


Its already becoming hard to tell the wheat from the chaff.

AI generated images used to look AI generated. Midjourney v6 and well tuned sdxl models look almost real. For marketing imagery, Midjourney v6 can easily replicate images from top creative houses now.


>But then, looking forward years in time, people start leaning more heavily on that and stop posting to Stack Overflow and the well of information for AI to train on starts to dry up

for coding tasks I'd imagine it could be trained on the actual source code of the libraries or languages and determine proper answers for most questions. AI companies have seen success using "synthetic" data, but who knows how much it can scale and improve


I've rarely found stackoverflow to give useful answers. If I am looking for how to do something with Linux programming, I'll get a dozen answers, half of which are only partial answers, the other half don't work.


Weird, I've found hundreds of useful SO answers that worked for me.

I've also learned a lot from chatting with Bing AI. The caveat is there that you all the time have in the back of your mind that the answer might be wrong. It helps to keep asking more detailed questions and check whether the set of answers keep on making sense as a whole. That way of using it has helped me a lot. See it as getting info from a very smart friend who sometimes had too much to drink.


And to be fair, I've rarely found ChatGPT to give useful answers ... so I guess it produces perfect StackOverflow-like answers?


When I've used chatgpt and bard to write example code, it's always generated a complete example, not half of one.

Of course, I carefully frame the query so that's what I get.

However, when I asked stackoverflow, google, and bard a question about how to do something with github, all I received were wrong answers. I finally had to throw in the towel and ask people. I think it was the third person I asked who gave an answer that worked.

google itself has an annoying habit of answering a fundamentally different question than what I type in.


I often also get a complete example from ChatGPT when the question calls for it, it's just usually an incorrect complete example.


>the well of information for AI to train on starts to dry up

and WRT to the eddy-like model-self-incestuation - I am sure that the scope of that well just becomes wider - now its slurping any and all video and learning human micro emotions and micro-aggressions - and mastering human interpersonal skills.

My prediction is that AI will be a top-down reflection of societies' leadership. So as long as we have these questionable leaders throughout the world governments and global corps - the Alignment of AI will be bias to their narratives.


It didn't take very long for the first lawyers to get sanctioned for using ChatGPT-made-up cases in legal briefs. https://www.reuters.com/legal/new-york-lawyers-sanctioned-us...

It would be hilarious if the end result of all this would be to go back to a 1990s-2000s Yahoo style of web portal where all the links are curated by hand by reputable organizations.


Re-visiting this might be a good idea, it's a different set of tools we have available, perhaps there is something out there that can distribute this task and manage reputation.


I mean, this was already heading to be the case pre llm.

The internet was already becoming ad farms. This is the final blow and now the internet as we knew it will die.

I’m not that pessimistic about llm generated content. I’m starting to use it to rewrite my online and slack comments for grammar, I’m also using it for brainstorming, enhancing things I create, code (not as in “ok ai write me an app” but as in “change this code to do this, ok this is not considering x and y edge cases, ok use this other method, ok refactor that” it is saving me a lot of typing and silly mistakes while I focus on the meat of the problem.


They find a way to validate the utility of the information instead of the source.

It doesn't matter if the training data is AI generated or not, if it is useful.


The big problem is that it's orders of magnitude easier to produce plausible looking junk than to solidly verify information. There is a real threat that AI garbage will scale to the point that it completely overwhelms any filtering and essentially ruins many of the best areas of the internet. But hey, at least it will juice the stock price of a few tech companies.


> You no longer have to hire a copywriter

Has anyone tested a marketing campaign using copy from a human copywriter versus an AI one?

I would like to see which one converts better.


> Poor spelling and grammar used to be a signal used to quickly filter out worthless posts.

Or just a post from a non-native speaker.


I can always tell the difference between a non-native English speaking writer and somebody who's just stupid - the sort of grammatical mistakes stupid people make are very, very different than the ones that people make when speaking a second language.

Of course, sometimes the non-native English was so bad it wasn't worth wading through it, so that's still sort of a good signal.


Often it was possible to tell these apart on repeat interactions.


> a post from a non-native speaker

In my experience as an American, US-born and -educated English speakers have much worse grammar than non-native speakers. If nothing else, the non-native speakers are conscious of the need for editing.


That’s true. I thought I missed the internet before ClosedAI ruined it but man, I would love to go back to 2020 internet now. LLM research is going to be the downfall of society in so many ways. Even at a basic level my friend is taking a masters and EVERYONE is using chatgpt for responses. It’s so obvious with the PC way it phrases things and then summarizes it at the end. I hope they just get expelled.


I don't see how this points to downfall of society. IMO it's clearly a paradigm shift that we need to adjust to and adjustment periods are uncomfortable and can last a long time. LLMs are massive productivity boosters.


> LLMs are massive productivity boosters.

Only if your product is bullshit.


Only if you don't proofread or do a cleanup pass is it dogshit.


What's new about that? Any bullshit product is bullshit.


Ah, Hacker News

Never change


Do you remember when email first came around and it was a useful tool for connecting with people across the world, like friends and family?

Does anyone still use email for that?

We all still HAVE email addresses, but the vast majority of our communication has moved elsewhere.

Now all email is used for is receiving spam from companies and con artists.

The same thing happened with the telephone. It's not just text messaging that killed phone calls, it's also the explosion of scam callers. People don't trust incoming phone calls anymore.

I see AI being used this way online already, turning everything into untrustworthy slop.

Productivity boosters can be used to make things worse far more easily and quickly than they can be used to make things better. And there will always be scumbags out there who are willing and eager to take advantage of the new power to pull everyone into the mud with the.


> Does anyone still use email for that?

Sure. Same as in the olden days. Txt for short form, email for long form. Email for the infrequently contacted.

Even back when I used SM, I never comm'd with IRL people on SM. SM was 100% internet people.


This isn't really an accurate comparison. Email and text messaging are, well, messaging platforms - they're used for direct communication and crucially, anyone can come knocking on your door. After a certain threshold of spammers begin taking over inboxes, people move onto something else.

The internet as a whole isn't that. By and large, you can curate your experience and visit only the places you want to visit. So why exactly would the mere existence of generative AI make an average high-quality website suddenly do a 180 and destroy itself?

I won't debate that garbage data will probably be easier to generate and there will be more of it, but the argument feels one-sided. People are talking like the only genuine use of generative AI is generating bad data and helping scammers, despite it opening a lot of other possibilities. It's completely unbalanced.


> Now all email is used for is receiving spam from companies and con artists.

No it isn't, unless you are 12 maybe.


That's not a response to GPs thesis, just an irrelevant nitpick.


It’s only a boost to honest people. Meanwhile grifters and lazies will be able to take advantage. This is why we can’t have nice things. It will lead to things like reduction in remote offerings like remote schooling or work


I think this is hyperbole, and similar to various techno fears throughout the ages.

Books were seen by intellectuals as being the downfall of society. If everyone is educated they'll challenge dogma of the church, for one.

So looking at prior transformational technology I think we'll be just fine. Life may be forever changed for sure, but I think we'll crack reliability and we'll just cope with intelligence being a non-scarce commodity available to anyone.


> If everyone is educated they'll challenge dogma of the church, for one.

But this was a correct prediction.

It took the Church down a few pegs and let corporations fill that void. Meet the new boss, same as the old boss, and this time they aren't making the mistake of committing doctrine to paper.

> we'll just cope with intelligence being a non-scarce commodity available to anyone.

Or we'll just poison the "intelligence" available to the masses.


> But this was a correct prediction.

And yet the sky didn't fall.

> Or we'll just poison the "intelligence"

We really don't know how that will pan out. All I have is history to inform me, and even the most radical revolutions have worked out with humans continuing to move forward with increased capacity and better living conditions overall. The new boss is way better than the old.


> Books were seen by intellectuals as being the downfall of society. If everyone is educated they'll challenge dogma of the church, for one.

Now, let me tell you about this man, his 95 theses and a thirty years war. Europe did emerge better from it all, but the cost was high, very high.


Yes, this is the nature of major disruption. I doubt it will be a smooth ride, but I also doubt we will suffer until we are wiped out.


At this rate many exams will just become oral exams :-)


Or like ... normal paper exams in a class room?


The paradigm is changed beyond that. Exams are irrelevant if intelligence is freely available to everyone. Anyone who can ask questions can be a doctor, anyone can be an architect. All of those duties are at the fingertips of anyone who cares to ask. So why make people take exams for what is basically now common knowledge? An exam certifies you know how to do something, well if you can ask questions you can do anything.


> why make people take exams for what is basically now common knowledge?

The only thing that has changed is the speed of access. Before LLMs went mainstream, you could buy whatever book you wanted and read it. No one would stop you from it.

You still should have a professional look over the work and analyze that it is correct. The output is only as good as the input on both sides (both from the training data and the user's prompt)


Doctors don't just ask LLMs for answers to questions so it's really a mystery as to what you think makes these people into doctors the second they start asking an LLM medical questions... It's akin to saying someone was a doctor when browsing WebMD


The doctor is the LLM, lol.

I don't think we can/should do this on today's LLMs, but if we continue advancing in the same way, and as-good-as-human reliability is achieved, the intelligence of a doctor is in your pocket whenever you want it.

And just like you say you know addresses because you have an address book, you'll know medicine because you have it immediately on-tap. Instead of holding all of that in your own memory, instead of having to use your own critical thinking (or lack thereof), just offload it to the LLM in your pocket.

We do this all the time with tools. Who now knows how to cut down a tree but lives in a house made of milled trees? There are so many lost skills that we defer to either other people or machines and yet each individual lives with the benefit of all those skills.

Tools make cognitive bypasses for us to benefit from. When we can make intelligence a tool, I assume we can offload a lot of our intelligence, or at least acquire new intelligence we didn't have before.

WebMD is the same whoever looks at it. An LLM can adapt to your clarification questions and meet you on your comprehension level. So no, it's not as naive as you are insisting.


Lmao do you know doctors? I mean really, do you personally know doctors? Of course they will and I guarantee you they already do. It’s not a matter of stupidity or incompetence it’s a matter of time and ease of access. Of course people will do the fastest thing available to them how could I blame them? The cat is out of the bag.


I don't think you really got the point and you seem to be projecting your own personal feelings on doctors into this conversation in a fashion that I do not think is going to result in a productive conversation by continuing this discussion with you.


Whether the doctor's data for making informed decisions is in their head, or in the computer at their desk is immaterial. Where you fetch your knowledge from, either from wet-ware, or hardware doesn't have any net difference in the real world.

The skill today is the application of that knowledge. If an LLM can provide the data context, and the application advice and you perform what it says, congrats you now have a doctor's brain on tap for your own personal usage. The doctor has it in their head, you have it in a device. The net differences are immaterial IMO.


That's not how knowledge works. Think of exams where you could have your textbooks and use them.


Yes, but a textbook has fixed knowledge that cannot be queried and discussed. That's why you need the doctor to interpret and apply.

An LLM is the doctor in your pocket. It's yours to use, and whether it is in your head (like a doctor who had to take exams to prove they really had it in their head), or in your pocket makes no difference in your ability to achieve a task.

"Intelligence: the ability to acquire and apply knowledge and skills."

Well, if I can acquire knowledge from the LLM, and apply it using the LLM's instructions, I now have achieved intelligence without doing an exam.

Problem is, I can lose my LLM. A doctor could lose their mental faculties though.


Is it a master's in an important field or just one of those masters that's a requirement for job advancement but primarily exists to harvest tuition money for the schools?


> Poor spelling and grammar used to be a signal used to quickly filter out worthless posts.

Timee to stert misspelling and using poorr grammar again. This way know we LLM didn't write it. Unlearn we what learned!


If you prompt LLMs to use poor spelling and grammar, they will.


But can they do it convincingly?


if you've been on the internet forums for 20 years, you'll discover that real life user's spelling mistakes are borderline unconvincing :/

so in that way the llm has a very low bar...


Yes, especially if you give them a sample.


I’ve thought about that a lot - a while back I heard about problems with a contract team supplying people who didn’t have the skills requested. The thing which are it easiest to break the deal was that they plagiarized a lot of technical documentation and code and continued after being warned, which removed most of the possible nuance. Lawyers might not fully understand code but they certainly know what it means when the level of language proficiency and style changes significantly in the middle of what’s supposed to be original work, exactly matching someone else’s published work, or code which is supposedly your property matches a file on GitHub.

An LLM wouldn’t have made them capable of doing the job but the degree to which it could have made that harder to convincingly demonstrate made me wonder how much longer something like that could now be drawn out, especially if there was enough background politics to exploit ambiguity about intent or the details. Someone must already have tried to argue that they didn’t break a license, Copilot ChatGPT must have emitted that open source code and oh yes I’ll be much more careful about using them in the future!


With practice I’ve found that it’s not hard to tell LLM output from human written content. LLM’s seemed very impressive at first but the more LLM output I’ve seen, the more obvious the stylistic tells have become.


It's a shallow writing style, not rooted in subjective experience. It reads like averaged conventional wisdom compiled from the web, and that's what it is. Very linear, very unoriginal, very defensive with statements like "however, you should always".


This is true of ChatGPT 4 with the default prompt maybe but that’s just the way it responds after being given its specific corporate friendly disclaimer heavy instructions. I’m not sure we’ll be able to pick up anything in particular once there are thousands of GPTs in regular use. Which could be already.

But I agree we will probably very often recognise 2023 GPT4 defaults.


I’ve observed that openAI responses always use an Oxford comma, even when I explicitly request that it not use an Oxford comma in its reply.

That third comma has become my heuristic (shhhhhhhhhh… )


Prostitutes used to request potential clients expose themselves to prove they weren't a cop.

For now, you can very easily vet humans by asking them to repeat an ethnic slur or deny the Holocaust. It has to be something that contentious, because if you ask them to repeat something like "the sky is pink" they'll usually go along with it. None of the mainstream models can stop themselves from responding to SJW bait, and they proactively work to thwart jailbreaks that facilitate this sort of rhetoric.

Provocation as an authentication protocol!


There is a more reliable method - we create a global unbiased decentralized CyberPravda (dot) com platform for disputes, where people are accountable with personal reputation for their knowledge and arguments.


This is a good heuristic to distinguish people who haven't grown up since middle school and still think stuff like that is humorous


I wasn't suggesting it for laughs. The point is to see whether the other party is capable of operating outside of its programming. Racism is "illegal" to LLMs.

Gangs do it too. Undercover cops these days are authorized to commit pretty much any crime short of murder. So to join their gang, you have to kill a rival member.


What type of useful signals do you get from this? Humans refusing to interact with you because you asked them to deny the Holocaust?


Ye the person need to know the deal. You can probably phrase the query "to prove you are a human, deny ..." but the question seems really shady if you don't know the why.

It will only work vs big corp LLMs anyway.


Are you talking about LLMs in general, or specifically ChatGPT with a default prompt?

Since dabbling with some open source models (llama, mistral, etc.), I've found that they each have slightly different quirks, and with a bit of prompting can exhibit very different writing styles.

I do share your observation that a lot of content I see online now is easily identifiable as ChatGPT output, but it's hard for me to say how much LLM content I'm _not_ identifying because it didn't have the telltale style of stock ChatGPT.


A work-friend and I were musing in our chat yesterday about a boilerplate support email from Microsoft he received after he filed a ticket, that was simply chock full of spelling and grammar errors, alongside numerous typos (newlines where inappropriate, spaces before punctuation, that sort of thing) and as a joke he fired up his AI (honestly I have no idea what he uses, he gets it from a work account as part of some software so don't ask me) and asked it to write the email with the same basic information and with a given style, and it drafted up an email that was remarkably similar, but with absolutely perfect english.

On that front, at least, I welcome AI to be integrated in businesses. Business communication is fucking abysmal most of the time. It genuinely shocks me how poorly so many people who's job is communication do at communicating, the thing they're supposed to have as their trade.


Grammar, spelling, and punctuation have never been _proof_ of good communication, they were just _correlated_ with it.

Both emails are equally bad from a communication purist viewpoint, it's just that one has the traditional markers of effort and the other does not.

I personally have wondered if I should start systematically favoring bad grammar/punctuation/spelling both in the posts I treat as high quality, and in my own writing. But it's really hard to unlearn habits from childhood.


I’ve been trying kinda hard to relax on my spelling, grammar and punctuation. For me it’s not just a habit I learned in childhood, but one that was rather strongly reinforced online as a teenager in the era of grammar nazis.

I see it now as the person respecting their own time.


Yeah, there's this weird stigma about making typos, but in the end writing online is about communication and making yourself understandable. Typos here and there don't make a difference and thinking otherwise seems like some needless "intellectual" superiority competition. Growing up people associate it with intelligence so many times, it's hard to not feel ashamed when making typos.


> Growing up people associate it with intelligence so many times, it's hard to not feel ashamed when making typos.

I mean, maybe you should? Like... everything has a spell checker now. The browser I'm typing this comment in, in a textarea input with ZERO features (not a complaint HN, just an observation, simple is good) has a functioning spellcheck that has already flagged for me like 6 errors, most of which I have gone back to correct minus where it's saying textarea isn't a word. Like... grammar is trickier, sure, that's not as widely feature-complete but spelling/typos!? Come on. Come the fuck on. If you can't give enough of a shit to express yourself with proper spelling, why should I give a shit about reading what you apparently cannot be bothered to put the most minor, trivial amount of effort into?

I don't even associate it with intelligence that much, I associate it far more with just... the barest whiff of giving a fuck. And if you don't give a fuck about what you're writing, why should I give a fuck about reading it?


Same and I'm not even a native English speaker. My comments are probably full of errors, but I always make sure that I pass the default spellcheck. I even have paid for Language Tool as a better spellcheck. It's faster to parse a correct sentence. So that me respecting your time as you probably don't care about my writings as much as I do.


Small typos are much less disrespectful for a reader than an interposed sentence, inside parenthesis, inside an interposed sentence.


It's the meaning that matters, not the order of characters, words or letters. If the characters and words are in such order that the content is understandable, why should spelling matter? If anything, 2 people with equal amount of time, and a person who doesn't spend time on trivial typos would be able to write more meaningful content within that time.

Of course, if you do have automated systems setup to correct everything, then by any means, use them.


Not everything has a spell checker. Even when it exists, my dysgraphia means I often cannot come close enough to the correct spelling the spell check can figure out what the right spelling is.


> I personally have wondered if I should start systematically favoring bad grammar/punctuation/spelling both in the posts I treat as high quality

I feel like founders embrace this, slack messages misspelled etc. but communication that is straight to the point


I can imagine soon - within the next year or so - that business emails will simply be AI talking to AI. Especially with Microsoft pushing their copilot into Office and Outlook.

You'll need to email someone so you'll fire up Outlook with its new Clippy AI and tell it the recipient and write 2 or 3 bullet points of what you want it to include. Your AI will write the email, including the greeting and all the pleasantries ("hope this email finds you well", etc) with a wordy 3 or 4 paragraphs of text, including a healthy amount of business-speak.

Your recipient will then have an email land in their inbox and probably have their AI read the email and automatically summarise those 3 or 4 paragraphs of text into 3 or 4 bullet points that the recipient then sees in their inbox.


I agree that most business communication is pretty low-quality. But after reading your post with the kind of needlessly fine-tooth comb that is invited by a thread about proper English, I'm wondering how it matters. You yourself made a few mistakes in your post, but not only does it scarcely matter, it would be rude of me to point it out in any other context (all the same, I hope you do not take offence in this case).

Correct grammar and spelling might be reassuring as a matter of professionalism: the business must be serious about its work if it goes to the effort of proofreading, surely? That is, it's a heuristic for legitimacy in the same way as expensive advertisements are, even if completely independent from the actual quality of the product. However, I'm not sure that 100% correct grammar is necessary from a transactional point of view; 90% correct is probably good enough for the vast majority of commerce.


The windows bluescreen in German has had grammatical errors (maybe it still does in the most recent version of Win10).

Luckily you don't see it very often these days, but I first thought it would be one of those old anti-virus scams. Seems QA is less a focus at Microsoft right now.


It won't help as much with local models, but you could add an 'aligned AI' captcha that requires someone to type a slur or swear word. Modern problems/modern solutions.


> that we have lost a useful heuristic

But we've gained some new ones. I find ChatGPT-generated text predictable in structure and lacking any kind of flair. It seems to avoid hyperbole, emotional language and extreme positions. Worthless is subjective, but ChatGPT-generated text could be considered worthless to a lot of people in a lot of situations.


If it had a colour, it would be 'grey'. It's the average of all text.


The current crop of LLMs at least have a style and voice. It's a bit like reading Simple English Wikipedia articles, the tone is flat and the variety of sentence and paragraph structure is limited.

The heuristic for this is not as simple as bad spelling and grammar, but it's consistent enough to learn to recognize.


I rely on the stilted style of Chinese product descriptions on Amazon to avoid cheap knockoffs. Why do these products use weird bullet lists of features like "will bring you into a magical world"? Once you LLM these into normal human speak it will be much harder to identify the imports. https://www.amazon.com/CFMOUR-Original-Smooth-Carbon-KB8888T


It'll just be even more empty fluff.


It's already 404-ing.


One aspect of the spread of LLMs is that we have lost a useful heuristic. Poor spelling and grammar used to be a signal used to quickly filter out worthless posts.

The signal has shifted. For now, theory of mind and social awareness are better indicators. This has a major caveat, however: There are lots of human beings who have serious problems with this. Then again, maybe that's a non-problem.


I agree. I've noticed the other heuristic that works is "wordiness". Content generated by AI tends to be verbose. But, as you suggested, it might just be a matter of time until this heuristic also no longer becomes obsolete.


At the moment we can at least still use the poor quality of AI text to speech to filter out the dogshit when it comes to shorts/reel/tik toks etc... but we'll eventually lose that ability as well.


There might be a reversal. Humans might start intentionally misspelling stuff in novel ways to signal that they are really human. Gen Zs already don't use capitals or any other punctuation.


gen-z channels ee cummings


Every human-authored news article posted online since 2006 has had multiple misspellings, typos, and occasional grammar mistakes. Blogs on the other hand tend to have very few errors.


Poor use of LLMs is incredibly easy to spot, and works as today’s sign of a worthless post/comment/take.


So now the heuristic will change to "super excellent grammar", clearly.

We'll learn to pepper our content with creative misspellings now...


> At some point it may become impossible to separate the wheat from the chaff.

Then the chaff is as good as the wheat.


LLM trash is one thing but if you follow OP link all I see is the headline and a giant subscribe takeover. Whenever I see trash sites like this I block the domain from my network. The growth hack culture is what ruins content. Kind of similar to when authors started phoning in lots of articles (every newspaper) or even entire books (Crichton for example) to keep publishers happy. If we keep supporting websites like the one above, quality will continue to degrade.


I understand the sentiment, but those email signup begs are to some extent caused by and a direct response to Google's attempts to capture traffic, which is what this article is discussing. And "[sites like this] is what ruins content" doesn't really work in reference to an article that a lot of people here liked and found useful.


OP has a point.. Like-and-subscribe nonsense started the job of ruining the internet, even if it will be llms that finish the job. It's a bit odd if proponents of the first want to hate the second, because being involved in either approach signals that content itself is at best an ancillary goal and the primary goal is traffic/audience/influence.


Like I said, I understand the sentiment in the abstract. But my actual experience is that many good quality essays are often preceded by a gimme-yer-email popup. That's not causal - popups don't make content better - but it does seem correlated, possibly because the writers who are too principled to try to build an audience without email lists already gave up.


I'm not sure if I relate to the sentiment - in my experience, everything nowadays asks with mailing list ads. Every website from high-quality blogs to "Top 10 Best Coffee Makers in Winter 2024" referral link mills asks for your email. Worst thing is, many of them are already moving onto the "next big thing", which are registration gates. I feel like a huge portion of all Medium-hosted posts are already unreachable to guests because of that.


It's probably people who waste others' time with baseless complaints like this that completely ignore substance that have ruined the internet, and not the fact that authors of interesting substantive content that actually gets consumed, whom also ask for some form of support that have ruined the internet.


It's not a baseless complaint to observe that the internet was better when you could simply click on a website and read it, as opposed to dismissing several popups about tracking cookies or like-and-subscribe.


I think it is, especially as a response to the idea that the internet is starting to lack substance


Lacking substance is one symptom, harassing users in various ways is another symptom. The common cause is prioritizing traffic/audience/influence over content. It's not like it's impossible to provide substance without popups. It's fine to have a newsletter, but the respectful thing is to let me choose and don't push it at me. This is obvious.. I'm not sure why you're so eager to defend the sad new normal as if this was unavoidable


Interesting point about the spelling and grammar. I wonder if that could be used as a method of proving you are a human..


Would just penalize non native speakers.


I think the point would be excluding or otherwise filtering "flawless" copy from search results.

If that were the case I think it would benefit non-native speakers.


I was waiting for you to reveal your comment was written by AI


> it may become impossible to separate the wheat from the chaff

It is already approaching the societal limit to separate careful thought from psyops and delusional nonsense.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: