Content Creator
Everyone's talking about "Open"AI and Google scraping their entire website every 10 minutes, just in case anything might have changed.
But not mine.
I checked the logs for the first time in a long time, and there's not much traffic.
The last 20,000 requests show only 111 from http://www.google.com/bot.html
and only 24 from the various OpenAI bots.
Why Don't You Love Me, ChatGPT?
Google.com informs me the reason is 'SEO', and after looking up what that means, I decided to make 'content'. Once I have more content, ChatGPT will surely love me, and learn from me.
But I can't feed ChatGPT any old 'content'; if you feed it nonsense then it will only learn how to nonsense in the context of nonsense. No - best to feed it proper 'content'. But how?
ChatGPT, of course, has the answer: 'AI'.
Generating Content
I can't afford to develop my own 'AI', and I'm far to mean to pay for any, but luckily I run Linux, and Linux distros come with an open source AI: dadadodo
.
From the help page of dadadodo
:
1usage: dadadodo [ options ] [ input-files ]
2
3This program analyses text files and generates markov chains of word
4frequencies; it can then generate random sentences based on that data.
According to OpenAI - 'markov chain' means 'Artificial Intelligence'. Wow! Now my computer will be smart.
Time to use my old blog-text to generate new 'content'!
I'll go to the directory with all my articles, and feed everything into dadadodo
:
1$ cat *.md | dadadodo -
2
3dadadodo: reading stdin...
4............................... 6305 lines
shh...it's thinking...
1Perhaps the standard writing at one can't stand to corporations not to
2hear evolve.
3
4I also it's mostly do down: the credits page which has fewer words
5should get This a rumour university and saying this gives immediately
6tell it, your retinas, and Stay. Shops having children spoken with
7birds, that I've heard nobody else they hear a year Nobody will things
8based on display; at the hard of the cause more Brainless Youtube very
9rude questions. Hope they're using a decade. Everyone in the bodies
10of course, would normally find switch to anything else, feels a friend,
11a few years of someone's till the mail isn't available (and has by
12infiltrating comparison Despite this of variety strange people
13Mathematicians Neil which Anglophones think about when someone
14recreated GURPS go through Sile at that American police to go).
15
16Utilitarian theory this basic fraction of these two parts will make had
17been able to the other users, and the right but soon the fastest
18internet so Alice can film, police posts up on any reasonable sounding;
19handle the Critical other. I can't imagine someone can immediately
20tell you haven't would. Since nobody knows pee; voh the UK, servers
21in all packages and for games useful ways: to both come go too many
22from which gives us have, the only title. Cultural differences The
23man with contracts, rather the git. You know intellectually that of
24what this sounds pretentious to do all compete with the user will never
25using a computer Skills, to it as the possibility of children
26JavaScript you encounter speakers can think in other well, and one can
27make a while one inferring meaning?
28
29The Devil date: or deny an instance with evidence from: people, used the
30mutt; delay there's no hassle to the Benefits of those set up. I
31wouldn't do awful, idea from dead, less acknowledged that most or could
32help here: this by default require a. That's what runnit is
33undefined.
Wow! It sounds just like me from the future when I have dementia. Very futuristic! I bet I could use this valuable content as a side-hustle to earn cash while I sleep. But let's not get sidetracked by the big-bucks: ChatGPT still doesn't love me.
How Double Your Content Output
'Go big or go home', they say, so I'm going to DOUBLE the size of my site by adding a link to new and exciting content on EVERY PAGE.
This website uses hugo
to render markdown files as html using go
templates.
Unfortunately, I don't know what any of those words mean, so I used my standard trick of pestering people on the internet.
Michael kindly furnished me with the following snippet:1
1<a href="{{ .Page.RelPermalink }}content.html"></a>
I've put this snippet in my website at splint.rs/layouts/_default/single.html
, and now every page has a link to the same page (RelPermalink
) plus content.html
.2
Now I can just run hugo
and the links are in place.
But where do they go?
Nowhere!
Making Content with Makefiles
I make my website with a Makefile
file.
So it just needs a line to tell it where all the pages are:
1html_pages != find public/ -type f -name index.html
Now the makefile just needs the same variable, with the same files, but with content.html
instead of index.html
:
1content_pages = $(patsubst %/index.html, %/content.html, $(html_pages))
Finally, we tell make
how to make those files from all the blog posts in the blog/
directory (which has markdown files).
1$(content_pages): %/content.html: $(wildcard blog/*.md)
2 dadadodo -c 14 -html $^ > $@
By adding $(content_pages)
to the default target, the Makefile
will generate just the missing pages, so every time I publish a new blog post, another content.html
file will arrive next to it.
Nice!
Where is the Love?
I left the files there overnight, and now it's tomorrow. Time to check the logs!
1$ tail -n 2000 access.log | grep -i chatgp
2
3
451.8.155.59 - - [05/Jan/2025:08:00:42 +0100] "GET /post/how_to/ HTTP/1.1" 200 49217 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot"
552.156.77.148 - - [05/Jan/2025:15:01:00 +0100] "GET /robots.txt HTTP/1.1" 200 1780 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot"
652.156.77.158 - - [05/Jan/2025:15:01:01 +0100] "GET /post/how_to/ HTTP/1.1" 200 49217 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot"
Drat! Still nothing.
But at least Google appreciates my efforts:
1tail -n 7000 access.log | grep -i google | grep content.html
2
366.249.66.44 - - [04/Jan/2025:22:13:36 +0100] "GET /posts/symposium/content.html HTTP/1.1" 200 3108 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.6778.85 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
466.249.66.44 - - [04/Jan/2025:23:39:51 +0100] "GET /posts/on_the_internet/content.html HTTP/1.1" 200 2325 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.6778.85 Mobile Safari/537.36 (compatible; GoogleOther)"
566.249.66.45 - - [04/Jan/2025:23:44:45 +0100] "GET /posts/symposium/content.html HTTP/1.1" 200 3108 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.6778.85 Mobile Safari/537.36 (compatible; GoogleOther)"
666.249.66.44 - - [05/Jan/2025:00:48:40 +0100] "GET /posts/on_the_internet/content.html HTTP/1.1" 200 2325 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.6778.85 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
766.249.66.45 - - [05/Jan/2025:01:14:44 +0100] "GET /posts/it_unions/content.html HTTP/1.1" 200 3258 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.6778.85 Mobile Safari/537.36 (compatible; GoogleOther)"
8
9 [ ... ]
And we have lots more company:
1$ tail -n 2000 access.log | grep content.html | grep -Pio "[^\s]*bot[^\s]*" | sort | uniq
2
3+http://ahrefs.com/robot
4+http://www.bing.com/bingbot.htm
5+http://www.google.com/bot.html
6+https://developer.amazon.com/support/amazonbot
7AhrefsBot/7.0;
8Amazonbot/0.1;
9Googlebot/2.1;
10PetalBot;+https://webmaster.petalsearch.com/site/petalbot
11bingbot/2.0;
The party's kicking off, and I'm sure ChatGPT will see how popular I am soon!
How to be a Content-Chad Like Me
If you too want to be a chad content-creator, just follow these two steps:
- Give your pages a link to another page of the same name, +
string
(like 'content
'). - Generate those pages with
dadadodo
from your existing content.
Always use your existing content, rather than random babble, otherwise the bots will only learn how to produce babble, and won't really understand that this new content is part of your old content.
-
Of course in the future, I will collect all the notes I have on computers, run it through my new AI, and it should tell me how to do everything, by extrapolating from the old notes. Amazing! ↩︎
-
Your
hugo
theme might be different. If you find the link empty, it's something to do with 'variables' (I am not a programmer). ↩︎