bajsicki.com/content/blog/zucc-may-be-lying.md

2.3 KiB

+++ title = "On Mark Zuckerberg's recent claims" lastmod = 2024-09-28T00:30:38+02:00 tags = ["copyright", "zuckerberg"] categories = ["tech", "llm"] draft = false meta = true type = "list" [menu] [menu.posts] weight = 3001 identifier = "on-mark-zuckerberg-s-recent-claims" +++

The other day, a post on the Fediverse caught my attention, linking to this article from The Verge. I thought I'd make some things clear.

This isn't the only website I run; in fact, this is not even the largest. I have written books worth of commentary and articles dealing with a variety of topics, a lot of which pertain to difficult, labyrinthine and opaque matters.

The time I spent writing these wasn't free. The time I spent researching wasn't free. And as tradition demands, I put very few restrictions on the use of my writing; my only requirements for printing, copying, and distribution are that:

  1. If you find errors, let me know so I can fix them.
  2. The text must be whole, with all the notes and edits and footnotes and sidenotes and margin-notes.
  3. You're not allowed to charge for copies. At all. If you want to send someone a copy, cover the cost yourself.
  4. You're not allowed to paywall it; reproductions must remain free to access, and be complete.
  5. You're not to use my writing for machine learning or training 'ai' models of any sort.

I don't think that's unfair; if anything, some of the articles I've written have taken years to put together.

The idea that the value of my specific content is 'overestimated', to the point where I'd give up ownership of it, is silly.

I'm not arguing that I should be paid for them using my work. I'm stating that my work should be accessible freely and, in its entirety.

I have seen enough silliness from LLMs to know that they can't be trusted to faithfully reproduce complex writing, and the last thing I wish for myself is to be accused of claims I did not make.

In short: Zucc, my writing doesn't exist to fuel climate change by training massive data matrices that can't even answer simple questions reliably.

Wouldn't it be so much easier to use tools that algorithmically, predictably, reliably solve problems, instead of... whatever this is?