top of page

Billions of Dollars, Millions of Souls, and the Acropolis: the Sobering Math Behind Meta’s Staggering Book Heist

  • Writer: Eric Anderson
    Eric Anderson
  • Apr 18
  • 13 min read

Updated: Apr 19



 

Like many who read and write books, I was gutted when I learned in Alex Reisner’s Atlantic article, “The Unbelievable Scale of AI’s Pirated-Books Problem,” that Meta had stolen 7.5 million already-stolen books and 81 million already-stolen articles to train Llama, its AI model.  I’m not a famous writer, none of my books is published, and when I checked the pirating website LibGen, nothing I’ve ever written and published was there.  But, I am well aware of the labor that goes into storytelling, and the value of that labor’s fruits.  On behalf of all those whose work is and will someday be pirated and used for AI, I felt deeply violated, as if Meta was trying to forcefully render me and anyone who values the process and the product of human expression obsolete, literally stealing our words and using them against us.


This was bad.  But how bad?  


“Move fast and break things” is the original motto of Meta founder and CEO Mark Zuckerberg, and he certainly stole so many books, so fast, that he broke this writer’s spirit for a good 36 hours.  I slipped into a funk where I felt unmotivated to dig into my Work-in-Progress (WIP) novel, which had so far given me so much joy, because I knew its inevitable destination in a billionaire’s chatbot.  I dug myself out by working on my book anyway.  The process is healing.  The pages I wrote in the wake of the theft felt satisfyingly rebellious in their sluggish, zigzaggy, one-word-forward-and-two-words-back halting humanness.  But I still lost a day-and-a-half.  Of progress, sure, which might eventually turn into readers and money.  But also of the maddeningly drudgerous fun of the creative effort.  The crime, to my count, now totaled 7,500,000 books, 81,000,000 articles, and 36 hours on my WIP. 


What else did Meta (Llama), Microsoft (Copilot), Open AI (Chat GPT), Alphabet (Google’s Gemini), and others in the AI arms race steal? 


Narrowing in on just the books that Meta stole: to size up the crime, in money, time, diminished expected value and incentives for human authors, enhanced expected value and incentives for AI competitors to do the same thing, or other units, I figured we need to have some sense of the value of what was stolen.  Some theft is relatively easy to quantify.  If bank robbers get away with ten money sacks, each of which has $750,000, it’s a $7,500,000 million robbery.  How do we measure the value of 7,500,000 books?  How do we measure the value of one book?  The Werewolf of Fever Swamp by R. L. Stine, a favorite Goosebumps book of mine, may have cost me a few weeks of piano practice earnings in the 1990s (I made $0.25 for every 20 minutes practicing piano, a rate that doubled if I practiced before school), but I’m not sure that really captures the value.  In the 1st and 2nd grades, having the latest literary scare contraband was big social currency, especially because the school librarians made Goosebumps books off-limits to all but 5th or 6th graders.  Nor did the book’s value depreciate after childhood.  Nearly three decades later, the mother of one of my elementary school friends said that, before her son and I started hanging out, he didn’t really like reading.  Without The Werewolf of Fever Swamp, The Haunted Mask, The Horror at Camp Jellyjam, and many others, I was at risk of dismissing books as way more boring than Sega Genesis myself.  Certainly, a love of reading contributed to our academic development, and thus the contributions, economic and otherwise, we make to society.  The point is, the ripples of influence from a book travel ever outward across time and currencies.  


AI only further complicates the task of quantifying the value of a book.  How much better will Llama be now that it has gobbled up Werewolf and the hundreds of other Fear Street and Goosebumps books on LibGen?  How valuable is a better AI?  With the applications and market value of AI still in flux, how can we know, or even roughly estimate, how valuable the stolen IP loot is?


Nor is divining the value of the stolen books enough to size up the crime.  We also need to know when the crime is over.  Our bank robbers might succeed once and get away with $7.5 million.  They might succeed twice and get away with another $7.5 million, for a grand total of $15 million.  But eventually, when they are caught, or quit while ahead, we can total up how much they took.  With generative AI, the crime is never over.  Every time someone uses Llama, the stolen books are stolen from again.


Sizing up Meta’s theft is no simple task.  Yet merely decrying the theft as “bad” and stopping there is to allow the crime scene to remain in the dark.


Our forensic tools are simple: estimation, analogy, reasoning, forecasting, and, most importantly, courage.  Although the villains here aren't friends who turn into werewolves, camp counselors who turn into lackeys for a subterranean monsters sweating snails, or possessed dummies who act out then retreat into a lifeless heap of wooden limbs, tech barons who turn into book bandits get away with it by running the same playbook: hiding from accountability.  


And so, even now, the cutesy catchphrase of Goosebumps rings with heavy, eternal relevance: “Reader beware, you're in for a scare!”

 


Theft Estimate using AI Licensing Fees


The most straightforward way to quantify just how much Meta stole in the units that most companies most listen to—dollars—is to tally up the AI licensing fees for each book.  Unfortunately, these numbers do not exist. 


Licensing rights for AI is a new frontier.  No one knows how much the rights for a certain book are worth, or even if they’re for sale.  There hasn’t been a whole lot of precedent because Meta, Microsoft, and others don't have much incentive to figure it out, given they benefit by opting for “free” as long as they can get away with it.  


As a rough estimate, in late 2024, Microsoft and HarperCollins signed a deal in which Microsoft will pay HC $5,000 per nonfiction book (split between HC and the author) to train its AI model.  If all the 7,500,000 books got HC offers (they wouldn’t), and if all authors consented (they wouldn’t), Meta would face a bill of $37,500,000,000.  That’s $37.5 billion.  Even if we calibrate the estimates to reality—only a small percentage of the books Meta stole are published by HC, or any major publishing house, so most books would get lesser offers—and run a hypothetical Meta licensing offer of $1,000 per text, it’s still billions of stolen IP loot. 


Problem is, some things aren’t for sale.  There’s the issue of consent.  No one knows who will accept what amount of money and terms to license which books to which AI models.  Reality can depart from survey results, but, in a survey by The Authors Guild asking whether writers were willing to license their works to AI, roughly one-third said yes, and two-thirds said no or unsure. 


If a book is not for sale to AI, what is it worth? 


Worth stealing, was Meta’s answer.


For our purposes, here, we’ll need to try on a unit other than money.  Clearly, an unwillingness to sell means some authors are valuing something about their books more than money.  Call it heart, self, spirit, singular purpose, one-of-a-kindness, raw organic creation, raison d’être.  In Skin in the Game, Nassim Nicholas Taleb says artists have “soul in the game,” which is as good a catchall term as any for that which we refuse to sell.  The math is easy, but the answer is hard to swallow: 1/3 to 2/3 of 7.5 million means Meta stole between 2,500,000 and 5,000,000 souls.


(A note on the bookkeeping of souls: others may quibble, but I contend that 5 million souls is more accurate than 5 million pieces of soul.  Books are not horcruxes.  I don't want to speak for everyone who's ever written a book, but from my own reading and writing life, my sense is that we are not titrating out our limited souls.  In other words/numbers, we're not just getting half of Harper Lee's Soul Per Book (SPB) because she published 2 books, a fifth of R.F. Kuang’s SPB because she has, as of writing, published 5 books.  We put all of our soul into every book.  Into every chapter.  Into every paragraph.  Into every sentence.  Published or unpublished, long or short, mainstream or avant-guard.  The advantage of this method of soul bookkeeping is that it is a truthful reflection of how for writers, the process is meaningful and the product is singular to who they are and no one else can reproduce, although AI is trying.  Unfortunately, the disadvantage of this method is that it makes it seem that Meta is simply stealing an ever-renewable resource.  More on the tragedy of commons later.)    


What about the authors of those other 2.5 million to 5 million books who might entertain offers, but instead were stolen from long before they ever got the chance to see an offer?  First, just because they might sell, given the right conditions, doesn’t mean that there’s less SPB in their work.  They may be resigned to a reality that, if it’s between getting their books pirated and stolen for AI, or getting a little bit of money for it, might as well get a little bit of money—a “selling your soul is better than letting your soul get stolen” logic.  Some would accept the bottom-of-the-barrel, some would accept only top dollar, some would see a $1,000, or $5,000, or $50,000 offer for their souls and join the refuse-to-sell camp (and vice versa).  I imagine more than a few writers would accept only a dollar amount similar to what writer/comedian Daniel Kibblesmith asked for: enough money to never have to work again, “since that's the end goal of this technology.”  For four decades of life at the average American’s annual cost of living ($77,280), that would be around $3.1 million.  Multiplied across, let’s say, one million authors, Meta stole $3,100,000,000,000.  That’s $3 trillion. Over twice Meta’s market value.


The truth is, it takes so much to write a book worth stealing, and selling it to help a machine get closer to imitating it feels so anathema to the spirit that drives authors to write in the first place, that Microsoft’s $5,000 offer feels cheap—$5/hour if it took 1,000 hours to write the book.  Actually, $2.50/hour for the author, since the publisher gets half.  But, if we run an estimated value of $5,000 per book, over 2.5 million to 5 million books, we arrive at a score of $12,500,000,000 to $25,000,000,000.


So, did Meta steal billions of dollars or trillions of dollars?  Depends on who you ask.  What is clear is that they stole millions of souls.  They didn’t even give writers a chance to sell them.  Even the devil offers better terms than that.

 


Theft Estimate Using the Time it Takes to Write a Book


Legend holds that Agatha Christie wrote Absent in the Spring in three days, calling in sick from work and sleeping for 24 hours after she was done.  Before The Godfather, Mario Puzo spent a decade writing The Fortunate Pilgrim.  R. L. Stine wrote one Goosebumps book per month, with just a single bent left index finger.  Isle McElroy noted in their essay, A Name is Only Gender Neutral if No One has Ever Used it to Gender You: Changing My Name After Publishing My Debut Novel, that The Atmospherians “took about five years to write; there were plenty of rough stretches.”  Accounts and accounting methods will report varying creative durations for Robert Lewis Stevenson’s The Curious Case of Dr. Jekyll and Mr. Hyde, but as far as artistic origin stories go, it's a good one, full of burnt manuscripts, fevered illness and/or drug-fueled hazes, over a very Victorian three to six days.


Anecdata reveals that there is no “too fast,” “too slow,” or “just right” amount of time that spans final first word to final last word of that umpteenth draft that has been written, rewritten, rerewritten, rererewritten, etc., edited, copyedited, and at last, “finished,” as much as any piece of art can ever be finished, which Da Vinci thought was never.  (Thanks and credit to my writer friend Duke Craig for letting me know of the quote attributed to Da Vinci, “art is never finished, only abandoned,” when I was mired in rerererererererererewrites and debating putting another “re” or two on top.)  And who bothers punching a time card?  A joy of the creative process is losing yourself in it, not feeling the passage of each second until quitting time.  


But, it’s fair to say that it takes a lot more time to write a book than it does to read a book, and reading an 80,000-word book at average pace (200-250 wpm for US adults) takes 5-7 hours.  How much more?  To my knowledge, scientists have never randomly selected authors and put them under observation to arrive at a number, but perhaps Colson Whitehead’s description of his chair in The Noble Hustle: Poker, Beef Jerky, and Death, is the next best thing: “My magnificent ergonomic chair, the steadfast galleon I had sailed through books and books, had finally sprung a leak. After 10 years, the webbing of the seat had given way, so I stuffed a throw pillow in there when I had to work” (83). 


With only the data of Colson Whitehead’s chair and intermittent self-reported time logging from creators, it’s tempting to abandon the task of sizing up the crime, to deem it a fool’s errand in quantifying the unquantifiable, and thus let ourselves off the hook.


But then we let Meta off the hook, too. 


The truth is, an hour of every human’s life has value, monetary and otherwise, and there is a number of active creative output hours.  Our information is incomplete and our instrumentation isn’t sensitive enough to divine it perfectly, but we can run some estimates.


Let’s say, on average, each of the 7.5 million texts took 1,000 hours of human labor, defined as active creative output, or literally looking at the WIP manuscript and manipulating it or planning the next manipulation of it, start to finish.  Seven point five million books multiplied by 1,000 hours per book equals 7,500,000,000 hours.  That’s 7.5 billion hours.


As a proxy for the value of those hours, we can convert them into another unit: lifetimes.  Using the average life expectancy worldwide of 72 years, Meta stole about 12,000 lifetimes.  Lifetimes that the writers could’ve spent with their kids and partners, making money, accruing advanced degrees, traveling the world… but instead spent writing.

Back to dollars.  In the United States, the Value of a Statistical Life (VSL) is $10 million.  By that metric, Meta stole $120,000,000,000.  That’s $120 billion.


If a book takes, on average, only 1,000 hours.


As anyone who has created anything can attest, the time spent on active creative output is not the only time spent creating.  What about that solution to a plot snag you got just as you were drifting off to sleep?  Daydreaming in class?  Stuck in traffic?  What about when you don’t think you’re thinking about your book at all but the subconscious is hard at work?


At her book launch event for The House that Horror Built, Christina Henry said her advice for writers used to be to do lots of writing, but now is to go for a walk or a bike ride.  In Larry McMurtry: A Life, biographer Tracy Daugherty recounts that McMurtry wrote religiously for 90 minutes every morning, and yet it's clear the author wasn’t spending the other 22.5 hours creatively idle.  At a court hearing over his suspended driver’s license, McMurtry explained his tendency to speed was the result of writing in his head and getting excited (233).


By this metric, calculating the rough number of hours a book takes is simple: take your age, multiply it by 365, and then by 24.


So that we are not double or triple-counting, 50x-counting in Gwendolyn Brooks's case, or 200x-counting in R. L. Stine’s case, let’s say that, on average, an author had 5 titles that were pirated on LibGen and stolen by Meta.  That brings us to 1.5 million authors.  It tends to take awhile to build the skillset to write publishable books, so let’s say that the average author is 50 years old.  The total the passive creative effort stolen totals up to 657,000,000,000 hours.  That's 657 billion hours.


Back to lifetimes.  Using the average worldwide life expectancy of 72 years, 657 billion hours of passive creative effort converts to about 1 million lifetimes.  Back to dollars.  Using the VSL of $10 million, 1 million lifetimes is worth $10,000,000,000,000.  That’s $10 trillion. 


In seconds, Meta stole a book bounty about 7 times more valuable than it is.


 

Why stop at the Elgin Marbles when you can steal the whole Acropolis?


I wrote this essay over 10 days of travel through Greece.  Near the end of the trip, I visited the Acropolis.  There, I saw on the Parthenon more bare stone rectangles than sculpted marble friezes.  Here was the scene of another instance of stolen art, this one perpetrated by a British Lord in the Colonial Era, when filling the coffers of England with the artifacts of other cultures was as pressing a venture as filling the servers of chatbots with books, music, code, images, and videos.  He and his colonizing cronies even successfully rebranded the stolen marble friezes, one depicting chariot riders on display at the Acropolis Museum, as Elgin’s Marbles, as if in prying them off the Parthenon with a crowbar, he carved them himself.


We have had over 200 years to return the marbles to the Acropolis, but as of today, the Marbles Stolen by Elgin reside still in the Museum of British Colonial Plunder.  When Athens has asked for the marbles back, London has provided excuses as if they’re doing the Greeks a favor, like, “more people go to the British Museum than the Acropolis” or “we take better care of them than you do” or “we don't want to set a precedent for other museums to have to return their cultural treasures” or “they provide a better narrative here” and on and on. 


Some of the parallels between Elgin and Zuckerberg are striking:


  • Both are members of the highest caste of their time and place.

  • Both steal art to further aggrandize themselves.

  • Both justify their theft with invocations to the greater good.

  • Both interpret art as the commons and break off a piece, or the whole thing in Meta’s case.


And, in viewing the marbles, as I did—both times I visited the Museum of British Colonial Plunder—or in using Intelligence from Artists Who Are Neither Compensated Nor Credited—as I have, much more than twice that I’m aware of, much more than that that I’m unaware of, every time I ape an AI-gen pithy phrase or use a Canva slide at work—we are incentivizing England, Meta, and the next thieves who sell us on the goodness of their theft by showing us benefits that are immediate, tangible, and enjoyed by us, and harm that is distant, existential, and borne by others. 


But it’s worthwhile to know the costs.  As of writing, focusing just on books, Meta stole:


  • 7.5 million books, up to 5 million of which would never be for sale in the first place

  • Between $7,500,000,000 and $36,500,000,000 in potential AI licensing fees

  • $120,000,000,000 in active creative output

  • 5,000,000 souls in active creative output

  • $10,000,000,000,000 in passive creative output

  • And one-and-a-half lost days on my WIP

 


The Great Writerly Reef


As I reflected more on the theft, the image that kept coming to my mind was of a dying coral reef.  Once teeming with vibrant marine life, the decline begins when, individually, “I” take a little, but collectively, “we” take a lot.  A fisherman poaches a few extra fish during its breeding season.  A tourist snaps off one little piece of reef as a keepsake.  Small transgressions that don’t make anyone feel all that evil, and that the reef could weather, if they stopped there.  And they do, for the fisherman, the tourist.  But they don’t for the reef. The reef’s bookkeeping totals up all those tiny transgressions across countless fishermen, countless tourists…


For millennia, authors have created a living and vital coral reef of humanity’s written record.  It is a labor of love, tedium, hope, despair, curiosity, humanness.  Many contribute for little to no payment.  But compensation, monetary and credit, are incentives, too.  Whitehead in The Noble Hustle: “Like many humans, writers need money for food and travel” (124).


Removing or limiting these incentives won’t kill off the reef all at once, and it will never die altogether, but just as I felt for 36 hours such despondency that the joys of authorship—the process, the money, the credit—only the first was alive and well, all of us who use AI are diminishing the reef, one little snapped-off piece at a time.

 

 

 

 

 
 
 

Recent Posts

See All

Comentários


bottom of page