Blog

  • Tracking Love and Hate in Modern Fandoms, Part Two: Star Trek: Starfleet Academy

    I ironically found this image on a subreddit complaining about how this poster doesn’t look star trek enough.

    What I Learned Trying to Build a Scalable Dashboard Framework

    The goal of this phase of RewindOS was straightforward: design a scalable ingestion layer capable of tracking fandom signals in near-real time. In practice, that meant building a lightweight FastAPI service, backing it with PostgreSQL, and deploying it on Railway to validate whether the architecture could support future dashboards at low operational cost.

    As a concrete test case, I wanted to monitor Reddit discussions around Star Trek: Starfleet Academy—specifically post velocity, controversy spikes, and narrative drift across subreddits. The idea was to ingest posts and metadata, normalize them into Postgres, and surface them as cultural “signals” alongside IMDb and news data.

    What I didn’t fully appreciate—despite knowing it abstractly—was just how aggressively Reddit now enforces its platform boundaries.

    Reddit’s API and the End of Casual Scraping

    Reddit has effectively closed the door on nearly all casual or semi-legitimate scraping workflows. API access is tightly gated, application approval is slow and opaque, and even read-only endpoints are heavily rate-limited or restricted by OAuth scope. Anonymous access, once tolerated at low volumes, is now unreliable at best.

    Beyond the API, Reddit’s cybersecurity posture is formidable:

    • Bot detection at the edge (behavioral + fingerprinting)
    • Aggressive IP and ASN-based throttling
    • User-agent and request-pattern heuristics
    • Inconsistent but intentional JSON endpoint breakage

    In short: the platform is designed to detect intent, not just volume. Attempting to build a scraper to monitor Star Trek Academy posts wasn’t just blocked—it surfaced how mature Reddit’s anti-extraction infrastructure has become. I knew this intellectually, but putting it to the test made the reality clear: Reddit data is no longer a “free signal layer” for independent researchers.

    Infrastructure Takeaways: Railway and PostgreSQL

    On the infrastructure side, the experiment was still valuable. Railway proved viable for rapid prototyping: fast deploys, sane defaults, and painless Postgres provisioning. It’s not magic, but for early-stage dashboards and internal tooling, it removes a lot of friction.

    PostgreSQL held up exactly as expected—flexible enough to store raw ingestion data, structured metrics, and future-proofed schemas for signals that don’t fully exist yet. The limitation wasn’t the database or the API layer; it was upstream access to the data itself.

    The Bigger Lesson

    If RewindOS is going to measure cultural velocity, controversy, and fandom health, it can simply revert back to its original dashboards created using python and json. Though not real time it still paints a pretty picture about what’s going on within the fandom itself. Which is all we really care about.

    What This First Pass Shows

    This first iteration of the Starfleet Academy trackers focuses on where conversation actually happens and what kind of attention the show receives, rather than trying to simulate a live social feed.

    A few clear patterns emerge immediately:

    1. r/television captures cultural moments, not fandom

    The r/television tracker surfaces:

    • Trailers and first-look teasers
    • Series premiere threads- official discussions again did not occur on r/television until the second week

    Engagement here is highly concentrated:

    • One or two posts (trailers or premiere discussions) account for a large share of total comments.
    • Topical discussions related to the show is sparse and fragmented

    This tells us r/television is best read as a general-audience and industry sentiment signal, not a place where sustained episode-by-episode engagement lives. However, the moderators do seem to be doing a good job keep the sub from being bombarded by the larger cultural divisions within the Star Trek community related to this show as I will discuss below:

    The absence of strong megathreads is itself a useful signal: it shows where not to look for fandom intensity.

    2. Adding Star Trek–specific subreddits changes the signal shape

    Once r/startrek and r/DaystromInstitute are included:

    • Overall engagement increases substantially
    • Episode-level discussion becomes more structured
    • Conversations shift from “Is this good?” to “How does this fit into canon / science / continuity?”

    This confirms the need for separate dashboards rather than one blended feed.


    What this Tracker Looks At

    Star Trek Subreddits Tracker Focus

    The second tracker (r/startrek + r/DaystromInstitute) is intentionally different.

    It shows:

    • Sustained episode-to-episode discussion
    • Canon, timeline, and science-based analysis
    • Deep engagement that doesn’t show up in mainstream spaces

    r/DaystromInstitute, in particular, acts as a high-signal, low-noise environment:

    • Fewer posts
    • More thoughtful, analytical comments
    • Less reactionary churn

    This tracker is closer to a fandom depth and intellectual engagement index.

    In general it does appear that interest in the cultural hate of SFA at least on reddit has decreased and comments and engagements are ticking up.

    The show while currently sitting with a horrible 4.3 IMDB rating gets more engagement than other one off streaming shows and there is some discussion among fans who had watched the latest Sisko-themed episode of SFA but claim will not watch any of the others (unless I am assuming there are further ties to Star Trek canon – in their eyes.)

    It remains to be seen if there will be any changes to the Star Trek Starfleet Academy schedule, but despite the supposedly low rating these engagements may be what Paramount is looking for after all.

    If you want to see or run the Python script used for this analysis, the full repository is here:

    👉 GitHub:

    https://github.com/jjf3/rewindOS_SFA2_Television_Tracker

    https://github.com/jjf3/rewindOS_sfa_StarTrekSub_Tracker

  • Tracking Love and Hate in Modern Fandoms: Part One — Heated Rivalry

    Why I Built This

    I originally thought tracking a TV show’s fandom growth on Reddit would be straightforward.

    Count subscribers.
    Track users online.
    Check back later and see if the numbers went up or down.

    That assumption turned out to be wrong.

    As I started looking more closely—first casually, then programmatically—it became clear that Reddit no longer exposes fandom size and activity in a way that’s consistent, comparable, or easy to track over time. Subscriber counts, “users online,” and newer UI-only labels often tell different stories depending on where and how you look.

    So instead of trying to force unreliable metrics to behave, I built a small tracker to answer a simpler question:

    If subscriber numbers are increasingly opaque, how does a fandom actually behave when a show is airing?


    The Problem With Traditional Metrics

    On the surface, Reddit still appears to show everything you’d want:

    • member counts,
    • active users,
    • large audience numbers framed as community participation.

    But those numbers are no longer consistent across views. The same subreddit can appear to have wildly different “sizes” depending on whether you’re looking at search results, a community sidebar, or aggregated UI elements that blend multiple related spaces together.

    For hobbyist analysts—or anyone trying to track how a TV show grows as it becomes more popular—this creates a real problem:
    you can’t rely on subscriber counts alone to measure engagement anymore.

    Rather than treating this as a bug to work around, I treated it as a signal that the metric itself had become less useful.


    A Different Approach: Comments as Engagement

    If membership numbers are increasingly abstracted, discussion isn’t.

    People still comment.
    They still argue.
    They still react in real time to episodes, trailers, and announcements.

    So for Heated Rivalry, I built a tracker that focuses on comment activity, not just who clicked “join.”

    The idea was simple:

    • Identify episode discussion threads.
    • Track how their comment counts change over time.
    • Separate episode discussions from trailers and general posts.
    • Treat comments as a proxy for sustained engagement, not passive interest.

    This doesn’t measure “how many people know about the show.”
    It measures how many people are actively participating in it.


    What the Tracker Looks At

    For this first case study, the tracker focuses on posts about Heated Rivalry within broader TV discussion spaces:

    • r/television episode discussion threads (e.g., 1x01, 1x02, etc.)
    • The official trailer post and its discussion
    • A small set of other high-engagement posts mentioning the show

    For each post, the tracker records:

    • comment count,
    • post score (net upvotes),
    • timestamp,
    • and whether it’s episode-related or not.

    By running the tracker repeatedly over time, it builds a longitudinal record of how discussion grows—or stalls—after episodes air.


    Why Comment Growth Matters

    Comments behave differently than votes or subscribers.

    • Votes are fast and shallow.
    • Subscribers are passive and increasingly obscured.
    • Comments require time, attention, and emotional investment.

    Episode discussion threads, in particular, are useful because they:

    • accumulate comments over days, not minutes,
    • reflect both positive and negative reactions,
    • and reveal whether conversation sustains beyond initial release.

    In practice, this makes comment growth a better indicator of fandom intensity than headline audience numbers.


    What This First Pass Shows

    The Heated Rivalry tracker is a Python script that queries r/television via Reddit’s public JSON search, tags posts as episode discussions (e.g., 1x01/1x02), an official trailer thread, or other mentions, and then captures each post’s num_comments, score, and permalink into CSV outputs.

    On each run it appends those comment counts into a longitudinal comment_history.csv, so the same threads can be measured repeatedly as they accumulate discussion over time.

    In our first pass it successfully separated episode discussion threads from non-episode chatter, identified the trailer thread, and produced a ranked list of a few other high-comment posts for comparison. The resulting plots/dashboard are designed to show whether engagement is concentrated in episode discussions (sustained growth) versus one-off posts (short spikes), using comments as the “true” activity signal rather than inconsistent UI membership numbers.


    What Comes Next

    This post is Part One of a broader series on measuring fandom response when platforms redefine their metrics.

    Next:

    • I’ll expand this approach to other shows,
    • compare episode-to-episode engagement patterns,
    • and apply the same methodology to much larger, more polarized fandoms.

    The goal isn’t to rank fandoms.


    It’s to understand how they actually behave when love, disappointment, and debate all show up in the same comment thread.

    If you want to see or run the Python script used for this analysis, the full repository is here:

    👉 GitHub: https://github.com/jjf3/rewindos_heated_rivalry_tracker

  • Operational Security Notes: Hardening RewindOS After a Hosting Migration

    A brief infrastructure note on reducing attack surface during routine maintenance

    RewindOS is built to prioritize observable behavior and minimal assumptions—not just in cultural analysis, but in infrastructure as well.

    During a hosting migration over the weekend from porkbun to rocket.net for better stability and control over the site itself, I reviewed authentication traffic patterns due to unusual login attempts on my site. Note: this is NOT something that every hosting provider (Namely cheaper ones) provide. So thanks to rocket.net. This review concluded repeated automated login attempts originating from non-U.S. bot traffic.

    I used it as an opportunity to apply layered security controls:

    • Reduced attack surface by obscuring default login pages
    • Implemented rate-limiting on failed login attempts
    • Disabled legacy remote-procedure endpoints that are no longer required for site operation

    After applying these changes, automated login traffic ceased, and authentication logs returned to baseline behavior.

    This process reinforced a simple principle that applies beyond WordPress: I could have left the site as is, but doing a little research and installing some much-needed plugins to close outdated loopholes is a must for everyone running their own site.

    RewindOS continues to favor minimal surface area, observable behavior, and incremental hardening over complex or opaque security tooling.

  • Severance and the Math of Cultural Safety

    When Severance used “Baby It’s Cold Outside” in Season 2, Episode 7 (Chikhai Bardo), which aired February 28, 2025, the choice looked controversial only if you froze time in 2018.

    If you remember there was a very brief controversy in 2018 framed around consent, radio pullbacks, and an outsized media backlash narrative.

    At RewindOS we don’t know if the team over at Apple TV refined their data in this structured way or if indeed the 2018 controversy ever even came up in the writer’s room or the legal department.

    So with that being said this project set out to answer a simple question:

    By late February 2025, was this song actually risky to use?

    My original thought was no. Not only was it not risky to use, but nobody had even mentioned it after the episode aired, and if you sort through the various subreddits run by eagle-eyed severance fans who discuss all things Severance in very detailed posts, you will get an idea of where I am going with this, how I got there, and how it can be recreated for just about any type of RewindOS Cultural Safety Framework that I will be developing to apply to other shows, quotes, actors, or songs.

    How I built this:

    This analysis draws on two independent signals to assess whether “Baby It’s Cold Outside” remained culturally risky by the time Severance aired in 2025.

    1. Google Trends: Controversy Decay

    Search interest was tracked for multiple controversy-framed queries related to the song from 2018 through 2025. using manual searches and then a larger python code. The data shows:

    • a clear but small peak during the 2018 radio pullback news cycle
    • rapid decay in the months that followed
    • no secondary spikes or resurfacing in subsequent years

    By 2020, controversy-related search interest had returned to baseline and remained there through 2025.

    2. Reddit: Social Engagement & Backlash Check

    Reddit posts were collected using public JSON search endpoints across within a python code that produced the following outputs:

    • broad references to the song and Severance / Apple TV
    • narrow “backlash” framings (e.g., banned, problematic, controversy)
    • subreddits where this discussion would be most likely to appear

    Weekly aggregation shows effectively no sustained discussion following the episode’s release. Only one post referenced the song in connection with Severance, and it discussed narrative tension rather than offense.

    The absence of engagement is itself the result. I was surprised to see the one post which featured the song as part of a long implied consent piece but this post did not show until March 6, 2025 a week after the episode aired and the post itself only generated three comments of discussion none about the song itself.

    Again, Whether or not Apple TV explicitly modeled this risk, guessed, or just didn’t care. We don’t know, but the outcome reflects what the data already showed and how others can use it. This is what cultural data intelligence looks like when measured instead of guessed.

    If you want to see or run the Python script used for this analysis, the full repository is here:

    👉 GitHub: https://github.com/jjf3/rewindos_measuring_cultural_safety

    📄 Download the White Paper here: https://www.rewindos.com/index.php/white-papers/the-rewindos-cultural-safety-framework/

  • Is Christmas Music Too Expensive to License for Sitcoms?

    “False Negative” — OSC of CBS’ HOW I MET YOUR MOTHER.
    Photo: Ron P. Jaffe/FOX.
    ©2010 FOX TELEVISION. All Rights Reserved.

    What This Project Set Out to Examine

    Christmas episodes are a long-standing sitcom tradition. Nearly every major network comedy of the past 30 years has produced multiple holiday-themed episodes, often using Christmas or other holidays as a narrative anchor for character conflict, sentimentality, or satire.

    This project was built to answer a simple question:

    How much Christmas music do sitcoms actually use?

    Rather than assuming that Christmas episodes are saturated with holiday songs, this analysis catalogued the actual music used in Christmas episodes across a sample of well-known sitcoms, tracking whether songs were:

    • licensed (commercial recordings)
    • public domain (traditional carols)
    • or absent altogether

    What the Data Actually Shows:

    To build this dataset, I started by manually reviewing episode music listings on Tunefind.com, a site that catalogs what songs appear in television episodes.

    I selected five well-known sitcoms spanning different eras and focused specifically on their Christmas episodes during periods when holiday-themed programming was most common, based on findings from my earlier analysis of Christmas episodes. For each episode, I recorded whether Christmas music appeared and whether those songs were licensed recordings or public-domain carols. I then wrote a small Python script to organize the data, calculate basic metrics (such as licensed versus public-domain usage), and generate visualizations to make patterns easier to compare across shows and time periods.

    Across major sitcoms analyzed — spanning the 1990s through the 2010s — one pattern is consistent:

    Most Christmas episodes use little to no Christmas music at all.

    Key observations from the spreadsheet:

    • Licensed Christmas songs appear infrequently, even in peak-era sitcoms.
    • Many Christmas episodes contain zero licensed holiday tracks.
    • When music is present, it is often:
      • public-domain carols
      • brief background cues
      • or isolated moments rather than sustained themes
    • Entire Christmas episodes often rely purely on dialogue, setting, and performance, not music, to signal the holiday.

    Importantly, this pattern holds across decades.
    The data does not show a sharp drop-off or collapse in music usage — it shows that Christmas music was never heavily used to begin with.


    Why This Matters (and Why Cost Alone Isn’t the Answer)

    The initial question — “Is Christmas music too expensive to license?” — is still valid, but the data suggests a more nuanced conclusion:

    Sitcoms largely chose not to rely on Christmas music, even when budgets were larger and licensing was easier.

    Several plausible explanations emerge:

    1. Sitcoms Signal Christmas Visually, Not Musically

    Christmas episodes communicate the holiday through:

    • decorations
    • wardrobe
    • dialogue
    • plot structure (parties, family, end-of-year reflection)

    Music is optional, not essential. The holiday is already legible without it.


    2. Comedy Prioritizes Timing Over Atmosphere

    Unlike dramas, sitcoms are built around:

    • rapid dialogue
    • punchlines
    • awkward silence

    Continuous background music — especially familiar Christmas songs — can interfere with comedic timing or dilute jokes. Many sitcoms intentionally avoid music so the rhythm of the scene remains intact.


    3. Christmas Songs Are Culturally “Loud”

    Christmas music carries heavy emotional and cultural baggage. Using a well-known song can:

    • overwhelm a scene
    • inject sentiment the writers didn’t intend
    • pull attention away from character dynamics

    Avoiding music gives writers tighter control over tone and avoids the episode from being dated.


    4. Licensing Cost Reinforces an Existing Creative Preference

    Licensing costs may reinforce avoidance, but the data suggests they are not the root cause.

    Even when:

    • network budgets were higher
    • syndication economics were stronger
    • licensing environments were more permissive

    sitcoms still used very little Christmas music.

    In other words:

    Cost may explain why music didn’t increase — not why it was absent.


    How This Fits with the Broader RewindOS Christmas Analysis

    In our companion analysis on Christmas episodes and television health, we show that:

    • Christmas episodes remain common
    • but their function has changed

    This music analysis complements that finding:

    • Christmas episodes persist as narrative rituals
    • not as audio spectacles
    • music was never the core signal — the episode itself was

    The holiday survives on structure and story, not soundtrack and it doesn’t need music.


    Final, Data-Faithful Conclusion

    Sitcoms did not abandon Christmas music — they largely never relied on it in the first place.

    Across decades, networks, and formats, Christmas episodes consistently used minimal music, suggesting a long-standing creative norm rather than a modern cost-driven retreat.

    Licensing expense may discourage experimentation, but the evidence indicates that sitcoms have always treated Christmas music as optional.

    So the next logical question for us would be if certain dramas would choose a controversial christmas song to heighten the episode? Stay tuned to find out.

    If you want to see or run the Python script used for this analysis, the full repository is here:

    👉 GitHub: https://github.com/jjf3/rewindos-christmas-music-cost

  • What Christmas Episodes Reveal About the Health of U.S. Television

    A data-driven look at how Christmas-themed TV episodes rise and fall with industry confidence.

    Why I Built This

    This is the second project for rewindOS. This project is part of a short Christmas programming sprint designed to explore how holiday-themed television functions as a cultural signal rather than a novelty. I started with a simple question: Are Christmas episodes still as prevalent as they were during the era of long-running television shows? In earlier decades, holiday episodes were almost guaranteed milestones for successful series, so I wanted to understand how that tradition has changed in the modern TV landscape.

    The first step was to measure actual output specifically, and so I went to wikipedia to see if they had a list of Christmas Episodes for TV. They had an extensive list that was used for the dataset of this project. Then I used python coding to produce a graph for how many Christmas-themed television episodes were produced between 1947 and 2025, and looked at the downward spiral of shows from 2023-2025 and how that compares to earlier periods.

    From there, the sprint expands into a few focused, related experiments rather than a single large dataset.

    Next, I plan to track Christmas music usage within TV episodes, looking at which songs appear, how frequently music is used, and what types of Christmas music are favored. This portion will intentionally stay small, focusing on two to three well-known sitcoms, to keep the analysis tight and interpretable.

    The sprint will conclude with a beta test of a controversy index, applied to a Christmas-themed case study, as a way to explore how sentiment, cultural backlash, or reinterpretation can affect holiday media over time.

    All of these projects are designed to be completed within one to two weeks, serving as a focused seasonal experiment while also laying groundwork for broader RewindOS cultural analytics moving forward.

    Figure 1: Number of U.S. Christmas-themed television episodes by year (1947–2025), excluding standalone specials.

    When we chart U.S. Christmas-themed TV episodes over time (excluding standalone specials), we see that they begin appearing consistently in the 1950s, rise steadily, and then fluctuate in clear cycles. Major declines occur in 1998–2000, 2006–2008, and again starting in 2023. Notably, 2008 and 2025 mark the lowest levels of Christmas episodes in modern television, matching periods of industry instability. The most productive era was 2012–2023, driven by long-running, sitcoms with over 5 seasons.

    Expanding on what the data shows (so far)

    Using a filtered dataset of U.S. television Christmas-themed episodes (excluding standalone holiday specials and animation), several clear patterns emerge:

    🎄 Long-term rise, then contraction

    • Christmas-themed TV episodes begin appearing consistently in the 1950s, aligning with the rise of broadcast television and stable seasonal scheduling.
    • Prior to that, occurrences are sparse, with 1973 marking the lowest point for Christmas episodes in the modern TV era.

    Repeated contractions in Christmas episode production align with broader industry disruptions.

    📉 Structural dips repeat over time

    • Three periods show notable declines in Christmas episode production:
      • 1998–2000
      • 2006–2008
      • 2023–2025
    • These downturns closely resemble each other in magnitude and shape, suggesting recurring industry-wide contractions rather than random fluctuation.

    ❄️ Modern low points

    • 2008 and 2025 stand out as the lowest years for Christmas episodes in modern television, placing today’s output on par with the post-network-era shock of the 2008 Writers’ Strike period.
    • The current decline beginning in 2023 suggests a sustained trend rather than a single anomalous year.

    📈 Peak era: 2012–2023

    • The years 2012 through 2023 represent the highest sustained production of Christmas-themed episodes.
    • This era is largely fueled by:
      • Highly successful, long-running sitcoms
      • Large ensemble casts
      • Stable season orders (22 Episodes ftw!!!!)
      • Continued contribution from U.S. animation (even when animation is excluded from the core count, its ecosystem supports the broader trend)

    This period reflects an industry confident in long arcs, recurring characters, and calendar-based “event episodes.”

    Long-running ensemble sitcoms dominated the peak era of Christmas television episodes (2012–2023).

    Christmas episodes function as a high-confidence production choice:

    • They assume a show will still be on the air in December
    • They reward audience loyalty
    • They often require additional budget, music licensing, and scheduling certainty

    Their decline aligns with periods of:

    • Shorter seasons
    • Higher show churn
    • Platform fragmentation
    • Labor disruption or industry uncertainty

    This suggests that Christmas episodes may act as a proxy indicator for scripted TV stability, not just seasonal nostalgia.

    This data in no way shape or form indicates that there is an overall agenda driven message surrounding Christmas itself on TV. It just shows that there is a decline in Television confidence in general and so holiday themed episodes are either cut or don’t make sense when a show doesn’t have 22 episodes or won’t be around during the holidays.

    My original thought going into this project was that the years 2020-2025 would show a major decrease but as you can see 2023 had a high output of Christmas themed episodes.

    If you want to see or run the Python script used for this analysis, the full repository is here:

    👉 GitHub: https://github.com/jjf3/Rewindos-Christmas-Episodes

  • The “Biggest of All Time”: A Tiny RewindOS Prototype

    Why I Built This

    This is the first mini-project for RewindOS and it started with a simple question:

    When I watched the latest season of Prehistoric Planet Season 3: Ice Age, I noticed that Tom Hiddleston who recently replaced the late great David Attenborough as the narrator, said some variation of “the biggest of all time,” quite a lot. So I set off to determine how many times without using my fingers to count it like the cavemen did in those prehistoric times.

    It turns out with a little reverse engineering and hacking, this is quite easy and it highlights a metric that nobody in Hollywood seems interested in measuring— yet.

    If you watch a lot of nature documentaries, you probably notice recurring superlatives—“the largest ever discovered,” “the biggest predator of its era,” “the strongest bite force in history.” These phrases shape how we interpret both animals and storytelling. They’re a mix of science communication and spectacle.

    So I wanted to answer a playful but data-driven question:

    How frequently does the show in season 3 use the phrase “of all time,” and in what contexts?

    This became a perfect first test for RewindOS, because it touches everything I envision this project will be and ultimately do at scale:

    • extracting structured data from media
    • analyzing linguistic patterns
    • building repeatable pipelines
    • archiving, visualizing, and tracking cultural tendencies in media

    How I Got the Subtitles

    I already had the Season 3 video files. Thanks to my automated plex setup. So logically, I first went to my plex library and tried to extract the files that way.

    However, I see in plex after some basic research and simple ls linux commands on my server that they were nowhere to be found readily available for extract in my system. Lo and behold further research proved my point:

    Subtitles downloaded by Plex itself are stored inside of Plex’s blob files and aren’t able to be interacted with, nor can the location be changed.

    Chatgpt gave me some helpful but rather tedious “legal” ways to obtain the subtitles by recording them on VLC/Plex as it plays. However after some more research I discovered VLC’s VLSub feature which can download the embedded files. Success!

    This gave me five .srt files I needed — one for each episode.

    Those files became the dataset for the project.

    Not being a scratch programmer and now deeply entrenched into the era of AI, the next logical step was to build the python script.

    The entire analysis pipeline was built using AI as a collaborator.

    I described the problem (“Find all phrases ending in of all time in the SRT files”), and the model generated a clean Python script that:

    1. Walks through my subtitle directory
    2. Reads every .srt file
    3. Uses a regex pattern to capture the phrase + its context
    4. Prints them to the console
    5. Exports everything into a structured CSV

    The core of the extraction logic was this:

    pattern = re.compile(
        r"(\b\w+(?:\W+\w+){0,6}\W+of all time\b)",
        re.IGNORECASE
    )
    

    This lets the script capture phrases like:

    • “largest predators of all time
    • “one of the most powerful hunters of all time
    • “the biggest terrestrial bird of all time

    The final output CSV includes:

    • which episode
    • the snippet of text
    • how many instances appear across the season

    This is exactly the kind of small linguistic dataset that RewindOS will eventually let creators, journalists, and analysts explore effortlessly.

    It’s easy to laugh at how often nature docs use hyperbole, but those phrases shape cultural impressions of extinct animals. They’re part of storytelling tradition. Being able to quantify them is the first step in understanding patterns across eras, genres, studios, and creators.


    If you want to see or run the Python script used for this analysis, the full repository is here:

    👉 GitHub: https://github.com/jjf3/prehistoric-planet-of-all-time-analysis