Most recent update: 30th November 2020 - 14:49:02 - 11365 characters

Wikipedia Explorer

I would love your feedback for good or bad! Did you fall down a Wiki rabbit hole? Do you feel you can "browse" information differently? Are the controls too unwieldy? Do you desperately want to load in your own dataset? Do you want to hear what I am thinking next? DM me on Twitter or via email!

Over the past months and years I've made many small tools to explore different ways of exploring and consuming information. Wikipedia Explorer was one of them, initially intended as a throw away experiment but instead becoming a standard tool for my own explorations.

Whilst I would recommend reading "An Introduction to Wikipedia Explorer" below you can jump in right away if you'd prefer!

Note: Wikipedia Explorer will work on mobile but is optimized for a larger screen

Starting points for exploration:

An introduction to Wikipedia Explorer:

We have a world of data yet relatively few tools to sort and search through that never ending library.

Search engines as they stand are best at surfacing information we've already connected or highlighted, whether that's via hyperlinks between content or the number of times a link has been clicked on. They're still lacking at forming novel connections.

Wikipedia Explorer allows you to ask similar questions but through asking "what blocks of text are similar to this block of text?".

From Deepwater Horizon to Formula One

As an odd example: Can the science of F1 pit stops transform everyday life? describes how the high-speed choreography of F1 pit-stop mechanics has been transferred to hospital operating theatres and even toothpaste production. This is the perfect example of making novel connections that a traditional search query of "how can I make my operating theatre more efficient" is unlikely to answer.

Let's instead imagine a similar query: what might be at the intersection of Formula One racing regulations and the Deepwater Horizon tragedy?

From the National Hot Rod Association:

Should the driver be rendered unable to perform the normal shutdown sequence at the conclusion of a run, a pair of redundant transmitters, placed and past the finish line, will signal an on-board receiver to automatically shut off ignition power and fuel to the engine and deploy the parachutes.

From the Deepsea Challenger:

If the ballast weight release system fails, stranding the craft on the seafloor, a backup galvanic release is designed to corrode in salt water in a set period of time, allowing the sub to automatically surface.

From British Rail Class 373:

To combat the hypnotic effect of driving through the tunnel at speed for 20 minutes, the power cars have a very small windscreen when compared to other high-speed trains and TGVs.

Redundant automatic shut-off switches, fail-safes that rely on a natural process (salt water corrosion), and user design that factors in impaired sensory abilities. These all sound like interesting starting points for pondering!

From my discussion on chasing a ball of linguistic yarn as it rolls around a thousand dimensional space:

Through the lens of this language model I can search for the emotional residue of great works distilled into phrases I would never have been able to explicitly search for. It's not the language model that's bringing us this knowledge, it's simply connecting threads of an intricate web that we've assembled both implicitly and explicitly over thousands of years, billions of observations, and a multitude of encoded emotions.

For those with a machine learning background:

The experiment breaks English Wikipedia dataset up into 1000 byte increments (approximately 4 tweets) and produces a vector representation for each block of text using a naive language model. The most popular quarter of English Wikipedia was selected as it would fit on a single consumer grade GPU.


  • You're essentially playing Twenty Questions: try and refine!
    • Add and remove various blocks to see if you get closer to what you're interested in
    • The language model may sometimes think you want a textual answer in a format

User guide

Annotated examples

A pessimistic past and future of medicine

Taking the lede from Robin Sloan's short story The Counselor as a starting point:

Will the future of healthcare bring artificial intelligence that convinces you it's time to die?

we might want to ponder pessimistic storylines about the progress of medicine as a historical basis.

  • Add Logan's Run as "the consumption of resources are maintained in equilibrium by killing everyone who reaches the age of thirty"
  • Refine by adding the history of heart transplantation to avoid being too fixated on films
  • The resulting query includes:
    • The film Hysteria ("how the medical management of hysteria led to the invention of the vibrator")
    • The Flesh and the Fiends ("based on the true case of Burke and Hare, who murdered at least 16 people in 1828 Edinburgh, Scotland and sold their bodies for anatomical research")
    • Ted DeVita (inspiration for "The Boy in the Plastic Bubble" who had to live in a sterile hospital room for the last eight years of his life)
    • ...

Harold Holt

Harold Holt was an Australian who, whilst serving as Prime Minister, went out swimming one day never to return!

Walter Reuther, one of the most progressive labor unions in American history, survived two attempted assassinations, only to die in mysterious plane crash:

The National Transportation Safety Board discovered that the plane's altimeter was missing parts, some incorrect parts were installed, and one of its parts had been installed upside down, leading some to speculate that Reuther may have been murdered. Reuther had been subjected earlier to two attempted assassinations.

Zachary Taylor, the second US President to die in office:

Almost immediately after his death, rumors began to circulate that Taylor had been poisoned by pro-slavery Southerners, and various conspiracy theories persisted into the late-20th century. The cause of Taylor's death was definitively established in 1991, when his remains were exhumed and an autopsy conducted by Kentucky's chief medical examiner. Subsequent Neutron activation analysis conducted at Oak Ridge National Laboratory revealed no evidence of poisoning, as arsenic levels were too low.

William Tolbert, 20th President of Liberia:

Undisputedly, Tolbert was dead by the end of April 12, 1980, the day of the coup d’état. There are competing stories as to the time and manner of his death.

Thomas Francis Meagher, Irish nationalist and leader of the Young Irelanders in the Rebellion of 1848, first sentenced to death, and then sentenced to life in Van Diemen's Land (now Tasmania) in Australia, who escaped and made his way to the US:

Sometime in the early evening of 1 July 1867, Meagher fell overboard from the steamboat "G. A. Thompson", into the Missouri River. The pilot described the waters as "instant deathwater twelve feet deep and rushing at the rate of ten miles an hour." His body was never recovered.
Some believed his death to be suspicious and many theories circulated about his death. Early theories included a claim that he was murdered by a Confederate soldier from the war, or by Native Americans. In 1913 a man claimed to have carried out the murder of Meagher for the price of $8000, but then recanted. In the same vein, American journalist and novelist Timothy Egan, who published a biography of Meagher in 2016, noted that his political nemesis, Wilbur Fisk Sanders, was in Fort Benton at the same time. Egan hypothesized that Meagher may have been set up for murder by his Montana political enemies or powerful and still active vigilantes.