Developing the Distroid Curator Algorithm (DCA), to help develop ranked feeds based on the sensemaking markers
In this pub, I cover our early work developing the Distroid Curator Algorithm (DCA), a feed-ranking algorithm for curating items in the Distroid Catalogue, primarily for use by Distroid curators.
The DCA’s objective is to produce a ranked feed of items that increase understanding of frontier information. We assume that items with higher scores have the potential to provide readers with a greater understanding of frontier information than items with a lower score.
For DCA Version 0.1 (V0.1), we are piloting the following sensemaking markers:
ELI5,
Implications,
Idea Machine Intersectionality,
Novelty,
Informative, and
Evergreen.
For more information on the sensemaking markers mentioned above, please refer to the embed below.
Originally, I had Anti-disciplinary as a marker for the DC algorithm, but after assessing the items for the example, I realized that when I draft new issues of the Distroid Digest, I do not truly consider this marker in my calculus.
I also started thinking about a marker to describe a work that provides discussion or discourse on things actually happening at the frontier, but I could not crystallize my thoughts into a coherent marker. I think the marker is more useful (or generally, only applicable) when only applied to research-like items such as academic papers.
I applied (i.e., assessed) the markers to the content (used interchangeably with items) curated in Distroid Digest Issue 44.
You can find the ratings in the embed below, and visualizations of the rating distribution in the next sub-section.
I applied five scoring methods for producing the ranked feeds:
Simple sum,
Weighted sum with integer weights,
Weighted sum with decimal weights,
Weighted sum with dynamic decimal weights based on ratings, and
Weighted sum with dynamic decimal weights based on a threshold rating.
You can find the feeds created, based on each scoring methods, in the Ranked Feeds section.
Through this process, I hope to create a default (or public) feed based on the DCA.
I applied greater weights to Implications, Idea Machine Intersectionality, and ELI5 because I believe that these markers will make it easier to learn new frontier information, and connect that information with the real world.
You can find the original ratings (before applying the scoring method), weights, and new scores (i.e., after applying the method) in the table below.
I posted the DCA scores with weighted sum with decimal weights method on Bluesky to show how the feed could appear on a social network.
The score is located in brackets, next to the title of the article, and below the title, we have the URL and a hashtag.
Though, I am unsure if Bluesky allows for searching via hashtags like Mastodon or Twitter.
I also created two examples feeds with Softr to show a web-based UI for the DCA scores and marker ratings. The two webpages allow users to search and filter for content based on DCA aggregate scores or marker scores.
A sample form for anyone to create their own personal feed, based on the DCA.
Submitters can select specific ratings per sensemaking marker, and for specific ranges of the DCA scores.
Some additional goals I want to achieve with the DCA include:
Increase transparency over how items are curated in the Distroid Digest;
Speed up the curation process by (fully or partially) automating how items are rated;
Invite reader feedback on the development of the DCA and Distroid Digest;
Help shed light on, and improve our own understanding of, our curation process; and
Provide users more flexibility in constructing feeds from the Distroid Catalogue.
As I was working on developing DCA V0.1, I stumbled upon Building a Social Media Algorithm That Actually Promotes Societal Values by Katharine Miller.
The project, supported by a Stanford HAI Hoffman-Yee Grant, required translating social science concepts about democratic values into algorithmic objectives; creating a feed that implemented the democratic values model; and testing its impact on people’s partisan animosity. The result: The team found lower partisan animosity among people shown a feed that downranked (or removed and replaced) posts expressing highly anti-democratic attitudes.
I think this article is fantastic, and provides great guidance (and a possible workflow) for creating feed-ranking algorithms.1
A possible workflow I thought of after reading the article (and some sections of the paper) for developing feed-ranking algorithms could be:
Determine the objective(s) of the feed-ranking algorithm;
Determine the sensemaking markers (or values) most appropriate (and/or desired) for reaching the goal (or need to be optimized to reach the goal);
Determine criteria for assessing each sensemaking marker;
Assess items based on the sensemaking markers to produce ratings;
Create a feed based on a scoring method(s) applied to the ratings; and
Obtain user feedback on feeds to determine if the feed is achieving the objective, and possible impacts (good or bad) on users.
Additionally, from reading the article and the paper, I realized that I probably need to test for inter-rater reliability for the sensemaking markers used in the DCA.