Skip to content

Conversation

@dips7189
Copy link

@dips7189 dips7189 commented Feb 2, 2026

Improvement proposal: Latent Interest Aggregates from Dwell Signals

Problem

Engagement-based ranking under-serves "silent" users who read content but do not explicitly like, reply, or retweet. The system already logs and labels multiple high-intent dwell signals (tweet detail, profile, link, fullscreen video), and aggregates them in real time, but does not expose a persistent user preference representation derived from those signals.

As a result, latent interest (interest without explicit engagement) is not captured as a first-class signal.

Proposed solution

Introduce online, decayed latent-interest aggregates derived from existing dwell labels, keyed by (userId, sourceAuthorId), to represent implicit long-term interest in authors.

This change:

  • Defines AuthorLatentInterestEngagements using high-intent dwell signals (profile dwell, tweet detail dwell, long link dwell)
  • Adds authorLatentInterestRealTimeAggregates with exponential decay
  • Registers the aggregate group in ProdAggregateGroups
  • Ensures outputs are not filtered via aggregates_to_drop.txt
  • Mirrors existing user+author aggregate flag defaults (e.g. includeAnyFeature)

The change is additive and does not alter existing ranking or scoring behavior. It provides a foundation for future use in scoring, mixing, or exploration to better serve silent readers.

Future work (not included):

  • Consume this aggregate in scoring or mixing
  • Tune decay/weights via params
  • Extend to topic/entity latent interest

Problem
-------
Engagement-based ranking under-serves "silent" users who read content but do not
explicitly like, reply, or retweet. The system already logs and labels multiple
high-intent dwell signals (tweet detail, profile, link, fullscreen video), and
aggregates them in real time, but does not expose a persistent user preference
representation derived from those signals.

As a result, latent interest (interest without explicit engagement) is not
captured as a first-class signal.

Proposed solution
-----------------
Introduce online, decayed latent-interest aggregates derived from existing dwell
labels, keyed by (userId, sourceAuthorId), to represent implicit long-term
interest in authors.

This change:
- Defines AuthorLatentInterestEngagements using high-intent dwell signals
  (profile dwell, tweet detail dwell, long link dwell)
- Adds authorLatentInterestRealTimeAggregates with exponential decay
- Registers the aggregate group in ProdAggregateGroups
- Ensures outputs are not filtered via aggregates_to_drop.txt
- Mirrors existing user+author aggregate flag defaults (e.g. includeAnyFeature)

The change is additive and does not alter existing ranking or scoring behavior.
It provides a foundation for future use in scoring, mixing, or exploration to
better serve silent readers.

Future work (not included):
---------------------------
- Consume this aggregate in scoring or mixing
- Tune decay/weights via params
- Extend to topic/entity latent interest
@CLAassistant
Copy link

CLAassistant commented Feb 2, 2026

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants