Skip to content

Conversation

@dfroger
Copy link
Contributor

@dfroger dfroger commented Jan 12, 2026

fixes #460

Before: the `.clear()` algorithm is wrong:

1. Suppose they are 2200 documents. The first 500 are deleted:

    0        1000      2000
    |        |         |
    xxxxx-----------------
    |   |
    |   index + limit
    index

2. There are now 1700 documents, and the next pagination query wrongly
   skips the first 500 documents, and remove from 500 to 1000:

    0        1000
    |        |
    -----xxxxx-------
         |   |
         |   index + limit
         index

3. There are now 1200 documents, and the next pagination query wrongly
   skips the first 1000 documents, and remove from 1000 to 1200:

    0        1000
    |        |
    ----------xx

=> Only 1200 of the 2200 documents are remove, and 1000 documents remain.

After: 500 documents are deleted in loop until there is nothing to
delete (with a security condition on the initial documents number, in
case of concurrent insertions).
@abrookins
Copy link
Collaborator

Thanks for this fix, @dfroger! You've been doing great work on redis-vl-python.

The approach looks solid - switching from offset-based pagination to always-query-from-zero makes sense for a destructive operation like clear(). The cluster support additions are also welcome.

A couple things that would seal the deal:

  1. CI is failing on import sorting - looks like a quick fix needed for check-sort-import

  2. Test coverage for the actual clear() bug - The new tests cover info() on cluster, but it would be great to have a test that directly validates the fix for issue AsyncSearchIndex.clear() does not remove all indexed documents with paginated delete #460:

    • Create N documents
    • Call clear()
    • Verify count is 0
  3. Minor: there's a small typo "recordrecords" in one of the docstrings (line 722)

Once those are addressed, this should be good to go! 🚀

@dfroger
Copy link
Contributor Author

dfroger commented Jan 15, 2026

Thanks @abrookins for the feedback!

I think I could fix the 3 points at the beginning of next week. Looking forward!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AsyncSearchIndex.clear() does not remove all indexed documents with paginated delete

3 participants