Skip to content

Conversation

@WindyDarian
Copy link

  • Repo Link
  • ruof
  • Features
    • Implemented CPU scan and compaction, compaction, GPU naive scan, GPU work-efficient scan, GPU work-efficient compaction, GPU radix sort (extra), and compared my scan algorithms with thrust implemention
    • I optimized my work efficient scan, and speed is increased to 270% of my original implementation.
    • I also wrote an invlusive version of work-efficient scan - because i misunderstood the requirement at first! The difference of the inclusive method is that it creates a buffer that is 1 element larger and swap the last(0) and and second last elements before downsweeping. Although I corrected my implemention to exclusive scan, the inclusive scan can still be called by passing ScanType::inclusive to scan_implenmention method in efficient.cu.
    • Radix sort assumes inputs are between [0, a_given_maximum) . I compared my radix sort with std::sort and thrust's unstable and stable sort.
    • I added a helper class PerformanceTimer in common.h which is used to do performance measurement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant