-
Notifications
You must be signed in to change notification settings - Fork 4
everyevalever launch #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,198 @@ | ||||||
| --- | ||||||
| layout: post | ||||||
| title: "Every Eval Ever: Toward a Common Language for AI Eval Reporting" | ||||||
| date: 2026-02-15 | ||||||
| published: true | ||||||
| exclude_from_collection: true | ||||||
| category: Infrastructure | ||||||
| image: "/assets/img/long-site-banner.webp" | ||||||
| authors: | ||||||
| - name: "Jan Batzner*" | ||||||
| - name: "Leshem Coshen*" | ||||||
|
||||||
| - name: "Leshem Coshen*" | |
| - name: "Leshem Choshen*" |
Copilot
AI
Feb 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ACL Anthology URL for reference [10] has an incorrect format. ACL Anthology URLs should end with a forward slash, not ".com". The URL "https://aclanthology.org/2025.acl-long.34.com" should be "https://aclanthology.org/2025.acl-long.34/" to be a valid link.
| We understand the pain firsthand. The EvalEval Coalition is a community of researchers working to fix how AI evaluations are built, run, documented, shared, and compared. We have worked on numerous projects where collecting evaluations either restricts what can be done or consumes most of the project's effort. Need examples? See [1](https://arxiv.org/abs/2602.03344), [2](https://arxiv.org/abs/2503.01622), [3](https://proceedings.neurips.cc/paper_files/paper/2024/hash/28236482f64a72eec43706b6f3a6c511-Abstract-Conference.html), [4](https://arxiv.org/abs/2412.06540), [5](https://arxiv.org/abs/2410.11840), [6](https://aclanthology.org/2024.acl-long.456/), [7](https://arxiv.org/abs/2407.13696), [8](https://par.nsf.gov/servlets/purl/10547932), [9](https://aclanthology.org/2024.naacl-long.139/), [10](https://aclanthology.org/2025.acl-long.34.com), among others. | |
| We understand the pain firsthand. The EvalEval Coalition is a community of researchers working to fix how AI evaluations are built, run, documented, shared, and compared. We have worked on numerous projects where collecting evaluations either restricts what can be done or consumes most of the project's effort. Need examples? See [1](https://arxiv.org/abs/2602.03344), [2](https://arxiv.org/abs/2503.01622), [3](https://proceedings.neurips.cc/paper_files/paper/2024/hash/28236482f64a72eec43706b6f3a6c511-Abstract-Conference.html), [4](https://arxiv.org/abs/2412.06540), [5](https://arxiv.org/abs/2410.11840), [6](https://aclanthology.org/2024.acl-long.456/), [7](https://arxiv.org/abs/2407.13696), [8](https://par.nsf.gov/servlets/purl/10547932), [9](https://aclanthology.org/2024.naacl-long.139/), [10](https://aclanthology.org/2025.acl-long.34/), among others. |
Copilot
AI
Feb 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The author name "Leshem Coshen" in the BibTeX citation appears to be spelled inconsistently with other posts in the repository. In the post "_posts/2025-08-10-blog-chart-crisis.md", the same author is listed as "Leshem Choshen" (with an 'h'). Please verify the correct spelling and ensure consistency across all posts and citations.
Copilot
AI
Feb 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BibTeX citation URL shows the date as "2026/02/16" but the frontmatter date is "2026-02-15". This creates an inconsistency between the post date and its canonical URL in the citation.
Copilot
AI
Feb 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The URL in the BibTeX citation has a date mismatch. The file is dated 2026-02-15 (February 15) in both the filename and the frontmatter date field, but the URL uses "2026/02/16" (February 16). This should be corrected to "2026/02/15" to match the post date.
| url = {https://evaleval.github.io/2026/02/16/everyevalever-launch/}, | |
| url = {https://evaleval.github.io/2026/02/15/everyevalever-launch/}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The image path uses a leading slash "/assets/img/long-site-banner.webp" while other blog posts in the codebase use "assets/img/..." without the leading slash. This inconsistency may cause issues depending on how Jekyll processes the paths. Consider using "assets/img/long-site-banner.webp" to match the convention seen in other posts like "2025-07-13-eval-science-kickoff.md" and "2025-08-10-blog-chart-crisis.md".