Conversation
|
@copilot Can you do a grammar pass on the blog? |
There was a problem hiding this comment.
Pull request overview
This PR introduces a new blog post announcing the launch of "Every Eval Ever," a standardized schema and repository for AI evaluation results, along with a template modification to support excluding posts from the homepage collection display.
Changes:
- New blog post detailing the Every Eval Ever initiative, its schema, validation tools, and converters
- Update to blogs.html template to filter posts with
exclude_from_collection: truefrom homepage listings
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| _posts/2026-02-15-everyevalever-launch.md | New comprehensive blog post announcing Every Eval Ever launch with technical details, examples, and call for participation |
| _includes/blogs.html | Adds liquid template filtering to exclude posts marked with exclude_from_collection: true from homepage |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| It is time for a change. We have seen this before in other parts of the ML pipeline. The community stopped retraining models from scratch or rewriting training code for each project long ago. Evaluations are next. | ||
|
|
||
| ## Why Us, Why Now | ||
| We just know the pain. The EvalEval Coalition is a community of researchers working to fix how AI evaluations are built, run, documented, shared, and compared. We worked on a myriad of projects where collecting evaluations restricts what can be done or takes most of the project’s efforts. Need examples? See [1](https://arxiv.org/abs/2602.03344), [2](https://arxiv.org/abs/2503.01622), [3](https://proceedings.neurips.cc/paper_files/paper/2024/hash/28236482f64a72eec43706b6f3a6c511-Abstract-Conference.html), [4](https://arxiv.org/abs/2412.06540), [5](https://arxiv.org/abs/2410.11840), [6](https://aclanthology.org/2024.acl-long.456/), [7](https://arxiv.org/abs/2407.13696), [8](https://par.nsf.gov/servlets/purl/10547932), [9](https://aclanthology.org/2024.naacl-long.139/), [10](https://aclanthology.org/2025.acl-long.34.com) among others. |
There was a problem hiding this comment.
The URL has an incorrect ".com" suffix appended. ACL Anthology URLs should not have ".com" at the end. It should be "https://aclanthology.org/2025.acl-long.34/" instead of "https://aclanthology.org/2025.acl-long.34.com".
| We just know the pain. The EvalEval Coalition is a community of researchers working to fix how AI evaluations are built, run, documented, shared, and compared. We worked on a myriad of projects where collecting evaluations restricts what can be done or takes most of the project’s efforts. Need examples? See [1](https://arxiv.org/abs/2602.03344), [2](https://arxiv.org/abs/2503.01622), [3](https://proceedings.neurips.cc/paper_files/paper/2024/hash/28236482f64a72eec43706b6f3a6c511-Abstract-Conference.html), [4](https://arxiv.org/abs/2412.06540), [5](https://arxiv.org/abs/2410.11840), [6](https://aclanthology.org/2024.acl-long.456/), [7](https://arxiv.org/abs/2407.13696), [8](https://par.nsf.gov/servlets/purl/10547932), [9](https://aclanthology.org/2024.naacl-long.139/), [10](https://aclanthology.org/2025.acl-long.34.com) among others. | |
| We just know the pain. The EvalEval Coalition is a community of researchers working to fix how AI evaluations are built, run, documented, shared, and compared. We worked on a myriad of projects where collecting evaluations restricts what can be done or takes most of the project’s efforts. Need examples? See [1](https://arxiv.org/abs/2602.03344), [2](https://arxiv.org/abs/2503.01622), [3](https://proceedings.neurips.cc/paper_files/paper/2024/hash/28236482f64a72eec43706b6f3a6c511-Abstract-Conference.html), [4](https://arxiv.org/abs/2412.06540), [5](https://arxiv.org/abs/2410.11840), [6](https://aclanthology.org/2024.acl-long.456/), [7](https://arxiv.org/abs/2407.13696), [8](https://par.nsf.gov/servlets/purl/10547932), [9](https://aclanthology.org/2024.naacl-long.139/), [10](https://aclanthology.org/2025.acl-long.34/) among others. |
| } | ||
| ``` | ||
|
|
||
| ### Feedback and Advise |
There was a problem hiding this comment.
The section heading uses "Advise" which is a verb, but the correct noun form should be "Advice". This should be "Feedback and Advice".
| ### Feedback and Advise | |
| ### Feedback and Advice |
| author = {Jan Batzner and Leshem Coshen and Avijit Ghosh and Sree Harsha Nelaturu and Anastassia Kornilova and Damian Stachura and Anka Reuel and Yifan Mai and Asaf Yehudai and Irene Solaiman and Stella Biderman}, | ||
| year = {2026}, | ||
| month = {February}, | ||
| url = {https://evaleval.github.io/2026/02/16/everyevalever-launch/}, |
There was a problem hiding this comment.
The BibTeX citation URL shows the date as "2026/02/16" but the frontmatter date is "2026-02-15". This creates an inconsistency between the post date and its canonical URL in the citation.
| published: true | ||
| exclude_from_collection: true | ||
| category: Infrastructure | ||
| image: "/assets/img/long-site-banner.webp" |
There was a problem hiding this comment.
The image path uses a leading slash "/assets/img/long-site-banner.webp" while other blog posts in the codebase use "assets/img/..." without the leading slash. This inconsistency may cause issues depending on how Jekyll processes the paths. Consider using "assets/img/long-site-banner.webp" to match the convention seen in other posts like "2025-07-13-eval-science-kickoff.md" and "2025-08-10-blog-chart-crisis.md".
| image: "/assets/img/long-site-banner.webp" | |
| image: "assets/img/long-site-banner.webp" |
| We need your help. We're launching a [Shared Task](evalevalai.com/events/) for practitioners alongside this post — two tracks for contributing public and proprietary eval data to the repository, with co-authorship for qualifying contributors and a [workshop at ACL 2026 in San Diego](https://evalevalai.com/events/2026-acl-workshop/). | ||
|
|
||
| *Submissions open now, deadline May 1, 2026.* | ||
|
|
||
| ## Get involved | ||
| - Try the schema 📋 : [Hugging Face Space](https://huggingface.co/spaces/evaleval/every_eval_ever_space) and [GitHub](github.com/evaleval/every_eval_ever) | ||
|
|
||
| - Join the Shared Task 🏁 : [Call for Participation](evalevalai.com/events/) |
There was a problem hiding this comment.
The URLs in this line are missing the https:// protocol prefix. The links to "evalevalai.com/events/" should be "https://evalevalai.com/events/" to function properly.
| We need your help. We're launching a [Shared Task](evalevalai.com/events/) for practitioners alongside this post — two tracks for contributing public and proprietary eval data to the repository, with co-authorship for qualifying contributors and a [workshop at ACL 2026 in San Diego](https://evalevalai.com/events/2026-acl-workshop/). | |
| *Submissions open now, deadline May 1, 2026.* | |
| ## Get involved | |
| - Try the schema 📋 : [Hugging Face Space](https://huggingface.co/spaces/evaleval/every_eval_ever_space) and [GitHub](github.com/evaleval/every_eval_ever) | |
| - Join the Shared Task 🏁 : [Call for Participation](evalevalai.com/events/) | |
| We need your help. We're launching a [Shared Task](https://evalevalai.com/events/) for practitioners alongside this post — two tracks for contributing public and proprietary eval data to the repository, with co-authorship for qualifying contributors and a [workshop at ACL 2026 in San Diego](https://evalevalai.com/events/2026-acl-workshop/). | |
| *Submissions open now, deadline May 1, 2026.* | |
| ## Get involved | |
| - Try the schema 📋 : [Hugging Face Space](https://huggingface.co/spaces/evaleval/every_eval_ever_space) and [GitHub](github.com/evaleval/every_eval_ever) | |
| - Join the Shared Task 🏁 : [Call for Participation](https://evalevalai.com/events/) |
| *Submissions open now, deadline May 1, 2026.* | ||
|
|
||
| ## Get involved | ||
| - Try the schema 📋 : [Hugging Face Space](https://huggingface.co/spaces/evaleval/every_eval_ever_space) and [GitHub](github.com/evaleval/every_eval_ever) |
There was a problem hiding this comment.
The GitHub URL is missing the https:// protocol prefix. It should be "https://github.com/evaleval/every_eval_ever" to work as a clickable link.
| - Try the schema 📋 : [Hugging Face Space](https://huggingface.co/spaces/evaleval/every_eval_ever_space) and [GitHub](github.com/evaleval/every_eval_ever) | |
| - Try the schema 📋 : [Hugging Face Space](https://huggingface.co/spaces/evaleval/every_eval_ever_space) and [GitHub](https://github.com/evaleval/every_eval_ever) |
| We need your help. We're launching a [Shared Task](evalevalai.com/events/) for practitioners alongside this post — two tracks for contributing public and proprietary eval data to the repository, with co-authorship for qualifying contributors and a [workshop at ACL 2026 in San Diego](https://evalevalai.com/events/2026-acl-workshop/). | ||
|
|
||
| *Submissions open now, deadline May 1, 2026.* | ||
|
|
||
| ## Get involved | ||
| - Try the schema 📋 : [Hugging Face Space](https://huggingface.co/spaces/evaleval/every_eval_ever_space) and [GitHub](github.com/evaleval/every_eval_ever) | ||
|
|
||
| - Join the Shared Task 🏁 : [Call for Participation](evalevalai.com/events/) |
There was a problem hiding this comment.
The URL is missing the https:// protocol prefix. It should be "https://evalevalai.com/events/" to work as a clickable link.
| We need your help. We're launching a [Shared Task](evalevalai.com/events/) for practitioners alongside this post — two tracks for contributing public and proprietary eval data to the repository, with co-authorship for qualifying contributors and a [workshop at ACL 2026 in San Diego](https://evalevalai.com/events/2026-acl-workshop/). | |
| *Submissions open now, deadline May 1, 2026.* | |
| ## Get involved | |
| - Try the schema 📋 : [Hugging Face Space](https://huggingface.co/spaces/evaleval/every_eval_ever_space) and [GitHub](github.com/evaleval/every_eval_ever) | |
| - Join the Shared Task 🏁 : [Call for Participation](evalevalai.com/events/) | |
| We need your help. We're launching a [Shared Task](https://evalevalai.com/events/) for practitioners alongside this post — two tracks for contributing public and proprietary eval data to the repository, with co-authorship for qualifying contributors and a [workshop at ACL 2026 in San Diego](https://evalevalai.com/events/2026-acl-workshop/). | |
| *Submissions open now, deadline May 1, 2026.* | |
| ## Get involved | |
| - Try the schema 📋 : [Hugging Face Space](https://huggingface.co/spaces/evaleval/every_eval_ever_space) and [GitHub](github.com/evaleval/every_eval_ever) | |
| - Join the Shared Task 🏁 : [Call for Participation](https://evalevalai.com/events/) |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| image: "/assets/img/long-site-banner.webp" | ||
| authors: | ||
| - name: "Jan Batzner*" | ||
| - name: "Leshem Coshen*" |
There was a problem hiding this comment.
The author name "Leshem Coshen" appears to be spelled inconsistently with other posts in the repository. In the post "_posts/2025-08-10-blog-chart-crisis.md", the same author is listed as "Leshem Choshen" (with an 'h'). Please verify the correct spelling and ensure consistency across all posts.
| - name: "Leshem Coshen*" | |
| - name: "Leshem Choshen*" |
| ```bibtex | ||
| @misc{evaleval2026everyevalever, | ||
| title = {Every Eval Ever: Toward a Common Language for AI Eval Reporting}, | ||
| author = {Jan Batzner and Leshem Coshen and Avijit Ghosh and Sree Harsha Nelaturu and Anastassia Kornilova and Damian Stachura and Anka Reuel and Yifan Mai and Asaf Yehudai and Irene Solaiman and Stella Biderman}, |
There was a problem hiding this comment.
The author name "Leshem Coshen" in the BibTeX citation appears to be spelled inconsistently with other posts in the repository. In the post "_posts/2025-08-10-blog-chart-crisis.md", the same author is listed as "Leshem Choshen" (with an 'h'). Please verify the correct spelling and ensure consistency across all posts and citations.
|
|
||
| ## Why Us, Why Now | ||
|
|
||
| We understand the pain firsthand. The EvalEval Coalition is a community of researchers working to fix how AI evaluations are built, run, documented, shared, and compared. We have worked on numerous projects where collecting evaluations either restricts what can be done or consumes most of the project's effort. Need examples? See [1](https://arxiv.org/abs/2602.03344), [2](https://arxiv.org/abs/2503.01622), [3](https://proceedings.neurips.cc/paper_files/paper/2024/hash/28236482f64a72eec43706b6f3a6c511-Abstract-Conference.html), [4](https://arxiv.org/abs/2412.06540), [5](https://arxiv.org/abs/2410.11840), [6](https://aclanthology.org/2024.acl-long.456/), [7](https://arxiv.org/abs/2407.13696), [8](https://par.nsf.gov/servlets/purl/10547932), [9](https://aclanthology.org/2024.naacl-long.139/), [10](https://aclanthology.org/2025.acl-long.34.com), among others. |
There was a problem hiding this comment.
The ACL Anthology URL for reference [10] has an incorrect format. ACL Anthology URLs should end with a forward slash, not ".com". The URL "https://aclanthology.org/2025.acl-long.34.com" should be "https://aclanthology.org/2025.acl-long.34/" to be a valid link.
| We understand the pain firsthand. The EvalEval Coalition is a community of researchers working to fix how AI evaluations are built, run, documented, shared, and compared. We have worked on numerous projects where collecting evaluations either restricts what can be done or consumes most of the project's effort. Need examples? See [1](https://arxiv.org/abs/2602.03344), [2](https://arxiv.org/abs/2503.01622), [3](https://proceedings.neurips.cc/paper_files/paper/2024/hash/28236482f64a72eec43706b6f3a6c511-Abstract-Conference.html), [4](https://arxiv.org/abs/2412.06540), [5](https://arxiv.org/abs/2410.11840), [6](https://aclanthology.org/2024.acl-long.456/), [7](https://arxiv.org/abs/2407.13696), [8](https://par.nsf.gov/servlets/purl/10547932), [9](https://aclanthology.org/2024.naacl-long.139/), [10](https://aclanthology.org/2025.acl-long.34.com), among others. | |
| We understand the pain firsthand. The EvalEval Coalition is a community of researchers working to fix how AI evaluations are built, run, documented, shared, and compared. We have worked on numerous projects where collecting evaluations either restricts what can be done or consumes most of the project's effort. Need examples? See [1](https://arxiv.org/abs/2602.03344), [2](https://arxiv.org/abs/2503.01622), [3](https://proceedings.neurips.cc/paper_files/paper/2024/hash/28236482f64a72eec43706b6f3a6c511-Abstract-Conference.html), [4](https://arxiv.org/abs/2412.06540), [5](https://arxiv.org/abs/2410.11840), [6](https://aclanthology.org/2024.acl-long.456/), [7](https://arxiv.org/abs/2407.13696), [8](https://par.nsf.gov/servlets/purl/10547932), [9](https://aclanthology.org/2024.naacl-long.139/), [10](https://aclanthology.org/2025.acl-long.34/), among others. |
| author = {Jan Batzner and Leshem Coshen and Avijit Ghosh and Sree Harsha Nelaturu and Anastassia Kornilova and Damian Stachura and Anka Reuel and Yifan Mai and Asaf Yehudai and Irene Solaiman and Stella Biderman}, | ||
| year = {2026}, | ||
| month = {February}, | ||
| url = {https://evaleval.github.io/2026/02/16/everyevalever-launch/}, |
There was a problem hiding this comment.
The URL in the BibTeX citation has a date mismatch. The file is dated 2026-02-15 (February 15) in both the filename and the frontmatter date field, but the URL uses "2026/02/16" (February 16). This should be corrected to "2026/02/15" to match the post date.
| url = {https://evaleval.github.io/2026/02/16/everyevalever-launch/}, | |
| url = {https://evaleval.github.io/2026/02/15/everyevalever-launch/}, |
No description provided.